Methods and Protocols Third Edition - Lirias

424
Plant Proteomics Jesus V. Jorrin-Novo Luis Valledor Mari Angeles Castillejo Maria-Dolores Rey Editors Methods and Protocols Third Edition Methods in Molecular Biology 2139

Transcript of Methods and Protocols Third Edition - Lirias

Plant Proteomics

Jesus V. Jorrin-NovoLuis ValledorMari Angeles CastillejoMaria-Dolores Rey Editors

Methods and ProtocolsThird Edition

Methods in Molecular Biology 2139

ME T H O D S I N MO L E C U L A R B I O L O G Y

Series EditorJohn M. Walker

School of Life and Medical SciencesUniversity of HertfordshireHatfield, Hertfordshire, UK

For further volumes:http://www.springer.com/series/7651

For over 35 years, biological scientists have come to rely on the research protocols andmethodologies in the critically acclaimedMethods in Molecular Biology series. The series wasthe first to introduce the step-by-step protocols approach that has become the standard in allbiomedical protocol publishing. Each protocol is provided in readily-reproducible step-by-step fashion, opening with an introductory overview, a list of the materials and reagentsneeded to complete the experiment, and followed by a detailed procedure that is supportedwith a helpful notes section offering tips and tricks of the trade as well as troubleshootingadvice. These hallmark features were introduced by series editor Dr. John Walker andconstitute the key ingredient in each and every volume of the Methods in Molecular Biologyseries. Tested and trusted, comprehensive and reliable, all protocols from the series areindexed in PubMed.

Plant Proteomics

Methods and Protocols

Third Edition

Edited by

Jesus V. Jorrin-Novo

Agroforestry and Plant Biochemistry, Proteomics and Systems Biology, Department of Biochemistryand Molecular Biology, University of Cordoba, Cordoba, Spain

Luis Valledor

Department of Organisms and Systems Biology, Institute of Biotechnology of Asturias, Universityof Oviedo, Oviedo, Asturias, Spain

Mari Angeles Castillejo

Agroforestry and Plant Biochemistry, Proteomics and Systems Biology, Department of Biochemistryand Molecular Biology, University of Cordoba UCO-CeiA3, Cordoba, Cordoba, Spain

Maria-Dolores Rey

Agroforestry and Plant Biochemistry, Proteomics and Systems Biology, Department of Biochemistryand Molecular Biology, University of Cordoba, Cordoba, Spain

EditorsJesus V. Jorrin-NovoAgroforestry and Plant BiochemistryProteomics and Systems BiologyDepartment of Biochemistryand Molecular BiologyUniversity of CordobaCordoba, Spain

Luis ValledorDepartment of Organisms and Systems BiologyInstitute of Biotechnology of AsturiasUniversity of OviedoOviedo, Asturias, Spain

Mari Angeles CastillejoAgroforestry and Plant BiochemistryProteomics and Systems BiologyDepartment of Biochemistryand Molecular BiologyUniversity of Cordoba UCO-CeiA3Cordoba, Cordoba, Spain

Maria-Dolores ReyAgroforestry and Plant BiochemistryProteomics and Systems BiologyDepartment of Biochemistry and Molecular BiologyUniversity of CordobaCordoba, Spain

ISSN 1064-3745 ISSN 1940-6029 (electronic)Methods in Molecular BiologyISBN 978-1-0716-0527-1 ISBN 978-1-0716-0528-8 (eBook)https://doi.org/10.1007/978-1-0716-0528-8

© Springer Science+Business Media, LLC, part of Springer Nature 2020This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material isconcerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproductionon microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation,computer software, or by similar or dissimilar methodology now known or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply,even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulationsand therefore free for general use.The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed tobe true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty,expressed or implied, with respect to the material contained herein or for any errors or omissions that may have beenmade. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of SpringerNature.The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.

Preface

You now have in your hands the third edition Plant Proteomics: Methods and Protocols,preceded by the first edition in 2007 (M. Zivy, C. Damerval, and V. Mechin, eds.) and thesecond one in 2014 (J. V. Jorrin Novo, S. Komatsu, W. Weckwerth, and S. Wienkoop, eds.).The success of the previous editions and the continuous advances and improvements inproteomic techniques, equipment, and bioinformatics tools, and their uses in basic andtranslational plant biology research that has occurred in the past 5 years encouragedHumanaPress to prepare a new updated version. Under the title Advances in Proteomics Techniques,Data Validation, and Integration with Other Classic and -Omics Approaches in the SystemsBiology Direction, it contains 29 chapters written by worldwide recognized scientists.

Themonograph, which starts with an introductory chapter (Chapter 1), is a compilationof protocols commonly employed in plant biology research. They show recent advances at allworkflow stages, starting from the laboratory (tissue and cell fractionation, protein extrac-tion, depletion, purification, separation, MS analysis, quantification) and ending on thecomputer (algorithms for protein identification and quantification, bioinformatics tools fordata analysis, databases and repositories).

Out of the 29 chapters, 6 are devoted to descriptive proteomics, with a special emphasison subcellular protein profiling (Chapters 5–10), 6 to PTMs (Chapter 11 and 14–18), 3 toprotein interactions (Chapters 19–21), and 2 to specific proteins, peroxidases (Chapter 24)and proteases and proteases inhibitors (Chapter 26). The book reflects the new trajectory inMS-based protein identification and quantification, moving from the classic gel-basedapproaches to the most recent labeling (Chapters 10, 11, 29), shotgun (Chapters 5, 7,12, 15), parallel reaction monitoring (Chapter 16), and targeted data acquisition(Chapter 13). MS-imaging (Chapter 25), the only in vivo MS-based proteomics strategy,is far from being fully optimized and exploited in plant biology research. A confident proteinidentification and quantitation, especially in orphan species, and on low-abundant proteins,is still a challenging topic (Chapters 4, 28).

This edition also gives a novel point of view to the proteomics approach with thedescription of different protocols for proteomics data validation and integration withother classic and -omics approaches in the systems biology direction. Chapter 2 reports onmultiple extractions in a single experiment of the different biomolecules, nucleic acids,proteins, and metabolites. Chapter 27 describes how metabolic pathways can be recon-structed from multiple -omics data, and Chapter 3 is on network building. Finally, Chapters22 and 23 deal with, respectively, the search for allele-specific proteins and proteogenomics.

Keeping in mind the history and evolution of proteomics, it is quite probable that thefourth edition will be published in few years, as we are still at the beginning of decipheringthe plant proteome to understand the central dogma of the molecular biology in terms ofproteins and to exploit the potential of the technique for translational purposes.

Cordoba, Spain Jesus V. Jorrin-NovoOviedo, Spain Luis ValledorCordoba, Spain Mari Angeles CastillejoCordoba, Spain Maria-Dolores Rey

v

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vContributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

1 What Is New in (Plant) Proteomics Methods and Protocols:The 2015–2019 Quinquennium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Jesus V. Jorrin-Novo

2 Multiple Biomolecule Isolation Protocol Compatible with MassSpectrometry and Other High-Throughput Analyses in Microalgae . . . . . . . . . . . 11Francisco Colina, Marıa Carb�o, Ana Alvarez, M�onica Meij�on,Marıa Jesus Canal, and Luis Valledor

3 Protein Interaction Networks: Functional and Statistical Approaches . . . . . . . . . . 21M�onica Escand�on, Laura Lamelas, Vıctor Roces,Vıctor M. Guerrero-Sanchez, M�onica Meij�on, and Luis Valledor

4 Specific Protein Database Creation from Transcriptomics Datain Nonmodel Species: Holm Oak (Quercus ilex L.). . . . . . . . . . . . . . . . . . . . . . . . . . 57Vıctor M. Guerrero-Sanchez, Ana M. Maldonado-Alconada,Rosa Sanchez-Lucas, and Maria-Dolores Rey

5 Subcellular Proteomics in Conifers: Purification of Nucleiand Chloroplast Proteomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69Laura Lamelas, Lara Garcıa, Marıa Jesus Canal,and M�onica Meij�on

6 Apoplastic Fluid Preparation from Arabidopsis thaliana LeavesUpon Interaction with a Nonadapted Powdery Mildew Pathogen . . . . . . . . . . . . 79Ryohei Thomas Nakano, Nobuaki Ishihama, Yiming Wang,Junpei Takagi, Tomohiro Uemura, Paul Schulze-Lefert,and Hirofumi Nakagami

7 Shotgun Proteomics of Plant Plasma Membrane andMicrodomain Proteins Using Nano-LC-MS/MS . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Daisuke Takahashi, Bin Li, Takato Nakayama,Yukio Kawamura, and Matsuo Uemura

8 A Protocol for the Plasma Membrane Proteome Analysisof Rice Leaves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107Ravi Gupta, Yu-Jin Kim, and Sun Tae Kim

9 Isolation, Purity Assessment, and Proteomic Analysis ofEndoplasmic Reticulum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117Xin Wang and Setsuko Komatsu

10 Dimethyl Labeling-Based Quantitative Proteomics ofRecalcitrant Cocoa Pod Tissue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133Yoel Esteve-Sanchez, Jaime A. Morante-Carriel,Ascensi�on Martınez-Marquez, Susana Selles-Marchart,and Roque Bru-Martinez

vii

11 Quantitative Profiling of Protein Abundance and PhosphorylationState in Plant Tissues Using Tandem Mass Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147Gaoyuan Song, Christian Montes, and Justin W. Walley

12 Optimizing Shotgun Proteomics Analysis for a Confident ProteinIdentification and Quantitation in Orphan Plant Species:The Case of Holm Oak (Quercus ilex) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157Isabel G�omez-Galvez, Rosa Sanchez-Lucas, Bonoso San-Eufrasio,Luis Enrique Rodrıguez de Francisco, Ana M. Maldonado-Alconada,Carlos Fuentes-Almagro, and Mari Angeles Castillejo

13 Combining Targeted and Untargeted Data Acquisition toEnhance Quantitative Plant Proteomics Experiments. . . . . . . . . . . . . . . . . . . . . . . . 169Gene Hart-Smith

14 A Phosphoproteomic Analysis Pipeline for Peels of Tropical Fruits . . . . . . . . . . . . 179Janet Juarez-Escobar, Jose M. Elizalde-Contreras,Vıctor M. Loyola-Vargas, and Eliel Ruiz-May

15 Label-Free Quantitative Phosphoproteomics for Algae . . . . . . . . . . . . . . . . . . . . . . 197Megan M. Ford, Sheldon R. Lawrence II, Emily G. Werth,Evan W. McConnell, and Leslie M. Hicks

16 Targeted Quantification of Phosphopeptides by ParallelReaction Monitoring (PRM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213Sara Christina Stolze and Hirofumi Nakagami

17 Enrichment of N-Linked Glycopeptides and Their Identificationby Complementary Fragmentation Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

18 High-Resolution Lysine Acetylome Profiling by OfflineFractionation and Immunoprecipitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241Jonas Giese, Ines Lassowskat, and Iris Finkemeier

19 A Versatile Workflow for the Identification of Protein–ProteinInteractions Using GFP-Trap Beads and Mass Spectrometry-BasedLabel-Free Quantification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257Guillaume Nee, Priyadarshini Tilak, and Iris Finkemeier

20 In Vivo Cross-Linking to Analyze Transient Protein–ProteinInteractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273Heidi Pertl-Obermeyer and Gerhard Obermeyer

21 Proteome Analysis of 14-3-3 Targets in Tomato Fruit Tissues . . . . . . . . . . . . . . . . 289Yongming Luo, Yu Lu, Junji Yamaguchi, and Takeo Sato

22 The Use of Proteomics in Search of Allele-Specific Proteins in(Allo)polyploid Crops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297Sebastien Christian Carpentier

23 Methods for Optimization of Protein Extraction andProteogenomic Mapping in Sweet Potato. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309Thualfeqar Al-Mohanna, Norbert T. Bokros, Nagib Ahsan,George V. Popescu, and Sorina C. Popescu

viii Contents

24 In Silico Analysis of Class III Peroxidases: Hypothetical Structure,Ligand Binding Sites, Posttranslational Modifications,and Interaction with Substrates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325Sabine Luthje and Kalaivani Ramanathan

25 MALDI Mass Spectrometry Imaging of Peptidesin Medicago truncatula Root Nodules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341Caitlin Keller, Erin Gemperline, and Lingjun Li

26 Cystatin Activity–Based Protease Profiling to Select ProteaseInhibitors Useful in Plant Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353Marie-Claire Goulet, Frank Sainsbury, and Dominique Michaud

27 A Pipeline for Metabolic Pathway Reconstruction inPlant Orphan Species. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367Cristina L�opez-Hidalgo, M�onica Escand�on, Luis Valledor,and Jesus V. Jorrin-Novo

28 Detection of Plant Low-Abundance Proteins by Meansof Combinatorial Peptide Ligand Library Methods . . . . . . . . . . . . . . . . . . . . . . . . . 381Egisto Boschetti and Pier Giorgio Righetti

29 iTRAQ-Based Proteomic Analysis of Rice Grains . . . . . . . . . . . . . . . . . . . . . . . . . . . 405Marouane Baslam, Kentaro Kaneko, and Toshiaki Mitsui

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415

Contents ix

Contributors

NAGIB AHSAN • Division of Biology and Medicine, COBRE Center for Cancer ResearchDevelopment, Proteomics Core Facility, Rhode Island, USA Hospital, Providence, BrownUniversity, Providence, RI, USA; Division of Biology and Medicine, Brown University,Providence, RI, USA

THUALFEQAR AL-MOHANNA • Department of Biochemistry, Molecular Biology, Entomology,and Plant Pathology, Mississippi State University, Mississippi State, MS, USA

ANA ALVAREZ • Plant Physiology, Department of Organisms and Systems Biology andUniversity Institute of Biotechnology (IUBA), University of Oviedo, Oviedo, Spain

MAROUANE BASLAM • Department of Biochemistry, Faculty of Agriculture, NiigataUniversity, Niigata, Japan

NORBERT T. BOKROS • Department of Biochemistry, Molecular Biology, Entomology, andPlant Pathology, Mississippi State University, Mississippi State, MS, USA

EGISTO BOSCHETTI • Scientific Consultant, JAM Conseil, Neuilly-sur-Seine, FranceROQUE BRU-MARTINEZ • Plant Proteomics and Functional Genomics Group, Department of

Agrochemistry and Biochemistry. Faculty of Sciences, University of Alicante, Alicante,Spain

MARIA JESUS CANAL • Plant Physiology, Department of Organisms and Systems Biology andUniversity Institute of Biotechnology (IUBA), University of Oviedo, Oviedo, Spain

MARIA CARBO • Plant Physiology, Department of Organisms and Systems Biology andUniversity Institute of Biotechnology (IUBA), University of Oviedo, Oviedo, Spain

SEBASTIEN CHRISTIAN CARPENTIER • SYBIOMA: Facility for Systems Biology-Based MassSpectrometry, KULeuven, Leuven, Belgium; Bioversity International, Genetic Resources,Leuven, Belgium

MARI ANGELES CASTILLEJO • Agroforestry and Plant Biochemistry, Proteomics and SystemsBiology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain

FRANCISCO COLINA • Plant Physiology, Department of Organisms and Systems Biology andUniversity Institute of Biotechnology (IUBA), University of Oviedo, Oviedo, Spain

LUIS ENRIQUE RODRIGUEZ DE FRANCISCO • Laboratorio de Biologıa, Instituto Tecnol�ogico deSanto Domingo, Santo Domingo, Republica Dominicana

JOSE M. ELIZALDE-CONTRERAS • Red de Estudios Moleculares Avanzados, Cluster Cientıfico yTecnol�ogico BioMimic®, Instituto de Ecologıa A.C. (INECOL), Veracruz, Mexico

MONICA ESCANDON • Agroforestry and Plant Biochemistry, Proteomics and Systems Biology,Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3,Cordoba, Spain

YOEL ESTEVE-SANCHEZ • Plant Proteomics and Functional Genomics Group, Department ofAgrochemistry and Biochemistry. Faculty of Sciences, University of Alicante, Alicante,Spain

IRIS FINKEMEIER • Plant Physiology, Institute of Plant Biology and Biotechnology, Universityof Munster, Munster, Germany

MEGAN M. FORD • Department of Chemistry, University of North Carolina at Chapel Hill,Chapel Hill, NC, USA

xi

CARLOS FUENTES-ALMAGRO • Proteomics Facility, SCAI, University of Cordoba, Cordoba,Spain

LARA GARCIA • Plant Physiology, Department of Organisms and Systems Biology andUniversity Institute of Biotechnology (IUBA), University of Oviedo, Oviedo, Spain

ERIN GEMPERLINE • Department of Chemistry, University of Wisconsin-Madison, Madison,WI, USA

JONAS GIESE • Plant Physiology, Institute of Plant Biology and Biotechnology, University ofMunster, Munster, Germany

ISABEL GOMEZ-GALVEZ • Agroforestry and Plant Biochemistry, Proteomics and SystemsBiology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain

MARIE-CLAIRE GOULET • Centre de Recherche et d’Innovation sur les Vegetaux, UniversiteLaval, Quebec, QC, Canada

VICTOR M. GUERRERO-SANCHEZ • Agroforestry and Plant Biochemistry, Proteomics andSystems Biology, Department of Biochemistry and Molecular Biology, University of Cordoba,UCO-CeiA3, Cordoba, Spain

RAVI GUPTA • Department of Plant Biosciences, Life and Energy Convergence ResearchInstitute, Pusan National University, Miryang, South Korea

GENE HART-SMITH • Department of Molecular Sciences, Macquarie University, Sydney,NSW, Australia

JOSHUA L. HEAZLEWOOD • School of BioSciences, The University of Melbourne, Parkville, VIC,Australia

LESLIE M. HICKS • Department of Chemistry, University of North Carolina at Chapel Hill,Chapel Hill, NC, USA

NOBUAKI ISHIHAMA • RIKEN Center for Sustainable Resource Science, Yokohama, JapanJESUS V. JORRIN-NOVO • Agroforestry and Plant Biochemistry, Proteomics and Systems

Biology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain

JANET JUAREZ-ESCOBAR • Red de Estudios Moleculares Avanzados, Cluster Cientıfico yTecnol�ogico BioMimic®, Instituto de Ecologıa A.C. (INECOL), Veracruz, Mexico

KENTARO KANEKO • Graduate School of Science and Technology, Niigata University, Niigata,Japan

YUKIO KAWAMURA • United Graduate School of Agricultural Sciences, Iwate University,Morioka, Japan; Department of Plant-bioscience, Faculty of Agriculture, Iwate University,Morioka, Japan

CAITLIN KELLER • Department of Chemistry, University of Wisconsin-Madison, Madison,WI, USA

SUN TAE KIM • Department of Plant Biosciences, Life and Energy Convergence ResearchInstitute, Pusan National University, Miryang, South Korea

YU-JIN KIM • Graduate School of Biotechnology and Crop Biotech Institute, Kyung HeeUniversity, Yongin, South Korea

SETSUKO KOMATSU • Faculty of Environmental and Information Sciences, Fukui Universityof Technology, Fukui, Japan

LAURA LAMELAS • Plant Physiology, Department of Organisms and Systems Biology andUniversity Institute of Biotechnology (IUBA), University of Oviedo, Oviedo, Spain

INES LASSOWSKAT • Plant Physiology, Institute of Plant Biology and Biotechnology, Universityof Munster, Munster, Germany

xii Contributors

SHELDON R. LAWRENCE II • Department of Chemistry, University of North Carolina atChapel Hill, Chapel Hill, NC, USA

BIN LI • United Graduate School of Agricultural Sciences, Iwate University, Morioka, JapanLINGJUN LI • Department of Chemistry, University of Wisconsin-Madison, Madison, WI,

USA; School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USACRISTINA LOPEZ-HIDALGO • Plant Physiology, Department of Organisms and Systems Biology,

University Institute of Biotechnology of Asturias (IUBA), University of Oviedo, Oviedo,Asturias, Spain

VICTOR M. LOYOLA-VARGAS • Unidad de Bioquımica y Biologıa Molecular de Plantas,Centro de Investigaci�on Cientıfica de Yucatan (CICY), Merida, Yucatan, Mexico

YONGMING LUO • Faculty of Science and Graduate School of Life Science, HokkaidoUniversity, Sapporo, Japan

SABINE LUTHJE • Oxidative Stress and Plant Proteomics Group, Institute for Plant Scienceand Microbiology, University of Hamburg, Hamburg, Germany

YU LU • Faculty of Science and Graduate School of Life Science, Hokkaido University,Sapporo, Japan; Graduate School of Life and Environmental Sciences, University ofTsukuba, Tsukuba, Japan

ANA M. MALDONADO-ALCONADA • Agroforestry and Plant Biochemistry, Proteomics andSystems Biology, Department of Biochemistry and Molecular Biology, University of Cordoba,UCO-CeiA3, Cordoba, Spain

ASCENSION MARTINEZ-MARQUEZ • Plant Proteomics and Functional Genomics Group,Department of Agrochemistry and Biochemistry. Faculty of Sciences, University of Alicante,Alicante, Spain

EVAN W. MCCONNELL • Department of Chemistry, University of North Carolina at ChapelHill, Chapel Hill, NC, USA

MONICA MEIJON • Plant Physiology, Department of Organisms and Systems Biology andUniversity Institute of Biotechnology (IUBA), University of Oviedo, Oviedo, Spain

DOMINIQUE MICHAUD • Centre de Recherche et d’Innovation sur les Vegetaux, UniversiteLaval, Quebec, QC, Canada

TOSHIAKI MITSUI • Department of Biochemistry, Faculty of Agriculture, Niigata University,Niigata, Japan; Graduate School of Science and Technology, Niigata University, Niigata,Japan

CHRISTIAN MONTES • Department of Plant Pathology and Microbiology, Iowa StateUniversity, Ames, IA, USA

JAIME A. MORANTE-CARRIEL • Biotechnology and Molecular Biology Group, Quevedo StateTechnical University, Quevedo, Ecuador

HIROFUMI NAKAGAMI • Protein Mass Spectrometry Group, Max Planck Institute for PlantBreeding Research, Cologne, Germany

RYOHEI THOMAS NAKANO • Department of Plant Microbe Interactions, Max Planck Institutefor Plant Breeding Research, Cologne, Germany; Cluster of Excellence on Plant Sciences(CEPLAS), Max Planck Institute for Plant Breeding Research, Cologne, Germany

TAKATO NAKAYAMA • Department of Plant-bioscience, Faculty of Agriculture, IwateUniversity, Morioka, Japan

GUILLAUME NEE • Plant Physiology, Institute of Plant Biology and Biotechnology, Universityof Munster, Munster, Germany

GERHARD OBERMEYER • Department of Biosciences, Membrane Biophysics, Paris-Lodron-University of Salzburg, Salzburg, Austria

Contributors xiii

HEIDI PERTL-OBERMEYER • Department of Biosciences, Membrane Biophysics, Paris-Lodron-University of Salzburg, Salzburg, Austria

GEORGE V. POPESCU • Institute for Genomics, Biocomputing, and Biotechnology, MississippiState University, Mississippi State, MS, USA; The National Institute for Laser, Plasma &Radiation Physics, Bucharest, Romania

SORINA C. POPESCU • Department of Biochemistry, Molecular Biology, Entomology, andPlant Pathology, Mississippi State University, Mississippi State, MS, USA

KALAIVANI RAMANATHAN • Oxidative Stress and Plant Proteomics Group, Institute for PlantScience and Microbiology, University of Hamburg, Hamburg, Germany

EDUARDO ANTONIO RAMIREZ-RODRIGUEZ • School of BioSciences, The University ofMelbourne, Parkville, VIC, Australia

MARIA-DOLORES REY • Agroforestry and Plant Biochemistry, Proteomics and Systems Biology,Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3,Cordoba, Spain

PIER GIORGIO RIGHETTI • Miles Gloriosus Academy, Milan, ItalyVICTOR ROCES • Plant Physiology, Department of Organisms and Systems Biology and

University Institute of Biotechnology (IUBA), University of Oviedo, Oviedo, SpainELIEL RUIZ-MAY • Red de Estudios Moleculares Avanzados, Cluster Cientıfico y Tecnol�ogico

BioMimic®, Instituto de Ecologıa A.C. (INECOL), Veracruz, MexicoFRANK SAINSBURY • Centre de Recherche et d’Innovation sur les Vegetaux, Universite Laval,

Quebec, QC, Canada; Griffith Institute for Drug Discovery, Griffith University, Brisbane,QLD, Australia

ROSA SANCHEZ-LUCAS • Agroforestry and Plant Biochemistry, Proteomics and SystemsBiology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain

BONOSO SAN-EUFRASIO • Agroforestry and Plant Biochemistry, Proteomics and SystemsBiology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain

TAKEO SATO • Faculty of Science and Graduate School of Life Science, Hokkaido University,Sapporo, Japan

PAUL SCHULZE-LEFERT • Department of Plant Microbe Interactions, Max Planck Institutefor Plant Breeding Research, Cologne, Germany; Cluster of Excellence on Plant Sciences(CEPLAS), Max Planck Institute for Plant Breeding Research, Cologne, Germany

SUSANA SELLES-MARCHART • Plant Proteomics and Functional Genomics Group, Departmentof Agrochemistry and Biochemistry. Faculty of Sciences, University of Alicante, Alicante,Spain

GAOYUAN SONG • Department of Plant Pathology and Microbiology, Iowa State University,Ames, IA, USA

SARA CHRISTINA STOLZE • Protein Mass Spectrometry Group, Max Planck Institute for PlantBreeding Research, Cologne, Germany

JUNPEI TAKAGI • Faculty of Science and Engineering, Konan University, Kobe, JapanDAISUKE TAKAHASHI • Central Infrastructure Group: Genomics and Transcript Profiling,

Max-Planck Institute of Molecular Plant Physiology, Potsdam, Germany; United GraduateSchool of Agricultural Sciences, Iwate University, Morioka, Japan; Graduate School ofScience and Engineering, Saitama University, Saitama, Japan

PRIYADARSHINI TILAK • Plant Physiology, Institute of Plant Biology and Biotechnology,University of Munster, Munster, Germany

xiv Contributors

MATSUO UEMURA • United Graduate School of Agricultural Sciences, Iwate University,Morioka, Japan; Department of Plant-bioscience, Faculty of Agriculture, Iwate University,Morioka, Japan

TOMOHIRO UEMURA • Graduate School of Humanities and Sciences, OchanomizuUniversity, Tokyo, Japan

LUIS VALLEDOR • Department of Organisms and Systems Biology, Institute of Biotechnologyof Asturias, University of Oviedo, Oviedo, Asturias, Spain

JUSTINW.WALLEY • Department of Plant Pathology andMicrobiology, Iowa State University,Ames, IA, USA

XIN WANG • College of Agronomy and Biotechnology, China Agricultural University,Beijing, China

YIMING WANG • Department of Plant Microbe Interactions, Max Planck Institute for PlantBreeding Research, Cologne, Germany; Department of Plant Pathology, NanjingAgricultural University, Nanjing, China

EMILY G. WERTH • Department of Chemistry, University of North Carolina at Chapel Hill,Chapel Hill, NC, USA

JUNJI YAMAGUCHI • Faculty of Science and Graduate School of Life Science, HokkaidoUniversity, Sapporo, Japan

Contributors xv

Chapter 1

What Is New in (Plant) Proteomics Methods and Protocols:The 2015–2019 Quinquennium

Jesus V. Jorrin-Novo

Abstract

The third edition of “Plant Proteomics Methods and Protocols,” with the title “Advances in ProteomicsTechniques, Data Validation, and Integration with Other Classic and -Omics Approaches in the SystemsBiology Direction,” was conceived as being based on the success of the previous editions, and the continuousadvances and improvements in proteomic techniques, equipment, and bioinformatics tools, and their usesin basic and translational plant biology research that has occurred in the past 5 years (in round figures, ofaround 22,000 publications referenced in WoS, 2000 were devoted to plants).The monograph contains 29 chapters with detailed proteomics protocols commonly employed in plant

biology research. They present recent advances at all workflow stages, starting from the laboratory (tissueand cell fractionation, protein extraction, depletion, purification, separation, MS analysis, quantification)and ending on the computer (algorithms for protein identification and quantification, bioinformatics toolsfor data analysis, databases and repositories). At the end of each chapter there are enough explanatory notesand comments to make the protocols easily applicable to other biological systems and/or studies, discussinglimitations, artifacts, or pitfalls. For that reason, as with the previous editions, it would be especially usefulfor beginners or novices.Out of the 29 chapters, six are devoted to descriptive proteomics, with a special emphasis on subcellular

protein profiling (Chapters 5–10), six to PTMs (Chapters 11, and 14–18), three to protein interactions(Chapters 19–21), and two to specific proteins, peroxidases (Chapter 24) and proteases and proteaseinhibitors (Chapter 26). The book reflects the new trajectory in MS-based protein identification andquantification, moving from the classic gel-based approaches to the most recent labeling (Chapters 10,11, 29), shotgun (Chapters 5, 7, 12, 15), parallel reaction monitoring (Chapter 16), and targeted dataacquisition (Chapter 13). MS imaging (Chapter 25), the only in vivo MS-based proteomics strategy, is farfrom being fully optimized and exploited in plant biology research. A confident protein identification andquantitation, especially in orphan species, of low-abundance proteins, is still a challenging task (Chapters 4,28).What is really new is the use of different techniques for proteomics data validation and their integration

into other classic and -omics approaches in the systems biology direction. Chapter 2 reports on multipleextractions in a single experiment of the different biomolecules, nucleic acids, proteins, and metabolites.Chapter 27 describes how metabolic pathways can be reconstructed from multiple -omics data, andChapter 3 network building. Finally, Chapters 22 and 23 deal with, respectively, the search for allele-specific proteins and proteogenomics.Around 200 groups were, almost 1 year ago, invited to take part in this edition. Unfortunately, only 10%

of them kindly accepted. My gratitude to those who accepted our invitation but also to those who did not,as all of them have contributed to the plant proteomics field. I will enlist, in this introductory chapter,following my own judgment, some of the relevant papers published in the past 5 years, those that have

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_1, © Springer Science+Business Media, LLC, part of Springer Nature 2020

1

shown us how to enhance and exploit the potential of proteomics in plant biology research, without aimingat giving a too exhaustive list.

Key words Omics approaches, Plant proteomics, Protein interactions, PTMs, Proteogenomics,Quantitative proteomics, Shotgun proteomics, Systems biology, Targeted proteomics

1 Introduction

The success of the previous editions of “Plant Proteomics Methodsand Protocols” (Springer Nature Methods in Molecular Biology,vols. 355, 2007, and 1072; 2014; http://www.springer.com/series/7651) [1, 2] and the continuous advances and improve-ments in proteomic techniques, equipment, and bioinformaticstools, and their use in basic and translational plant biology research,have encouraged Humana Press to prepare a new updated thirdversion with the title, “Advances in Proteomics Techniques, DataValidation, and Integration with Other Classic and -OmicsApproaches in the Systems Biology Direction,” edited by J.V. JorrınNovo, L. Valledor, M.A. Castillejo, and M.D. Rey.

Since the last, second, edition, and in a very short period oftime, 5 years (2014-May 2019), the number of proteomics papers,in general, and those devoted to plant proteomics studies in partic-ular, has been continuously increasing. There were 22,000 and2000 hits for a search at WoS with the keywords “proteomics” or“plant + proteomics,” respectively. These figures reflect, on the onehand, that the field of proteomics has been greatly enriched andupdated with equipment, techniques, protocols, algorithms, data-bases, and repositories. Thus, the possibility now exists of having adeeper coverage of the proteome, a more confident protein identi-fication and quantification, and a less speculative and more confi-dent biological interpretation of the data and responses tobiological questions based on the protein language. On the other,and on glancing once again at plant proteomics figures, the sameconclusion is reached: “the full potential of proteomics is still farfrom being fully exploited in plant biology research” [3], and there arenot many groups carrying out plant proteomics experiments usingthe latest technological advances and equipment in the field. Thereare more groups entering proteomics, with new plant experimentalsystems, proposing new biological studies, but they use classicapproaches, keeping proteomics mostly descriptive and speculative.Assuming this situation, this new edition aims to show plant scien-tists how they can go one step forward by using proteomics as anexperimental approach.

2 Jesus V. Jorrin-Novo

2 Novelties in the 2015–2019 Period

The main objective of a proteomics experiment is to identify, char-acterize, and quantify as many proteoforms or protein species aspossible. Its success depends on the experimental system, the pro-tocols for protein extraction and fractionation, the MS strategy, theequipment, and the algorithms and databases employed. Eachtechnique and protocol has to be optimized to the experimentalsystem, the biological process, and the starting hypothesis. Like anyanalytical technique, MS has to be validated, and its resolution,sensitivity, detection limit, and dynamic range determined(Chapter 12). With respect to the experimental system, a consider-ation should be made of its biological characteristics such as thelevel of ploidy, the availability of species-specific protein databases,and its recalcitrance, the latter related to the chemical composition(Chapters 22, 23, 29). In the plant proteomics scenario, orphanand recalcitrant species such as forest trees still remain challenging(Chapters 4 and 12).

Up to six consecutive generations of MS proteomics platformshave been developed and employed since its beginning, in the early1990s, 25 years ago [4]. Human proteome research has moved fastin using the most recent technologies, gel-free/label-free or shot-gun (fourth generation) [5], single/multiple reaction monitoring,targeted or mass western (fifth generation) [6], and data-independent acquisition, DIA, and its sequential windowed data-independent acquisition of the total high-resolution mass spectra,SWATH (sixth generation) [7] However, plant investigators stillcling to the employment of gel-MS, including difference gel elec-trophoresis, DIGE (first and second generation), isobaric or isoto-pic labeling, mostly isobaric tags for relative and absolutequantitation, iTRAQ (third generation), and shotgun (fourth gen-eration) [8] (Chapters 10, 11, 13, 15).

The optimization of classic protocols for protein extraction [9]and purification [10, 11], together with advances in mass spec-trometry techniques [12], the evolution of mass spectrometers,especially the Orbitrap family [13], the feasibility of sequencingand annotating quicker and cheaper complete genomes and tran-scriptomes for protein database constructions ([14, 15] and Chap-ters 4 and 12), and the development of algorithms andbioinformatics tools for protein identification, quantification,grouping, and statistical analysis of the data ([16] and Chapter 4,this volume), “[has] taken proteomics to an unimaginable achieve-ment in terms of the number of protein species confidently identified,quantified, and characterized” [4]. We have progressed from iden-tification of hundreds to thousands of gene products in a singleexperiment. As a result, protein databases and repositories are beingcreated or enriched [17, 18]. Even so, we are only able to visualize a

2019 Plant Proteomics Methods and Protocols 3

small fraction of the whole proteome (1–5%). For a higher cover-age, subcellular or protein fractionation has been chosen. In thisvolume, different chapters deal with the proteome analysis of sub-cellular fractions, including apoplast, membrane systems, nuclei,and chloroplasts (Chapters 5–9). Chapter 28 describes detectinglow-abundance proteins by using the combinatorial peptide ligandlibrary (CPLL) technique.

Descriptive and comparative proteomics remain the mostrepresented areas in the current plant proteomics literature, withnew plant systems and biological processes continuously beingreported. The main interest lies in crops and processes related toproductivity and other phenotypes of importance from an agro-nomic point of view [19, 20]. Stresses associated with climatechange and biodiversity are two of the leading topics [21, 22].

It has been claimed that proteomics can lead us to the identifi-cation of protein markers [23–26] that are useful in plant breedingprograms and in the selection of elite genotypes, but that is still farfrom reality. One of the difficulties in identifying protein markers isthe existence of very similar proteins as members of a multigenefamily, allelic variants, or individual genes, that give rise to a variablenumber of proteoforms or protein species as a result of posttran-scriptional (alternative splicing) of posttranslational (PTMs) events,without finding out the biological role of each one of them[27, 28]. As bottom-up, peptide-centric, platforms cannot giveclear responses to this question, top-down strategies have to beimproved [29]. Just as an example, in Chapter 22, by Prof. Car-pentier, alternative protocols for allele-specific proteins areproposed.

Posttranslational modifications, PTMs, and interactomicsremain a challenge, but more and more papers are appearing onthese topics [30–32]. As a novelty with respect to the previous twoeditions, this third edition includes five chapters describing proto-cols for PTM analysis: Chapters 11, 14, 15, 16 (phospho),Chapter 17 (glyco), and Chapter 18 (acetyl). PTM analysis can bedone with gel-based, gel-free, labeling, and targeted parallel reac-tion monitoring approaches, the topic recently reviewed by Vu et al.[33]. The difficulty of the PTM analysis depends on the type ofmodification, its stability, stoichiometric levels of protein modifica-tion, the existence of multiple sites for specific or different PTMs,and the efficacy of the enrichment protocols for modified proteinsand peptides, among other items. Whereas in vitro analysis is quitefeasible, changes in the in vivo PTM profiles remain somewhatelusive.

Three of the chapters, Chapters 19–21, address the study ofprotein interactions or interactome, one of the main challenges inthe postgenomic era. Interactomics shares with PTMs their meth-odological strategy and workflow, with a previous MS step directedat purifying or enriching the target and partners complexes. The

4 Jesus V. Jorrin-Novo

difficulty in characterizing interactions is even greater than PTMsbecause of the low stability of the interactions and the generation offalse positives due to unspecific binding. In order to diminish thosefalse positives, in vivo site-specific chemical cross-linking coupled toMS has appeared as being a powerful technique [34], as it convertsunstable complexes into stable ones that can be purified or enrichedby using immunoaffinity techniques (Chapter 20 of this book).Both PTMs and interactions studies are favored by computationalanalysis and in silico predicted PTM motifs and functional associa-tion network of genes ([35–37] andChapter 3 of this book). InChapter 19, Nee et al. report on a mass spectrometry-based label-free quantification approach to identify protein interaction net-works under native conditions. It uses a transgenic plant expressingthe protein of interest fused to a GFP-Tag; enrichment of theGFP-tagged protein with its interaction partners is performed byimmunoaffinity purification, with the captured purified proteinsbeing analyzed by LC-MS/MS and label-free quantification.FLAG tag-fused is an alternative, as shown in Chapter 21 by Luoet al., who propose a protocol to study 14-3-3 interactors in tomatofruit.

Proteomics is being increasingly employed in a directed, tar-geted, hypothesis-based direction, thus changing the previous viewof a holistic approach that did not need a hypothesis. The latteroption was a good starting point, but it made proteomics mostlydescriptive and speculative, without the possibility of comparingthe data with those previously obtained by using other experimen-tal approaches. In the end, experimental data has to be manuallyvalidated if it is intended to confidently interpret it from a biologicalpoint of view, and if we wish to escape from the tyranny of the blindanalysis based on computational tools, and to move from the forest(whole proteomes, subproteomes, functional or structural groups)to the tree (individual proteins). We need to understand when,where, how, and the reasons for the orchestration of thousands ofproteins in order to construct the cellular building, to fit it into adevelopmental program, and to respond to a highly changeableenvironment.

Targeted (Mass Western) proteomics is a bottom-up approachbased on the MS analysis of individual proteins, or a selected groupof them, through a set of selected peptides, ideally proteotypicones. These are the basics of a number of recently developedtechniques such as single, multiple, or parallel reaction monitoring(SRM, MRM, and PRM), accurate inclusion mass screening(AIMS), and the sequential window acquisition of all theoreticalfragments (SWATH) [38]. These approaches offer new possibilitiesin biomarker discoveries and multiplexing analyses [39].

In Chapter 13, Dr. Hart-Smith addresses the combined use oftargeted and untargeted LC-MS/MS data acquisition, a strategytermed TDA/DDA, and its application to a model quantitative

2019 Plant Proteomics Methods and Protocols 5

plant proteomics experiment performed on Arabidopsis. Thisapproach is compatible with different methodologies, includingmetabolic and chemical labeling and label-free approaches, andcan be used to create tailored assay libraries to assist in the interpre-tation of quantitative proteomics data collected using the Indepen-dent Acquisition Data (IDA).

MS techniques, in combination with classic protein purificationapproaches and in silico analyses of gene sequences at the genomicor transcriptomic level, are perfect for the chemical, structural, andfunctional characterization of proteins, as illustrated in Chapter 24by Luthje and Ramanathan. They describe a protocol to perform insilico analysis of plant peroxidases, concretely of the secretory path-way family, in order to determine amino acid sequence, PTMs,structure, and ligand sites, among others. Prediction models thenhave to be validated in wet experiments. In Chapter 26, Gouletet al. introduce an activity-based functional proteomics approachprotocol for the selection of protease inhibitors, a group of peptideswith a high biotechnologic potential. This protocol is an alternativeto the in vitro activity assay with synthetic peptides, with theadvantage of additional information on specificity. The procedureinvolves the capture of target Cys proteases with biotinylated ver-sions of the cystatins, followed by the identification and quantifica-tion of captured proteases by mass spectrometry.

Genomics, transcriptomics, and proteomics feedback eachother. Thus, up to now, protein identification has been based onavailable protein sequences obtained from annotated genomes andtranscriptomes. [However, proteomics could be of great help inimproving and correcting genome annotation. With this in mind,the term proteogenomics was coined following a publication byChurch’s group in 2004, in which proteomics data were used toannotate the genome of Mycoplasma pneumonia [40]. The field ofproteogenomics has expanded and is being applied to a number ofliving organisms, including plants. Thus, by 2008, Castellana et al.[41], in an MS analysis of Arabidopsis tissues, found that 18,024peptides did not correspond to annotated genes, discovering778 new coding genes, and refining, in addition, 695 more genemodels. The topic of proteogenomics has recently been reviewed[42]. In Chapter 23 of this book, Al-Mohanna et al. propose aproteogenomic method for the peptide mapping of the haplotype-derived sweetpotato genome assembly. Proteogenomics is a veryuseful tool for genomics studies of species that, like sweet potato,have a complex, hexaploid, genome (2n ¼ 6� ¼ 90).

6 Jesus V. Jorrin-Novo

3 Proteomics Data Validation, and Integration into Other Classic and -OmicsApproaches in the Systems Biology Direction

Up to 2010, -omics approaches were developed independentlywith not much interaction between them. This made proteomicsand transcriptomics, as affirmed above, mostly descriptive andspeculative. In this decade, papers reporting the integrated employ-ment of the two or three -omics approaches, mostly transcriptomicsand proteomics, have started to appear [4]. While defining thecontents of the present monograph, it was clear, as pointed out inthe invitation letter to contributors, that chapters on protocols forproteomics data validation and integration with other classic and-omics approaches in the systems biology direction would consti-tute the main novelty in this new edition. As stated in Rey et al. [4],“The logical transition from reductionists to a holistic strategy andintegration of multidimensional biological information is currentlyaccepted by the scientific community as the only way to decipher thecomplexity of living organisms and predict through multiscale net-works and models.” The integrated use of the -omics approaches willnot only allow us to connect the phenotype and the genotype butalso, more importantly, to deepen the knowledge of gene expres-sion mechanisms, including posttranscriptional (RNA splicing,micro-RNAs, small interfering RNA, long noncoding RNAs), andposttranslational (phosphorylation, glycosylation, acetylation,methylation, etc.) events [43].

The new strategy requires novel methodologies, with bioinfor-matics and computer skills being the real bottleneck. The experi-mental setup is highly complex considering the heterogeneity of themolecules under study (DNA, RNA, proteins, and metabolites);the levels of analysis; next-generation sequencing for nucleic acids;mass spectrometry for proteins and metabolites, the huge amountof data produced, and the biases generated by each methodology.

In the wet lab, one limitation is the independent extraction ofeach type of biomolecule, making the results not fully comparable.In order to solve this, protocols for sequential extraction of thedifferent types of biomolecules have been developed [44]. Valle-dor’s group, in Chapter 2, introduces a novel protocol, optimizedfor microalgae, that allows for the combined extraction of differentlevels including total metabolites, or their pigments or lipids frac-tions along with nucleic acids (DNA and RNA) and/or proteinsfrom the same sample, reducing biological and time variationsbetween different levels of data.

The workflow, including wet and dry steps, has recently beenreviewed [4], including original articles and reviews related to thetopic. In order to avoid repetitions, I suggest that the reader gothrough it.

2019 Plant Proteomics Methods and Protocols 7

Chapter 27, by Lopez-Hidalgo et al., is a good example of howto use the different -omics for gaining biological knowledge. Theypresent a protocol based on a multiomics approach for the meta-bolic pathway reconstruction in a recalcitrant and orphan plantspecies, that is, the forest tree Holm oak (Quercus ilex). There aremore examples in the very recent current literature, such as thestudy of substantial equivalence in transgenic crops [45], seedgermination in Arabidopsis [46], somatic embryogenesis [47],biotic stress in fruit crops [48], and root development [49].

While I was summarizing the advances in (plant) proteomicsmethods and protocols in the past 5 years since the second editionof this monograph was published, I began to wonder what thefuture holds for this discipline and I asked myself two questions:(a) How long will it take before a fourth edition is needed? and(b) Will this third edition become obsolete? The answer to thesequestions is, in my opinion, are as follows: (a) In a few years’ timeand (b) No. Proteomics, and more concretely plant proteomics, isin its infancy, at the descriptive stage, with the proteomes observedbeing just the tip of the iceberg. We are assembling the pieces of apuzzle that will help us to understand how the cell is built and howit works. We are striving to see light at the end of the very longtunnel that links genotype and phenotype that, however, is still toodark. Every proteomics experiment shows us that life is morecomplex than we have ever imagined, while research continues tobe reductionist and simple.

References

1. Thiellement H, Zivy M, Damerval C et al (eds)(2007) Plant proteomics methods and proto-cols. Methods Mol Biol 355:1–8

2. Jorrin-Novo JV, Komatsu S, WeckwerthWet al(2014) Plant proteomics methods and proto-cols. In: Methods molecular biology, vol 1072,2nd edn. Humana Press, Totowa

3. Jorrin Novo JV (2014) Plant proteomics meth-ods and protocols. In: Novo J et al (eds)Chapter 1, plant proteomics methods and pro-tocols, Methods molecular biology, vol 1072,2nd edn. Humana Press, Totowa, pp 3–13

4. Rey MD, Valledor L, Castillejo MA et al(2019) Recent advances in MS-based plantproteomics: proteomics data validationthrough integration with other classic –omicsapproaches. In: Progress in botany. Springer,Berlin, Heidelberg

5. Neilson KA, Ali NA, Muralidharan S et al(2011) Less label, more free: approaches inlabel-free quantitative mass spectrometry. Pro-teomics 11:535–553

6. Picotti P, Bodenmiller B, Aebersold R (2013)Proteomics meets the scientific method. NatMethods 10:24–27

7. Gillet LC, Navarro P, Tate S et al (2012) Tar-geted data extraction of the MS/MS spectragenerated by data independent acquisition: anew concept for consistent and accurate prote-ome analysis. Mol Cell Proteomics 11:O111.016717

8. Jorrin-Novo JV, Komatsu S, Sanchez-Lucas Ret al (2018) Gel electrophoresis-based plantproteomics: past, present, and future. Happy10th anniversary journal of proteomics. J Pro-teome 198:1–10

9. Luthria DL, Maria John KM, Marupaka R et al(2018) Recent update on methodologies forextraction and analysis of soybean seed pro-teins. J Sci Food Agric 98:5572–5580

10. Fesmire JD (2019) A brief review of othernotable electrophoretic methods. MethodsMol Biol 1855:495–499

11. Minic Z, Dahms TES, Babu M (2018) Chro-matographic separation strategies for precision

8 Jesus V. Jorrin-Novo

mass spectrometry to study protein-proteininteractions and protein phosphorylation. JChromatogr B Analyt Technol Biomed LifeSci 1102-1103:96–108

12. Ankney JA, Muneer A, Chen X (2018) Relativeand absolute quantitation in massspectrometry-based proteomics. Annu RevAnal Chem 11:49–77

13. Eliuk S, Makarov A (2015) Evolution of Orbi-trap mass spectrometry instrumentation. AnnuRev Anal Chem 8:61–80

14. Jung H, Winefield C, Bombarely A et al (2019)Tools and strategies for long-read sequencingand de novo assembly of plant genomes.Trends Plant Sci 24(8):P700–P724. (in press)

15. Guerrero-Sanchez VM, Maldonado-Alconada-A, Amil-Ruiz et al (2019) Ion torrent and lllu-mina, two complementary RNA-seq platformsfor constructing the holm oak (Quercus ilex)transcriptome. PLoS One 14:e0210356

16. Misra BB (2018) Updates on resources, soft-ware tools, and databases for plant proteomicsin 2016–2017. Electrophoresis 39:1543–1557

17. Subba P, Narayana Kotimoole C et al (2019)Plant proteome databases and bioinformatictools: an expert review and comparativeinsights. OMICS 23:190–206

18. Martens L, Vizcaıno JA (2017) A golden agefor working with public proteomics data.Trends Biochem Sci 42:333–341

19. Duncan O, Trosch J, Fenske R et al (2017)Resource: mapping the Triticum aestivum pro-teome. Plant J 89:601–616

20. Katam K, Jones KA, Sakata K (2015) Advancesin proteomics and bioinformatics in agricultureresearch and crop improvement. J ProteomicsBioinform 8:3

21. Hu J, Rampitsch C, Bykova NV (2015)Advances in plant proteomics toward improve-ment of crop productivity and stress resistance.Front Plant Sci 6:209

22. Carrera DA, Oddsson S, Grossmann J et al(2018) Comparative proteomic analysis ofplant acclimation to six different long-termenvironmental changes. Plant Cell Physiol59:510–526

23. Schneider S, Harant D, Bachmann G et al(2019) Subcellular phenotyping: using proteo-mics to quantitatively link subcellular leaf pro-tein and organelle distribution analyses ofPisum sativum cultivars. Front Plant Sci 10:638

24. de Lamo FJ, Constantin ME, Fresno DH et al(2018) Xylem sap proteomics reveals distinctdifferences between R gene- and endophyte-mediated resistance against Fusarium wilt dis-ease in tomato. Front Microbiol 9:2977

25. Lankinen A, Abreha KB, Masini L et al (2018)Plant immunity in natural populations andagricultural fields: Low presence ofpathogenesis-related proteins in Solanumleaves. PLoS One 13:e0207253

26. Ghatak A, Chaturvedi P, Weckwerth W (2017)Cereal crop proteomics: systemic analysis ofcrop drought stress responses towards marker-assisted selection breeding. Front Plant Sci8:757

27. Schaffer LV, Millikin RJ, Miller RM et al(2019) Identification and quantification ofproteoforms by mass spectrometry. Proteomics19:SI 1800361

28. Naryzhny S (2019) Inventory of proteoformsas a current challenge of proteomics: sometechnical aspects. J Proteome 191:22–28

29. Toby TK, Fornelli L, Kelleher NL (2016)Progress in top-down proteomics and the anal-ysis of proteoforms. Annu Rev Anal Chem(Palo Alto, Calif) 9:499–519

30. Hashiguchi A, Komatsu S (2017) Postransla-tional modifications and plant-environmentinteraction. Methods Enzymol 586:97–113

31. Wu XL, Gong FP, Cao D et al (2016) Advancesin crop proteomics: PTMs of proteins underabiotic stress. Proteomics 16:847–865

32. Friso G, van Wijk KJ (2015) Posttranslationalprotein modification in plant metabolism.Plant Physiol 3:1469–1487

33. Vu LD, Gevaert K, De Smet I (2018) Proteinlanguage: post-translational modifications talk-ing to each other. Trends Plant Sci12:1068–1080

34. Zhu XL, Yu FC, Yang Z et al (2016) In plantachemical cross-linking and mass spectrometryanalysis of protein structure and interaction inArabidopsis. Proteomics 16:1915–1927

35. Li GXH, Vogel C, Choi H (2018) PTMscape:an open source tool to predict generic post-translational modifications and map modifica-tion crosstalk in protein domains and biologicalprocesses. Mol Omics 14:197–209

36. Willems P, Horne A, Van Parys T, et al (2019)The Plant PTM Viewer, a central resource forexploring plant protein modifications. Plant Jdoi: https://doi.org/10.1111/tpj.14345.[Epub ahead of print]

37. Yao H, Wang X, Chen P et al (2018) PredictedArabidopsis interactome resource and gene setlinkage analysis: a transcriptomic analysisresource. Plant Physiol 177:422–433

38. Rodiger A, Baginsky S (2018) Tailored use oftargeted proteomics in plant-specific applica-tions. Front Plant Sci 9:1204

39. Chawade A, Alexandersson E, Bengtsson Tet al (2016) Targeted proteomics approach

2019 Plant Proteomics Methods and Protocols 9

for precision plant breeding. J Proteome Res15:638–646

40. Jaffe J, Berg HC, Church GM (2004) Proteo-genomic mapping as a complementary methodto perform genome annotation. Proteomics4:59–77

41. Castellana NE, Payne SH, Shen Z (2008) Dis-covery and revision of Arabidopsis genes byproteogenomics. Proc Natl Acad Sci U S A105:21034–21038

42. Low TY, Mohtar MA, Ang MY et al (2019)Connecting proteomics to next-generationsequencing: Proteogenomics and its currentapplications in biology. Proteomics 19:e1800235

43. HongWJ, Kim YJ, Chandran AKN et al (2019)Infrastructures of systems biology that facilitatefunctional genomic study in rice. Rice 12:15

44. Xiong J, Yang Q, Kang J et al (2011) Simulta-neous isolation of DNA, RNA, and proteinfrom Medicago truncatula L. Electrophoresis32:321–330

45. CorujoM, PlaM, van Dijk J et al (2019) Use ofomics analytical methods in the study of genet-ically modified maize varieties tested in 90 daysfeeding trials. Food Chem 292:359–371

46. Ponnaiah M, Gilard F, Gakiere B et al (2019)Regulatory actors and alternative routes forArabidopsis seed germination are revealedusing a pathway-based analysis of transcrip-tomic datasets. Plant J 99:163–175

47. Pais MS (2019) Somatic embryogenesis induc-tion in woody species: the future after omicsdata assessment. Front Plant Sci 10:240

48. Li T, Wang YH, Liu JX et al (2019) Advances ingenomic, transcriptomic, proteomic, andmetabolomic approaches to study biotic stressin fruit crops. Crit Rev Biotechnol 39:680–692

49. Proust H, Hartmann C, Crespi M et al (2018)Root development inMedicago truncatula: les-sons from genetics to functional genomics.Methods Mol Biol 1822:205–239

10 Jesus V. Jorrin-Novo

Chapter 2

Multiple Biomolecule Isolation Protocol Compatiblewith Mass Spectrometry and Other High-ThroughputAnalyses in Microalgae

Francisco Colina, Marıa Carbo, Ana Alvarez, Monica Meijon,Marıa Jesus Canal, and Luis Valledor

Abstract

Microalgae are gaining attention in industry for their high value–added biomolecules and biomass produc-tion and for studying fundamental processes in biology. The introduction of novel approaches for under-standing and modeling molecular networks at different omic levels is paramount for increasing theproductivity of these organisms. However, the construction of these networks requires high quality datasetswith, if possible, perfectly overlapping datasets. The employ of different materials for different biomoleculeisolation protocols, even if they come from the same homogenate, is one of the commonest issues affectingquality. Hence, a new method has been developed, allowing for the combined extraction of different levelsincluding total metabolites, or their pigments or lipid fractions along nucleic acids (DNA and RNA) and/orproteins from the same sample reducing biological and time variation between levels data.

Key words Microalgae, Proteomics, Lipids, Metabolite, Pigments, DNA, RNA

1 Introduction

Microalgae have gained attention in industry during the last dec-ades. They constitute a sustainable production platform due totheir high biomass production together with their generation ofhigh value–added biomolecules such as biodiesel, ß-carotene, astax-anthin, and omega-3. However, research is still necessary to makethese microorganisms real economically profitable producers.Moreover, not only is microalgae research industry-focused, buttheir intermediate plant–animal phylogenetic position makes thema powerful and convenient model to study fundamental processesin biology [1].

Understanding microalgae metabolic networks is complex, butrecent advances in omics and systems biology allow for the reliablecharacterization and (semi)quantitation of hundreds to thousands

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_2, © Springer Science+Business Media, LLC, part of Springer Nature 2020

11

of transcripts, proteins, or metabolites and its integration intodifferent functional networks, helping to better understand theirfunctions and relationships.

However, this high-throughput capability comes at a cost: thedifferent omic levels require different isolation methods, analyticalplatforms, and specific data processing pipelines. The biases relatedto different sample processing could have a major impact over laterbioinformatics analyses and metabolic reconstruction. The beststrategy to avoid this potential flaw is the development of a multipleextraction protocols, allowing for the fragmentation of a singlesample into its different omic layers. These kinds of protocolshave been developed for plants, animals, and microorganisms[2–4]. In Chlamydomonas, different strategies have been devel-oped focusing the multiple extraction of metabolites, nucleic acids,and protein fractions [2], but none of these are compatible with thecommonly used spectrophotometry- and gravimetry-based physio-logical indexes as total lipid content or pigment contents. For thisreason, we have developed a multiple extraction protocol allowingfor either metabolite or total lipid or pigment fraction extractionalong with nucleic acid (DNA and RNA) and protein extractionfrom the same microalga sample (Fig. 1). Moreover, this protocolcan be easily coupled to other procedures including thefluorescence-based lipid [5], enzyme-based starch [6], or phenoliccompound [7] quantitation and various phenotyping workflows asin vivo quantification of pigments [8] and lipids/carbohydrates [9],photosynthetic performance, and growth [10].

2 Materials

2.1 Cell Culture

Materials

1. Chlamydomonas reinhardtii CC-503 cw92 mt+ agg1+ nit1nit2 (available at the Chlamydomonas Culture Collection,Duke University).

1900 g, 5 min

1900 g, 2 min

Discard supernatant

Discard supernatant

Lipids DNARNA

Proteins

Spin

700 µL ddH2O

TUBE S

TUBE L

Lipids extraction

Proteins purification

Metabolites extraction

Pigments extractionNucleic acidsextracction

TUBE NAP

+

PigmentsDNARNA

Proteins

DNA RNA ProteinsTUBE Pi TUBE NAP

TUBE PTUBERNA

TUBEDNA

+

DNARNA

Proteins

TUBE NAP

+Non polarmetabolites

Polarmetabolites

TUBE NPM TUBE PM2

Stacking

1 cm

Resolving

Protein fractionation

Protein digestion

Storage (-20ºC)until LC/MS analysis

DesaltingC18 tips

Trypsin

Fig. 1 Workflow of microalgae metabolite, lipid, or pigment fraction extraction combined with nucleic acidand/or protein extraction from the same sample

12 Francisco Colina et al.

2. Tris-Acetate-Phosphate Media (TAP) (https://www.chlamycollection.org). For 1 L of media combined the follow-ing amounts of stock solutions and autoclave: 10 mL of TAPsalts stock, 1 mL of TAP Phosphate Solution, 1 mL of Hutner’strace elements stock, 2.42 g of Tris base, and 1 mL of glacialacetic acid. Adjust pH to 7.0–7.5.

3. Culture physical environment. Light intensity: 100 μmol/m2 sPAR is a good level for photosynthetically competent cultureson agar. For liquid cultures, light intensities of 200–300 μmol/m2 s, shaking at 110–150 rpm, and 25 �C temperature arerecommended.

4. Material needed for the culture: flask, incubator, or culturechamber with temperature, light intensity, photoperiod, andshake control.

2.2 Sampling and

Extraction Materials

1. 50 mL conical tubes, 1.5 mL tubes, 2 mL tubes, and 1.5 screw-cap tubes.

2. Refrigerated centrifuge.

3. Regimill/Fastprep (beads beating system).

4. Vortex.

5. Vacuum concentrator (speedvac).

6. Heat block.

7. Ultrasound sonicator.

8. Freezers (�20 and �80 �C).

2.3 Sampling and

Extraction Reagents

and Solutions

1. Metabolite extraction buffer (MEB): methanol–chloroform–ddH2O (2.5:1:0.5). Store at 4 �C; must be cold when added.

2. Phase separation mix (PSM): chloroform–ddH2O (1:1) (seeNote 1).

3. Polar metabolites extraction buffer (PMEB): chloroform–ddH2O (1:1).

4. Pigment extraction buffer (PEB): acetone–1 M Tris pH 8–ddH2O (80:5:15).

5. Lipid extraction buffer 1 (LBE1): chloroform–isopropanol(1:1).

6. Lipid extraction buffer 2 (LBE2): hexane.

7. Washing buffer 1 (WB1): 0.75% (v/v) ß-mercaptoethanol in100% methanol.

8. Washing buffer 2 (WB2): 2 mM Tris pH 7.5, 20 mM NaCl,0.1 mM EDTA, 90% ethanol.

9. Washing buffer 3 (WB3): 2 mM Tris pH 7.5, 20 mM NaCl,0.1 mM EDTA, 70% ethanol.

Multiple Biomolecule Isolation in Microalgae 13

10. RNase solution: 300 μL of WB2 and 3 μL of 20 mg/mLPureLink RNAse A (Invitrogen).

11. DNase solution: 300 μL of WB2, 3 μL of 10� DNase I Bufferand 3 μL of 2 U/μL DNase I (Ambion).

12. Protein solubilization buffer (PSB): 7 M guanidine hydrochlo-ride, 2% (v/v) TWEEN 20, 4% (v/v) NP-40, 50 mM Tris,pH 7.5, 1% (v/v) ß-mercaptoethanol.

13. Phenol.

14. Protein phase separation mix (PPSM): phenol–ddH2O(0.92:1).

15. Phenol washing buffer: 0.7 M sucrose, 50 mM Tris–HClpH 7.5, 50 mM EDTA, 0.5% ß-mercaptoethanol, 0.5% (v/v)Plant Protease Inhibitor Cocktail (Sigma-Aldrich).

16. Protein precipitation buffer (PPB): 0.1 M ammonium acetateand 0.5% ß-mercaptoethanol in methanol.

17. Methanol.

18. Protein pellet washing buffer (PPWB): acetone–ddH2O(85:15).

19. Protein pellet solubilization buffer (PPSB): Urea 8 M with4% SDS.

3 Methods

3.1 Sampling

Method

1. Harvest 50 mL of culture and centrifuge at 1900� g for 5 min.Discard the supernatant (see Notes 2 and 3).

2. Resuspend the cell pellet in 700 μL of ddH2O and transfer to a2 mL tube (tube S) (see Note 4).

3. Centrifuge at 1900 � g for 2 min. Discard the supernatant.And spin the tube S on a centrifuge to discard all the superna-tant (see Note 5).

4. Weight the tube S and determine the fresh weight (seeNote 6).

3.2 Metabolite

Extraction Method

Following steps must be done in ice and centrifugations at 4 �Cunless specified. Metabolites extraction is not compatible with lipidand pigment extractions.

1. Transfer the content of tube S to a new screw-cap tube withglass beads (tube SB). Add 600 μL of MEB and, if needed,resuspend the pellet by pipetting up and down (see Notes 7and 8).

2. Homogenize pellets by beads beating until totalhomogenization.

14 Francisco Colina et al.

3. Centrifuge at 20,000 � g for 6 min and transfer supernatant totube containing 800 μL of PSM (tube M) (see Note 9). Thepellet contains nucleic acids and proteins (tube NAP).

4. Mix well by vortexing and centrifuge tube M, 5 min at15,000 � g.

5. During centrifugation time add 500 μL of WB1 to tube NAPand mix by vortex until the pellet is mostly disaggregated (seeNote 10). Keep at 4 �C until metabolite extraction is finished.

6. After step 6 is finished, two different phases should be clearlydefined with a sharp interphase. Transfer the upper, aqueouslayer to a new 2 mL microcentrifuge tube (Tube PM, polarmetabolites). Transfer the lower layer, containing nonpolarmetabolites, to a new 2 mL tube (Tube NPM) (see Notes 11and 12).

7. Add 300 μL of PMEB to each PM tube. Mix 1 min at roomtemperature and centrifuge at 15,000 � g for 4 min.

8. Transfer upper layer to a new microcentrifuge tube PM2 (seeNote 11).

9. Dry PM2 and NPM tubes in a speedvac or under nitrogenstream. Keep the dried tubes at �20 �C or �80 �C untilanalysis.

10. Centrifuge tube NAP at 20,000 � g for 10 min. Discardsupernatant without disturbing the pellet.

3.3 Pigment

Extraction Method

Following steps must be at 4 �C unless other conditions are speci-fied. All materials used must be acetone resistant. Pigment extrac-tion is not compatible with metabolite and lipid extractions.

1. Add 500 μL of PEB to tube S for pellet resuspension. Transferto the glass beads screw cap tubes (tube SB) (see Note 8).

2. Add 500 μL of PEB to tube S and be sure the pellet iscompletely resuspended. Mix with previous PEB (step 1) inthe tube SB.

3. Vortex vigorously for 30 s or Regimill/Fastprep for 30 s.Transfer to a new 1.5 mL tube (tube NAP).

4. Centrifuge for 5 min at 21,100 � g. Transfer supernatant to anew tube (tubePi). The pellet containing nucleic acids andproteins (tube NAP) should be whitish-brownish (seeNote 13).

5. Read the absorbance of tube Pi (dilute Pi contents if necessary)immediately, since the acetone is highly volatile.

Absorbance to be read: 470 nm, 537 nm, 647 nm, 663 nm.Take the background-subtracted mean absorbance of the threereplicates (see Note 14).

Multiple Biomolecule Isolation in Microalgae 15

6. The concentration of chlorophylls and carotenoids (in μmolmL�1) can be obtained with the following equations (see Note15) according to [11]:

Chla ¼ 0, 01373 A663 � 0, 000897 A537 � 0, 003046 A647

Chlb ¼ 0, 02405 A647 � 0, 004305 A537 � 0, 005507 A663

Carotenoids ¼ A470 � 17, 1� Chla þ Chlbð Þð Þ=119, 26ð7. Air-dry pellets for PEB evaporation at room temperature (tube

NAP) (see Note 10).

3.4 Lipid Extraction

Method

Lipid extraction is not compatible with pigment and metaboliteextractions.

1. Add 200 μL of LBE1 to cell pellet (tube S) and transfer to aglass beads containing screw-cap tube (tube SB).

2. Homogenize using beads beating until total homogenization.Weight a 1.5 mL tube (tube L).

3. Centrifuge at 14,000 � g for 5 min at room temperature andtransfer supernatant to the tube L.

4. Repeat steps 1 and 2, mixing both fractions in the tube same L.

5. Reextract the pellet with 400 μL of LBE2 and vigorouslyvortex for 3 min.

6. Centrifuge at 14,000 � g for 5 min at room temperature andtransfer supernatant to tube L. The pellet contains proteins andnucleic acids (tube NAP) (see Notes 10 and 16).

7. Dry tube L in a speedvac or oven.

8. Determine lipid weight gravimetrically.

3.5 Nucleic Acid

Purification Method

The following steps must be carried out at 4 �C, unless otherconditions are specified.

1. Add 500 μL of WB1 to tube NAP and mix by vortex until thepellet is mostly disaggregated (see Note 17). Centrifuge at20,000� g for 10min. Discard supernatant without disturbingthe pellet (see Note 18).

2. Resuspend the pellet in 400 μL of PSB and centrifuge at14,000 � g for 3 min.

3. Transfer supernatant to a new silica column (SC1) placed in anuclease- and protease-free 2 mL tube (see Note 18). Centri-fuge at 10,000 � g for 1 min.

4. Transfer the flow through to a new tube (tube RP) containingRNA and proteins. Reserve the SC1 containing DNA for laterwashing steps.

16 Francisco Colina et al.

5. Add 400 μL of acetonitrile to the tube RP and mix first bypipetting and then by vortex.

6. Transfer tube RP sample mix to a new silica column (SC2)placed in a nuclease- and protease-free 2 mL tube (tube P) (seeNote 18).

7. Centrifuge SC2 at 12,000 � g for 2 min and save the flow-through containing proteins in tube P.

8. Wash the columns SC1 and SC2 with 600 μL of WB2. Centri-fuge at 12,000 � g for 2 min and discard the flow through.

9. Add 300 μL of RNase solution to SC1 and incubate 30 min atroom temperature. Add 360 μL of DNase solution to SC2 andincubate 30 min at 37 �C.

10. Centrifuge SC1 and SC2 at 12,000 � g for 1 min. Discard theflow-through.

11. Add 600 μL ofWB3 to SC1 and SC2. Centrifuge at 12,000� gfor 2 min. Discard the flow-through.

12. Centrifuge SC1 and SC2 1 min at 20,000 � g (see Note 19).

13. Place SC1 in a new tube (tube DNA) and SC2 in other one(tube RNA). Add 50 μL of ddH2O to the center of the mem-brane of SC1 and SC2. Incubate 5 min at room temperature.

14. Centrifuge SC1 and SC2 at 12,000 � g for 1 min for elutingboth DNA (tube DNA) and RNA (tube RNA).

3.6 Protein

Extraction and

Purification Methods

Following steps must be at 4 �C unless other conditions arespecified.

1. Add 100 μL of PSB and 300 μL of phenol to tube P. Mix byvortexing and incubate for 2 min at room temperature (seeNote 20).

2. Add 1150 μL of PPSM to tube P and vortex for 1–2 min atroom temperature.

3. Centrifuge for 5 min at 10,000 � g and room temperature forallowing for phase separation.

4. Transfer the upper phenolic phase containing proteins to a newtube (tube A) and add 600 μL of PWB. Vortex for 1–2 min andthen centrifuge 5 min at 10,000 � g and room temperature.

5. Transfer the upper phenolic phase to a new tube (tube B),being carefully for not disturbing the interphase (seeNote 21).

6. Precipitate the proteins by adding 1.5 mL of PPB to tubeB. Incubate over night at �20 �C (see Notes 22 and 23).

7. Centrifuge tube B at 10,000 � g for 15 min and discard thesupernatant carefully using a pipette for not disturbing thepellet.

Multiple Biomolecule Isolation in Microalgae 17

8. Fill the tube B with methanol and disaggregate the pellet usingan ultrasound sonicator.

9. Centrifuge at 10,000 � g for 10 min and discard the superna-tant without disturbing the pellet.

10. Wash the pellet with 600 μL of PPWB. Mix until the pellet iscompletely disaggregated (see Note 24).

11. Centrifuge at 10,000 � g for 10 min and discard the superna-tant without disturbing the pellet.

12. Air-dry pellets and redissolve in an adequate buffer (see Notes25 and 26).

13. Resolubilize and quantify proteins (see Note 26).

Proceed with protein fractionation, digestion, desalting, andconcentration according to [12].

4 Notes

1. Phase separation mix should be prepared in the 1.5 mL tube.

2. Cell concentration should be 5 � 105–1 � 106 cells/mL.

3. All the sampling steps must be done quickly. If it is not possible,centrifuge 15 mL of cell culture in 35 mL of cold (�80 �C)methanol and keep it at �80 �C until the extraction will beperformed.

4. Weight 2 mL tubes S before transferring the cells.

5. Discarding all supernatant is crucial for the step 4.

6. Maximum fresh weight for extraction should be 50 mg.

7. Tissue should remain frozen during all of the process.

8. Cut the pipette tip for an easier resuspension.

9. If the resultant pellet is green (nonwhitish), proceed to reho-mogenize it because it indicates a poor homogenization. Add200 μL of MEB to the tube containing the pellet. Mix well byvortex until the pellet is completely disaggregated. Centrifugeat 20,000 � g for 6 min and transfer supernatant to tube M.

10. If it is needed the nucleic acid extraction, perform immediatelythe first step of nucleic acid extraction and maintain at 4 �C.For performing directly protein extraction, maintain at 4 �C orair-dry the pellets (tube NAP) at room temperature and keptovernight at �20 �C. For long-term storage keep at �80 �Cuntil nucleic acid and protein extractions. This purification iscompatible with directly protein extraction and purificationwithout nucleic acid purification.

11. Low-binding tube is preferred.

18 Francisco Colina et al.

12. Sometimes polar phase can be slightly cloudy, becoming trans-parent if the tube is warmed to room temperature. This indi-cates a chloroform contamination. In this case, a second washof the PM tube with WB1 is recommended. Transfer the upperlayer to a new PM tube and the lower to the NPM tube.

13. If the pellet remains green colored, repeat steps 3 and 4.

14. The spectrophotometer response is linear with pigment con-centration up to an absorbance of 1. When the peak absorbanceof the samples exceeds 1, the solutions are diluted further andremeasured.

15. All of these values should be multiplied by the dilution factor ofthe samples (if sample is diluted). The fresh weight can be usedto obtain the moles of each pigment by milligram of freshweight.

16. If the pellet remains green and hexane still pigmented, repeatsteps 4 and 5 but adding 500 μL LBE2 instead of 400 μL. Incase the pellet remains green, continue with the extractionbecause green pellet color may come from the chlorophyllhemo group separation from the dead cells.

17. Pipetting through beads reduces the amount of nonsolubleparticles that are taken up.

18. Avoid transferring pellet particles to silica column.

19. This step is for eliminating residual ethanol and completelydrying the column for a better elution of nucleic acids.

20. If previous nucleic acid extraction is not performed, disaggre-gate tube NAP pellet in 400 μL of PSB and transfer thedissolved pellet to a new tube (tube P). Then, follow theprotein extraction and purification protocol.

21. For a maximum protein yield, remaining aqueous phase of tubeA can be reextracted with 550 μL of phenol, repeating steps 4–5.

22. If aqueous phase was reextracted, transfer the upper phenolicphase to a 10 mL tube, and precipitate protein with 4 mLof PPB.

23. Pause point: precipitated proteins in acetone are stable formore than 1 week at room temperature, but we recommendedkeeping them at �20 �C until extraction is resumed.

24. Pellets that are not completely dry (but with the acetonecompletely evaporated) are easier to solubilize.

25. Pellet solubilization should be done in an appropriate bufferdepending on the downstream application of proteins. Chla-mydomonas best protein pellets buffer solubilizer is urea 8 M,4% SDS.

Multiple Biomolecule Isolation in Microalgae 19

26. Choose protein quantification method depending on the com-patibility of protein resuspension buffer used.

Acknowledgments

Our research group is generously funded by Spanish Ministry ofScience, Innovation and Universities (AGL2016-77633-P andAGL2017-83988-R). M.M., L.V., and F.C. were also supportedby Spanish Ministry of Science, Innovation and Universitiesthrough Ramon y Cajal (RYC-2014-14981, RYC-2015-17871 toM.M. and L.V., respectively) and Programa de Ayudas Predoctor-ales Severo Ochoa, Autonomous Community of Asturias, Spain(BP14-138) to F.C. programs.

References

1. Sasso S, Stibor H, Mittag M et al (2018) Frommolecular manipulation of domesticated Chla-mydomonas reinhardtii to survival in nature.elife 7:e39233

2. Valledor L, Escandon M, Meijon M et al(2014) A universal protocol for the combinedisolation of metabolites, DNA, long RNAs,small RNAs, and proteins from plants andmicroorganisms. Plant J 79:173–180

3. Nakayasu ES, Nicora CD, Sims AC et al (2016)MPLEx: a robust and universal protocol forsingle-sample integrative proteomic, metabo-lomic, and lipidomic analyses. MSystems 1:e00043–e00016

4. Salem MA, Juppner J, Bajdzienko K et al(2016) Protocol: a fast, comprehensive andreproducible one-step extraction method forthe rapid preparation of polar and semi-polarmetabolites, lipids, proteins, starch and cellwall polymers from a single sample. PlantMethods 12:45

5. Morschett H, Wiechert W, Oldiges M (2016)Automation of a Nile red staining assay enableshigh throughput quantification of microalgallipid production. Microb Cell Factories 15:34

6. Smith AM, Zeeman SC (2006) Quantificationof starch in plant tissues. Nat Protoc 1:1342

7. Singleton VL, Orthofer R, Lamuela-RaventosRM (1999) Analysis of total phenols and other

oxidation substrates and antioxidants by meansof Folin-Ciocalteu reagent. In: Methods inenzymology. Academic Press, Cambridge, pp152–178

8. Gregor J, Marsalek B (2004) Freshwater phy-toplankton quantification by chlorophyll a: acomparative study of in vitro, in vivo and insitu methods. Water Res 38:517–522

9. Chiu L, Ho S-H, Shimada R et al (2017) Rapidin vivo lipid/carbohydrate quantification ofsingle microalgal cell by Raman spectral imag-ing to reveal salinity-induced starch-to-lipidshift. Biotechnol Biofuels 10:9

10. Strenkert D, Schmollinger S, Gallaher SD et al(2019) Multiomics resolution of molecularevents during a day in the life of Chlamydomo-nas. PNAS 116:2374–2383

11. Sims DA, Gamon JA (2002) Relationshipsbetween leaf pigment content and spectralreflectance across a wide range of species, leafstructures and developmental stages. RemoteSens Environ 81:337–354

12. Valledor L, Weckwerth W (2014) An improveddetergent-compatible gel-fractionationLC-LTQ-Orbitrap-MS workflow for plantand microbial proteomics. In: Jorrin-Novo JV,Komatsu S, Weckwerth W, Wienkoop S (eds)Plant proteomics: methods and protocols.Humana Press, Totowa, NJ, pp 347–358

20 Francisco Colina et al.

Chapter 3

Protein Interaction Networks: Functional and StatisticalApproaches

Monica Escandon, Laura Lamelas, Vıctor Roces,Vıctor M. Guerrero-Sanchez, Monica Meijon, and Luis Valledor

Abstract

The evolution of next-generation sequencing and high-throughput technologies has created new oppor-tunities and challenges in data science. Currently, a classic proteomics analysis can be complemented bygoing a step beyond the individual analysis of the proteome by using integrative approaches. Theseintegrations can be focused either on inferring relationships among proteins themselves, with othermolecular levels, phenotype, or even environmental data, giving the researcher new tools to extract anddetermine the most relevant information in biological terms. Furthermore, it is also important the employof visualization methods that allow a correct and deep interpretation of data.To carry out these analyses, several bioinformatics and biostatistical tools are required. In this chapter,

different workflows that enable the creation of interaction networks are proposed. Resulting networksreduce the complexity of original datasets, depicting complex statistical relationships (through PLS analysisand variants), functional networks (STRING, shinyGO), and a combination of both approaches. Recentlydeveloped methods for integrating different omics levels, such as coinertial analyses or DIABLO, are alsodescribed. Finally, the use of Cytoscape or Gephi was described for the representation and mining of thedifferent networks.This approach constitutes a new way of acquiring a deeper knowledge of the function of proteins, such as

the search for specific connections of each group to identify differentially connected modules, which mayreflect involved protein complexes and key pathways.

Key words Protein networks, String, Omics levels, sPLS, DIABLO, Cytoscape

1 Introduction

The classic workflow for proteome analysis, mainly based on the useof univariate statistics and PCAs, is being quietly displaced in favorof new approaches that take advantage of protein interactionknowledge and advanced statistical tools. These novel

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_3, © Springer Science+Business Media, LLC, part of Springer Nature 2020

Electronic supplementary material:The online version of this chapter (https://doi.org/10.1007/978-1-0716-0528-8_3) contains supplementary material, which is available to authorized users.

21

methodologies allow for the study of the proteome and its interac-tion with other biomolecules, the environment, and even withitself, providing a holistic perspective. This kind of workflowsgives the researcher the possibility of having a deeper understand-ing of the biological responses behind the observed differences inthe experimental systems.

Integrative studies heavily rely on computational biology andrequire the use of specific algorithms, methods, and models toextract and determine the most relevant information in biologicalterms [1, 2]. Common classification methods (including discrimi-nant analysis; neural networks; decision trees; support vectormachine, SVM; and random forest, RF) are suited to single datasetanalyses [2], whereas the methods that build predictive modelsrequire multiple sources of those which act as predictors andthose which are predicted.

The most employed methods for the characterization of multi-ple omics dataset is the combination of unsupervised multivariatestatistics, like principal component analysis (PCA), and supervised,like partial least square (PLS) and discriminant analysis (PLS-DA)and its variants [3–5]. PLS methods are suitable to integrate twodatasets considering one omic level a predictor of a second omiclevel, the response. With these methods it is possible to get anoverview about the most important variables (proteins, metabolitesor transcripts) determining which variables of the predictor explainthe maximum variance of the responses [6]. In addition, there areinnovative multiple integration tools (for more than two differentdata inputs) that allow for the construction of these relationshipbased models, such as DIABLO (Data Integration Analysis forBiomarker discovery using a latent component method for Omicsstudies) [2], multiple coinertia analysis (MCIA) [7], andxMWAS [8].

Integrative analysis can be pushed beyond sample and variablebiplotting or variable filtering, since determined interaction can bedepicted as networks, where the variables are the nodes and therelations among them, the edges. As a result, this simple represen-tation collects the complexity of the original data as retrieved byprevious analyses. These networks can be topologically evaluated todetermine the most connected nodes or hubs within the data aswell as subnetworks or clusters with same (or opposite) behavior.

Interaction networks described above, and its inferred relation-ships are based on statistical analyses. However, there is also possi-ble to create or enrich those networks with functional orbiologically relevant annotations. This new information layer isobtained from specific tools and databases (STRING, ShinyGO)which gather known functional relations (protein–protein, protein–metabolite associations). In addition, we can create functional net-works [9] even for species not included in these databases throughBLAST and protein domain analyses.

22 Monica Escandon et al.

In this chapter, different workflows aimed to conduct all ofthese functional and statistical approaches together with data visu-alization are described.

2 Materials

Next, we describe different approaches aiming to obtain networksthat infer proteins connection between themselves and with otheromics datasets, specifically with the metabolomic and transcrip-tomic levels in the example shown.

The experiment used as an example consists of a control andtwo experimental treatments (T1 treatment, T2 treatment) withthree biological replicas each. The names given for each replica areC-1; C-2; C-3; T1-1; T1-2; T1-3; T2-1; T2-2; and T2-3. Thematrices used—Proteins matrix, Metabolites matrix, and Tran-scripts matrix—in the different workflows follow the templateshown in Fig. 1 (where A: Protein Matrix; B: Metabolite Matrix;and C: Transcript Matrix). Individual matrixes of each dataset havethe following arrangement: samples in columns (e.g., control,treatment1, treatment2) and variables in rows (protein1, protein2, . . .). This arrangement of the matrixes for entry as a dataset in thedifferent programs is crucial to follow to obtain good results in thedifferent workflows. In supplementary, a simplified dataset isprovided to carry out the different workflows (Supplementarydataset S1). The networks shown in the chapter have been madewith real data from experiments.

Protein identification and quantification from rawMS/MS wasperformed by Thermo Proteome Discoverer™, Metabolites usingMZmine 2 [10] and Transcripts from Trinity software [11]. Alldatasets are normalized following the indications of [12], [13], and[14]. In each workflow, we will specify which matrices we need asstarting materials (e.g., Protein matrix with proteins identified,annotated, and quantified for each experimental situation).

The different workflows require different tools which areenlisted below:

Software: R Program (v.3.6.0), Rstudio, Cytoscape (v. 3.7.1),Cytoscape STRING App (v. 1.4.2), Gephi (v.0.9.2), andspreadsheet.

R Libraries: Bioconductor, edgeR, ARTIVA, xMWAS, MixO-mixs, igraph, and RColorBrewer.

3 Methods

Depending on the approach used, we have developed differentworkflows, and they are summarized in this protocol index:

Protein Interaction Networks 23

1. Selection of differential expression proteins (for targeted net-works) (Subheading 3.1).

2. Integration tools for statistical networks.

Fig. 1 Example matrices for data entry in R in csv format. (a) Protein matrix, (b) Metabolite matrix, and (c)Transcript matrix (RNA-seq data in this case). Each dataset with samples in columns and variables in rows

24 Monica Escandon et al.

(a) Statistical integration networks: Dynamic protein–proteininteraction networks (Subheading 3.2.1).

(b) With other omics datasets:

l Partial Least Square Regression (PLS) and variates(Subheading 3.2.2.1).

l Data-driven integration and differential network anal-ysis, xMWAS (Subheading 3.2.2.2).

l Data Integration Analysis for Biomarker discoveryusing a Latent component method for Omics studies(DIABLO) (Subheading 3.2.2.3).

3. Biological interaction network enrichment.

(a) STRING (Subheading 3.3.1).

(b) ShinyGO (Subheading 3.3.2).

4. Merged functional and statistical interaction networks (Sub-heading 3.4).

5. Network visualization tools.

(a) Cytoscape (Subheading 3.5.1).

(b) Gephi (Subheading 3.5.2).

6. Future Perspectives.

3.1 Selection

of Differential

Expression Proteins

(for Targeted

Networks)

For the analysis of protein–protein interactions, besides the globalanalysis of the proteome, it is possible to analyze in particular theinteractions of proteins with differential expression within ourexperiment. Quantitative analysis of shotgun proteomic data canbe performed through statistical tools commonly used to measurethe differential expression of genes (proteins in our case) such asEdgeR [15]. This package implements a range of statistical meth-odology based on the negative binomial distributions, includingempirical Bayes estimation, exact tests, generalized linear modelsand quasilikelihood tests. This analysis makes it possible to bettergroup proteins according to their function under certain condi-tions, reducing network complexity and keeping only the proteinssignificantly altered for a specific treatment.

Then, we will explain the workflow to obtain a selection ofdifferential proteins through which we will obtain the networkfunctionally enriched with programs such as STRING or ShinyGO(Subheading 3.2).

Workflow 1. Install and load the required packages. These are collections offunctions, data, and R code that are stored in a folder accordingto a well-defined structure, easily accessible for R (see Note 1).In an R console or GUI (we recommend R Studio) type:

if (!requireNamespace("BiocManager", quietly = TRUE))

install.packages("BiocManager")

BiocInstaller::install("edgeR")

library(edgeR)

Protein Interaction Networks 25

2. Load your data (Protein matrix, Fig. 1a), indicating the path ofthe file that contains them and its format. In addition, we mustassign a name to the columns of data, indicating the controlsand the corresponding treatments.

proteins <- read.table("proteins.csv", header = T,

row.names=1, sep=";")

3. Now we can create a DGEList variable (a list-based system,designed to store quantification data and associated informa-tion from sequencing technologies (see Note 2)). In this case,protein quantification data will be used instead of RNA-seqdata.

dpList <- DGEList(counts=proteins, genes=rownames(proteins))

4. To correct for variations between samples, the Trimmed Meanof M-values (TMM) method must be applied [16] in order tobring the average expression values of different samples to thesame scale, based on the assumption that the majority of pro-teins are not differentially expressed.

dpList <- calcNormFactors(dpList, method="TMM")

5. Once dataset is ready, it is needed to define a matrix that willdescribe the setup of the experiment. Each row defines a treat-ment, and columns define the different proteins. For imple-menting more treatments, see Note 3.

design<- matrix(c(c(1,1,1,0,0,0,0,0,0), c(0,0,0,1,1,1,0,0,0), c(0,0,0,0,0,0,1,1,1)),

ncol=3, dimnames = list(c(’C.1’,’C.2’, ’C.3’,’T1.1’,’T1.2’,’T1.3’,’T2.1’,’T2.2’,’T2.3’),

c(’C.’,’T1.’,’T2.’)))

A common negative binomial dispersion parameter consid-ering the experimental design previously defined must be cre-ated in order to estimate the variance on the dataset.

dpList <- estimateGLMCommonDisp(dpList, design=design)

dpList <- estimateGLMTrendedDisp(dpList, design=design)

dpList <- estimateGLMTagwiseDisp(dpList, design=design)

6. After calculating dispersions, differential expression values canbe estimated fitting to a Negative Binomial model because thisapproach provides greater benefits than others when usingsmall number of replicates [15]. The contrast matrix needs tobe carried out, in which the comparisons to be made between

26 Monica Escandon et al.

treatments and controls are indicated, and which must beadjusted for each experiment. Example:

contrast<- makeContrasts(T1vsC= T1.- C., T2vsC= T2. - C.,

levels = colnames(design))

fit <- glmFit(dpList, design)

lrt <- glmLRT(fit, contrast = contrast)

DEP <- topTags(lrt,n=Inf)

DEPdf <- as.data.frame(DEP)

7. FDR must be applied in order to avoid the consideration offalse positives. In this example, a false positive ratio of less than5% (FDR < 0.05) has been applied.

test <- DEPdf[which(DEPdf$FDR < 0.05),]

8. A rule for defining differentially expressed proteins was defined(double or half amount for proteins accumulated or lost,log2FC > 1 or < 1 respectively). Proteins meeting this criteria,and also having an FDR below threshold, were selected andexported into a new object:

(a) Upregulated proteins Treatment 1:

proteinsup1 <- test[which(test$logFC.T1. >= 1),]

proteinsup1 <- proteinsup1[,c(1,2,3,5,6,7)]

colnames(proteinsup1)<- c("Protein", "log2T.1", "log2T.2", "LR", "p-value", "FDR")

Export the list of upregulated proteins treatment 1 (“pro-teinsup1.txt”). The document will be saved in the selectedworkspace.

write.table(proteinsup1$Protein, "proteinsup1.txt", quote = FALSE, row.names=FALSE)

(b) Downregulated proteins Treatment 1:

proteinsdown1 <- test[which(test$logFC.T1. <= -1),]

df <- proteinsdown1[,c(1,2,3,5,6, 7)]

colnames(proteinsdown1) <- c("Protein", "log2T.1", "log2T.2", "LR", "p-value", "FDR")

Export the list of downregulated proteins treatment 1(“proteinsdown1.txt”) in the working directory:

write.table(proteinsdown1$Protein, "proteinsdown1.txt", quote = FALSE, row.names=FALSE)

(c) Upregulated proteins Treatment 2:

proteinsup2 <- test[which(test$logFC.T2. >= 1),]

proteinsup2 <- proteinsup2[,c(1,2,3,5,6,7)]

colnames(proteinsup)<- c("Protein", "log2T.1", "log2T.2", "LR", "p-value", "FDR")

Protein Interaction Networks 27

Export the list of upregulated proteins treatment 2 (“pro-teinsup2.txt”) in the working directory.

write.table(proteinsup$Protein, "proteinsup2.txt", quote = FALSE, row.names=FALSE)

(d) Downregulated proteins Treatment 2:

proteinsdown2 <- test[which(test$logFC.T2. <= -1),]

df <- proteinsdown2[,c(1,2,3,5,6, 7)]

colnames(proteinsdown2) <- c("Protein", "log2T.1", "log2T.2", "LR", "p-value", "FDR")

Export the list of downregulated proteins treatment1 (“proteinsdown1.txt”) in the working directory:

write.table(proteinsdown2$Protein, "proteinsdown2.txt", quote = FALSE, row.names=FALSE)

9. The final files of overexpressed or repressed differential proteinswill be used for the protein–protein interaction analysis bySTRING database and the Functional network visualizationby ShinyGO. Example: In STRING you only need the namesor the fasta files of these selected proteins (Upregulated orDownregulated) to introduce in the STRING web.

10. Continue the protocol with STRING/ShinyGO workflow(Subheading 3.3).

(a) Subheading 3.3.1 for STRING.

(b) Subheading 3.3.2 for ShinyGO.

3.2 Integration Tools

3.2.1 Statistical

Integration Networks:

Dynamic Protein–Protein

Interaction Networks

While most approaches are focused on constant interactions andassume that biological systems are static, nonhomogeneousdynamic Bayesian statistics with ARTIVA (autoregressive time vary-ing) algorithm can reveal causal interactions considering temporalinformation. This method is able to determine indirect associationsand regulation loops of high biological interest, thus giving a morecomplete picture of the system. ARTIVA model divides globaldataset dynamics (heterogeneity) in several uniform (homoge-neous) phases called changepoints (CPs). In each CP, it searchesfor relations between two types of variables (regulators and targets)while taking into account a user-defined time delay. As a result, itreveals dynamic interactions across time [17]. This type of modelhas three main limitations: (1) Interactions that occur at a time scaleshorter than the sampling points cannot be detected and may resultin wrong conclusions. If the time between two consecutive sam-pling points is too large, it is recommended to try other approaches.(2) Datasets with low number of samples and high number ofvariables can lead to erroneous inference [18]. It would be desirableto find a balance by filtering the variables with differential expres-sion tests or functional categories. (3) Time-consuming process.An example of the ARTIVA networks is available in Fig. 2.

28 Monica Escandon et al.

Workflow 1. Install all necessary packages.

install.packages(“ARTIVA”)

2. Load all necessary packages.

library(ARTIVA)

3. Set working directory where is the dataset to use (Proteinsmatrix, Fig. 1a) and import it (see Note 4). Protein namesmust be unique. This matrix is named as ART_data:

ART_data <- read.table("proteins.csv", header = T, row.names=1, sep=";")

4. Transform the data to allow a better performance of thealgorithm.

ART_data <- log10(ART_data + 1)

5. Select and subset the regulators. Each group of regulatorsshould have similar number of variables (see Note 5). Classicalexamples of regulators are: transcription factors and epigenetic-

Prot 5 Prot 45

Prot 3Prot 46

Prot 62

Prot 2

Prot 1

Prot 4

Prot 35

Prot 37Prot 36

Prot 44

Prot 43

Prot 7

Prot 6

Prot 20

Prot 10

Prot 21

Prot 25

Prot 9

Prot 12

Prot 8Prot 61

Prot 47

Prot 52

Prot 48

Prot 60

Prot 53

Prot 54

Prot 64

Prot 59

Prot 58

Prot 39

Prot 40

Prot 32

Prot 38

Prot 31

Prot 41

Prot 42

Prot 51

Prot 57

Prot 56Prot 55

Prot 49

Prot 63

Prot 50

Prot 17

Prot 23

Prot 16

Prot 11

Prot 19

Prot 22

Prot 18

Prot 14

Prot 13

Prot 24

Prot 15

Prot 26

Prot 34

Prot 29

Prot 33Prot 30

Prot 28Prot 27

Prot 5

Prot 36

Prot 3Prot 2

Prot 72

Prot 35

Prot 34

Prot 4

Prot 28

Prot 30

Prot 29

Prot 33

Prot 62

Prot 43

Prot 39

Prot 65

Prot 37

Prot 44

Prot 67

Prot 66

Prot 59

Prot 58

Prot 68

Prot 25 Prot 21

Prot 20

Prot 19

Prot 50Prot 49

Prot 45

Prot 51

Prot 48

Prot 18

Prot 69

Prot 57

Prot 14Prot 13

Prot 54

Prot 56

Prot 55

Prot 70

Prot 60 Prot 64

Prot 27

Prot 32Prot 31

Prot 24Prot 38

Prot 1

Prot 7Prot 9Prot 10Prot 6Prot 73Prot 8Prot 11

Prot 52Prot 53

Prot 71

Prot 12

Prot 40

Prot 42

Prot 41

Prot 46

Prot 31

Prot 35

Prot 37

Prot 38

Prot 78

Prot 30

Prot 71

Prot 29

Prot 79

Prot 65

Prot 80

Prot 42

Prot 24

Prot 81

Prot 5

Prot 4

Prot 69

Prot 74

Prot 43

Prot 44

Prot 61

Prot 76

Prot 1

Prot 75

Prot 54

Prot 62

Prot 77

Prot 53

Prot 72

Prot 66

Prot 25

Prot 40

Prot 59

Prot 48

Prot 45

Prot 58

Prot 67

Prot 68

Prot 46

Prot 36

Prot 2

Prot 33

Prot 12

Prot 55

Prot 49

Prot 71

Prot 90

Prot 89

Prot 26

Prot 86

Prot 29

Prot 52

Prot 85

Prot 88

Prot 54

Prot 2

Prot 12

Prot 76 Prot 53

Prot 55

Prot 49

Prot 77

Prot 64

Prot 75

Prot 69

Prot 32Prot 87

Prot 59

Prot 74 Prot 4Prot 68

Prot 17 Prot 58

Prot 67

Prot 5

Prot 66

Prot 82

Prot 24

Prot 79

Prot 1

Prot 36

Prot 63

Prot 50

Prot 65

Prot 18

Prot 37

Prot 84

Prot 34

Prot 80

Prot 33Prot 83

Prot 57

Prot 91

Prot 90Prot 81

Prot 25

Changepoint 1; Sampling points 1-2

Changepoint 2; Sampling points 2-3

Changepoint 3; Sampling points 3-4

Changepoint 4; Sampling points 4-5-6

Fig. 2 ARTIVA networks representing the interactions across time between three different groups of regulators(blue, red, and green nodes) and targets (gray nodes). The interactions of each changepoint are representedindependently. The orange circle shows nodes that have steadily gained interactions over time and thereforegreater importance. This network was made using Cytoscape workflow. Changepoint 1 ¼ interactions thatonly occur on sampling point 1; Changepoint 2 ¼ interactions that only occur on sampling point 2;Changepoint 3 ¼ interactions that only occur on sampling point 3; Changepoint 4 ¼ interactions that onlyoccur on sampling points 4-5-6

Protein Interaction Networks 29

related proteins. Regulators_1 is a numeric vector containingthe row numbers of the desired proteins. In this example thereis one group of regulators:

Regulators_1 <- c(43, 46, 47, 79, 131, 154, 160, 164, 187, 202,

218, 230, 242, 269, 289, 344, 345, 350, 428, 433, 438, 442,

468, 501, 510, 522, 561, 586, 605, 625, 669, 698)

6. Run ARTIVA regulators vs targets (see Note 6) and filter theoutput (see Note 7) based in posterior probability (PostProb).Posterior probability is a statement about the degree of belief ina particular interaction.

(a) Run ARTIVA regulators vs targets. For a detailed explana-tion of the arguments (see Note 8):

DBN <- ARTIVAnet(

targetData = as.matrix(ART_data[-Regulators_1, ] ),

parentData = as.matrix(ART_data[Regulators_1,]),

targetNames = rownames(ART_data[-Regulators_1, ] ),

parentNames = rownames(ART_data[Regulators_1,]),

niter = 50000,

dataDescription = rep(1:6, each=3),

nbCPinit = 1, maxCP = 5, segMinLength = 1,

savePictures = FALSE, saveEstimations = FALSE,

saveIterations = FALSE,

dyn = 0, edgesThreshold = 0.6)

(b) Filter the output:

DBN <- DBN[which(DBN$PostProb > 0.6),]

7. Export the output.

write.table(DBN, file = "DBN.txt")

8. Continue the protocol with Cytoscape/Gephi workflow (Sub-heading 3.5).

3.2.2 Statistical

Interaction Networks

between Proteins

with Other Omics Datasets

Partial Least Square

Regression (PLS)

and Variates

The statistical interaction networks between proteins with otheromic datasets (metabolomics, transcriptomics, ...) is obtainedthrough the use of the Partial Least Square Regression (PLS), orits variant regression of Sparse Partial Least Squares regression(sPLS). The PLS is a multivariable analysis which integrates twomatrices: X a predictive matrix (e.g., Proteins) and Y a responsematrix (e.g., Metabolites) [19, 20]. This method is used when thenumber of predictors are more than the number of observations orcases and where the variables considered for the study are correlated[21], a common situation in data obtained from omic techniques.The advantages of PLS is that it can handle many noisy and collinearvariables, while sPLS [22] also has the advantage that a

30 Monica Escandon et al.

simultaneous selection of variables can be made in the two datasets,reducing the complexity of omics data.

In this section, sPLS has been used to obtain networks ofinteraction of our protein matrix against other molecular levels(transcripts, in this case). An example of the sPLS network isrepresented in Fig. 3.

Workflow 1. Install all necessary packages.

install.packages(“mixOmics”)

install.packages(“igraph”)

install.packages(“RColorBrewer”)

2. Load all necessary packages:

library(mixOmics)

library(igraph)

library(RColorBrewer)

3. Introduce the necessary datasets (in this case, the twomatrixes): The transcripts (Fig. 1c) as the predictor matrix(X) and protein matrix (Fig. 1a) as the response matrix (Y).First, in the session tab of R, set working directory where are allmatrixes of the datasets to use.

Xmatrix<-read.table("contigs.csv",header=T,row.names=1,sep=";")

Ymatrix <- read.table("proteins.csv", header = T, row.names=1, sep=";")

4. Transforms matrixes to the correct format for the PLS func-tion. The transposed matrix is required for the function.

Xmatrix <- as.data.frame(t(Xmatrix))

Ymatrix <- as.data.frame(t(Ymatrix))

AspartateAminotransferase,

cytoplasmic(GOT1)

SERINECARBOXYPEPTIDASE-LIKE

36-RELATED

small subunit ribosomal protein

S27Ae CHAPERONIN60 SUBUNIT

ALPHA 1 CHLOROPLASTIC

MEMBER OF 'GDXG' FAMILY OF LIPOLYTIC

ENZYMES

acetyl-CoAacyltransferase 1

(ACAA1)

Solute Carrier family 25

(mitochondrialoxoglutaratetransporter),member 11

Trans-cinnamate4-monooxygenase

PECTATELYASE

1-RELATED

acetyl-CoAcarboxylase

carboxyltransferase

subunit alpha

TIP120

Apolipoprotein D and lipocalin family protein

(APOD)FAD7.2

12-oxophytodienoicacid reductase

Leucoanthocyanidindioxygenase

SKP1

MEMBER OF 'GDXG' FAMILY OF LIPOLYTIC

ENZYMES

Annexin A13 (ANXA13)

1-deoxy-D-xylulose-5-phosphatesynthase

PHOSPHOLIPASED ALPHA

1-RELATED

Ubiquitincarboxyl-terminalhydrolase (UCH)

hypotheticalprotein (K09955)

BAND 7 PROTEIN-RELATED

EARLY-RESPONSIVETO

DEHYDRATIONSTRESSPROTEIN

stress-induced-phosphoprotein1

cis-zeatinO-glucosyltransferase

DIHYDROLIPOAMIDEACETYL

raffinose synthase

GE10H

GDPmannose4,6-dehydratase

Linoleate13S-lipoxygenase

acetyl-CoAcarboxylase

carboxyltransferase

subunit alpha

Beta-ketoacyl-[acyl-carrier-protein]synthase II N-MYC

DOWNSTREAMREGULATED

GLUCOSYL

MYC2

large subunit ribosomal protein

L28e

solute carrier family 25

(mitochondrialphosphate

transporter),member3

ASPARTICPROTEINASEA1-RELATED

MAP

PHOSPHOLIPASED ALPHA

1-RELATED

Serine protease family S10 serine Carboxypeptidase

Chalcone-FlavononeIsomerase3-Related

Anthocyanidin3-O-glucoside

2''-O-glucosyltransferase

PEPTIDYL-PROLYLCIS-TRANS

ISOMERASECYP18-3-RELATED

GIP1

FAD2.1

9-cis-epoxycarotenoiddioxygenase

Oxalyl-CoAdecarboxylase

Diphosphomevalonatedecarboxylase

Malatedehydrogenase

(oxaloacetate-decarboxylating)(NADP(+))

IMPORTINSUBUNITALPHA-2

Malatedehydrogenase

(decarboxylating) PEROXIDASE 35-RELATED

)Importin subunit

beta-1

glucose-6-phosphate1-dehydrogenase

Cysteine synthase

Villin 1 (VIL1)

Dormancy3-oxoacyl- (fabG)

3-hydroxyacyl-CoAdehydrogenase

T-complex protein 1 subunit gamma (CCT3, TRIC5)

calcium-bindingEF hand family

protein

Prohibitin 1 (PHB1)

Polyneuridine-aldehydeesterase

METALLOPROTEASEM41 FTSH

Long-chainacyl-CoA

synthetaseSolute carrier family 25

(mitochondrialoxoglutaratetransporter),member 11

enoyl- (fabI)

Cathepsin F

1-hydroxy-2-methyl-2-(E)-butenyl4-diphosphate

synthase aminocyclopropanecarboxylateoxidase

3-oxo-5-beta-steroid4-dehydrogenase

Cytochrome P450 (p450)

Inositol-3-phosphatesynthase

HALOACIDDEHALOGENASE-LIKE

HYDROLASE

Farnesyldiphosphatesynthase

Limonenesynthase

Linoleate9S-lipoxygenase

FK506-bindingprotein

Naringenin3-dioxygenase

Secogolaninsynthase

4-coumarate--CoAligase (4CL)

coniferyl-alcoholglucosyltransferase

CaffeateO-methyltransferase

Acyl-[acyl-carrier-protein]desaturase 1

NADDEPENDENTEPIMERASE

SUGARUTILIZATION

REGULATORYPROTEIN IMP2

TRIOSEPHOSPHATEISOMERASE

GLYCEROL-3-PHOSPHATEDEHYDROGENASE

[NAD(+)]

L-ascorbateperoxidase

Betaine-aldehydedehydrogenase

PECTINESTERASE

Universal stress protein family

(Usp)

PROTEIN(PAC:37713294)

alpha,alpha-trehalase(E3.2.1.28, treA,

treF)

Cell cycle arrest protein BUB3

(BUB3)

Beta-ketoacyl-[acyl-carrier-protein]synthase III

Arsenicalpump-driving

atpasearsenite-translocating

atpaseGeraniol

dehydrogenase(NADP(+))

gplastocyanin

(petE)

ATP SYNTHASE DELTA CHAIN

Phosphoglyceratekinase (PGK)

SERINE

ProteinsTranscripts

Edge

-0.97 0.99

Fig. 3 sPLS network using the transcript matrix as the predictor matrix and the protein matrix as the responsematrix. Edge legend represents the weight of the links between proteins and transcripts. Network represen-tation was drawn employing Cytoscape

Protein Interaction Networks 31

5. The sPLS analysis of the mixOmics package will be done asfollows. To do the PLS analysis, simply change the spls functionto pls.

spls.analysis <- spls(Xmatrix,Ymatrix,ncomp=4,max.iter=2500)

6. Select the two components that best explain your experimentalmodel (normally components 1 and 2). Use the plots and otherfunctions included in mixOmics (see Le Cao et al. [22] formore information) to decide the components to used.

(a) For visualize plots: Create a vector of colors for yourexperiment (pal) to improve the visualization and use theplotIndiv function to make the plots (seeNote 9). Changethe number of treatments and replicates to the one that fitsyour experiment, in this case three treatments with threereplicas each.

treatmentnumber <- 3

Replicates <- 3

pal <- brewer.pal(treatmentnumber,"Set1")

pal <- rep(pal,each=replicates)

plotIndiv(spls.analysis,comp=1:2,cex=4,col=pal,rep.space="XY-

variate",pch=16,Y.label="comp 2", X.label="comp 1")

plotIndiv(spls.analysis,comp=3:4,cex=4,col=pal,rep.space="XY-

variate",pch=16,Y.label="comp 4", X.label="comp 3")

7. Make the sPLS network with the two selected components, inthis case components 1 and 2 (this is set in the “comp” argu-ment, e.g., comp ¼ 1:2). The cutoff selected for this network is0.7 (it will only include interactions greater than � 0.7). Thecutoff will depend on each experiment. We recommend the useof low cutoff for network creation to export to Cytoscape andincrease this cutoff later in this one. Sometimes the output maynot show with Rstudio because of margin issues. The plot canbe saved as an image using the argument pdf.save and name.save. Example for pdf function can be seen below.

net <- network(spls.analysis, comp = 1:2, color.node = c("white","pink"),

cutoff=0.7, shape.node = c("rectangle", "rectangle"),show.color.key = TRUE)

(a) Example for pdf function:

pdf(file="networksPLS_0.7.pdf")

net <- network(spls.analysis, comp = 1:2, color.node = c("white","pink"),

cutoff=0.7,shape.node = c("rectangle", "rectangle"),show.color.key = TRUE)

dev.off()

32 Monica Escandon et al.

8. Exports the network in a format suitable for opening withCytoscape.

write.graph(net$gR, file = "network_sPLS(0.7).gml", format = "gml")

9. Continue the protocol with Cytoscape workflow (Subheading3.5.1).

Data-Driven Integration

and Differential Network

Analysis (xMWAS)

PLS algorithm and its variants have many advantages, some of themespecially useful for omics dataset research as stated in Subheading3.2.2.1, but they are restricted to two different omics layers. Statis-tical based integrative networks aim to collect the relations amongall the studied omics layers of molecular information, “the more,the better,” to get the more accurate and complete network. To thisend, other tools have been developed, expanding sPLS algorithm,which allow for the integration of more levels (up to four). xMWASuses this method of integration and performs network analysis toallow for visualization of positive or negative associations betweendifferent datasets [8].

In this section, xMWAS package has been used to obtain net-works of interaction of three omics data: metabolites, proteins, andtranscripts. An example of the xMWAS network is represented inFig. 4.

protein 7

protein 34

metabolite 247

metabolite 246

metabolite 189

metabolite 253

metabolite 172

metabolite 125

metabolite 254

metabolite 228

protein 61

metabolite 94

metabolite 243

metabolite 192

metabolite 232

metabolite 221

protein 17

metabolite 174

protein 43

metabolite 159

metabolite 151

metabolite 170protein 18

protein 47metabolite 132

metabolite 138

metabolite 7

metabolite 20

metabolite 17

metabolite 140

metabolite 113

metabolite 28

protein 63

metabolite 147metabolite 12

metabolite 144

metabolite 121

metabolite 206metabolite 168

protein 44

metabolite 201 metabolite 184

metabolite 122

metabolite 226

metabolite 210

metabolite 242

protein 51

metabolite 133

protein 49

metabolite 194

metabolite 129

protein 12protein 10metabolite 9

protein 8protein 13

protein 6protein 1protein 14

protein 41

metabolite 126

protein 42

metabolite 163protein 40

metabolite 187

metabolite 127

metabolite 239

metabolite 233

metabolite 224

metabolite 134metabolite 137metabolite 202

metabolite 225

metabolite 218

metabolite 185

metabolite 231

metabolite 143

metabolite 124metabolite 158

metabolite 153

metabolite 164

metabolite 142

metabolite 183metabolite 181

metabolite 179metabolite 175

metabolite 198

metabolite 251

transcript 22

metabolite 139 metabolite 195

transcript 21

metabolite 190

metabolite 148

metabolite 116

metabolite 220metabolite 169

transcript 20

protein 28

metabolite 222metabolite 152

transcript 19

transcript 14

metabolite 136

metabolite 145

metabolite 188

metabolite 205

metabolite 215

metabolite 208

metabolite 241

metabolite 214

metabolite 237

metabolite 197metabolite 230

metabolite 213

metabolite 193

metabolite 128

metabolite 211metabolite 240

metabolite 199metabolite 130

transcript 18 metabolite 157

metabolite 123

metabolite 238

metabolite 103

metabolite 217

metabolite 52

metabolite 74

metabolite 51

metabolite 39

metabolite 59

metabolite 80

metabolite 62

metabolite 82

metabolite 84

metabolite 78

metabolite 90

metabolite 73

metabolite 50

metabolite 66

metabolite 70

metabolite 54

metabolite 57

metabolite 249metabolite 141

metabolite 248metabolite 262metabolite 166

metabolite 165

metabolite 219 metabolite 216

transcript 15

transcript 17metabolite 167

metabolite 173 transcript 13

metabolite 69

metabolite 44metabolite 88

metabolite 36

metabolite 75

metabolite 259

metabolite 257metabolite 180 metabolite 258

metabolite 203metabolite 244

metabolite 207 metabolite 260

transcript 16

metabolite 160

metabolite 162

metabolite 16

metabolite 1

metabolite 10protein 39

metabolite 5

metabolite 15

metabolite 2

metabolite 18

metabolite 4

protein 48protein 36

protein 22

protein 21

protein 20protein 23

metabolite 261metabolite 255

metabolite 100

metabolite 112

protein 58metabolite 110

metabolite 104metabolite 106

protein 54

metabolite 114

metabolite 96

transcript 4

metabolite 37transcript 7

transcript 9

metabolite 34transcript 10

transcript 3transcript 2

metabolite 35transcript 1

metabolite 119transcript 12 metabolite 115

transcript 5

transcript 6

metabolite 98

protein 50transcript 8

protein 37protein 26

metabolite 131

protein 31protein 27

metabolite 171

metabolite 191protein 52

protein 56metabolite 229

metabolite 60

metabolite 108

protein 60protein 59

metabolite 97protein 46

metabolite 101

protein 53

metabolite 102

metabolite 109

metabolite 8

metabolite 11metabolite 3

protein 2

metabolite 13

metabolite 155

protein 29 metabolite 252

metabolite 196

protein 33

metabolite 161

metabolite 156

metabolite 176

protein 19protein 45protein 9

metabolite 30

metabolite 29

metabolite 24

metabolite 32

metabolite 33

metabolite 25

metabolite 31

metabolite 22

metabolite 19protein 11

metabolite 23

metabolite 26

metabolite 27metabolite 21

metabolite 99

metabolite 150 metabolite 177

metabolite 95

metabolite 212

metabolite 200

metabolite 250

protein 32

protein 38

protein 35metabolite 227

protein 25metabolite 256

metabolite 178metabolite 14

protein 24

protein 57

metabolite 6

protein 62

metabolite 105

metabolite 154

protein 30

metabolite 223

metabolite 107

metabolite 245protein 3

metabolite 120

protein 5

protein 15protein 55

metabolite 146

protein 65

metabolite 117metabolite 135 protein 64

metabolite 111protein 4metabolite 118

metabolite 149

metabolite 38

metabolite 46

metabolite 41metabolite 40

metabolite 53

metabolite 42metabolite 87

metabolite 89

transcript 11metabolite 68

metabolite 83metabolite 61

metabolite 79

metabolite 47

protein 16

metabolite 81

metabolite 85

metabolite 56metabolite 63

metabolite 93

metabolite 72

metabolite 64metabolite 77

metabolite 92metabolite 65

metabolite 45

metabolite 55

metabolite 49

metabolite 76

metabolite 71

metabolite 58

metabolite 67metabolite 48

metabolite 91metabolite 86metabolite 43

ProteinsTranscriptsMetabolites

Fig. 4 xMWAS, multi sPLS network using transcripts, proteins, and metabolites. The clusters are formed by the500 most relevant variables of each omics dataset, and its presence is greater than 20% along the treatments.Negative connections are shown in red and positive connections in blue. Network representation was drawnemploying Cytoscape

Protein Interaction Networks 33

Workflow 1. Install all necessary packages.

(a) Install R dependencies.

source("https://bioconductor.org/biocLite.R");

biocLite(c("GO.db","graph","RBGL","impute","preprocessCore"),dependencies=TRUE);

install.packages(c("devtools","WGCNA","mixOmics","snow","igraph","plyr","pl

sgenomics"),dependencies=TRUE,type="binary", repos="http://cran.r-project.org")

(b) Install R package xMWAS.

library(devtools); install_github("kuppal2/xMWAS")

2. Load all necessary packages:

library(xWMAS)

3. Introduce the necessary datasets (in this case, the threematrixes described in materials): The transcripts (Fig. 1c), pro-teins (Fig. 1a) and metabolites (Fig. 1b). First, in the sessiontab of R, set working directory where are all matrixes of thedatasets to use.

transcripts<- read.table("contigs.csv", header = T, row.names=1, sep=";")

proteins<- read.table("proteins.csv", header = T, row.names=1, sep=";")

metabolites<- read.table("metabolites.csv", header = T, row.names=1, sep=";")

4. Create a dataframe describing the samples, one column withthe sample names and another for the treatments, and list itwith the previous datasets:

SampleID <- colnames(metabolites)

Class <- rep(c(“Control”, ”Treatment1”, “Treatment2”), each = 3)

classlabels <- cbind(“SampleID” = SampleID, “Class” = Class)

dataset <- list(“transcripts” = transcripts, “proteins” = proteins, “metabolites” =

metabolites, “classlabels” = classlabels)

5. Create the path for the outputs, in the working directory.

output <- getwd()

6. Run the function (it may take a while). For further argumentdetails (see Note 10). In the following example regressionmode of sPLS has been used, selecting the 500 most relevantvariables of each omics dataset and ten components. Tran-scripts, proteins, and metabolites have been depicted in RStu-dio Viewer Window as gold rectangles, green circles, and cyantriangles, respectively. All those variables with a presence lessthan 20% along the treatments have been discarded.

34 Monica Escandon et al.

xmwas_res<-run_xmwas(Xome_data = dataset$Transcripts, Yome_data =

dataset$proteins, Zome_data = dataset$metabolites, outloc = output, classlabels =

dataset$classlabels, xmwasmethod = "spls", plsmode = "regression", max_xvar =

5000, max_yvar = 5000, max_zvar = 5000, rsd.filt.thresh = 1, corthresh = 0.5,

keepX = 500, keepY = 500, keepZ = 500, pairedanalysis = FALSE, optselect =

FALSE, rawPthresh = 0.1, numcomps = 10, net_edge_colors = c("blue","red"),

net_node_colors = c("gold", "green", "cyan"), net_node_shape =

c("rectangle","circle","triangle"), all.missing.thresh = 0.2, seednum = 100,

label.cex = 0.2, vertex.size = 6, graphclustering = TRUE, interactive = FALSE,

max_connections = 10000, centrality_method = "eigenvector", use.X.reference =

FALSE, removeRda = TRUE)

7. Continue the protocol with Cytoscape workflow (Subheading3.5.1) by opening the gml file created in the output directory(Multidata_Network_threshold0.59cytoscapeall.gml).

DIABLO DIABLO is a Data Integration Analysis for Biomarker discoveryusing a Latent component method for Omics studies. DIABLOmodels maximize the correlation between pairs of prespecifiedomics datasets to unravel similar functional relationships betweenthose omics data [23]. As the most practical approach, DIABLOallows to select relevant correlated and discriminatory biomarkers,using synthetic variables as well as multi-omics datasets [2]. DIA-BLO builds on Projection to Latent Structure models [19], sub-stantially extends both sparse PLS-DA [5] to multi-omics analysesand sparse generalized canonical correlation analysis [24] to adiscriminant analysis framework. An example of the DIABLO net-work is represented in Fig. 5.

Caffeoyl-CoAO-methyltransferase

Contig 14494

Contig 58633

Contig 60576

Contig 60614

Os10g0434900protein

Peroxidase

Putativesugar

phosphate/phosphatetranslocator

p599

p635

n908

p532

n575p661

n10

n229

n1239 n974

p634p598

n3283/n1295

1,2,4-trithiolanen540

RmlC-likecupins

superfamilyprotein

Contig 28611

Contig 54211Disease

resistanceresponseprotein

W.nobilis(R.T.26054_591)

RNA sequence

Contig 14852Contig 48980Cytochrome

b5

Anthocyanidinsynthase

Contig 32263Benzylalcohol

O-benzoyltransferaseContig 72233

Contig 28711

NAD-dependentmalic enzyme

2

Contig 44151

Contig 39406

Isopentenyladenosine

Delta-1-pyrroline-5-carboxylatesynthetase

Contig 42439Contig 72015

SelenoproteinO

Contig 46910Serine/threonine-protein

kinaseAurora-3

Contig 50072

Glycine-richRNA-binding

proteinCastasterone n75 n1082

n143

p610

p1502

E3ubiquitin-protein

ligase BAH1

Salicilic acidp275

p224

Dihydrokaempferol

n1142

n962

3-Caffeoylpelargonidin5-glucoside

Pyridoxine

Contig 35628

Contig 49804

GibberellinGA7

Contig 50114

Contig 61535

Transmembraneprotein,putativeProtein

DETOXIFICATION

Cellulosesynthase-like

H2

n34O-glycosylhydrolasesfamily 17 protein

Predictedprotein

ProteinsTranscriptsMetabolites

Edge

-1.00 1.00

Fig. 5 DIABLO network showing interactions between different omic levels. Edge legend represents the weightof the links between proteins, transcripts, and metabolites. Network representation was drawn employingCytoscape

Protein Interaction Networks 35

Workflow 1. Install all necessary packages.

install.packages(“mixOmics”)

install.packages(“snow”)

install.packages(“(RColorBrewer”)

2. Load all necessary packages:

library(mixOmics)

library(snow)

library(igraph)

3. Introduce all dataset (in this case, the three matrixes) individu-ally. These matrixes are named as proteins as protein matrix(Fig. 1a), metabolites for metabolites matrix (Fig. 1b) andcontigs as transcripts matrix (Fig. 1c) (see Note 11). In thesession tab of R, set working directory where are all matrixes ofthe datasets to use. The csv matrix files are the individualmatrices of each data set stored in Excel as csv (see Fig. 1 andSupplementary dataset S1).

metabolites <- read.table("metabolites.csv", header = T, row.names=1, sep=";")

proteins <- read.table("proteins.csv", header = T, row.names=1, sep=";")

contigs <- read.table("contigs.csv", header = T, row.names=1, sep=";")

metabolites <- as.data.frame(t(metabolites))

proteins <- as.data.frame(t(proteins))

contigs <- as.data.frame(t(contigs))

4. Check that all matrices have the same dimension (number ofsamples). The first number must be the same for the fourmatrixes in lapply(datasetsdiablo,dim).

datasetsdiablo <-list(met=t(metabolites),pro=t(proteins),mrna=t(contigs))

lapply(datasetsdiablo,dim)

5. Create the diablovectorY, a vector with elements as number ofrow: for example, three treatments (C, T1, and T2) with threereplicates (each ¼ 3) each equal to a diablovectorY of 9) andestablish ncomp. The ncomp is equal to the number of levels(treatments) minus 1 (e.g., 3–1 ¼ 2).

diablovectorY <- rep(c("C","T1","T2"),each=3)

ncomp = 2

6. Prepare the functions with which DIABLO chooses the opti-mal number of variates for each data set:

(a) Design matrix: The matrix design determines whichblocks (variates) should be connected to maximize thecorrelation or covariance between components. Thevalues may range between 0 (no correlation) to 1 (correla-tion to maximize), in a symmetrical matrix with a diagonalof 0. In the example, we choose a design where all the

36 Monica Escandon et al.

blocks are connected with a link of 1 (seeNote 12). Checkthat design is a matrix of 1 (or the chosen value in eachcase) with a diagonal of 0.

design <- matrix(1,ncol=length(datasetsdiablo),

nrow=length(datasetsdiablo),dimnames = list(names(datasetsdiablo),

names(datasetsdiablo)))

dimnames = list(names(datasetsdiablo), names(datasetsdiablo))

diag(design)=0

design

(b) List.keepX: This tuning function should be used to tune thekeepX parameters in the block.splsda function. We choosethe optimal number of variables to select in each data setusing the tune function, for a grid of keepX values (seeNote13). First we set a range of values to take for each variablewith test.keepX (e.g., we have set5,6,7,8,9,10,12,14,16,18,20,25,30,35,40,45,50). Meta-bolites are usually highly correlated with respect to theother matrices. Expand the possible number of variablesto include of these dataset if necessary (e.g., to 100 meta-bolites (see Note 14)).

test.keepX <- list ("met" = c(5:9, seq(10, 18, 2), seq(20,50,5)),"prot" =

c(5:9, seq(10, 18, 2), seq(20,50,5)),"mrna" = c(5:9, seq(10, 18, 2),

seq(20,50,5)))

optimal.variables <- tune.block.splsda(X=datosdiablo,

Y=diablovectorY, ncomp=2, test.keepX = test.keepX, design =

design,nrepeat=1, cpus=1, folds=2)

This process is long (more than 3000 models being fittedfor each component and each nrepeat) and it is recommendedto have a computer with more than one cpu. If you have acomputer with six cpus, change cpus ¼ 1 to cpus ¼ 6.

list.keepX <- optimal.variables $choice.keepX

list.keepX

You get a list of the variables selected for each dataset forthe ncomp (e.g., $met [1] 50 [2] 10 $prot [1] 8 [2] 10 $mrna[1] 10 [2] 12). This means that the optimal variables formetabolites are in component [1] 50 metabolites, component[2] 10 metabolites; those for proteins are in component [1] 8proteins, component [2] 10 proteins; and those for mRNA arein component [1] 10 transcripts, component [2] 12 tran-scripts. Alternatively, you can manually input those parameters(see Note 15).

7. Make the DIABLO with the selection of variables made previ-ously (datasetsdiablo, diablovectorY, list.keepX, and design).

Protein Interaction Networks 37

DIABLOanalysis <- block.splsda(X= datasetsdiablo, Y=diablovectorY,

ncomp=2, keepX=list.keepX, design=design)

8. Once the DIABLO is done, make the corresponding networkas follows. Cutoff is the parameter that restricts the degree ofcorrelation (ranging from�1 to 1 correlation). With a cutoff of0.8 we only take interactions that exceed this threshold (elim-inating interactions of less than�0.1 to�0.8). Adapt the cutoffto each experiment (more than 0.7 is an acceptablecorrelation).

net <- network(DIABLOanalysis, cutoff =0.8, blocks=c(1,2,3),

row.names=FALSE,col.names =FALSE ,color.node =

c("pink","gold","blue"),shape.node = c("circle","rectangle","circle"))

9. Export the pdf of the network as follows:

pdf("DIABLOanalysis_network_0.8.pdf",10,10)

net<-network(DIABLOanalysis, cutoff =0.8, blocks=c(1,2,3),

row.names=FALSE,col.names =FALSE ,color.node =

c("pink","gold","blue"),shape.node = c("circle","rectangle","circle"))

dev.off()

10. Exports the network in a format suitable for opening withCytoscape.

write.graph(net$gR, file = "DiabloNet(0.8).gml", format = "gml")

11. Continue the protocol with Cytoscape workflow (Subheading3.5.1).

3.3 Biological

Interaction Network

Enrichment

The functional interaction network is very useful for the full under-standing of biological phenomena [25]. For this purpose, there aredifferent databases and programs that help to compile and integrateall protein–protein interactions, including both direct (physical)and indirect (functional) relationships. In this section we willdevelop two workflows using two of the most used databases(STRING and ShinyGO) to obtain networks in species includedand not included in these databases. Initial input data could be thecomplete protein dataset or the differentially expressed proteinsobtained in Subheading. 3.1.

3.3.1 STRING STRING is a Search Tool for Recurring Instances of NeighboringGenes. The STRING [9, 25–27] database compiles all publicsources of information on protein–protein interaction. It is anonline tool available at https://string-db.org/ which allows thecreation of functional networks from protein identification/sequence. The latest version of STRING (11.0) has 5090 organ-isms, of which 56 are plants (Embryophyta) including some tree

38 Monica Escandon et al.

species such as Populus, Eucalyptus, Prunus, Citrus, or Malus. Anexample of STRING resulting networks is represented in Fig. 6.

Workflow 1. Go to the STRING website: https://string-db.org/

2. Introduce your proteins (upload a file or paste in the tab).There are a lot of options in STRING web (included for onlyproteins family), for multiple proteins used:

(a) Option Multiple proteins for the name/identifiers list(Fig. 7a).

(b) Option Multiple sequences for the fasta (or .txt) file ofprotein sequences (less than 2000 sequences). If the fileis larger, split into several files. For species not included inthe organisms available in STRING, we recommend theused of sequences (Fig. 7b, Supplementary S2: Sequencesin txt).

3. Select the Organism. In the case of a plant species not includedin the STRING databases, write plants in the Organism tab andselect Embryophyta.

4. STRING gives you all the plant species included in its databasesordered from the highest to the lowest number of proteins withwhich it targets your sequences/identifier/name protein (nr ofprotein matching in STRING).

5. Select the species with the highest number of nr (number ofproteins in the species that have blast bit scores higher than 60)(see Note 16).

Fig. 6 STRING-based interaction network represented in the webtool (a) and in Cytoscape (b). This networkemployed Vitis vinifera as reference organism and proteins sequences (Supplementary S2). Edge representsconfidence in STRING web network, and the score in Cytoscape-represented network

Protein Interaction Networks 39

6. The STRING will give you a list of the proteins with which itmade target, being able to have more than one identification byprotein. Review all annotations, according to identity and bit-score that gives you the STRING. Choose the optimum or theone that matches your identification. STRING will take bydefault all identifications with the highest bitscore.

7. Once selected the protein identification in the new database ofthe organism, download the document (option MAPPING) tosave which protein corresponds each STRING identification inthe organism database (Supplementary material S3). It will beused later in Cytoscape.

8. There are two possibilities for viewing the network: continuefrom the web tool or go to the Cytoscape and work with theSTRING app available in it.

(a) STRING web tool:

l Click the continue button and STRING makes us aninteraction network for proteins on the web.

l Improve the visualization of your network usingthe tabs below. Here are some examples, but for moresee [26]:

E0CQ31D7SKD8D7THC1Q6B4V4A5AWT3D7U5G1F6HQP9D7U044A5BD13A5ACP0F6HEM8E0CR38F6HAX5D7TGC8D7U1Z1F6H1H4D7TDE2D7TCD0A7NVX9A5B3K2D7UC14A5C3G7etc.

A) B)>protein000006.2MAKVPPKHARDQFQDFEGLLNNLQDWELSFKDKDKRLKSQFVGKDKLDLPAQRHSMNSASQHSNGTGVNEKPPMGKTTALDNLGSGRQYDYMKDYDAIHRLSDGLMEEEAVDANSEKELGNEFFKQKKFNEAIDCYSRSIAFSPTAVAYANRAMAYIKIKRFQEAENDCTEALNLDDRYIKAYSRRSTARKELGKLKESIDDTSFALRLDPHNQEIKKQYAELKSLLEKEILKKASGVAGGSSQGVQREGKLKVEKSKSIHKVQSVSPSSPAGVAEVLKDNSKDREGGAETSMEVESSRLRTHRADMNTSFGNVKIEHKNGEQELKASVQELAARAANLAKAEAAKNISPPNSAYQFEVSWRGLSGDRTLQAHLLKVTPATALPGIFKNALSAPMLVDVIRCIATFFTEDMDLGVKYLENLTKVPRFDMVIMCLSPSDKADLWKIWDEVFSKGTSEYAENLGNLRLKYGVKQ*>protein000016.1MWFSLFVLLIYICYVNSKDGWENRWVKSDWKKDENMAGEWNYTSGKWNGDANDKGIQTSEDYRFYAISAEFPEFSNKGKTLVFQFSVKHEQKLDCGGGYMKLLSGEVDQKKFGGDTPYSIMFGPDICGYSTKKVHAILTYNGTNHLIKKEVPCETDQLSHVYTFILRPDATYSILIDNVEKQSGSLYSDWDLLPPKEIKDPEAKKPEDWDDKEYIPDPEDKKPEGYDDIPKEIPDPDAKKPEDWDDEEDGEWTAPTIPNPEYKGPWKPKKIKNPNYKGKWKAPMIDNPDFKDDPDLYVYPKLKYVGVELWQVKSGTLFDNVLVCDDPEYAKQLAEETWGKQKDAEKAAFDEAEKKREEEESKDDPIDSDAEDGDDDAEDNDTDDDSKSDSTEDEATSVDDDAHDEL*>protein000017.1MFLVDWFYGVLASLGLWQKEAKILFLGLDNAGKTTLLHMLKDERLVQHQPTQYPTSEELSIGKIKFKAFDLGGHQIARRVWKDYYAKVDAVIYLVDAYDKERFAESKKELDALLSDESLATVPFLILGNKIDIPYAASEDELRYHMGLTGITTGKGKVNLADSNVRPLEVFMCSIVRKMGYGDGFKWVSQYIK*>protein000019.1MSNSELLQIEPLELQFPFELKKQISCSLQLTNKSDNYVAFKVKTTNPKKYCVRPNTGVVLPHSTCDVTVTMQAQKEAPPDLQCKDKFLLQSVVVGPGVTTENIKPDVFNKESGNRVEECKLRVSYVPPPQPPSPVREGSEEGSSPRASLSDNGTVNQIPDYNSMSRAYVDSLENTPEIDPC*>protein000041.1MAITSRTPDISGERQSGQDVRTQNVVACQAVANIVKSSLGPVGLDKMLVDDIGDVTITNDGATILKMLEVEHPAAKVLVELAELQDREVGDGTTSVVIIAAELLKRANDLVRNKIHPTSIISGYRLAMREACKYVDEKLAVKVEKLGKDSLVNCAKTSMSSKLIGGDSDFFANLVVEAVQTVKMTNGRGEVKYPIKGINILKAHGKSARDSYLLKGYALNTGRAAQGMPMRVAPARIACLDFNLQKAKMQMGVQVLVTDPRELEKIRQREADMTKERIDKLLKAGANVVLTTKGIDDMALKYFVEAGAIAVRRVRKEDLRHVAKATGATVVSTFADMEGEETFDSSLLGYADEVVEERIADDDVIMIKGTKTTSAVSLILRGANDFMLDEMDRALHDALCIVKRTLESNTVVAGGGAVEAALSVYLENLATTLGSREQLAIAEFAESFLIIPKVLAVNAAKDATELVAKLRAYHHTAQTKADKKHLSSMGLDLGKGTVRNNLEAGVIEPAMSKVKIIQFATEAAITILRIDDMIKLVKDESQNEE*>protein000053.1MNPLTFLRVLGPEPWNVAYVEPSIRPDDSRYGENPNRLQRHTQFQVILKPDPGNSQDLFIRSLSALGINVHDHDIRFVEDNWESPVLGAWGLGWEIWMDGMEITQFTYFQQAGSLQLTPVSVEITYGLERILMLLQGVDHFKKIQYADGITYGELFLENEKEMSAYYLKHASVDNIHKHFDLFEAEARCLLDSGLAIPAYDQLLKTSHAFNILDSRGFVGVTERARYFGRMRSLARQCAQLWLKTRESLGYPLGVTSQSDHIVFPKEVLEEAAGKVSTDPRLFVLEIGTEELPPNEVVNACKQLKDLIEQLLEKQRLSHGKVLTFGTPRRLVVHVHNLYAKQVANEIDVRGPPASKAFDQGGNPTKAAEGFCRRNGVPLGSLFRRVEGKTEYVYVRAVEPSRLALEVLSEELPGTIGKILFPKSMRWNSEVMFSRPIRWILALHGDVVVPFIGNLSHGLRNTPSATVKVASAESYTDVMQRAGIAISMEQRKQTILDSSNALAKSVGGIIILQNDLLDEVANLVEKPVPVLGKFNESFLVLPKDLLIMVMQKHQKYFAITDQGGNLLPYFISVANGAINEMVVRKGNEAVLRARYEDAKFFYEVDTSKRFSEFRSQLNGILFHEKLGTMLDKMTRVQHLVTEVGSSLRVSGDTLQIIKGAASLAMIDLATAVVTEFTSLSGIMARHYALRDGYSEQIAEALFEITLPRFSGDIVPKTDAGTVLAITDRLESLVGLFAAGCQPSSSNDPFGLRRISYCLVQLLVETNRDLDLRHGLELAAAVQPINVAAETIDTVHQFVTRRLEQLLMDQGISPEVVRSVLAERANQPCLATKSAYKMEALSRGELLPKIVEVYSRPTRIVRGKDINDDLEVDEGAFETKEEKALWCTFTSLRTKIRPDMEVDDFVEASSDLLQPLEDFFNNVFVMVEDERIRKNRLALLKKISDLPKGIADLSILPGF*>protein000056.1MEQTFIMIKPDGVHRGLVGEIIGRFEKKGFTLKGLKLITVDRHFAEQHYADLSAKPFFNGLVEYIISGPVTAMVWEGKNVVTTGRKIIGATNPADSAPGTIRGDYAIDIGRNVIHGSDSVESAKKEIALWFPEGIAEWRSSVHQWIYE*etc.

Fig. 7 Dataset Input. (a) Multiple proteins in gene names. (b) Extract of a fasta file with sequences of proteins

40 Monica Escandon et al.

– Meaning of the network edges: There are three displayoptions (evidence, confidence, or molecular action),with evidence as default. Choose according to thelevel of proven interaction evidence you want toconsider. For explanation about the interaction (seeNote 17).

– Minimum required interaction score: Define athreshold for trusting biological interactions(STRING assigns 0.4 by default). With option Cus-tom value, you can increase to more than 0.9 (high-est confidence). Try to find a balance betweenbiological significance and network complexity (seeNote 18).

– Max number of interactors to show: This optionallows you the introduction of intermediate inter-actors, not present in your dataset, which interactswith your proteins and proteins linking them (seeNote 19).

– Display simplifications: Active tabs “hide discon-nected nodes in the network” and “disable structurepreviews inside network bubbles”. This will elimi-nate disconnected nodes and images of the struc-tures within each protein.

l Export you Network in the tab Export (bitmap, vectorgraphic, network coordinates, etc.), as well as all docu-ments generated or used (protein sequences, XMLsummary, simple tabular text output, etc.).

(b) STRING Cytoscape App:

l Installs the STRING App using Cytoscape’s Appmanager.

l File->Import ->Network -> from Public Databases.

– Data Source: STRING: Protein query.

– Species: Select the Organism previously selected inSTRING web (the species with the highest numberof nr and that we have the MAPPINGdownloaded).

– Enter the protein names or identifiers (the column“preferredName” of the STRING mapping file,Supplementary material S3, downloaded in point7 of this workflow).

– Leaves the default values for Confidence (score) cut-off (0.4) and Maximum number of interactors (0).

– Select Layout: Apply preferred layout to order thenetwork.

Protein Interaction Networks 41

l STRING creates the default network. The edge attri-butes include the overall confidence score (see Note17). This network will depend on the studio species.For an example of how to proceed according to thenetwork obtained, see Note 20.

l See Subheading 3.5.1 Workflow for information toimprove the network visualization.

3.3.2 ShinyGO ShinyGO [28] is an intuitive and graphical web platform with morethan 200 species of plants and animals, annotated from GO andother databases, which allows for the graphical visualization ofenrichment results and with access to a program interface (API)to STRING to make protein–protein interaction networks [28]. Anexample of the ShinyGO networks is represented in Fig. 8.

Workflow 1. Go to the ShinyGO website: http://bioinformatics.sdstate.edu/go/.

2. Select the Organism. Default ¼ Best matching species. OptionSelect or search for your species option.

3. Introduce your proteins. Upload a file with your protein listwhich is the list of genes that encode for those proteins (ACC1,LHY, HSA32, FTSH6, CHS, etc.) in the option Paste genes:(a) All protein dataset (list with all genes that encode all

proteins).

Fig. 8 ShinyGO networks using the Website application. (a) Protein Network with enrichment of Go BiologicalProcess (all proteins dataset) and (b) Protein response to stress Network with enrichment of a selected GoCategory (Go category: response to stress) (only proteins in category were introduced)

42 Monica Escandon et al.

(b) Differential expression proteins (e.g., Upregulated orDownregulated list of genes that encode those proteins).

(c) Selected proteins in a GO biological process or KEGGpathway (in the Groups tab you can see the proteinsincluded in these classifications once they have beenuploaded to ShinyGO).

4. Select the category you wish to analyze (GO Biological Pro-cess, KEGG pathways, GO Molecular Function, or others).You can change this option at any time in the tab on the leftand it will change the network obtained in the tab on the right.

5. Set a P-value cutoff (FDR). Default ¼ 0.05 (see Note 21).

6. Set the number of most significant terms to show. Default¼ 30(see Note 22).

7. Click Submit.

8. Click on the Network tab to get an enriched term visualized asa network. You can click on Change layout, to change thenetwork view. Furthermore, you can change organisms sinceShinyGO shows all Matched Species (genes), for example tochoose a species with less matched but closer phylogenetically.Repeat the steps by choosing this new species.

9. Click on the download button to save your interaction net-works into a local file (see Note 23).

3.4 Merged

Functional

and Statistical

Interaction Networks

Once experimental statistical and biological based networks arealready depicted separately, they can be merged to deeper under-stand our system. Following this approach, we will be able tovisualize in one graph already known biological information andexpand it with our experimental results. This depiction can be alsoused to check and validate our experimental network reliability.These networks can be merged either by using Cytoscape built-infunction or manually. An example of the Merged networks is repre-sented in Fig. 9.

Create a STRING Network

for Merged Networks

1. Go to STRING web page (version 11.0) to “Proteins withvalues/Ranks”.

2. Introduce a list containing our experimental proteins and onenumeric value as fold change as calculated in Subheading 3.1.1.The matrix with protein identifiers must be compatible withSTRING database (see Subheading 3.3.1).

3. Select the organism you are working with.

4. Click Search.

5. Download the mapping document and continue.

6. Download the depicted network.

Protein Interaction Networks 43

7. Network visualization can be improved employing Cytoscapeor Gephi (see specific workflows in Subheading 3.5).

3.4.1 Cytoscape Merged

Functional and Statistical

Interaction Network

Workflow

1. Open already built networks in the same Cytoscape session.Example: STRING network and sPLS network.

2. Cytoscape > Tools > Merge > Networks and select the net-works to combine.

3. Choose between union, intersection, and difference.

4. Mark Enable merge nodes/edges in the same network.

5. Select the matching columns.

6. See Cytoscape workflow (Subheading 3.5.1) to improve thenetwork visualization.

Protei n 201Protei n 133

Protei n 176

Protei n 23

Protei n 11

Protei n 130

Protei n 200Protei n 12

Protei n 150

Protei n 7

Protei n 128

Protei n 235

Protei n 182

Protei n 218

Protei n 210

Protei n 221

Protei n 224

Protei n 227

Protei n 129

Protei n 191Protei n 204

Protei n 202

Protei n 56

Protei n 228

Protei n 205

Protei n 232

Protei n 209

Protei n 190

Protei n 189

Protei n 29

Protei n 170

Protei n 17

Protei n 206Protei n 26

Protei n 1

Protei n 131

Protei n 35

Protei n 20

Protei n 34

Protei n 214

Protei n 5

Protei n 39Protei n 10

Protei n 212

Protei n 153

Protei n 18

Protei n 9

Protei n 187

Protei n 8

Protei n 44

Protei n 52

Protei n 108

Protei n 54

Protei n 45

Protei n 98

Protei n 148

Protei n 171

Protei n 53

Protei n 192

Protei n 146

Protei n 89

Protei n 155Protei n 4

Protei n 25

Protei n 6

Protei n 16

Protei n 32Protei n 38

Protei n 30Protei n 28Protei n 107

Protei n 21Protei n 14Protei n 13

Protei n 33

Protei n 15

Protei n 27Protei n 36

Protei n 24Protei n 186

Protei n 156

Protei n 166

Protei n 220

Protei n 2

Protei n 151

Protei n 134

Protei n 3

Protei n 19

Protei n 154

Protei n 37

Protei n 216

Protei n 22

Protei n 132

Protei n 195 Protei n 207

Protei n 231Protei n 219

Protei n 167

Protei n 181

Protei n 55

Protei n 168

Protei n 50

Protei n 31

Protei n 157

Protei n 152

Protei n 109

Protei n 196

Protei n 183

Protei n 163

Protei n 81

Protei n 217

Protei n 161

Protei n 84Protei n 97

Protei n 169

Protei n 101

Protei n 92

Protei n 86

Protei n 103

Protei n 223

Protei n 105

Protei n 194

Protei n 87

Protei n 172Protei n 85

Protei n 49Protei n 145

Protei n 62Protei n 106 Protei n 48

Protei n 147

Protei n 47

Protei n 51

Protei n 94

Protei n 77Protei n 102

Protei n 100

Protei n 79

Protei n 80

Protei n 162

Protei n 93

Protei n 71Protei n 70

Protei n 64Protei n 165Protei n 74

Protei n 126

Protei n 139

Protei n 193

Protei n 138

Protei n 184Protei n 149Protei n 136

Protei n 188Protei n 112

Protei n 43

Protei n 222

Protei n 226

Protei n 233

Protei n 213

Protei n 174

Protei n 215

Protei n 160

Protei n 159

Protei n 211

Protei n 185

Protei n 127

Protei n 115

Protei n 121Protei n 124Protei n 69

Protei n 177

Protei n 40

Protei n 137

Protei n 59

Protei n 120

Protei n 230

Protei n 225

Protei n 75Protei n 175Protei n 198

Protei n 118

Protei n 140

Protei n 116

Protei n 76Protei n 68

Protei n 114

Protei n 208

Protei n 203

Protei n 66Protei n 173

Protei n 41 Protei n 144

Protei n 113Protei n 83

Protei n 46Protei n 95

Protei n 111

Protei n 90Protei n 110

Protei n 99Protei n 164Protei n 88

Protei n 65Protei n 61

Protei n 60

Protei n 58

Protei n 72

Protei n 63Protei n 141

Protei n 142

Protei n 143Protei n 78

Protei n 96 Protei n 104Protei n 73Protei n 91

Protei n 125Protei n 67Protei n 82

Protei n 122Protei n 57

Protei n 135

Protei n 123

Protei n 42

Protei n 197

Protei n 158Protei n 117

Fig. 9 Merged Network combining two different mathematical methods (sPLS, ARTIVA) and STRING network.sPLS edges are shown in blue, ARTIVA in red, and STRING in green. Network representation was drawnemploying Cytoscape

44 Monica Escandon et al.

3.4.2 Manually Merged

Functional and Statistical

Interaction Network

Another choice, more flexible, is pasting both table networks, justconcatenating data. The only requirement for the subsequent visu-alization to work is to verify that node names are the same in bothnetworks. In this way, we will able to create a consensus network ofdifferent statistical analyses as those mentioned above or comple-mentary annotation tools. This network table will be compatiblewith both visualization tools described below. An example of theresulting network is shown in the Fig. 9.

Workflow 1. Check all the nodes have coherent identifiers across the differ-ent networks.

2. Paste the network tables in the spreadsheet maintaining thesource origin. Please note that whether directed (sPLS) andundirected (STRING) edges are mixed together all the result-ing network should be treated as undirected.

3. Add a column “Method” to distinguish the provenance of eachtype of relations (sPLS, STRING, ARTIVA, . . .).

4. Save the table document (Table 1) in xlsx format for opening itwith a network visualization tool (Subheading 3.5.1).

3.5 Network

Visualization Tools

Two workflows are presented for two most used and freely accessi-ble platforms to visualize networks: Cytoscape and Gephi. Net-works can be introduced to the visualization programs in amultiple variety of formats, the two most widely used are as atable (.csv, .xlsx) and as a gml file.

Table 1Example of input table combining two different mathematical methods (sPLS, ARTIVA) and STRINGnetwork

Source Target Weight/CoeffMean Relation Method

Contig_16059_6_5 Contig_00060_5_2 0.705818839393222 Positive sPLS

Contig_59320_4_6 Contig_00060_5_2 0.763347400356101 Positive sPLS

Contig_01307_5_8 Contig_00109_4_9 0.763094905388721 Positive sPLS

Contig_03989_5_4 Contig_00109_4_9 0.74432961834302 Positive STRING

Contig_04066_5_5 Contig_00109_4_9 0.72448590539238 Negative STRING

Contig_04206_5_1 Contig_00109_4_9 0.867245688624438 Positive STRING

Contig_04418_4_3 Contig_00109_4_9 0.777849239004387 Positive STRING

Contig_04940_5_3 Contig_00109_4_9 0.757539908309518 Positive ARTIVA

Contig_05478_5_9 Contig_00109_4_9 0.712740724773293 Negative ARTIVA

Contig_06439_5_1 Contig_00109_4_9 0.725660528940782 Positive ARTIVA

Protein Interaction Networks 45

3.5.1 Cytoscape Cytoscape is an open source software project for integrating bio-molecular interaction networks with high-throughput expressiondata and other molecular states into a unified conceptual frame-work [29]. Some examples of resulting networks are shown in theFigs. 2, 3, 4, 5, 6b, and 9.

Workflow 1. Go to the Cytoscape website: https://cytoscape.org/ anddownload the program (v. 3.7.1 or upper).

2. Open Cytoscape program.

3. Import a network from a file, for example those generated inprevious sections (Subheadings 3.2.1, 3.2.2.1, 3.2.2.2,3.2.2.3, 3.3.1 or 3.4.2):

File > Import > Network from file > Select desired file oropen desired network from file (see Note 24).

4. Order the network by applying a layout. In the tab Layoutselect apply preferred layout to change the network visualiza-tion to a prefuse force directed layout (width is used as default).This layout is a good base for visualizing your data, but Cytos-cape gives other types of layout that can better fit your data(Grid, hierarchical, circular, . . .) and prefuse force directedlayout for others attributes as width, weigth, or labelcex. Net-work display is obtained with this option.

5. Now there will be some optional improvements that can helpto the compression and visualization of the network:

(a) Graphically analyze the network (the network containsonly undirected edges):

Tools > NetworkAnalyzer > Network Analysis >Analyze network.

l In the Result Panel click in Visualize Parameters toopen the windows to change it.

l Map Node size and Map Node Color: For continuousvariables as size it can be selected a variable asRadiality.Click on Map Node size/Color and select the attributeRadiality. This way networks node size will be accord-ing to their relevance in the network (more connectionmore size or color). For Map Edge Size or Color, selectthe attribute Weight. This way networks edge will beaccording to their interaction weight in the network(from �1 to 1, for default Cytoscape have selected redfor negative values and blue for positive values).

l Apply and close the tab.

(b) Apply desired style to nodes (color, shape, width, label) byright tab “Control panel” selecting the Style tab andchange down to “node.”

46 Monica Escandon et al.

l Change the name of your nodes: To improve theunderstanding of the network, it is advisable to changethe names of the nodes to the name of the identifica-tions. In the Label tab, change the selected column(by default in the “name” column) to the column inyour table where you have the ID. If you do not alreadyhave it in your table, you can add the column asfollows:

– Export the names of your nodes to include in a newdocument.

File > Export > Table for file > Select “defaultnode” in the .csv format.

– Create a new document with two columns (Supple-mentary S4): the first column with node namesexported previously (it is required to match theCytoscape table) and the second column with allyou identification name in the column (e.g., name:contig542; identification: Aquaporin PIP2). Savethe document in .csv or .xlsx format.

– Import the table to your Cytoscape network:

File > Import > Table for file > Select the newdocument created with the two columns.

– Change in Label (Control panel, Style, node tab) tothe column with the identification.

(c) Apply desired style to edges (color, arrow, width) by righttab “Control panel” selecting “edge” menu.

l Legend to edges: In “Stroke Color” you can see therange of colors given for the edges, as well as themaximum and minimum value (weight). Use it to cre-ate a legend to your network.

(d) Increase the cutoff to reduce the size of your network andkeep only the strongest connections:

l In the Control panel, tab select, click in the + and select“column filter”.

l Select “Edge:weight”. Now it will show you the rangeof your data, and you will be able to select a moreadjusted one. Example:

Select Edge:weigth, “is not” and between (�0.97and 0.97), click to Apply. This selects the edges outsidethat range, then:

File > New network > From selected nodes,selected edges.

l The network has been created with the nodes onlyincluded outside that range (all connections between

Protein Interaction Networks 47

�0.97 and 0.97 were eliminated) and the new networkwill appear in the Control panel in the tab Networks.

(e) All nodes can be moved and placed in the most under-standable position. As well as select groups ofinterconnected nodes to move together or make extrac-tions from your own network. For further details of net-work depiction and customization (see Note 25).

6. Export the network as pdf File > Export > Network to image> Export file format: pdf.

3.5.2 Gephi Gephi is an open source software that allows you to visualize,manipulate, and explore all types of graphics and networks for alltypes of data (a general-use platform), while Cytoscape is com-monly used in the biology domain with specific apps and pluggings.Gephi have some advantages over Cytoscape like have a good presetand an integrated statistical analysis module. Furthermore, Gephi isrecommended for visualize large networks (up to 100,000 nodes).An example of network created with Gephi is shown in Fig.10.

Workflow 1. Go to the Gephi website: https://gephi.org/ and downloadthe program (v.0.9.2 or upper).

2. Open Gephi program.

3. Select New project.

4. Open your network:

(a) From spreadsheet: File (e.g., from sPLS analysis)> Importspreadsheet > Select desired file or open desired networkfrom file.

(b) From a .gml format File > Open > Select desired file.

5. Follow program instructions and import your table as Edgestable. Youwill find imported data inWindow>Context>DataTable, where you can edit your data input as well as add newnode or link information about annotations or labels.

6. Create the first view of your raw network withWindow > graph.

7. Order the network byWindow>Layout>Force Atlas2>Run.This layout is one of the most used, being nice starting point tounderstand the specific topology of your network.

8. Return to graph sheet and, once your network is alreadyspread, go to layout page and stop the Force atlas algorithm.To configure the spread rate of the layout (see Note 26).

9. Graphical analysis can be done by Window > Statistics.

10. Apply the desired style to the network inWindow > Appearance.

48 Monica Escandon et al.

(a) Node color: Change the color of the nodes in the Appear-ance>Nodes menu, click on the palette icon and selectamong “unique” for all nodes in the same color, “Parti-tion” for make color clusters of nodes (attending to forexample a statistical network parameter (Choose and attri-bute drop-down) as degree, or cluster coefficient) or“Ranking” for a continue color scale. Click Apply tosave the changes and see the result.

(b) Node size: change node size in the next right icon; possi-bilities are a unique value for all nodes and a Ranking, as acontinuous sizing scale. Click Apply to save the changesand see the result.

(c) Change edge color following the first part of point10 “Node color” in the Appearance > Edges menu.

Protein 1

Protein 2

Protein 3

Protein 4

Protein 5

Protein 6

Protein 7

Protein 8Protein 9Protein 10

Protein 11

Protein 12

Protein 13

Protein 14

Protein 15

Protein 16

Protein 17

Protein 18

Protein 19

Protein 20

Protein 21

Protein 22

Protein 23

Protein 24

Protein 25

Protein 26

Protein 27

Protein 28

Protein 29

Protein 30

Protein 31Protein 32

Protein 33

Protein 34

Protein 35Protein 36

Protein 37

Protein 38

Protein 39

Protein 40

Protein 41

Protein 42

Protein 43

Protein 44 Protein 45

Protein 46

Protein 47

Protein 48

Protein 49

Protein 50

Protein 51

Protein 52

Protein 53Protein 55 Protein 54Protein 80

Protein 56

Protein 57Protein 58Protein 59

Protein 60

Protein 61

Protein 62

Protein 63

Protein 64

Protein 65

Protein 66

Protein 67

Protein 68

Protein 69

Protein 70

Protein 71

Protein 72

Protein 73

Protein 74Protein 75Protein 76

Protein 77

Protein 78

Protein 79

Protein 81

Protein 82

Protein 83Protein 84

Protein 85

Protein 86

Protein 87

Protein 88

Protein 89

Protein 90

Protein 91

Protein 92

Protein 93Protein 94

Protein 95

Protein 96

Protein 97

Protein 98

Protein 99

Protein 100

Protein 101

Protein 102

Protein 103

Protein 104 Protein 105

Protein 106

Protein 107

Protein 108Protein 109

Protein 110

Protein 111Protein 112

Protein 113

Protein 114

Protein 115

Protein 116Protein 117

Protein 118

Protein 119

Protein 120

Protein 121

Protein 122

Protein 123

Protein 124

Protein 125

Protein 126

Protein 127

Protein 128

Protein 129

Protein 130

Protein 131Protein 132

Protein 133

Protein 134

Protein 135

Protein 136

Protein 137

Protein 138

Protein 139

Protein 140

Protein 141

Protein 142Protein 143

Protein 144

Protein 145

Protein 146

Protein 147

Protein 148

Protein 149

Protein 150

Protein 151

Protein 152

Protein 153

Protein 154

Protein 155

Protein 156 Protein 207

Protein 157 Protein 168

Protein 158

Protein 159

Protein 160

Protein 161

Protein 162

Protein 163

Protein 164

Protein 165

Protein 166

Protein 167

Protein 169

Protein 170

Protein 171

Protein 172

Protein 173

Protein 174

Protein 175

Protein 176

Protein 177

Protein 181

Protein 182

Protein 183

Protein 184

Protein 185

Protein 186

Protein 187

Protein 188

Protein 189

Protein 190

Protein 191

Protein 192

Protein 193

Protein 194

Protein 195

Protein 196

Protein 197

Protein 198

Protein 199

Protein 200Protein 201

Protein 202

Protein 203

Protein 204Protein 205

Protein 206

Protein 208

Protein 209

Protein 210

Protein 211

Protein 212

Protein 213

Protein 214

Protein 215

Protein 216 Protein 217

Protein 218

Protein 219Protein 220

Protein 221

Protein 222

Protein 223

Protein 224

Protein 225

Protein 226

Protein 227

Protein 228

Protein 229

Protein 230

Protein 231

Protein 232

Protein 233

Protein 235

Fig. 10 Protein–protein interaction network depicted using Gephi. Node colors indicate the different subnet-work groups based on the cluster betweenness graphical analysis. Arrows between nodes and labels arecolored as the parental node

Protein Interaction Networks 49

11. Customize some other visual parameters in the preview menu.Window > Preview settings.

(a) Add node names in the plot by selecting Show Labels inNode Labels tab, change label color (e.g., to the samecolor of node by clicking “parent” inNode Labels>Colordrop-down). Please note added labels will be those in thedata table column “Label”. Therefore, it may be necessaryto add them in the Data Table tab.

(b) Change the size of the edges or edge arrows.

(c) Click Refresh to update the graph and previsualize it.

12. For further details (see Note 27).

13. Export the network as pdf file: File > Export > SVG/PDF/PNG file.

3.6 Future

Perspectives

The evolution of next-generation sequencing and high-throughputtechnologies has created new possibilities and opportunities in dataanalysis. Despite recent advances in regulatory networks arefocused on transcriptomics (RNA-seq, microarrays), most of themcan be straightforwardly expanded to proteomics research as longas one assumption is satisfied: regulator abundance must influencetarget abundance. The study of gene regulatory networks will allowthe proteomics community to anticipate future approaches in pro-tein network construction.

During the last few years, a large number of available algo-rithms have been implemented, and as this chapter has shown,each method is biased toward certain type of biological interaction.To overcome this problem, a “wisdom of crowds” approximation isneeded. This concept refers to the phenomenon in which thecollective knowledge of a community is greater than the knowledgeof any individual; in other words, aggregating several networks intoa metanetwork significantly improves network accuracy [30]. Ametanetwork community approach and user-friendly software likeSeidr [31] need to be established as a priority in the near future ofprotein network construction.

Another interesting topic is comparing networks between dif-ferent groups/conditions. Differential network analysis can eluci-date the different roles proteins play and constitutes a new way toacquire a deeper knowledge of protein function. Briefly, it searchesgroup-specific connections to identify differentially connectedmodules, which may reflect key pathways or protein complexesinvolved. This method is particularly suitable for cases in whichvariations are caused by rewiring in the network [32].

Understanding the advantages and limitations of current andfuture network construction methods is critical to address one ofthe most challenging tasks in molecular and computational biology,which is genome-scale inference of protein interaction networksfrom exclusively protein abundance datasets.

50 Monica Escandon et al.

4 Notes

1. If you have problems installing the packages from the console,download the packages from BiocManager (https://cran.r-project.org/web/packages/BiocManager/index.html) andEdgeR (https://bioconductor.org/packages/release/bioc/html/edgeR.html).

2. For more examples see https://www.rdocumentation.org/packages/edgeR/versions/3.10.5/topics/DGEList-class

3. Adapt to more treatments following this example: For fourtreatments (C, treatment T1, treatment T2, and treatment T3):

design <- matrix(c(c(1,1,1,0,0,0,0,0,0,0,0,0),

c(0,0,0,1,1,1,0,0,0,0,0,0),c(0,0,0,0,0,0,1,1,1,0,0,0),c(0,0,0,0,0,0,0,0,0,1,1,1)),ncol=4,dimnames=

list(c(’C.1’,’C.2’,’C.3’,’T1.1’,’T1.2’,’T1.3’,’T2.1’,’T2.2’,’T2.3’,’T3.1’,’T3.2’,’T3.3),

c(’C.’,’T1.’,’T2.’,’T3.’)))

4. Another R functions designed for this purpose:

ART_data <- read.delim("clipboard", header=T, row.names=1)

# Previously you copied in the clipboard the matrix protein (Fig. 1A)

5. This is important in order to maintain sensitivity and truepositives detection. For further details see Lebre et al. [17]. Itis advisable to select regulators by manual revision of proteinslist before doing the workflow.

6. Another possibility is run ARTIVA with Regulators_1 vs Reg-ulators_1 to infer dynamic interactions between regulators:

all_DBNsub <- list()

for (i in 1:length(Regulators_1)) {

Targets <-Regulators_1[i] #One regulator

Regulators <- Regulators_1[-i] #the rest of regulators

DBNsub <- ARTIVAsubnet(

targetData = as.vector(as.matrix(ART_data[Targets,])),

parentData = as.matrix(ART_data[Regulators,]),

targetName = rownames(ART_data[Targets,]),

parentNames = rownames(ART_data[Regulators,]),

niter = 50000,

dataDescription = rep(1:6, each=3),

nbCPinit = 1, maxCP = 5 , segMinLength = 1,

savePictures= FALSE, saveIterations = FALSE, saveEstimations =

FALSE,

dyn = 0,edgesThreshold = 0.6)

Protein Interaction Networks 51

Save network information in the list:

all_DBNsub[[i]] <- DBNsub$network

}

#bind rows of all data in the list

all_DBNsub_df <- Reduce(rbind, all_DBNsub)

#Filter the output

all_DBNsub_df <-all_DBNsub_df[which(all_DBNsub_df$PostProb >

0.6),]

7. Output format: Parent ¼ source of the interaction; Tar-get ¼ destiny of the interaction; CPstart ¼ time point whenthe interaction begins; CPend ¼ time point when the interac-tion ends; CoeffMean ¼ sign and strength of the interaction.

8. The entered parameters are: niter¼ number of iterations in theReversible-Jump Markov chain Monte Carlo (RJ-MCMC).This is needed to determine an approximation of each time-varying network; dataDescription ¼ number of replicates foreach sampling point; nbCPinit ¼ the initial number of change-points to be considered;maxCP ¼maximum number of chan-gepoints to be considered; segMinLength ¼ minimum numberof time points that constitute a changepoint; savePictures-saveEstimations-saveIterations ¼ algortihm execution reports;dyn ¼ time delay to search relationships between the abun-dance of targets and regulators. If dyn ¼ 0, the algorithmsearches for relationships between the abundance of targetsand regulators in the same time point (t). If dyn ¼ 1, thealgorithm searches for relationships between abundance oftargets in (t) and abundance of regulators in (t-1).

9. Use the argument ind.names ¼ TRUE to see the name of eachreplica in the plot. Example:

plotIndiv(spls.analysis,comp=1:2,cex=4,col=pal,rep.space="XY-

variate",ind.names=TRUE,Y.label="comp 2", X.label="comp 1")

10. For further details of arguments please see

https://github.com/kuppal2/xMWAS/blob/master/example_manual_tutorial/xM

WAS-manual.pdf.

11. For further details see MixOmics [33].

12. A compromise between maximizing the correlation betweenblocks, and discriminating the outcome needed to be achieved,and that the weight in the design matrix could be set to <1between blocks. We recommend decreasing the value gradually(0.9–0.1) choosing the highest possible with an acceptablenumber of variables chosen by DIABLO (in the test.keepX)for each data set.

52 Monica Escandon et al.

13. Note that the function has been set to favor the small-ishsignature while allowing to obtain a sufficient number of vari-ables for downstream validation/interpretation. See Singhet al. [34] for further information.

14. An example of manual entry of parameters:

test.keepX <- list ("met" = c(5:9,

seq(10, 18, 2), seq(20,50,5)),"prot" = c(5:9, seq(10, 18, 2),

seq(20,50,5)),"mrna" = c(5:9, seq(10, 18, 2), seq(20,50,5)))

15. To include a list.keepX for our example (3 datasets with2 components):

list.keepX <- list(met = c(50,10), prot = c(8,10), mrna = c(10,12))

16. If there are several species with a good number of nr, make foreach species the network STRING and choose the one thatbest suits your experiment.

17. For each interaction, the edge attributes include the overallconfidence score and the subscores from seven individual evi-dence channels. These channels are as follows:

(a) The experiments channel: Evidence comes from actualexperiments in the lab.

(b) The database channel: Evidence that has been asserted bya human expert curator.

(c) The textmining channel: Pairs of proteins are given anassociation score when they are frequently mentionedtogether in the same paper, abstract, or even sentence.

(d) The coexpression channel: Pairs of proteins that are con-sistently similar in their expression patterns, under a vari-ety of conditions, will receive a high association score.

(e) The neighborhood channel: Genes are given an associa-tion score where they are consistently observed in eachother’s genome neighborhood (such as in the case ofconserved, cotranscribed “operons”).

(f) The fusion channel: Pairs of proteins are given an associa-tion score when there is at least one organism where theirrespective orthologs have fused into a single, protein-coding gene.

(g) The co-occurrence channel: STRING evaluates the phy-logenetic distribution of orthologs of all proteins in agiven organism.

Therefore, the three possible display options will bedisplayed:

Evidence: Included Known interaction from curateddatabases and experimentally determined), predictedinteraction (gene neighborhood, fusions, and

Protein Interaction Networks 53

co-occurrence), and others (textmining, coexpression,and protein homology).

Confidence: The overall confidence score obtain aboutthe seven channels.

Molecular action: activation, binding, phenotype,reaction, inhibition, catalysis, posttranslational modifica-tion, transcriptional regulation in three classes (positive,negative, or unspecified).

18. Higher confidence values will produce high significant net-works but at a cost of losing real interactions. This will reducethe biological meaning of the network. A balance should bedetermined by researcher considering organisms, experimentalsetups, and network topologies.

19. Use this option if your network has few connections, or manyunconnected nodes, or you want to discover potential candi-date molecules not present in your samples.

20. An example of how to proceed according to the networkobtained:

(a) If your network has few connections, or many uncon-nected nodes, include interactors. Interactors are proteinsnot included in your data, but they will act as connectionnodes to your proteins. Select the whole network and inthe STRING tab, click in Expand network where you canincrease the number of interactors.

(b) If you have a very dense network it is possible to apply newfilters for its simplification:

l Increase the cutoff of your network by adding a defaultfilter (column filter, Edge:score) in the control panel.For example, Edge:score between 0.8 and 0.999,STRING will select the edges within that range.

l In the tab Select, select the connected nodes byselected edges.

l Create a new network in File: New Network fromselected nodes, selected edges.

l Make different nets until you get the one that mostclosely explains the experiment.

21. An FDR adjusted p-value of 0.05 implies that 5% of significanttests will result in false positives. If FDR is increased, theprobability of finding false positives will be higher. If, on theother hand, if FDR is decreased, the probability of finding falsepositives will be lower.

22. If many results are generated, a subset with the most significantterms can be selected. If, for example, we select 30, ShinyGOwill show the 30 most significant terms (even if there are morethan 50 generated).

54 Monica Escandon et al.

23. It is also possible to download the resulting table from thenetwork using the STRING API tab and use the Cytoscapeworkflow for improvement.

24. The visualization of the network may not be clear until theapplication of some visual criteria. Continue the protocol toimprove this visualization.

25. Please see Cytoscape manual in: manual.cytoscape.org/.

26. To configure the spread rate:

(a) If depicted network is too spread increase gravity to 1.5(Default 1) in Layout sheet> Tuning>Gravity and repeatsteps 7 and 8, if your network is too contracted decreasegravity value.

(b) In this menu, it is advisable to mark “prevent Overlap” inBehavior alternatives drop-down.

27. See Gephi manuals, available at gephi.org/users/.

Acknowledgments

This work was supported by the projects financed by the SpanishMinistry of Economy and Competitiveness (AGL2016-77633-Pand AGL2017-83988-R). The Spanish Ministry of Economy andCompetitiveness supported M.E., L.V., and M.M. by Juan de laCierva (FJCI-2017-31613) and Ramon y Cajal Programs(RYC-2015-17871 and RYC-2014-14981), respectively. L.L. andV.R were supported by fellowship from the FPI (BES-2017-082092) and FPU (FPU18/02953) (Ministry of Science, Innova-tion and Universities, Spain), respectively. V.S. was supported byYouth Employment Operational Program (EJI-17-AGR-164),cofinanced by Regional Government of Andalusia and theEuropean Social Fund (ESF).

References

1. Valledor L, Jorrın J (2011) Back to the basics:maximizing the information obtained byquantitative two dimensional gel electrophore-sis analyses by an appropriate experimentaldesign and statistical analyses. J Proteome74:1–18

2. Singh A, Gautier B, Shannon CP et al (2016)DIABLO – an integrative, multi-omics, multi-variate method for multi-group classification.bioRxiv. https://doi.org/10.1101/067611

3. Scholz M, Selbig J (2007) Visualization andanalysis of molecular data. In: Weckwerth W(ed) Metabolomics methods protocol.Humana Press, Totowa, NJ, pp 87–104

4. Steuer R, Morgenthal K, Weckwerth W et al(2007) A gentle guide to the analysis of meta-bolomic data. Methods Mol Biol 358:105–126

5. Le Cao K-A, Boitard S, Besse P (2011) SparsePLS discriminant analysis: biologically relevantfeature selection and graphical displays for mul-ticlass problems. BMC Bioinformatics 12:253

6. Groth D, Hartmann S, Klie S et al (2013)Principal components analysis. In: Reisfeld B,Mayeno AN (eds) Computational toxicology,Methods in molecular biology, vol II. HumanaPress, New York City. p chapter 22

7. Meng C, Kuster B, Culhane AC et al (2014) Amultivariate approach to the integration of

Protein Interaction Networks 55

multi-omics datasets. BMC Bioinformatics15:162

8. Uppal K, Go Y-M, Jones DP (2017) xMWAS:an R package for data-driven integration anddifferential network analysis. bioRxiv:122432

9. Von Mering C, Jensen LJ, Snel B et al (2005)STRING: known and predicted protein-protein associations, integrated and transferredacross organisms. Nucleic Acids Res 33:D433–D437

10. Pluskal T, Castillo S, Villar-briones A et al(2010) MZmine 2 : modular framework forprocessing, visualizing, and analyzing massspectrometry-based molecular profile data.BMC Bioinformatics 11:395

11. Haas BJ, Papanicolaou A, Yassour M et al(2013) De novo transcript sequence recon-struction from RNA-seq using the Trinity plat-form for reference generation and analysis. NatProtoc 8:1494

12. Valledor L, Romero-Rodriguez MC, Jorrin-Novo JV (2014) Standardization of data pro-cessing and statistical analysis in comparativeplant proteomics experiment. Methods MolBiol 1072:51–60

13. Escandon M, Valledor L, Pascual J et al (2017)System-wide analysis of short-term response tohigh temperature in Pinus radiata. J Exp Bot68:3629–3641

14. Pascual J, Canal MJ, Escandon M et al (2017)Integrated physiological, proteomic, and meta-bolomic analysis of ultra violet (UV) stressresponses and adaptation mechanisms in Pinusradiata. MCP 16:485–501

15. Branson OE, Freitas MA (2016) A multi-model statistical approach for proteomic spec-tral count quantitation. J Proteome 144:23–32

16. Robinson MD, Oshlack A (2010) A scalingnormalization method for differential expres-sion analysis of RNA-seq data. Genome Biol.https://doi.org/10.1186/gb-2010-11-3-r25

17. Lebre S, Becq J, Devaux F et al (2010) Statisti-cal inference of the time-varying structure ofgene-regulation networks. BMC Syst Biol4:130

18. Nagarajan R, Scutari M, Lebre S (2013) Bayes-ian networks in R with applications in systemsbiology. Springer-Verlag, New York. https://doi.org/10.1007/978-1-4614-6446-4

19. Wold H (1966) Estimation of principal com-ponents and related models by iterative leastsquares. Multivariate analysis, NewYork. Aca-demic Press, Cambridge

20. Wold S, Sjostrom M, Eriksson L (2001)PLS-regression: a basic tool of chemometrics.Chemom Intell Lab Syst 58:109–130

21. Cramer RD (1993) Partial least squares (PLS):its strengths and limitations. Perspect DrugDiscov Des 1:269–278

22. Le Cao K-A, Rossouw D, Robert-Granie Cet al (2008) A sparse PLS for variable selectionwhen integrating omics data. Stat Appl GenetMol Biol 7:35

23. Lee HK, Hsu AK, Sajdak J et al (2004) Coex-pression analysis of human genes across manymicroarray data sets. Genome Res14:1085–1094

24. Tenenhaus A, Philippe C, Guillemot V et al(2014) Variable selection for generalizedcanonical correlation analysis. Biostatistics15:569–583

25. Szklarczyk D, Gable AL, Lyon D et al (2019)STRING v11: protein-protein association net-works with increased coverage, supportingfunctional discovery in genome-wide experi-mental datasets. Nucleic Acids Res 47:D607–D613

26. Szklarczyk D, Franceschini A, Wyder S et al(2015) STRING v10: protein–protein interac-tion networks, integrated over the tree of life.Nucleic Acids Res 43:D447–D452

27. Szklarczyk D, Morris JH, Cook H et al (2017)The STRING database in 2017: quality-controlled protein-protein association net-works, made broadly accessible. Nucleic AcidsRes 45:D362–D368

28. Ge SX, Jung D (2018) ShinyGO: a graphicalenrichment tool for animals and plants.bioRxiv:315150

29. Shannon P, Markiel A, Ozier O et al (2003)Cytoscape: a software environment forintegrated models of biomolecular interactionnetworks. Genome Res 13:2498–2504

30. Marbach D, Costello JC, Kuffner R et al(2012) Wisdom of crowds for robust gene net-work inference. Nat Methods 9:796

31. Schiffthaler B, Serrano A, Delhomme N et al(2019) Seidr: a gene meta-network calculationtoolkit. bioRxiv:250696

32. Grimes T, Potter SS, Datta S (2019) Integrat-ing gene regulatory pathways into differentialnetwork analysis of gene expression data. SciRep 9:5479

33. Rohart F, Gautier B, Singh A et al (2017)mixOmics: an R package for ‘omics featureselection and multiple data integration. PLoSComput Biol 13:e1005752

34. Singh A, Gautier B, Shannon CP et al (2018)DIABLO: from multi-omics assays to bio-marker discovery, an integrative approach.bioRxiv. https://doi.org/10.1101/067611

56 Monica Escandon et al.

Chapter 4

Specific Protein Database Creation from TranscriptomicsData in Nonmodel Species: Holm Oak (Quercus ilex L.)

Vıctor M. Guerrero-Sanchez, Ana M. Maldonado-Alconada,Rosa Sanchez-Lucas, and Maria-Dolores Rey

Abstract

Proteomics encompasses efforts to identify all the proteins of a proteome, with most of studies about plantproteomics based on a bottom-up mass spectrometry (MS) strategy, in which the proteins are subjected todigestion by trypsin and the tryptic fragments are subjected to MS analysis. The identification of proteinsfrom MS/MS spectra has been performed using different algorithms (Mascot, Sequest) against plantprotein sequence databases such as UniProtKB or NCBI_Viridiplantae. But these databases are not thebest choice for nonmodel species where they are underrepresented, resulting in poor identification rates. Ahigh identification rate requires a sequenced and well-annotated genome of the species under investigation.For nonmodel organisms, the identification of proteins is challenging since, in the best of the cases, only hitsor orthologs instead of gene products are identified. However, in the absence of a sequenced genome, thissituation can be improved using transcriptome data to generate a specific species database to compareproteins. In this chapter, we report the protein database construction from RNA-Seq data in a nonmodelspecies, in this particular case Holm oak (Q. ilex).

Key words Quercus ilex, Nonmodel species proteomics, Protein databases, RNA-Seq analysis, Cus-tom protein databases

1 Introduction

The main objective of a proteomics experiment is to identify andquantify as many protein species or proteoforms as possible. Theidentification of proteins is generally based on the comparisonbetween the experimental m/z data and the theoretical onededuced from the in silico translation of DNA or RNA sequences,following a well-known bottom up scheme. Currently, several algo-rithms are widely used in a proteomics experiment, such asSEQUEST [1], MASCOT [2], and ANDROMEDA [3], amongothers. The availability of a well-annotated genome (e.g., Arabi-dopsis and rice) makes possible a quick identification of gene pro-ducts, including mRNA splicing or posttranslational variants

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_4, © Springer Science+Business Media, LLC, part of Springer Nature 2020

57

(PTMs). However, the situation is quite different when workingwith organisms scarcely represented in databases.

Considering the above concerns, the question arises, “What isthe best way for facing proteomics research with orphan species?”Genome sequencing and annotation would be the first option, butin most cases performing this step is unrealistic for many reasons(too expensive and time consuming to be conducted for individuallaboratories). A more feasible alternative is to perform a de novotranscriptome for later building a protein database out of thetranscript reads. Furthermore, these sequence reads can be com-plemented by adding all available sequences (DNA or RNA) dis-persed in the literature or deposited in different databases. Thisapproach has been used with orphan forest tree species such asQuercus ilex [4] and Pinus radiata [5]. We observed that theemploy of such a database notably increased the number of proteinsidentified and the confidence of the identification [4, 5].

In this chapter, a workflow for the creation of a specific speciesprotein sequence database from RNA-Seq data that enables effec-tive proteomic profiling is described. This protocol is illustratedusing as example the transcriptome of Holm oak, generated fromRNA sequence experiments of a pool of equal amounts homoge-nized tissue from acorn embryos, leaves and roots [6, 7]. As part ofthe workflow, we first explain the generation of the transcriptomeof Holm oak and the bioinformatics tools used, including thefollowing steps: (1) trimming to select only high-quality sequences;(2) de novo assembling of all the clean reads; (3) evaluation ofstructure and completeness of the de novo transcriptome; (4) anno-tating candidate transcripts; and, finally, (5) constructing and anno-tating a protein database.

2 Materials

The development of this protocol was carried out using a LinuxUbuntu distribution (GNU/Linux distribution Ubuntu 18.04 orhigher) as an operating system. It is also applicable on systemsbased on Ubuntu, such as Linux Mint.

2.1 Nucleotide

Sequences

mRNA reads in FastQ format obtained from the Illumina sequenc-ing platform.

2.1.1 Required Software All the software required for completing this protocol is publiclyavailable. Documentation and software can be freely downloadedfrom the following addresses.

2.2 Raw Reads

Quality Control

https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

58 Vıctor M. Guerrero-Sanchez et al.

2.3 Preprocessing

Raw Data

http://hannonlab.cshl.edu/fastx_toolkit/https://cutadapt.readthedocs.io/

2.4 Assembling

Raw Data

http://denovoassembler.sourceforge.net/https://www.mirametrics.com/https://github.com/trinityrnaseq/trinityrnaseq/.https://www.ebi.ac.uk/~zerbino/velvet/http://www.bcgsc.ca/platform/bioinfo/software/trans-abysshttps://sourceforge.net/projects/soapdenovotrans/

2.5 Removing

Redundant Transcripts

http://bioinformatics.org/cd-hit/

2.6 Evaluating

the Assembly

Structure

and Completeness of a

Transcriptome

http://cab.spbu.ru/software/rnaquast/http://deweylab.biostat.wisc.edu/detonate/https://busco.ezlab.org/

2.7 Annotation of a

Transcriptome

http://www.bioinfocabd.upo.es/node/11

2.8 Construction of a

Custom Protein

Database

https://github.com/TransDecoder/TransDecoder/

3 Methods

3.1 Nucleotide

Sequences

1. Download raw transcriptome data of Holm oak from theEuropean Nucleotide Archive (ENA) at European Bioinfor-matics Institute (EBI) in FastQ-Format (Illumina_ R1 andIllumina_R2). Repository can be accessed via http://www.ebi.ac.uk/ena. Sequencing reads were uploaded using theaccession number: SRR5815058 (see Note 1).

$ wget

ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR581/008/SRR5815058/SRR5815058_1.fastq.g

z

$ wget

ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR581/008/SRR5815058/SRR5815058_2.fastq.g

z

Both FASTQ files (SRR5815058_1 and SRR5815058_2)should be downloaded, as they were obtained from an Illuminasequencing platform, which is capable of paired-end sequencing.

Protein Database Creation in Non-Model Organisms 59

3.2 Raw Reads

Quality Control (See

Note 2)

Quality control (QC) is critical to ensure that RNA-seq data are ofhigh quality and suitable for subsequent analyses. Due to thepresence of intrinsic biases and limitations, such as nucleotidecomposition bias, GC bias, and RCA bias, a quality control shouldbe carried out in a RNA-seq. These biases directly affect the accu-racy of many RNA-seq applications [8, 9] and they can be checkedfrom raw sequences using tools like FastQC. The FastQC softwareprovides a FastQC Report of a set of high throughputs sequencingreads, allowing identifying sequencing errors or bias for a latertrimming. Basic statistics (total sequences, filtered sequences,sequence length, GC %) are provided by using this tool.

1. Download and install the FastQC software v0.11.8.

$ wget

https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.8.zi

p

$ unzip fastqc v0.11.8.zip

2. Unzip both FASTQ files.

$ gunzip SRR5815058_1.fastq.gz

$ gunzip SRR5815058_2.fastq.gz

3. Run FastQC (see https://dnacore.missouri.edu/PDF/FastQC_Manual.pdf).

$ ./fastqc SRR5815058_1.fastq

$ ./fastqc SRR5815058_2.fastq

3.3 Preprocessing

Raw Data

Preprocess all raw data using Fastx_Toolkit version 0.0.13 andCutadapt version 1.9 [10]. The aim of this step is retaining onlyhigh-quality sequences by removing of low-quality reads withambiguous bases and adapter sequences.

1. Download and install Fastx_Toolkit.

$ wget

http://hannonlab.cshl.edu/fastx_toolkit/fastx_toolkit_0.0.13_binaries_Linux_

2.6_amd64.tar.bz2

$ tar -xjvf fastx_toolkit_0.0.13_binaries_Linux_2.6_amd64.tar.bz2

Filter fastq sequences using a minimum Phred score (Phredscore indicates the quality of the identification of thenucleobases generated by DNA sequencing):

$ ./fastq_quality_filter -q 20 -i SRR5815058_1.fastq -o

SRR5815058_1_f.fastq

$ ./fastq_quality_filter -q 20 -i SRR5815058_2.fastq -o

SRR5815058_2_f.fastq

60 Vıctor M. Guerrero-Sanchez et al.

–q indicates minimum quality score (commonly between20 and 40).

2. Download and install cutadapt.

$ sudo apt install python3-pip

$ pip3 install --user --upgrade cutadapt

3. Remove all the overrepresented sequences.

$ cutadapt -m 100 -a adapter

ATCGGAAGAGCACACGTCTGAACTCCAGTCACCGGCTATGATCTC

GTATG SRR5815058_1_f.fastq -o SRR5815058_1_fc.fastq

$ cutadapt -m 100 -a adapter

ATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTGCCTCTATGTGTA

GATCTC SRR5815058_2_f.fastq -o SRR5815058_2_fc.fastq

–m discards processed reads shorter than a length.

3.4 Assembling

Raw Data

The assembling of the de novo transcriptome can be carried outusing different assemblers such as Velvet [11], Trans-ABySS [12],SOAPdenovo [13], TRINITY [14], MIRA [15], or RAY [16],among others. In Holm oak, we used TRINITY version 2.5.1,RAY version 2.3.1, and MIRA V5rc1 (see Note 3). The use ofmore than a single assembly should be carried out to assess thequality of an assembly [17]. This is due to the read lengths, readcounts, and error profiles that are produced by different next-generation sequencing technologies [17].

1. Download and install RAY.

$ sudo apt install ray

2. Run RAY selecting the following launched parameters (seeNote 4 and http://denovoassembler.sourceforge.net/).

$ mpiexec -n 8 Ray -k 31 -p SRR5815058_1_fc.fastq SRR5815058_2_fc.fastq -o

ray_assembly_folder

3. Download and install MIRA.

$ sudo apt install mira-assembler

4. Run MIRA selecting the following launched parameters:

job = denovo,est,accurate > parameters = NW:maxreadnamelength=100

COMMON SETTINGS

GENERAL: number of threads = 12

MERSTATISTICS: lossless digital normalisation = yes

ALIGN:min relative score = 70

Protein Database Creation in Non-Model Organisms 61

ASSEMBLY:minimum read length = 100

CLIPPING:quality clip = no

CLIPPING:qc minimum quality = 15

CLIPPING:qc window length = 20

CLIPPING:clip polyat = yes

CLIPPING:cp min sequence len = 12

technology = solexa

5. Download and install Trinity selecting the following launchedparameters:

$ wget https://github.com/trinityrnaseq/trinityrnaseq/archive/Trinity-

v2.8.4.zip

$ unzip Trinity-v2.8.4.zip

$ sudo apt-get instann cmake

$ make

$ make plugins

$ make install

6. Run TRINITY selecting the following launched parameters.

$ ./Trinity --seqType fq --left SRR5815058_1_fc.fastq --right

SRR5815058_2_fc.fastq --CPU 20 --max_memory 1000G

3.5 Removing

Redundant Transcripts

A high redundancy of transcripts leads to an increase in the amountof data to process and computer requirements. To cluster andcompare nucleotide or protein sequences, the CD-HIT algorithm[18, 19] is highly recommended.

1. Download and install CD-HIT.

$ sudo apt install cd-hit

2. Run CD-HIT (see http://www.bioinformatics.org/cd-hit/cd-hit-user-guide.pdf).

$ cdhit -i input Contigs.fasta -o output clusteredassembly.fasta -c 0.95 -T 4 (see

Note 5).

–c 0.95, indicates 95% identity, is the clustering threshold

–T number of threads: 4.

3.6 Evaluating

the Assembly

Structure

and Completeness of a

Transcriptome

Evaluate the structure of the generated assemblies using the rna-QUAST (Quality Assessment Tool for Transcriptome Assemblies)software version 1.5.1 [20] (see Note 6).

1. Download and install rnaQUAST.

62 Vıctor M. Guerrero-Sanchez et al.

$ wget http://cab.spbu.ru/files/rnaquast/release1.5.2/rnaQUAST-1.5.2.tar.gz

$ tar -xjvf rnaQUAST-1.5.2.tar.gz$ sudo apt-get install python3-matplotlib

$ pip install joblib

$ pip install gffutils

$ sudo apt-get install ncbi-blast+

$ wget http://research-pub.gene.com/gmap/src/gmap-gsnap-2019-03-15.tar.gz

$ tar -xjvf gmap-gsnap-2019-03-15.tar.gz

$ ./configure

$ make

$ make install

2. Run rnaQUAST (see http://cab.spbu.ru/software/rnaquast/).

$ python rnaQUAST.py --transcripts clusteredassembly.fasta --reference

OCV4_assembly_final.fsa

By using the transcriptome of Q. robur andQ. petraea (https://urgi.versailles.inra.fr/download/oak/OCV4_assembly_final.fsa).

If a reference genome is available, the following command-lineshould be used:

$ python rnaQUAST.py --transcripts clusteredassembly.fasta --reference

Qrob_PM1N.fa -gft Qrob_PM1N_genes_20161004.gff

By using the genome of Q. robur (https://urgi.versailles.inra.fr/download/oak/Qrob_PM1N.fa.gz and https://urgi.versailles.inra.fr/download/oak/Qrob_PM1N_genes_20161004.gff.gz).

The analysis of the structure of a de novo transcriptome can becomplemented with other transcriptome-specific metrics obtainedby using the DETONATE (DE novo TranscriptOme rNa-seqAssembly with or without the Truth Evaluation) package version1.11 [21]. In order to realign the assembled contigs with the readsused to generate a more complete assembly using the RSEM-EVAL[22] assembly evaluator of DETONATE and obtain the value ofE90N50, overall score values, length of alignable reads, and num-ber of alignments in total, among others.

1. Download and install DETONATE.

$ wget http://deweylab.biostat.wisc.edu/detonate/detonate-1.11-precompiled.tar.gz

$ tar -xjvf precompiled.tar.gz

2. Run DETONATE.

$ ./rsem-eval-estimate-transcript-length-distribution clusteredassembly.fasta

parameter_file

$ ./rsem-eval-calculate-score -p 8 --transcript-length-parameters parameters_file

SRR5815058_1_fc.fastq SRR5815058_2_fc.fastq clusteredassembly.fasta

output_folder

Protein Database Creation in Non-Model Organisms 63

Evaluate the completeness of the transcriptome by BUSCO(Benchmarking Universal Single-Copy-Orthologs) version 3.0.2[23, 24].

3. Download and install BUSCO.

$ wget https://gitlab.com/ezlab/busco/-/archive/master/busco-master.zip$ gunzip

busco-master.zip

$ sudo apt-get install ncbi-blast+

$ sudo apt-get install hmmer

$ wget http://bioinf.uni-greifswald.de/augustus/binaries/augustus.current.tar.gz

$ tar -xjvf augustus.current.tar.gz

$ make

$ make install

According to BUSCO manual (see https://busco.ezlab.org/v1/files/BUSCO_userguide.pdf): Do not forget to create a config.ini file in the config/subfolder. You can set the BUSCO_CONFIG_-FILE environment variable to define a custom path (including thefilename) to the config.ini file, useful for switching between configura-tions or in a multiuser environment. Augustus uses several executablesand PERL scripts. Please refer to Augustus documentation for PERLrequirements. In addition to the entries in the config.ini file, August-us requires environment variables to be declared as follows:

$ export PATH="/path/to/AUGUSTUS/augustus-3.2.3/bin:$PATH"

$ export PATH="/path/to/AUGUSTUS/augustus-3.2.3/scripts:$PATH"

$ export AUGUSTUS_CONFIG_PATH="/path/to/AUGUSTUS/augustus-

3.2.3/config/"

$ sudo python setup.py install

$ wget https://busco.ezlab.org/datasets/embryophyta_odb9.tar.gz

$ tar –xf embryophyta_odb9.tar.gz

4. Run BUSCO (see https://busco.ezlab.org/).

$ python run_BUSCO.py -i clusteredassembly.fasta -o output_name -l

embryophyta_odb9 -m tran

–m or --mode sets the assessmentMODE: genome (�mgeno),proteins (�m prot), transcriptome (�m tran).

–l indicates location of the BUSCO lineage data to use.

$ python generate_plot.py -wd “working directory”

–wd name or full path to folder containing BUSCO short_-summary files.

3.7 Annotation of a

Transcriptome

Annotate the transcriptome using the Sma3s annotator version2 [25, 26] (see Note 7). Sma3s (Sequence massive annotation by

64 Vıctor M. Guerrero-Sanchez et al.

three modules) is an easy-to-use tool for high-throughput annota-tion that provides both accuracy and broad applicability. The anno-tation of biological sequences is to associate biological informationto sequences of interest (see http://www.bioinfocabd.upo.es/node/11).

1. Download Sma3s (see Note 8).

$ wget http://www.bioinfocabd.upo.es/node/11#sma3s.pl

$ wget http://www.bioinfocabd.upo.es/sma3s/db/uniref90.fasta.gz

$ gunzip uniref90.fasta.gz

$ wget http://www.bioinfocabd.upo.es/sma3s/db/uniref90.annot.gz

$ gunzip uniref90.annot.gz

2. Run Sma3s.

$ perl sma3s_v2.pl -num_threads 10 -i clusteredassembly.fasta -d

uniref90.fasta -go -goslim -nucl

3.8 Construction of a

Custom Protein

Database

Generate a six-frame translation for each sequence of the transcrip-tome using the TransDecoder software version 5.5.0 [27].

1. Download TransDecoder.

$ wget

https://github.com/TransDecoder/TransDecoder/archive/TransDecoder-

v5.5.0.zip

$gunzipTransDecoder-v5.5.0.zip

2. Run TransDecoder (see https://github.com/TransDecoder/TransDecoder/wiki).

$ ./TransDecoder.LongOrfs -t clusteredassembly.fasta

-m 75 –o outputfolder

–m length of amino acids

3. Process the spectra using the SEQUEST algorithm available inProteome Discoverer version 2.1 (Thermo-Scientific,Massachusetts, USA).

4. Use the FASTA sequences generated previously (see Note 9).

5. Select peptides following these settings: (1) precursor masstolerance sets to 10 ppm and fragment ion mass tolerance to0.8 Da, (2) only charge states +2 or greater, (3) identificationconfidence sets of 5% FDR (false discovery rate), (4) variablesmodifications such as oxidation of methionine and fixed modi-fication such as carbamidomethyl cysteine formation, and (5) amaximum of two missed cleavage for all searches.

6. Save archive with all the identified proteins generating a customspecies database.

Protein Database Creation in Non-Model Organisms 65

4 Notes

1. This protocol has been employed in the construction of aHolm oak specific transcriptome by the Illumina Hiseq 2500sequencing platform using 100 bp paired-end sequencing.Additionally, this protocol is compatible with raw data fromIon Torrent sequencing platform as previously described in[7]. At ENA, raw data can be downloaded by either experimentaccession (SRX2993508) or run accession (SRR5815058).

2. Raw data provided by the platform contained hundreds ofreads in a single run; hence, before drawing biological conclu-sions, an analysis of sequences should always be performed tomonitor the quality of raw data.

3. Trinity version 2.5.1 uses an algorithm based on Bruijn graphs[14]. MIRA version 4.9.6 is based on the strategy known asOverlap / Layout / Consensus [15]. Ray version 2.3.1 alsouses de Bruijn graphs but its framework is not based on theEulerian steps [16].

4. Other launched parameters can be used in the assembling of ade novo transcriptome using RAY:

– minimum-seed-length “minimumSeedLength” ->Changes the minimum seed length, default is100 nucleotides,

– minimum-contig-length “minimumContigLength” ->Changes the minimum contig length, default is100 nucleotides,

– use-maximum-seed-coverage “maximumSeedCoverage-Depth” -> Ignores any seed with a coverage depth abovethis threshold. The default is 4294967295,

– use-minimum-seed-coverage “minimumSeedCoverage-Depth” -> Sets the minimum seed coverage depth. Anypath with a coverage depth lower than this will be discarded.The default is 0.

5. The cutoff could be increased to 99% and 100%, depending onthe stricness required in the experiment.

6. In order to carry out an evaluating using rnaQUAST, it isrecommended to provide either FASTA files with all transcriptsor align transcripts to the reference genome [20] to get thevalues of N50, L50, N75, L75, and GC % of the contigsgenerated in the assembly and choose those with higher N50and best contig size distribution. In order to align assemblieswith complete and annotated transcriptome sequences of phy-logenetically related species. In Q. ilex, the transcriptomes of

66 Vıctor M. Guerrero-Sanchez et al.

Q. suber were used in the alignment of the de novo Q. ilextranscriptome.

7. An annotation process should be done using a complete data-base (e.g., UniRef, Swiss-Prot or NR-NCBI) with a lowere-value cutoff than 10�6 (default lower than 10�6). Moreover,other software such as Trinotate or Blast2Go can be used in theannotation of a transcriptome [28, 29].

8. According to Sma3s (see http://www.bioinfocabd.upo.es/node/11): Sma3s has low computing requirements and canbe used on virtually any computer. It is written in Perl lan-guage, and you need its interpreter (http://www.perl.com),which is preinstalled in Linux and Mac OS X (in Windowsit will not be necessary). Additionally, you need to install theBlas + package for your operating system.

9. A de novo peptide sequencing and annotation can be per-formed by the NOVOR software [30].

Acknowledgments

The authors thank University of Cordoba (UCO-CeiA3) and thestaff of the Central Service for Research Support (SCAI) at theUniversity of Cordoba (Spain) for its technical support in thebioinformatics data analysis. This research was supported by thegrant ENCINOMICA BIO2015-64737-R from Spanish Ministryof Economy and Competitiveness. MD-R and LV thanks, respec-tively, the contracts “Ayudas Juan de la Cierva-Formacion (FJCI-2016-28296)” and “Programa Ramon y Cajal (RYC-2015-17871)” of the Spanish Ministry of Science, Innovation, andUniversities.

References

1. Eng JK, McCormack AL, Yates JR (1994) Anapproach to correlate tandem mass spectraldata of peptides with amino acid sequences ina protein database. J Am Soc Mass Sprectrom5:976–989

2. Perkins DN, Pappin DJ, Creasy DM et al(1999) Probability-based protein identificationby searching sequence databases using massspectra. Bioinformatics 20:3551–3567

3. Cox J, Neuhauser N, Michalski A et al (2011)Andromeda: a peptide search engine integratedinto the MaxQuant environment. J ProteomeRes 10:1794–1805

4. Romero-Rodrıguez MC, Pascual J, Valledor Let al (2014) Improving the quality of proteinidentification in non-model species.

Characterization of Quercus ilex seed andPinus radiata needle proteomes by usingSEQUEST and custom databases. J Proteome105:85–91

5. Valledor L, Jorrın-Novo JV, Rodrıguez JL et al(2010) Combined proteomic and transcrip-tomic analysis identifies differentially expressedpathways associated to Pinus radiata needlematuration. J Proteome Res 9:3954–3979

6. Guerrero-Sanchez VM, Maldonado-AlconadaAM, Amil-Ruiz F et al (2017) Holm oak(Quercus Ilex) transcriptome. De novosequencing and assembly analysis. Front MolBiosci 4:70

7. Guerrero-Sanchez VM, Maldonado-AlconadaAM, Amil-Ruiz F et al (2019) Ion torrent and

Protein Database Creation in Non-Model Organisms 67

lllumina, two complementary RNA-seq plat-forms for constructing the holm oak (Quercusilex) transcriptome. PLoS One 14:e0210356

8. Benjamini Y, Speed TP (2012) Summarizingand correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res 40:e72

9. Hansen K, Brenner S (2010) Biases in Illuminatranscriptome sequencing caused by randomhexamer priming. Nucleic Acids Res 38:e131

10. Martin M (2011) Cutadapt removes adaptersequences from high-throughput sequencingreads. EMBnet J 17:10–20

11. Zerbino DR, Birney E (2008) Velvet: algo-rithms for de novo short read assembly usingde Bruijn graphs. Genome Res 18:821–829

12. Simpson JT, Wong K, Jackman SD et al (2009)ABySS: a parallel assembler for short readsequence data. Genome Res 9:1117–1123

13. Li R, Zhu H, Ruan J et al (2010) De novoassembly of human genomes with massivelyparallel short read sequencing. Genome Res20:265–272

14. Grabherr MG, Haas BJ, Yassour M et al (2011)Trinity: reconstructing a full-length transcrip-tome without a genome from RNA-Seq data.Nat Biotechnol 29:644–652

15. Chevreux B, Wetter T, Suhai S (1999) Genomesequence assembly using trace signals and addi-tional sequence information. German ConfBioinformatics 99:45–56

16. Boisvert S, Laviolette F, Corbeil J (2010) Ray:simultaneous assembly of reads from a mix ofhigh-throughput sequencing technologies. JComput Biol 17:1519–1533

17. Bradnam KR, Fass JN, Alexandrov A et al(2013) Assemblathon 2: evaluating de novomethods of genome assembly in three verte-brate species. Gigascience 2:10

18. Weizhong L, Godzik A (2006) Cdhit: a fastprogram for clustering and comparing largesets of protein or nucleotide sequences. Bioin-formatics 22:16589

19. Fu L, Niu B, Zhu Z et al (2012) CD-HIT:accelerated for clustering the next-generationsequencing data. Bioinformatics28:3150–3152

20. Bushmanova E, Antipov D, Lapidus A et al(2016) RnaQUAST: a quality assessment toolfor de novo transcriptome assemblies. Bioinfor-matics 32:2210–2212

21. Li B, Fillmore N, Bai Y et al (2014) Evaluationof de novo transcriptome assemblies fromRNA-Seq data. Genome Biol 15:553

22. Li B, Dewey CN (2011) RSEM: accurate tran-script quantification from RNA-Seq data withor without a reference genome. BMCBioinfor-matics 12:323

23. Simao FA, Waterhouse RM, Ioannidis P(2015) BUSCO: assessing genome assemblyand annotation completeness with single-copyorthologs. Bioinformatics 31:3210–3212

24. Waterhouse RM, Seppey M, Simao FA et al(2017) BUSCO applications from qualityassessments to gene prediction and phyloge-nomics. Mol Biol Evol 35:543–548

25. Munoz-Merida A, Viguera E, Claros MG et al(2014) Sma3s: a three-step modular annotatorfor large sequence datasets. DNA Res21:341–353

26. Casimiro-Soriguer CS, Munoz-Merida A,Perez-Pulido AJ (2017) Sma3s: a universaltool for easy functional annotation of pro-teomes and transcriptomes. Proteomics17:1700071

27. Haas B, Papanicolaou A (2017) TransDecoder.https://transdecoder.github.io

28. Bryant DM, Johnson K, DiTommaso T et al(2017) A tissue-mapped axolotl de novo tran-scriptome enables identification of limb regen-eration factors. Cell Rep 18:762–776

29. Conesa A, Gotz S (2008) Blast2GO: a compre-hensive suite for functional analysis in plantgenomics. Int J Plant Genomics 2008:619832

30. Ma B (2015) Novor: real-time peptide de novosequencing software. J Am Soc Mass Spectrom26:1885–1894

68 Vıctor M. Guerrero-Sanchez et al.

Chapter 5

Subcellular Proteomics in Conifers: Purification of Nucleiand Chloroplast Proteomes

Laura Lamelas, Lara Garcıa, Marıa Jesus Canal, and Monica Meijon

Abstract

The complexity of the plant cell proteome, exhibiting thousands of proteins whose abundance varies inseveral orders of magnitude, makes impossible to cover most of the plant proteins using standard shotgun-based approaches. Despite this general description of plant proteomes, the complexity is not a big issue(current protocols and instrumentation allow for the identification of several thousand proteins perinjection), low or medium abundant proteins cannot be detected most of times, being necessary to fractionor perform targeted analyses in order to detect and quantify them. Among fractioning choices, cellfractioning in its different organelles is a good strategy for gaining not only a deeper coverage of theproteome but also the basis for understanding organelle function, protein dynamics, and trafficking withinthe cell, as nuclear and chloroplast communication. This approach is used routinely in many labs workingwith model species; however, the available protocols focusing on tree species are scarce. In this chapter, weprovide a simple but robust protocol for isolating nuclei and chloroplasts in pine needles that is fullycompatible with later mass spectrometry–based proteome analysis.

Key words Nucleus, Chloroplast, Subcellular proteomics, Forest species, Pinus

1 Introduction

Current LC-MS–based methodologies allow for the identificationof thousands of proteins per injection, being close to performalmost complete proteome coverage for simple organisms[1]. Despite analyzing more than three thousand proteins in asingle injection is an amazing milestone, not even dreamt someyears ago, this analytical capability is not enough for covering cellsof more complex organisms such as a plant cell. Plant cells, withtens of thousands of proteins, are a fascinating living systems forproteome analyses but current possibility for obtaining completeproteomes in a single injection is beyond our range. This limitationis translated in the fact that when performing single-injectionuntargeted shotgun-based proteomics, we will be able to quantifyabout two thousand proteins, three thousand in the best of the

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_5, © Springer Science+Business Media, LLC, part of Springer Nature 2020

69

situations. If targeting less abundant proteomes is required, frac-tioning is a required step. It can be performed after protein isola-tion (MudPIT) [2], or at cellular level, purifying the differentorganelles separately [3].

Subcellular proteomics, besides decreasing proteome complex-ity, allows for the targeted study of the compartments in eukaryoticcells, allowing for a better knowledge of organelle function, proteindynamics, trafficking, and the understanding of the proliferation ofmultigene families and the specialization of cellular functions[4]. Consequently, the exploration of the proteome of the cell atsubcellular level is therefore both a practical approach and also afunctional necessity for proper interpretation of dynamics of prote-ome which requires detailed information about compartmentationof protein machinery.

In plants and other organisms, the functions of the nucleus arecrucial for cell proliferation and the regulation of gene expressionduring development and/or in response to biotic/abiotic stresses[5]. Knowing nuclear proteome dynamics is essential to increaseour understanding of how both environmental and cytoplasmicsignals are sensed and translated into molecular responses, mainlythrough the proteins that guide and control the gene expression.Nuclear proteomics is an advantageous approach for investigatingthe mechanisms underlying plant responses to abiotic stresses,including protein–protein interactions, spliceosome complex, his-tones, enzyme activities, posttranslational modifications, andintrinsically disordered proteins [5, 6].

On the other hand, the chloroplast is a major plant cell organ-elle that fulfills basic metabolic and biosynthetic functions[7]. Chloroplast is indispensable for plant response to environmen-tal stresses, growth and development, whose function is regulatedby different plant hormones. The chloroplast proteome is encodedby chloroplast genome and nuclear genome, which play essentialroles in plant photosynthesis, metabolism and other biologicalprocesses [8]. Chlorophyll precursors, photosynthetic electrontransport, and sugars have all been shown to be involved in signal-ing from the chloroplast to the nucleus, suggesting the presence ofmultiple signaling pathways of coordination between both cellularcompartments.

Chloroplast function requires the import of both nucleus-encoded photosynthetic proteins and cytoplasmic factors that reg-ulate the expression of chloroplast genes. The plastid also plays arole in nuclear gene expression, with signals that originate in thechloroplast acting to regulate transcription of nucleus-encodedphotosynthetic genes, a process called retrograde signaling. In thelast 10 years, many studies have revealed the nature of nucleus-derived molecules that affect chloroplast gene expression at alllevels [9]. Although it has been known for many years that theexpression of a subset of nuclear genes, whose products are

70 Laura Lamelas et al.

involved in photosynthesis, depends on the presence in the cell offunctional plastids, little progress has been made in elucidating thesignaling molecules or mechanisms involved in retrograde signal-ing. However, several recent discoveries have made inroads into thiscomplex mechanism and have begun to shed light on the black boxof signaling from the chloroplast to the nucleus [9–11]. For exam-ple, it has recently been discovered that, when experiencing stressor damage from various sources plants use chloroplast-to-nucleuscommunication to regulate gene expression and help them cope.

Subcellular proteomics stands on the shoulders of decades ofbiochemical research that has developed methods for separation ofsubcellular compartments. Numerous laboratories have workedover the years to improve separation techniques, enabling incre-mental limitation of contamination in isolation methods [4]. Suchsubcellular fractionation protocols typically utilize density-gradientcentrifugation and have enabled the enrichment of crude micro-somes, the cytosol, the plasmalemma, the nuclei, and chloroplasts.In this context, this chapter describes the experimental stepsinvolved in the enrichment of nuclei and chloroplast from needlesof conifers, tissue especially complex in biochemical terms, in orderto analyze their proteomes and cross talk signaling between bothorganelles. An overview of the workflow for the purification ofnuclei and chloroplast subproteome is presented in Fig. 1.

2 Materials

All solutions must be prepared using ultrapure water (prepared bypurifying deionized water to attain a resistivity of 18 MΩ cm at25 �C) and analytical grade reagents.

2.1 Nuclei Isolation 1. Plant materialApproximately 1 g of needles (see Note 1).

2. Reagents and SolutionsAll buffers must be made fresh and kept on ice (4 �C).

– Organelle Extraction Buffer (OEB): 0.44 M sucrose,10 mM Tris–HCl pH 8.0, 5 mM β-mercaptoethanol,0.015 mM PMSF.

– Plastid Disruption Buffer (PDB): 0.25 M sucrose, 10 mMTris–HCl pH 8.0, 10 mM MgCl2, 1% (v/v) Triton X-100,5 mM β-mercaptoethanol, 0.015 mM PMSF.

– Washing Buffer (WB): 0.25 M sucrose, 10 mM Tris–HClpH 8.0, 10 mM MgCl2, 5 mM β-mercaptoethanol,0.015 mM PMSF.

– Pellet Suspension Buffer (PSB): WB: ddH2O (2:1, v/v).

– Discontinuous Sucrose Gradient (DSB) (see Note 2):

Pine Nuclei and Chloroplast Proteomics 71

SUBCELLULAR PROTEOMEEXTRACTION

HOMOGENEIZATION1 gram of tissue

• Mortar andliquid N2

• Rotor-stator

PLASTIDISOLATION

• OrganelleExtractionBuffer

• Filtration• Centrifugation

• ChloroplastIsolationBuffer

• Filtration• Differential

centrifugation

PURIFICATION BYDISCONTINUOUS GRADIENT

• Percoll-Sucrosegradient

ORGANELLECLEANING

• PlastidDisruptionBuffer

• Centrifugation• Washing Buffer• Centrifugation

• ChloroplastIsolationBuffer

• Differentialcentrifugation

• PelletSuspensionBuffer

• Sucrosegradient

PROTEIN EXTRACTION

• Protein Extraction Buffer and SDS• Incubate by shaking (15 min,RT)• Buffer Z and phenol• Centrifugation• Protein Precipitation Buffer• Incubate overnight (-20ºC)

PROTEIN PURIFICATION

• Centrifugation• Washes with acetone• Air dry

PROTEIN RESUSPENSION

• Protein Solubilization Solution

PROTEIN ASSESSMENT

LC-MS & BIOINFORMATICANALYSIS

NUCLEUS CHLOROPLAST

Fig. 1 Overview of the workflow of nuclei and chloroplast subproteome purification

72 Laura Lamelas et al.

Solution a. 0.32 M sucrose, 3 mM CaCl2, 2 mM Mg(C2H3O2)2, 0.1 mM EDTA, 10 mM Tris–HClpH 8.0, 1 mM DTT, 0.5% (v/v) NP-40.

Solution b. 2 M sucrose, 5 mM Mg(C2H3O2)2, 0.1 mMEDTA, 10 mM Tris–HCl pH 8.0, 1 mM DTT.

Solution c. 3 M sucrose, 5 mM Mg(C2H3O2)2, 0.1 mMEDTA, 10 mM Tris–HCl pH 8.0, 1 mM DTT.

– Liquid Nitrogen.

– Ultrapure Water.

3. Equipment.

– Mortar and pestle.

– Cheesecloth or miracloth (filters).

– Centrifuge and disposable centrifuge tubes (50 and 15 mL).

– Vortex/microtube mixer.

– Micropipettes.

2.2 Chloroplast

Isolation

1. Plant materialApproximately 1 g of needles (see Note 1).

2. Reagents and SolutionsAll buffers must be made fresh just before performing

chloroplast isolation and kept on ice (4 �C).

– Chloroplast Isolation Buffer (CIB): Sorbitol 0.35 mM,HEPES-KOH 50 mM pH 7.4, EDTA 5 mM, MgCl25 mM, 15 mM β-mercaptoethanol, PMSF 0.5 mM, BSA1% (w/v) (see Note 3).

– Discontinuous Percoll-Sucrose Gradient (DPSG) (see Note4):

Solution a. 9 vol 3 M sucrose, 5 mM Mg(C2H3O2)2,0.1 mM EDTA, 10 mM Tris–HCl pH 8.0, 1 mMDTT, 1 vol CIB.

Solution b. Percoll 70% (v/v) diluted in CIB.

Solution c. CIB.

– Liquid nitrogen.

– Ultrapure water.

3. Equipment.

– Cheesecloth or miracloth (filters).

– Centrifuge and disposable centrifuge tubes (50 and 15 mL).

– Rotor–stator homogenizer.

– Pipettes.

Pine Nuclei and Chloroplast Proteomics 73

2.3 Protein

Extraction

1. Plant materialNuclei or chloroplast purified pellets.

2. Reagents and SolutionsAll buffers must be made fresh just before performing

protein extraction and kept on ice (4 �C).

– Protein Extraction Bufffer (PEB): 100 mM Tris–HClpH 8.0, 5% SDS (w/v), 10% glycerol (v/v), (2 mMPMSF), 10 mM DTT, 1.2% (v/v) plant protease inhibitorcocktail (Sigma P9599) (see Note 5).

– Buffer Z (BZ): 1.5 M sucrose, 10 mMDTT, 1% (v/v) plantprotease inhibitor cocktail (Sigma, P9599).

– Protein Precipitation Solution (PPS): 0.1 M ammoniumacetate in methanol.

– Protein Solubilization Solution (PSS): 1.5% (w/v) SDS,8 M urea/6 M urea, 2 M thiourea.

– SDS: 20% (w/v).

– Acetone.

– Methanol.

– Phenol.

3. Equipment.

– Microcentrifuge and disposable microcentrifuge tubes(1.5 mL).

– Vortex/microtube mixer.

– Ultrasound bath.

– Pipettes.

3 Methods

3.1 Nuclei Isolation Unless otherwise specified, all steps of nuclei isolation must beperformed at 4 � C

1. Cell lysis.

– Collect pine needles and keep them at �80 �C or in liquidnitrogen until use.

– Ground 1 g of plant material to fine powder using a mortarand pestle in liquid nitrogen. Immediately transfer the pow-der to a 50 mL tube with 10 mL of Organelle ExtractionBuffer (OEB) and mix gently by inversion.

– Incubate on ice for 30 min and mix it carefully by inversionevery 5 min to avoid the formation of clumps.

2. Nuclei isolation.

74 Laura Lamelas et al.

– Filter the solution through three layers of cheesecloth ormiracloth previously soaked with OEB.

– Centrifuge filtered samples 15 min at 3000 � g in a swing-ing rotor at 4 �C and discard the supernatant.

– Resuspend the obtained pellet in 5 mL of Plastid DisruptionBuffer (PDB) by gently pipetting and incubate on ice10 min (during this period mix by inversion every 2 min).

– Centrifuge for 10 min at 3000 � g and 4 �C and discard thesupernatant.

– Repeat this step until whitish pellets were obtained (seeNote6).

– Wash pellets with 8 mL of Washing Buffer (WB).

– Centrifuge at 3000 � g 10 min and remove supernatant.

3. Nuclei fraction purification.

– Add 3 mL of previously cooled Discontinuous Sucrose Gra-dient (DSG) solution c to a 12 mL tube and the overlay3 mL of solutions b and a sequentially and keep gradient onice. At this point two sharp interfaces should be observed inthe tube.

– Resuspend obtained pellets carefully in 400 μL of PelletSuspension Buffer (PSB) and add them to gradient, centri-fuge for 12 min at 3000 � g. Intact nuclei solution is in theinterface between DSG solutions b and c.

– Harvest nuclei fraction and clean it with Pellet SuspensionBuffer (PSB) in a 1.5 mL disposable tube, centrifuge at3000 � g and discard the supernatant.

∗Pause point: At this point nuclei can be stored at �20 �C.

3.2 Chloroplast

Isolation

1. Cell lysis.

– Collect pine needles just before starting chloroplast isola-tion and keep them on ice.

– Cut the needles in 2–3 mm pieces and immediately homog-enize them in 12 mL of precooled Chloroplast IsolationBuffer (CIB) using a rotor–stator homogenizer at6500 min�1 for 20 s three times each.

– Clean rotor–stator system in a fresh tube with 8 mL of CIB.Add this CIB with the homogenized needles. Mix bothhomogenates and filter through four layers of cheese-cloth/Miracloth.

2. Chloroplast isolation.

– Centrifuge filtered solution for 3 min at 200� g at 4 �C in aswinging rotor.

Pine Nuclei and Chloroplast Proteomics 75

– Transfer supernatant to a new tube and centrifuge 20 min at3000 � g at 4 �C.

– Discard the supernatant and wash the raw chloroplast pelletwith 10 mL of CIB.

– Repeat the centrifugation step and suspend the cleanedpellet in 3 mL of CIB.

3. Chloroplast fraction purification.

– Add 3 mL of Discontinuous Percoll-Sucrose Gradient(DPSG) solution a and then overlay 3 mL of DPSG solutionb. A sharp interface between the two layers should beobserved. Prepare this gradient in a 15 mL tube.

– Resuspend chloroplast pellet (Subheading 3.2, step 2) in3 mL of CIB and carefully overlay the discontinuous gradi-ent. Centrifuge 30 min at 3300 � g at 4 �C in a swingingrotor with smooth acceleration–deceleration (see Note 7).Intact chloroplasts are located in the lower phase (see Note8). It is recommended to check by microscopy eachobtained layer composition.

– Recover the lower dark green phase of the gradient to a12 mL tube and fill the tube with CIB mix gently byinversion until homogeneous color is obtained and centri-fuge it at 3000 � g 10 min. Discard the supernatant.

∗Pause point: Obtained pellet consisting in intact chloroplastscan be stored at �80 �C.

3.3 Protein

Extraction

– Resuspend nuclei or chloroplast pellets in 300 μL of ProteinExtraction Buffer (PEB) and sonicate them for 15 s at 60%amplitude (Hielcher UP200S). Then incubate in a vortex atmaximum speed for 15 min at room temperature. Add 100 μLof 20% SDS to the sample tube. Incubate 2–5 min at 95 � Cvortex in-between.

– Add 300 μL of Buffer Z (BZ) and 300 μL of phenol. Mixvigorously, and centrifuge for 5 min at 17,000 � g and roomtemperature.

– After centrifugation, save phenolic (upper) phase and reextractlower phase by adding 300 μL of phenol.

– Centrifuge again for 5 min at 17,000 � g and merge bothphenolic phases.

– Clean obtained phenolic phases with BZ in the same way andkeep inly the upper phase.

– Add two volumes of Protein Precipitation Buffer and incubateovernight at �20 �C. White flakes of protein may be seenimmediately.

76 Laura Lamelas et al.

– Centrifuge the tubes and wash the protein pellet with acetonetwice.

– Dry the pellets at room temperature and dissolve them in theminimum amount of Protein Solubilization Solution (PSS)(30–40 μL or more) (see Note 9).

– Protein content can be quantified by BCA assay [12]. And theenrichment in nuclear proteins can be assessed by 1-DE SDS-PAGE by comparing total protein fraction with nuclei/chloro-plast fraction.

4 Expected Results

The suitability of an untargeted proteomics protocol dependsheavily on two main factors, the obtained protein yield, understoodas the purified protein abundance per weight of starting materialand also, the diversity of obtained proteins.

Described protocols in this chapter resulted material-effectiveand allowed for the identification of a wide variety of subcellularproteins as summarized in Table 1.

5 Notes

1. Starting plant material can be fresh or frozen, for tissuehomogenization step frozen material should be disruptedwith liquid nitrogen mortar and pestle and for fresh materiala rotor–stator homogenizer. Both approaches are valid, but it ispreferred, when possible, the use of fresh material which willlead to a better purification and then a higher protein yield.

2. NP-40 and DTT added just prior to use.

3. β-mercaptoethanol, PMSF, and BSA must be freshly added.

4. DTT added just prior to use.

5. Add DTT, PMSF, and protease inhibitor cocktail justbefore use.

6. If pellets are still greenish after three washes it is recommendedincrease the Triton concentration up to 3% of PDB).

Table 1Expected results of protein extraction yield and identifications

Nucleus proteome Chloroplast proteome Total proteome

Protein yield 100 μg/g frozen needles 700 μg/g fresh needles –

Protein identification 1057 1342 3652

Pine Nuclei and Chloroplast Proteomics 77

7. It is essential to use an automatic rate controller, if available, toavoid mixing of the gradient layers in the deceleration; if thisfeature is not available, it is recommended to disconnect thebrake or to manually decrease the rotor frequency.

8. The chloroplast density depends heavily on the amount ofstarch in them; if the experimental system has varying starchamounts, it has to be taken in consideration for intact chloro-plast recovery.

9. Freeze and thaw the samples to help the protein to denaturalizeand get dissolved.

Acknowledgments

Our research group is generously funded by Spanish Ministry ofScience, Innovation, and Universities (AGL2016-77633-P andAGL2017-83988-R). M.M., and L.L. were also supported bySpanish Ministry of Science, Innovation, and Universities troughRamon y Cajal (RYC-2014-14981) and Spanish PhD training(BES-2017-082092 to L.L.) programs. L.G. was supported byGovernment of Principado de Asturias, (Spain) trough SeveroOchoa program (BP19-146).

References

1. Hebert AS, Richards AL, Bailey DJ et al (2014)The one hour yeast proteome. Mol Cell Prote-omics 13:339–347

2. Kislinger T, Gramolini AO, MacLennan et al(2005) Multidimensional protein identifica-tion technology (MudPIT): technical overviewof a profiling method optimized for the com-prehensive proteomic investigation of normaland diseased heart tissue. J Am Soc Mass Spec-trom 16:1207–1220

3. Pascual J, Alegre S, Nagler et al (2016) Thevariations in the nuclear proteome reveal newtranscription factors and mechanisms involvedin UV stress response in Pinus radiata. J Pro-teome 143:390–400

4. Millar AH, Taylor NL (2014) Subcellularproteomics-where cell biology meets proteinchemistry. Front Plant Sci 5:55

5. Goto C, Hashizume S, Fukao Y et al (2019)Comprehensive nuclear proteome of Arabi-dopsis obtained by sequential extraction.Nucleus 10:81–92

6. Yin X, Komatsu S (2016) Plant nuclear prote-omics for unraveling physiological function.New Biotechnol 33:644–654

7. Bouchnak I, Brugiere S, Moyet L et al (2019)Unraveling hidden components of the chloro-plast envelope proteome: opportunities andlimits of better MS sensitivity. Mol Cell Prote-omics 18:1285–1306

8. Wu W, Yan Y (2018) Chloroplast proteomeanalysis of Nicotiana tabacum overexpressingTERF1 under drought stress condition. BotStud 59:26

9. Brown EC, Somanchi A, Mayfield SP (2001)Interorganellar crosstalk: new perspectives onsignaling from the chloroplast to the nucleus.Genome Biol 2:REVIEWS1021

10. Colombo M, Tadini L, Peracchio C et al(2016) GUN1, a jack-of-all-trades in chloro-plast protein homeostasis and signaling. FrontPlant Sci 7:1427

11. Zhao X, Huang J, Chory J (2019) GUN1interacts with MORF2 to regulate plastidRNA editing during retrograde signaling.Proc Natl Acad Sci U S A 116:10162–10167

12. Smith PK, Krohn RI, Hermanson GT et al(1985) Measurement of protein using bicinch-oninic acid. Anal Biochem 150:76–85

78 Laura Lamelas et al.

Chapter 6

Apoplastic Fluid Preparation from Arabidopsis thalianaLeaves Upon Interaction with a Nonadapted PowderyMildew Pathogen

Ryohei Thomas Nakano, Nobuaki Ishihama, Yiming Wang, Junpei Takagi,Tomohiro Uemura, Paul Schulze-Lefert, and Hirofumi Nakagami

Abstract

Proteins in the extracellular space (apoplast) play a crucial role at the interface between plant cells and theirproximal environment. Consequently, it is not surprising that plants actively control the apoplastic pro-teomic profile in response to biotic and abiotic cues. Comparative quantitative proteomics of plantapoplastic fluids is therefore of general interest in plant physiology. We here describe an efficient methodto isolate apoplastic fluids from Arabidopsis thaliana leaves inoculated with a nonadapted powdery mildewpathogen.

Key words Apoplast, Protein secretion, Membrane trafficking, Arabidopsis thaliana, Cell wall

1 Introduction

The “extracellular space” of plant cells constitutes a compartmentexternal to the plasma membrane, including cell walls, the intercel-lular space, and the apoplastic fluid. Given its interface locationbetween plant cells and their proximate environment, moleculesin this compartment are expected to play fundamental roles inintrinsic developmental processes and adaptation to fluctuatingenvironmental conditions. It is therefore not surprising that theextracellular compartment of plants contains a multitude of partlydiffusible small molecules and proteins as well as carbohydratepolymers, the latter of which provide structure to plant cells. Allof these molecules must be transported from the cell interior to theextracellular space. The transport mechanism itself is likely subjectto controlled activity to fulfill changes in supply and demand of theextracellular compartment to developmental needs and for envi-ronmental adaptation. The primary pathway to deliver proteins to

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_6, © Springer Science+Business Media, LLC, part of Springer Nature 2020

79

the apoplast is mediated by a vesicular protein transport system, aprocess referred to as membrane trafficking. Earlier work has shownthat in leaves the secreted protein profile (“secretome”) is depen-dent on environmental conditions. For instance, biotic stress sti-muli such as pathogen inoculation is known to trigger the secretionof a specific set of inducible defense-related proteins, includingPATHOGENESIS RELATED-1 (PR-1) and PR-2 [1–3]. How-ever, the mechanism(s) enabling plants to shift their secretomeprofiles both locally and systemically are still incompletelyunderstood.

In order to comprehensively analyze the proteomic profile ofthe secretome, several protocols have been described [1]. Onemethod takes advantage of suspension-cultured plant cells to col-lect the culture supernatant as secretome fraction [4–7]. Thisapproach is technically straightforward, minimizes contaminationfrom damaged cells, and permits the collection of large volumes ofapoplastic fluid under defined nutritional stress conditions and/orupon phytohormone treatment or application of ligands (elicitors)that trigger innate immune responses. Another nondestructiveapproach is to collect proteins in hydroponic culture media inwhich plant seedlings are growing. This permits the identificationof proteins that are actively secreted in vivo from the whole organ-ism consisting of diverse tissue types and organs [8]. These proteinswill comprise, besides apoplastically secreted polypeptides normallyfound in the intercellular space in the interior of a plant organ, alsomolecules that are exported to the plant surface, for example,polypeptides present in root exudates. However, these approachesare often not applicable for in-depth studies of interactions betweenplants and microorganisms that are adapted to colonize above-ground plant organs. On the other hand, vacuum infiltration cen-trifugation (VIC) methods have been widely used for many plantspecies to isolate apoplastic fluid from plants growing in natural soilsubstrate [2, 7, 9–18]. The VIC method is essentially based oncollecting vacuum-infiltrated buffer solutions from plant tissues bycentrifugation, which is assumed to consist mainly of plant apoplas-tic fluid. Experimental details can be modified according to a givenparticular research question, for example by altering the infiltrationbuffer’s contents and/or using different sample preparation proce-dures before buffer infiltration. We here describe a method toisolate leaf apoplastic proteins from Arabidopsis thaliana leavesinoculated with a non-adapted powdery mildew pathogen forsubsequent quantitative proteomic analysis. The method describedhere can be applied to investigate interactions of plant leaves withmicrobes that have different lifestyles, for example, pathogens,mutualists, or commensals.

80 Ryohei Thomas Nakano et al.

2 Materials

Prepare all solutions using ultrapure water and analytical gradereagents. Prepare and store all reagents at room temperature, unlessindicated otherwise.

2.1 Plant Growth

and Pathogen

Inoculation

1. 70% ethanol in Milli-Q water.

2. 99.9% ethanol (VWR Chemicals).

3. 1.5- or 2-mL microtubes.

4. Greenhouse soil (see Note 1).

5. Pots (9 � 9 � 9.5 cm3).

6. A. thaliana seeds.

7. Hordeum vulgare (cultivar Golden Promise) seedlings.

8. Blumeria graminis f. sp. hordei (Bgh) isolate K1 [19].

9. Controlled environment growth cabinet for A. thaliana(Day—21 �C, 10 h, 60% humidity/Night—21 �C, 14 h,

60% humidity).

10. Controlled environment growth cabinet for H. vulgare(Day—21 �C, 16 h, 55% humidity/Night—21 �C, 8 h,

55% humidity).

2.2 Apoplastic Fluid

Preparation

1. Infiltration Buffer: 5 mM sodium acetate, 0.2 M calcium chlo-ride, pH 4.3 (see Note 1). Store at 4 �C.

2. Protease inhibitor cocktail, EDTA-free (Roche).

3. Flat-bottom glass beakers (300 mL and 200 mL).

4. A vacuum pump and a vacuum desiccator (see Note 2).

5. Paper towel.

6. Blunt-end needleless syringe (20 mL).

7. 50-mL low-polymer conical tubes (see Note 3).

8. 2-mL low-polymer microtubes (see Note 3).

2.3 Protein

Purification by

Chloroform–Methanol

Precipitation

1. Ultrafree-CL Centrifugal Filter 0.22 μm pore size (MerckMillipore).

2. 50-mL conical centrifuge tubes, compatible with chloroform(see Note 4).

3. 100% methanol (VWR Chemicals).

4. 99% chloroform (Merck Millipore).

5. Sterile Milli-Q water.

6. Vortex.

7. SpeedVac vacuum concentrator.

Leaf Secretome of Pathogen-Challenged Arabidopsis Thaliana 81

3 Methods

3.1 Plant Growth

and Pathogen

Inoculation

1. To sterilize the seed surface, mix a scoop of A. thaliana seeds(200~500 seeds) with 1 mL of 70% ethanol in a 1.5- or 2-mLmicrotube and shake or rotate for 10 min. Discard the ethanoland briefly rinse the seeds with 99.9% ethanol. Leave on abench with its lid open for 30~60 min to dry the seeds (seeNote 5).

2. Sow a generous seed number (see Note 5) in 9 � 9 � 9.5 cm3

pots filled with greenhouse soil. Start plant cultivation in thegrowth chamber for A. thaliana (see Note 6).

3. Keep watering the plants every 2–3 days. Avoid excessive water-ing to minimize algae growth on soil surface.

4. After 2 weeks, remove the majority of seedlings so that each potharbors 4–5 plants of similar size. In case of plant dwarfism(e.g., due to particular gene defects), this number can beincreased up to 10–25 plants/pot (see Note 7).

5. Continue plant cultivation under the same conditions foranother 2–3 weeks.

6. One week before pathogen inoculation, bulk up fresh coni-diospores of Bgh isolate K1 on 7-day-old susceptibleH. vulgareseedlings and incubate in a growth chamber for barleycultivation.

7. Inoculate Bgh conidiospores on A. thaliana plants by tappingthe leaf blades of infected barley plants (seeNote 8). A subset ofplants is used as non-inoculated samples (0 hpi; hour postinoculation) and proceed to preparation of apoplastic fluid(see Note 8).

8. Incubate inoculated A. thaliana plants under the same condi-tions for a desired time period.

3.2 Apoplastic Fluid

Preparation

1. Collect entire shoots by cutting hypocotyls in a 300-mL glassbeaker containing 100 mL of the Infiltration Buffer, freshlysupplemented with two tablets of the Protease Inhibitor Cock-tail (see Notes 9 and 10).

2. Submerge shoots in the infiltration buffer by placing on top anempty 200-mL glass beaker (Fig. 1a).

3. Vacuum for 10 min and release gently. Vacuum release shouldtake longer than 10 min (see Note 11).

4. Remove excessive buffer on the leaf surface by blot drying on apaper towel (Fig. 1b).

5. Introduce plant shoots in a 20-mL blunt-end needleless syringeafter removal of its plunger.

82 Ryohei Thomas Nakano et al.

6. Introduce the syringe into a 50-mL conical centrifuge tube(Fig. 1c).

7. Centrifuge at 1000� g for 20 min at 4 �C. The apoplastic fluidwill be collected at the bottom of the conical tube as a flow-through (Fig. 1c). Residual plant material can be stored in aseparate microtube at �80 �C as “total fraction.”

3.3 Protein

Purification by

Chloroform–Methanol

Precipitation

1. Apply obtained apoplastic fluid to a centrifuge tube with a0.22-μm filter column on ice and centrifuge at 12,000 � gfor 2 min at 4 �C.

2. Transfer the flow-through to a 50-mL conical tube that isresistant to chloroform (see Note 12) on ice. Estimate theamount of flow-through using a pipette (see Note 13).

3. Add 4 volumes of 100% methanol on ice and mix well byvortexing.

4. Add 1 volume of 99% chloroform on ice and mix well byvortexing. The mixed solution can be stored at �20 �C for afew hours.

300-mLbeaker

200-mLbeaker

A

B C

50-mLconical tube

20-mL blunt-endneedle-less syringe

1,000 g20 min.

Fig. 1 Apoplast fluid preparation. (a) Buffer infiltration. A. thaliana shoots are submerged into the InfiltrationBuffer in a 300-mL flat-bottom beaker, and a 200-mL flat-bottom beaker is used as a weight during vacuuminfiltration. (b) Shoots are collected from the beaker, carefully flattened, and blot-dried on a paper towel. (c)Shoots are introduced in a 20-mL blunt-end needleless syringe within a 50-mL conical tube. After centrifu-gation, the apoplastic fluid will be extracted to the space between the syringe and the surrounding conical tube

Leaf Secretome of Pathogen-Challenged Arabidopsis Thaliana 83

5. Add 3 volumes of sterile Milli-Q water on ice and mix well byvortexing.

6. Centrifuge at 15,000 � g for 2 min at 4 �C. Proteins willaccumulate at the liquid interface.

7. Remove the top aqueous layer and add 4 volumes of 100%methanol on ice. Mix well by vortexing.

8. Centrifuge at 15,000 � g for 2 min at 4 �C.

9. Remove the supernatant such that the total amount is less than2 mL.

10. Mix residual supernatant and precipitates by pipetting andtransfer to a new 2-mL microtube on ice.

11. Centrifuge at 15,000 � g for 10 min at 4 �C.

12. Carefully reduce the amount of supernatant as much as possi-ble (see Note 14).

13. Dry the pellet by a vacuum concentrator (e.g., SpeedVac) (seeNote 15).

14. Dissolve the pellet in an optimal buffer for protein digestion orprotein gel electrophoresis for subsequent proteomic analysis(see Note 16). The pellet can be stored at �80 �C untilsubsequent analyses.

4 Notes

1. Any soils that are compatible to grow A. thaliana andH. vulgare can be used. We normally use “Mini Tray” soils(EINHEITS ERDE) without additional fertilization.

2. Any conventional vacuum diaphragm pumps and compatibledesiccators, with which the vacuum is able to be released gently,can be used. A glass vacuum desiccator of ~25-cm diameter isable to accommodate 5–6 samples at once (see Fig. 1a).

3. To obtain a high signal-to-noise ratio in the mass spectrometer,it is crucial to avoid polymer contaminations derived fromplastics throughout the procedures. Toward this end, we rec-ommend the use of Eppendorf 50-mL Conical Tubes andSARSTEDT SafeSeal 2-mL microtubes.

4. To avoid physical damage caused by chloroform on the conicaltubes made of polypropylene (PP), an alternative chloroform-tolerant material such as fluorocarbon polymers (FEP) shouldbe used. We use Nalgene® Oak Ridge Centrifuge Tube FEP(Sigma-Aldrich).

5. Although surface-sterilized seeds can be stored for weeks up tomonths, we recommend freshly sterilized seeds for every use asseed vigor and germination rate rapidly decreases during

84 Ryohei Thomas Nakano et al.

storage. In most cases stratification is not essential for germi-nating A. thaliana seeds but can be applied (e.g., for oldseeds). The amount of seeds needed for an experiment dependson the growth and germination rate of each plant line to betested. We typically sow at least five times more seeds than thenecessary number of plants (e.g., 25~30 seeds to have fiveplants of similar size).

6. At least four to five technical replicates (i.e., independent pro-tein samples) must be prepared. When multiple plant geno-types are included in an experiment, grow all genotypes in thesame tray in a randomized design (Fig. 2). When multipletreatments are to be compared, the plants of all treatmentswithin a technical replicate must be grown in the same plantgrowth chamber, ideally side-by-side on the same shelf.

7. For one protein sample, we normally use 16 5-week-old plantsgrowing in four separate pots. This may be scaled up if plantsare expected to be smaller, for instance due to genetic dwarfismor upon abiotic/biotic stress treatment.

8. Bgh conidiospores can be easily detached from infected barleyplants by tapping the leaf blades directly above A. thalianarosettes. Conidiospores will settle immediately on the surfaceof target leaves by gravity. A settling tower can be used for amore efficient and homogeneous conidiosporeinoculation [20].

genotype A

genotype F

genotype E

genotype D

genotype B

genotype A

genotype F

genotype E

genotype C

genotype B

genotype A

genotype F

genotype D

genotype C

genotype B

genotype A

genotype E

genotype D

genotype C

genotype B

genotype F

genotype E

genotype D

genotype C

Fig. 2 Randomized design of plant cultivation. Pots need to be randomized when multiple genotypes are to becompared. A schematic example for six genotypes in a single technical replicate (four pots per replicate) isshown

Leaf Secretome of Pathogen-Challenged Arabidopsis Thaliana 85

9. We avoid washing plant leaves before buffer infiltration tocollect also proteins that are secreted to the leaf surface. Thisis relevant because Bgh is a leaf epiphyte and attacks exclusivelyleaf epidermal cells. When apoplastic fluid within the leaf organis the primary target, a thorough washing step can be addedbefore submerging the leaves in the Infiltration Buffer.

10. Samples to be directly compared within a technical replicateshould be processed jointly at this step and throughout thefollowing steps.

11. Ensure that all plants are submerged in the buffer. Vacuummust be released very slowly and gently, otherwise plant cellswill collapse and this will increase contamination with cytoplas-mic proteins.

12. Buffer composition can be optimized depending on the targetproteins. For instance, calcium chloride is required for extract-ing cell wall-associated proteins but can be omitted if theseproteins are not of interest. Alternative pH values can also beused, but it should be noted that the plant extracellular space isusually around pH 5.0 or below [21]. It was reported thatinfiltration with sodium acetate buffer (pH 4.3) best performsto avoid contamination with cytoplasmic proteins [18, 22].

13. The amount of flow-through depends on the number/size ofthe plants. From 16 5-week-old A. thaliana shoots, we typi-cally obtain 0.5–2.0 mL of apoplastic fluids.

14. Protein precipitate after methanol–chloroform extraction isusually very fragile and disruption of the precipitate decreasesthe final protein yield. To minimize the yield loss, we do notremove the whole amount of supernatant but rather decreaseits amount by pipetting and use SpeedVac concentrator tocompletely avoid the residual liquid.

15. The time period required for complete drying of protein pelletsdepends mainly on the amount of residual liquid.

16. After untargeted proteomic analysis, we typically quantifythousands of protein groups and 25–30% of them containpredicted N-terminal signal peptides [3]. The rest of the pro-teins may be delivered into the apoplast via noncanonical secre-tory pathway(s) or represent contaminations, for example, dueto physical cell disruption. Whether proteins lacking aN-terminal signal peptide should be considered as true signalsor are removed in silico prior to a deeper computational analy-sis depends on the specific hypotheses to be tested.

86 Ryohei Thomas Nakano et al.

Acknowledgments

This work was supported by Ministry of Education, Culture,Sports, Science, and Technology of Japan Grants-in-Aid for Scien-tific Research to T. Uemura (No. 15H04627), by the Asahi GlassFoundation to T. Uemura, by the Max Planck Society to P.S.-L.and H. N., and by the “Cluster of Excellence on Plant Sciences(CEPLAS)” program funded by the Deutsche Forschungsge-meinschaft (DFG) to P.S.-L.

References

1. Delaunois B, Jeandet P, Clement C et al (2014)Uncovering plant-pathogen crosstalk throughapoplastic proteomic studies. Front Plant Sci5:249

2. Uemura T, Nakano RT, Takagi J et al (2019) AGolgi-released subpopulation of the trans-Golgi network mediates protein secretion inarabidopsis. Plant Physiol 179:519–532

3. Ruhe J, Agler MT, Placzek A et al (2016)Obligate biotroph pathogens of the genusalbugo are better adapted to active host defensecompared to niche competitors. Front PlantSci 7:820

4. Kusumawati L, Imin N, Djordjevic MA (2008)Characterization of the secretome of suspen-sion cultures of Medicago species reveals pro-teins important for defense and development. JProteome Res 7:4508–4520

5. Oh IS, Park AR, Bae MS et al (2005) Secre-tome analysis reveals an Arabidopsis lipaseinvolved in defense against Alternaria brassici-cola. Plant Cell 17:2832

6. Okushima Y, Koizumi N, Kusano T, Sano H(2000) Secreted proteins of tobacco culturedBY2 cells: identification of a new member ofpathogenesis-related proteins. Plant Mol Biol42:479–488

7. Cho WK, Chen XY, Chu H et al (2009) Pro-teomic analysis of the secretome of rice calli.Physiol Plant 135:331–341

8. Waghmare S, Lileikyte E, Karnik R et al (2018)SNAREs SYP121 and SYP122 mediate thesecretion of distinct cargo subsets. PlantPhysiol 178:1679–1688

9. Lohaus G, Pennewiss K, Sattelmacher B et al(2001) Is the infiltration-centrifugation tech-nique appropriate for the isolation of apoplasticfluid? A critical evaluation with different plantspecies. Physiol Plant 111:457–465

10. Delaunois B, Colby T, Belloy N et al (2013)Large-scale proteomic analysis of the grapevineleaf apoplastic fluid reveals mainly stress-related

proteins and cell wall modifying enzymes.BMC Plant Biol 1:24

11. Konozy EHE, Rogniaux H, Causse M, Fauro-bert M (2013) Proteomic analysis of tomato(Solanum lycopersicum) secretome. J Plant Res126:251–266

12. Wen F, VanEtten HD, Tsaprailis G, Hawes MC(2007) Extracellular proteins in pea root tipand border cell exudates. Plant Physiol143:773

13. Delannoy M, Alves G, Vertommen D et al(2008) Identification of peptidases in Nicoti-ana tabacum leaf intercellular fluid. Proteo-mics 8:2285–2298

14. Soares NC, Francisco R, Ricardo CP, JacksonPA (2007) Proteomics of ionically bound andsoluble extracellular proteins inMedicago trun-catula leaves. Proteomics 7:2070–2082

15. Witzel K, Shahzad M, Matros A et al (2011)Comparative evaluation of extraction methodsfor apoplastic proteins from maize leaves. PlantMethods 7(1):48

16. Agrawal GK, Jwa N-S, Lebrun M-H et al(2010) Plant secretome: unlocking secrets ofthe secreted proteins. Proteomics 10:799–827

17. Casasoli M, Spadoni S, Lilley KS et al (2008)Identification by 2-D DIGE of apoplastic pro-teins regulated by oligogalacturonides in Ara-bidopsis thaliana. Proteomics 8:1042–1054

18. Wang Y, Kim SG, Wu J et al (2014) Differentialproteome and secretome analysis during rice-pathogen interaction. Methods Mol Biol1072:563–572

19. Zhou F, Kurth J, Wei F et al (2001) Cell-autonomous expression of barley Mla1 confersrace-specific resistance to the powdery mildewfungus via a Rar1-independent signaling path-way. Plant Cell 13(2):337–350

20. Adam L, Somerville SC (1996) Genetic char-acterization of five powdery mildew diseaseresistance loci in Arabidopsis thaliana. Plant J

Leaf Secretome of Pathogen-Challenged Arabidopsis Thaliana 87

9(3):341–356. https://doi.org/10.1046/j.1365-313X.1996.09030341.x

21. Barbez E, Dunser K, Gaidora A et al (2017)Auxin steers root cell expansion via apoplasticpH regulation in Arabidopsis thaliana. ProcNatl Acad Sci U S A 114(24):E4884–E4893.https://doi.org/10.1073/pnas.1613499114

22. Sehrawat A, Deswal R (2014) S-nitrosylationanalysis in Brassica juncea apoplast highlightsthe importance of nitric oxide in cold-stresssignaling. J Proteome Res 13(5):2599–2619.https://doi.org/10.1021/pr500082u

88 Ryohei Thomas Nakano et al.

Chapter 7

Shotgun Proteomics of Plant Plasma Membraneand Microdomain Proteins Using Nano-LC-MS/MS

Daisuke Takahashi, Bin Li, Takato Nakayama, Yukio Kawamura,and Matsuo Uemura

Abstract

Shotgun proteomics allows for the comprehensive analysis of proteins extracted from plant cells, subcellularorganelles, and membranes. Previously, two-dimensional gel electrophoresis-based proteomics was used formass spectrometric analysis of plasma membrane proteins. However, this method is not fully applicable forhighly hydrophobic proteins with multiple transmembrane domains. In order to solve this problem, wehere describe a shotgun proteomics method using nano-LC-MS/MS for proteins in the plasma membraneand plasma membrane microdomain fractions. The results obtained are easily applicable to label-freeprotein semiquantification.

Key words Plasma membrane, Detergent-resistant membrane, Microdomain, Nano-LC-MS/MS,Shotgun proteomics, Label-free semiquantification, In-gel digestion, In-solution digestion

1 Introduction

Comprehensive protein identification involves solubilization andpreseparation of proteins, peptide digestion and fragmentationusing trypsin, and separation and detection of each peptide withliquid chromatography–tandem mass spectrometer (LC-MS/MS)[1–4]. Compared with soluble proteins, the preparation steps forproteomics of cellular membranes including the plasma membrane(PM) is difficult because of a large number of the highly hydropho-bic properties of the proteins and its highly hydrophobic lipidenvironments [5–7]. Although membrane proteomics has beenperformed by two-dimensional gel electrophoresis (2-DE)–basedproteomics [5, 7, 8], PM proteins are particularly difficult to solu-bilize and 2-DE–based proteomics requires a large amount of valu-able PM proteins [9]. In addition, microdomains, which areconsidered to exist as extremely hydrophobic compartments in thePM because of the enrichment of specific lipids and proteins, havebeen recognized by many researchers because of their involvement

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_7, © Springer Science+Business Media, LLC, part of Springer Nature 2020

89

in important cellular process such as signal transduction, membranetrafficking, and pathogen infection. However, microdomains,which are often isolated as detergent-resistant membrane (DRM)fractions, are even more intractable to comprehensive proteomicsstudy due to their highly hydrophobic characteristics [10–15]. Thehydrophobicity of themembranes has been a challenge for research-ers to obtain a comprehensive view of their proteomic profiles.

Liquid chromatography and mass spectrometry technologieshave advanced rapidly, providing higher resolution and reliableresults for a huge amount of peptides [16, 17]. In particular,nano-flow reverse phase liquid chromatography allows for the sep-aration of proteins without preseparation using 1-DE or 2-DE[17]. Here, we describe PM and DRM protein preparation meth-ods that are adapted to nano-LC-MS/MS-based shotgun proteo-mics using two sample preparation methods, “in-gel” and “in-solution” peptide digestion. We applied this method for the aerialparts of oat plants as an example and the method described areapplicable to other plants such as rye [18], Arabidopsis [19], andBrachypodium distachyon (data not shown). In the “In-gel diges-tion protocol,” solubilized PM and DRM proteins were applied tosodium dodecyl sulfate (SDS)–polyacrylamide gel to remove non-proteinaceous materials and subsequently subjected to trypticdigestion. In the “In-solution digestion protocol,” we use anMPEX PTS reagent kit (GL Sciences, Tokyo, Japan), which hasbeen widely used for solubilization of proteins in mammalian andbacteria PMs, such as HeLa cells and Escherichia coli cells, respec-tively [20]. Using these methods, thousands of PM and DRMproteins have been consistently identified, including highly hydro-phobic proteins with multiple transmembrane domains. Further-more, these data can be used for protein quantification.

2 Materials

All solutions are prepared using ultrapure water (prepared by pur-ifying deionized water to attain a sensitivity of 18.2 MΩ cm at24 �C) and analytical grade reagents. All reagents included in thischapter are prepared and stored at room temperature (unless indi-cated otherwise). Diligently follow all waste disposal regulationswhen disposing of waste materials.

2.1 Plasma

Membrane Purification

Components

Ultrapure water, the Polytron homogenizer, centrifuge, and ultra-centrifuge rotors should be kept and used at 4 �C.

1. Homogenizing medium: 0.5 M sorbitol, 50 mM Mops-KOH,pH 7.6, 5 mM EGTA (pH 8.0), 5 mM EDTA (pH 8.0), 5%(w/v) polyvinylpyrrolidone-40 (molecular weight 40,000),0.5% (w/v) bovine serum albumin (BSA), 2.5 mM

90 Daisuke Takahashi et al.

phenylmethanesulfonyl fluoride (PMSF), 4 mM salicylhy-droxamic acid (SHAM), 2.5 mM 1,4-dithiothreitol (DTT).Store at 4 �C (see Note 1).

2. Microsome (MS)-suspension medium: 10 mM KH2PO4/K2HPO4 (K-P) buffer (pH 7.8), 0.25 M sucrose. Store at4 �C (see Note 2).

3. NaCl medium: Add 1.17 g of NaCl to 180 mL MS suspensionmedium and stir moderately using a stirring bar. Add MSsuspension medium up to 200 mL with a graduated cylinder.Store at 4 �C (see Note 3).

4. Plasma membrane (PM)-suspension medium: 10 mM Mops-KOH (pH 7.3), 2 mM EGTA (pH 8.0), 0.25 M sucrose. Storeat 4 �C (see Note 4).

5. Two phase partition medium: Weigh 1.45 g of polyethyleneglycol 3350 and 1.45 g dextran in a 40 mL centrifuge tube.Add 9.3 mL MS suspension medium and 7.3 mL NaClmedium to the centrifuge tube and mix well by shaking. Incu-bate at 4 �C overnight to completely dissolve the polymers (seeNote 5). Prepare three tubes per sample for repeating the twophase partition in order to increase the purity of the PMfraction.

6. Bio-Rad Protein Assay Kit (Bio-Rad Laboratories, CA, USA):Store at 4 �C.

2.2 Detergent-

Resistant Membrane

Extraction

Components

Ultracentrifuge rotors should be prechilled at 4 �C.

1. TED buffer: 50 mM Tris–HCl (pH 7.4), 3 mM EGTA(pH 8.0), 1 mM DTT. This buffer should be freshly prepared.

2. 10% (w/v) Triton X-100 buffer: Add 1 g of Triton X-100 toTED buffer and then adjust to 10 mL volume. Shake theTriton X-100 buffer with a shaker for 3 h to completely dis-solve Triton X-100. This buffer should be freshly prepared.

3. 65%, 48%, 35%, 30% and 5% (w/w) sucrose solution (in TEDbuffer): Weigh 65 g, 24 g, 17.5 g, 15 g, and 2.5 g of sucroseand dissolve in 35 g, 26 g, 32.5 g, 35 g, and 47.5 g of TEDsolution, respectively. These solutions should be freshlyprepared.

2.3 In-Gel Tryptic

Digestion

2.3.1 SDS–

Polyacrylamide Gel

Components

1. Running gel solution: 1.5 M Tris–HCl, pH 8.8. Add approxi-mately 900 mL water to a 1 L glass beaker. Add 181.7 g Trisand stir moderately using a stirring bar. After Tris is completelydissolved, adjust pH with HCl using a pH meter. Add water upto 1 L with a graduated cylinder (see Note 6).

2. SDS sample buffer (2�): 2% (w/v) SDS, 50 mM Tris–HCl(pH 6.8), 6% (v/v) β-mercaptoethanol, 10% (w/v) glycerol,

Shotgun Proteomics of Plasma Membrane 91

0.001% (w/v) bromophenol blue (BPB). Store at 4 �C forcurrent use or at �30 �C for long-term storage (see Note 7).

3. TGS running buffer: 0.025 M Tris, 0.188 M glycine, 0.1%(w/v) SDS (see Note 8).

4. 30% (w/v) acrylamide solution: Weigh 29 g of acrylamidemonomer and 1 g methylenebisacrylamide. Add to 50 mLwater in 100 mL glass beaker and stir moderately using astirring bar. After the solids have completely dissolved, addwater up to 100 mL with a graduated cylinder and store at4 �C, with protection from light using a light-shielding bottleor wrapping with aluminum foil (see Note 9).

5. 10% (w/v) SDS: Add 10 g of SDS to 50 mL water and stirmoderately using a stirring bar. Add water up to 100 mL with agraduated cylinder.

6. 10% (w/v) ammonium persulfate: Add 1 g of ammoniumpersulfate to 8 mL water and stir moderately using a stirringbar. Add water up to 10 mL with a graduated cylinder. Store at4 �C for current use or at �30 �C for long-term storage.

7. N,N,N,N0-tetramethyl-ethylenediamine (TEMED) (WakoPure Chemical Industries, Tokyo, Japan). Store at 4 �C.

2.3.2 Tryptic Digestion

Components

All of these processes should be carefully performed at a cleanbench with gloves and clean lab coat throughout to avoid contami-nation by keratin, dust, and other exogenous proteinaceousmaterials.

1. Fixation solution: Mix 50 mL of water, 40 mL of ethanol(99.5%, w/w) and 10 mL of acetic acid (100%, w/w).

2. 0.1 M ammonium bicarbonate: Weigh 3.95 g of ammoniumbicarbonate and add to 400 mL of water in a glass beaker. Stirmoderately using a stirring bar and add water up to 500 mLwith a graduated cylinder. Store at 4 �C (see Note 10).

3. Acetonitrile (LC-MS grade) (Wako Pure Chemical Industries,Tokyo, Japan). Store at 4 �C (see Note 10).

4. 25 mM ammonium bicarbonate/50% (v/v) acetonitrile: Mix50 mL of acetonitrile (LC-MS grade), 25 mL of 0.1 M ammo-nium bicarbonate, and 25 mL of water. Store at 4 �C (seeNote10).

5. Reduction buffer: Weigh 7.7 mg DTT and add to 5 mL of0.1 M ammonium bicarbonate in a conical tube (seeNote 11).

6. 55 mM iodoacetamide (IAA)/0.1 M ammonium bicarbonate:Weigh 51 mg IAA and add to 5 mL of 0.1 M ammoniumbicarbonate in a conical tube (see Note 11).

7. Protease solution: Add 2 mL of 0.1 M ammonium bicarbonateinto a vial containing 20 μg of trypsin (Sequence grade

92 Daisuke Takahashi et al.

modified, Promega KK, Tokyo, Japan) and mix well. Store at�20 �C.

8. 5% (v/v) trifluoroacetic acid (TFA)/50% (v/v) acetonitrile:Mix 450 μL of water and 500 μL of acetonitrile in a 1.5 mLmicrotube. Quickly add 50 μL of TFA into the solution andmix well (see Note 11).

9. 0.1% (v/v) TFA: Quickly add 1 μL of TFA into 999 μL of waterand mix well (see Note 11).

2.4 In-Solution

Tryptic Digestion

In order to avoid keratin, dust, and other exogenous proteinaceousmaterials, all of these processes must be carefully performed at aclean bench with gloves and a clean lab coat throughout thein-solution tryptic digestion.

1. MPEX PTS reagents kit (GL Sciences, Tokyo, Japan): Makesolution B, DTT solution, IAA solution, and trypsin solutionaccording to the manufacturer’s instruction manual. Onlysolution B should be stored at 4 �C. Prepare fresh DTT solu-tion, IAA solution, and trypsin solution immediatelybefore use.

2. 5% (v/v) acetonitrile/0.1% (v/v) TFA: Add 50 μL of acetoni-trile and 1 μL of TFA into 949 μL of water and mix well (seeNote 11).

3. Pierce BCA protein assay kit (Thermo Fisher Scientific, MA,USA): Store at room temperature.

2.5 Peptide

Purification

Components

1. SPE C-TIP T-300 (Nikkyo Technos Co., Ltd., Tokyo, Japan).

2. 1.5 mL microtubes: Make a hole of 3 mm in diameter on thecap with a soldering iron. Prepare two tubes per sample (Twomicrotubes with a hole on the lid per one sample preparationwill be used in two steps of the peptide purification, Subhead-ing 3.4, steps 1 and 6, below).

3. Solution A: Add 800 μL of acetonitrile and 5 μL of TFA into195 μL of water and mix well (see Note 11).

4. Solution B: Add 40 μL of acetonitrile and 5 μL of TFA into955 μL of water and mix well (see Note 11).

5. 0.1% (v/v) TFA: Quickly add 1 μL of TFA into 999 μL of waterand mix well (see Note 11).

3 Methods

Wear gloves and a clean lab coat throughout the processes to avoidcontamination by keratin, dust and other exogenous proteinaceousmaterials. It is preferable to use low protein absorption microtubesat all stages.

Shotgun Proteomics of Plasma Membrane 93

3.1 Plasma

Membrane Purification

Perform all steps on crushed ice (unless indicated otherwise). Sche-matic outline of the procedure is described in Fig. 1.

1. Cut out the aerial parts of oat seedlings and weigh the samples(10–70 g in fresh weight is suitable for the plasma membranepurification). Put the harvested plants on a plastic containerand wash with 500 mL of chilled water. Wash twice. Drain on apaper towel and put on crushed ice.

2. Cut into small pieces with razor blades. Immediately, put intofour volumes of chilled homogenizing medium and mix wellwith a spatula. The homogenizing medium containing thesamples should be again cooled on crushed ice (see Note 12).

3. Homogenize with a chilled Polytron generator (PT10SK,Kinematica Inc., Lucerne, Switzerland) until the samples arebroken down into tiny pieces (speed 6 for 60–90 s). Filter thehomogenates through four layers of gauze and squeeze tightly.

Fig. 1 Schematic outline of the plasma membrane preparation. All steps from harvesting leaves to suspendingthe purified plasma membrane fractions are described

94 Daisuke Takahashi et al.

Put the filtrates into 40 mL centrifuge tubes and balancethem in pairs. Centrifuge at 10,000 � g for 15 min with achilled rotor to remove debris and heavy membrane fractions(see Note 13).

4. Transfer the supernatants into ultracentrifuge tubes by decan-tation. Centrifuge at 231,000 � g for 50 min with a chilledultracentrifuge rotor to precipitate the microsome fractions.Discard supernatants by decantation.

5. Add an aliquot of MS suspension medium (approximately0.5–1.0 mL) to each tube and homogenize the pellets with aTeflon–glass homogenizer. Collect the microsomal suspensionswith a large Pasteur pipette into ultracentrifuge tubes. Balanceultracentrifuge tubes in pairs with MS suspension medium.

6. Ultracentrifuge at 231,000 � g for 50 min as described in step4. Put 5 mL of MS suspension medium in a Teflon–glasshomogenizer and mark the water surface on the glass homog-enizer as an indication of 5 mL volume. After centrifugation,discard the supernatant with an aspirator.

7. Put 2 mL of MS suspension medium and break up the pre-cipitated pellet with a glass rod. Transfer into a Teflon–glasshomogenizer using a large Pasteur pipette. Put 2 mL of MSsuspension medium into the tube and pipet up and down tobreak up the remaining pellet. Transfer into a Teflon–glasshomogenizer and add MS suspension medium to 5 mL.Homogenize well with an electric Teflon–glass homogenizer(moving up and down five times) with cooling on ice (seeNote14).

8. Put all of the homogenized sample in a centrifuge tube con-taining two-phase partition medium (tube A). Put 5 mL of MSsuspension medium two other two-phase partition systems(tubes B and C). Chill on crushed ice for 10 min. During thistime, mix well every 2 min.

9. Centrifuge tubes A and B at 650� g for 5 min in a chilled rotor.Two phases should be observed to have settled in tubes A andB. Discard the upper phase of tube B with a Pasteur pipetteand transfer the upper phase of tube A into tube B. Chill oncrushed ice for 10 min. During this time, mix well every 2 min(see Note 15).

10. Centrifuge tubes B and C at 650 � g for 5 min in a chilledrotor. Discard the upper phase of tube C with a Pasteur pipetteand transfer the upper phase of tube B into tube C. Balancetube C with another centrifuge tube filled with water. Chill oncrushed ice for 10 min. During this time, mix well every 2 min(see Note 15).

Shotgun Proteomics of Plasma Membrane 95

11. Centrifuge at 650 � g for 5 min and split the resultant upperphase of tube C into two ultracentrifuge tubes. Fill up thetubes with PM suspension medium and balance them. Ultra-centrifuge at 231,000� g for 50min, as described in step 4 (seeNote 15).

12. Discard the supernatant with an aspirator. Add an appropriatequantity of PM suspension medium to each tube and homoge-nize the pellets with a Teflon–glass homogenizer. Collect theplasma membrane suspensions with a large Pasteur pipette intoultracentrifuge tubes. Balance ultracentrifuge tubes in pairswith PM suspension medium. Ultracentrifuge again at231,000 � g for 35 min.

13. Discard the supernatant with an aspirator. Add a minimalquantity of PM-suspension medium to the plasma membranepellets. Homogenize the pellets with a glass rod. Transfer into aTeflon–glass homogenizer and homogenize well using an elec-tric Teflon–glass homogenizer (moving up and down fivetimes) with cooling on ice. Transfer into a 1.5 mL microtube.

14. Measure the protein content using the Bradford assay (Bio-RadProtein Assay Kit). Use 10 μg of protein for tryptic digestionand LC-MS/MS analysis. The remaining PM fractions shouldbe frozen in liquid nitrogen immediately and stored at�80 �C.

3.2 Detergent-

Resistant Membrane

Extraction

Perform all steps on crushed ice (unless indicated otherwise).

1. Prepare PM with approximately 2.5 mg protein and dilute withPM-suspension medium in an ultracentrifuge tube. After bal-ancing the tubes in pairs, ultracentrifuge at 231,000 � g for35 min (see Note 16).

2. Add 2000 μL of PM-suspension medium in an ultracentrifugetube and grind pellets with a glass rod. Transfer into a Teflon–glass homogenizer and homogenize well using an electrichomogenizer (moving up and down five times). Measure theprotein content by Bradford assay and place PM samples with2 mg of protein into a 35 mL swing rotor tube. Adjust thevolume to 2.7 mL by adding PM suspension medium.

3. Add 300 μL of 10% (w/v) Triton X-100 buffer and mix well(at this point, protein–detergent ratio is 1:15). Incubate for30 min.

4. Add 12mL of 65% (w/w) sucrose solution andmix well (at thispoint, the final concentration of sucrose is 52%). Overlay 5 mLof 48%, 35%, 30%, and 5% (w/w) sucrose solution slowly insequence (see Note 17).

5. Balance the swing rotor tubes in pairs by adding 5% (w/w)sucrose solution and ultracentrifuge in a swing rotor at141,000 � g for 20 h.

96 Daisuke Takahashi et al.

6. DRMs will be visible as a white layer at the interface of the35%/48% (w/w) sucrose solution. Recover the white layer andplace it in an ultracentrifuge tube. Dilute with TED buffer andbalance them in pairs. Ultracentrifuge (w/w) at 231,000 � gfor 35 min (see Note 18).

7. Discard the supernatant. Add an appropriate quantity of PMsuspension medium to each tube and homogenize the pelletswith a Teflon–glass homogenizer. After balancing in pairs,ultracentrifuge at 231,000 � g for 35 min.

8. Discard the supernatant with an aspirator. Add a minimalquantity of PM-suspension medium to the sample tube.Break up the pellets with a glass rod, transfer into a Teflon–glass homogenizer and homogenize well using an electric Tef-lon–glass homogenizer (moving up and down five times).Transfer into a 1.5 mL microtube. The DRM fraction shouldbe frozen in liquid nitrogen immediately and stored at�80 �C.

3.3 In-Gel Tryptic

Digestion

3.3.1 SDS–

Polyacrylamide Gel

Electrophoresis

1. Mix 2.5 mL running gel solution, 3.35 mL 30% (w/v) acryl-amide solution, and 3.95 mL water in a conical flask. Degaswith a vacuum pump for 5 min. Add 100 μL of 10% (w/v)ammonium persulfate, 100 μL of 10% (w/v) SDS and 5 μL ofTEMED. Cast gel into a 90 mm (W) � 83 mm (H) � 1 mm(T) gel cassette immediately. Insert a 14-well comb withoutintroducing air bubbles. Incubate at room temperature for 1 h(see Note 19).

2. Mix 5 μg of PM or DRM protein samples (within 10 μL) andequal volume of SDS sample buffer. Vortex and centrifugetubes briefly. Heat at 95 �C for 5 min. Centrifuge and cool toroom temperature.

3. Wash out the wells by pipetting up and down. Slowly load thesamples onto the gel. Electrophorese at 100 V until the upperend of sample dye band enters 2 mm from the well (see Note20).

4. Pry the gel plates open with a knife. Cut out the gel slice fromthe well to 2 mm in front of the BPB dye with a scalpel on aglass plate. Cut the gel slice into four equal pieces (Fig. 2) andput into 1.5 mL microtubes (see Note 21).

5. Add 200 μL of fixation solution and agitate for 10 min. Cen-trifuge briefly and discard the supernatant. Repeat these stepstwice.

3.3.2 In-Gel Tryptic

Digestion for Nano-LC-MS/

MS

All of these procedures should be performed at room temperature(unless otherwise specified).

1. Add 200 μL of water and agitate for 10 min. Centrifuge brieflyand discard the supernatant.

Shotgun Proteomics of Plasma Membrane 97

2. Add 400 μL of 25 mM ammonium bicarbonate/50% (v/v)acetonitrile and agitate for 10 min. Centrifuge briefly anddiscard the supernatant.

3. Add 200 μL of acetonitrile and incubate at room temperaturefor 5 min. Centrifuge briefly and discard the supernatant (seeNote 22).

4. Add 100 μL of 0.1 M ammonium bicarbonate and centrifugebriefly. Incubate at room temperature for 5 min (see Note 23).

5. Add 100 μL of acetonitrile and centrifuge briefly. Incubate atroom temperature for 15 min. Centrifuge briefly and discardthe supernatant (see Note 24).

6. Dry out the gel samples using a centrifugal concentrator for45 min (see Note 25).

7. Add 100 μL of reduction buffer and centrifuge briefly. Incu-bate at 56 �C for 45 min. Discard the supernatant.

8. Add 100 μL of 55 mM IAA/0.1 M ammonium bicarbonateand centrifuge briefly. Incubate in the dark at room tempera-ture for 30 min. Discard the supernatant.

Fig. 2 Excision of a protein band from gel. Wells are separated to avoidcontamination of different samples during loading and electrophoresis. Afterthe BPB dye migrates into a gel (about 2 mm), a 4 mm of gel piece centered onthe BPB dye band is cut out. Subsequently, the gel piece is cut into four equalpieces. Each of the gel pieces is separately put into 1.5 mL microtubes

98 Daisuke Takahashi et al.

9. Add 200 μL of water and agitate for 10 min. Centrifuge brieflyand discard the supernatant.

10. Add 400 μL of 25 mM ammonium bicarbonate/50% (v/v)acetonitrile and agitate for 10 min. Centrifuge briefly anddiscard the supernatant.

11. Add 200 μL of acetonitrile and incubate at room temperaturefor 5 min. Centrifuge briefly and discard the supernatant (seeNote 22).

12. Add 100 μL of 0.1 M ammonium bicarbonate and centrifugebriefly. Incubate at room temperature for 5 min (seeNote 23).

13. Add 100 μL of acetonitrile and centrifuge briefly. Incubate atroom temperature for 15 min. Centrifuge briefly and discardthe supernatant (see Note 24).

14. Dry out the gel samples using centrifugal concentrator for45 min (see Note 25).

15. Cool to room temperature and put on ice. Add 25 μL ofprotease solution to each tube and centrifuge the tubes briefly.Incubate on ice for 45 min (see Note 23).

16. Discard the supernatant and add 100 μL of 0.1 M ammoniumbicarbonate. Centrifuge the tubes briefly and incubate at 37 �Cfor 20 h.

17. Agitate for 15 min and add 100 μL of acetonitrile. Agitate for15 min and collect the supernatant (see Note 24).

18. Add 5% (v/v) TFA/50% (v/v) acetonitrile and agitate for15 min. Centrifuge the tubes briefly. Collect the supernatantin tubes, as described in step 17 (see Note 24).

19. Dry out the collected supernatant using a centrifugal concen-trator for 1 h. Add 30 μL of 0.1% TFA. Store at �30 �C (seeNote 26).

3.4 In-Solution

Tryptic Digestion

All of these procedures should be performed at room temperature(unless otherwise specified).

1. Precipitate 100 μg of PM or DRM protein using an ultracen-trifuge (231,000 � g, 4 �C, 50 min).

2. Add solution B to the centrifuge tubes and homogenize with aTeflon–glass homogenizer. Transfer to 1.5 mL microtubes.

3. Solubilize samples and measure the protein concentration witha Pierce BCA protein assay kit according to the instructionmanual from the manufacturer.

4. Transfer 5 μg of PM protein to another 1.5 mL microtube.Make up to 20 μL with solution A.

5. Perform reductive alkylation and tryptic digestion according tothe instruction manual, and store at �30 �C (see Note 26).

Shotgun Proteomics of Plasma Membrane 99

3.5 Peptide

Purification

All of these procedures should be performed at a clean benchwhenever possible and at room temperature (unless otherwisespecified).

1. Insert a SPE C-TIP into the 3 mm hole in the microtube top(Fig. 3).

2. Add 30 μL of solution A to the upper side of the SPE C-TIP forpreconditioning. Centrifuge at 1000 � g for 30 s to get solu-tion A through the tip column.

3. Add 30 μL of solution B from upper side of SPE C-TIP forpreconditioning. Centrifuge at 1000 � g for 30 s to get solu-tion B through the tip column.

4. After confirming that the column is moist, add the entiretrypsin digested peptide sample to the upper side of the SPEC-TIP for column absorption. Centrifuge at 1000 � g for 30 sto get the sample solution through the tip column.

5. Add 30 μL of solution B from upper side of SPE C-TIP forcleaning. Centrifuge at 1000 � g for 30 s to get solution Bthrough the tip column.

6. Put a vial insert for each LC-MS/MS sampler into anotherholed microtube. Transfer the SPE C-TIP into the microtube.

Fig. 3 Peptide purification assembly. A C-TIP is inserted in a hole drilled with asoldering iron in a 1.5 mL microtube cap. Solution A, solution B and the samplesolution are put into the C-TIP, in that order, and then the assembly iscentrifuged. The mixed solution passes through a C18 column and peptidesare absorbed on, or eluted from, the C-TIP

100 Daisuke Takahashi et al.

7. Add 30 μL of solution A to the upper side of the SPE C-TIP forelution. Centrifuge at 1000 � g for 30 s to get solution Athrough the tip column. Discard the SPE C-TIP.

8. Dry out the eluted samples using a centrifugal concentrator for15 min. Add 15 μL of 0.1% (v/v) TFA. Put the vial insert intothe vial and close the lid. Store at �30 �C (see Note 23).

3.6 Nano-LC-MS/MS

Analysis

Separate digested and purified peptide solutions with a C18 columnby nano-flow LC. Make a linear gradient of acetonitrile (from 5%[v/v] to 45% [v/v]) at a flow rate of 500 nL/min for 100 min.Detect and analyze the separated and ionized peptides in a massspectrometer. Examples of analyzed results using 5 μg of oat PMand DRM proteins are shown in Figs. 4, 5, and 6. You can seedetailed results in figure legends.

4 Notes

1. Mops-KOH (pH 7.6), EGTA (pH 8.0), and EDTA (pH 8.0)should be prepared as 0.5 M stock solutions and stored at 4 �C.The pH of EGTA and EDTA should be adjusted using NaOH.When BSA is dissolved, BSA powder should be preset at roomtemperature. PMSF and SHAM should be prepared as 1 M and1.6 M stock solutions in DMSO, respectively, and stored at4 �C. DTT should be stored at�30 �C as a 1 M stock solution.PMSF, SHAM, DTT should be diluted only as needed justbefore use. If you prepare Arabidopsis PM fraction, the homo-genizing medium should consist of 0.5 M sorbitol, 50 mMMops-KOH (pH 7.6), 5 mM EGTA (pH 8.0), 5 mM EDTA(pH 8.0), 1.5% (w/v) polyvinylpyrrolidone-40 (molecularweight 40,000), 0.5% (w/v) BSA, 2 mM PMSF, 4 mMSHAM, and 2.5 mM DTT.

2. KH2PO4/K2HPO4 (K-P) buffer (pH 7.8) should be preparedas a 0.5 M stock solution and diluted to make the MS suspen-sion medium. First, 200 mL of 0.5 M K2HPO4 and 30 mL of0.5 M KH2PO4 are prepared. The pH of the 0.5 M K2HPO4 isadjusted to 7.8 by adding 0.5 M KH2PO4, monitored by a pHmeter. If you prepare Arabidopsis PM fraction, the MS sus-pending medium should contain 10 mM KH2PO4/K2HPO4

(K-P) buffer (pH 7.8), 0.3 M sucrose.

3. If you prepare Arabidopsis PM fraction, the final concentrationof NaCl should be adjusted to 100 m M in the MS suspensionmedium.

4. Mops-KOH (pH 7.3), EGTA (pH 8.0) should be prepared as a0.5 M stock solution and stored at 4 �C. If you prepare Arabi-dopsis PM fraction, the PM suspending medium consists of

Shotgun Proteomics of Plasma Membrane 101

10 mM Mops-KOH (pH 7.3), 1 mM EGTA (pH 8.0), and0.3 M sucrose.

5. For Arabidopsis PM preparation, you should weigh 1.4 g ofpolyethylene glycol 3350 and 1.4 g dextran, and add to 9.4 mLMS suspension medium and 7.3 mL NaCl medium in a 40 mLcentrifuge tube.

6. When the pH of the Tris buffer is adjusted, the buffer solutionshould be at room temperature. The pH of Tris can be affectedby the temperature of the solution. Addition of HCl results inan increase of temperature by heat of neutralization and dilu-tion. To avoid a temperature increase of the solution, add theHCl slowly and intermittently.

Fig. 4 The number of transmembrane domains in oat PM proteins separated and identified following in-gel orin-solution tryptic digestion. Peptide sequences were searched against the NCBI database (version20,120,216, comprising 17,282,984 sequences), taxonomy viridiplantae. Transmembrane domains wereestimated by SOSUI engine ver. 1.10 (http://bp.nuap.nagoya-u.ac.jp/sosui/). (a) Proteins with up to 24 trans-membrane domains were identified in oat PM by in-gel digestion in four biological replicates. On average,700 proteins with transmembrane domains were identified in the four replicates. (b) Proteins with up to24 transmembrane domains were identified in oat PM by in-solution digestion in four biological replicates. Onaverage, 397 proteins with transmembrane domains were identified in the four replicates

102 Daisuke Takahashi et al.

Fig. 5 Distribution of the number of transmembrane domains in oat DRM proteins identified using the “In-Solution Digestion” protocol. Peptides were analyzed as described in Fig. 4. Compared to PM (Fig. 4b), in DRM,more transmembrane proteins, especially those containing more than five transmembrane domains, could beidentified

Fig. 6 Estimated number of proteins with secretory signal peptides in oat PM. 173 and 81 proteins with signalpeptides were identified from four replicates of in-gel or in-solution tryptic digests of 5 μg oat PM proteins,respectively

Shotgun Proteomics of Plasma Membrane 103

7. β-mercaptoethanol is a reducing agent and should be added tothe sample buffer just before use.

8. TGS buffer is normally made as a 10� stock solution. First,make 1 L of 10� TGS buffer consisting of 30.3 g Tris, 141.4 gglycine, and 10 g SDS. Just before use, dilute 100 mL of 10�TGS buffer with 900 mL of water.

9. Unpolymerized acrylamide is neurotoxic. Acrylamide powderrequires careful handling. Wear gloves, a clean lab coat, and amask, and pay attention to people around you when weighingacrylamide. Store at 4 �C. Add polymerization agent beforediscarding any spare acrylamide solution.

10. These solutions should be dispensed into a small volume andsealed with Parafilm to prevent contamination and evapora-tion. Store at 4 �C and use within 1 month of preparation.

11. DTT and IAA can be easily modified in solution for a shortperiod and TFA evaporates quickly. Solutions including DTT,IAA, and TFA should be freshly prepared just immediatelybefore use.

12. For Arabidopsis PM preparation, plants must be put in homo-genizing medium directly and immediately after harvest andwashing. Subsequently, plants should be cut with clean scissorsin the medium.

13. For Arabidopsis PM preparation, the homogenates should becentrifuged at 5000 � g for 10 min.

14. In this step, homogenization should not be too long or toovigorous because harsh homogenization can severely disruptmembrane integrity.

15. Two-phase partitioning is the most important step for prepar-ing highly purified PM. When the upper phase of thetwo-phase partition medium is removed, the Pasteur pipetteshould be moved from left to right near the boundary of thetwo phases to prevent taking lower phase. For Arabidopsis PMpreparation, the two phases should be centrifuged at 440 � gfor 5 min.

16. The yield of the PM preparation is expected to be 2.5 mgprotein from 70 to 100 g (FW) of oat leaves.

17. One of the keys to making a good step gradient with sucrosesolutions is pouring the solution slowly along the inner wall ofthe tube.

18. In this step, the upper portions of the white band should bediscarded first and then the DRM layer should be collectedcarefully.

19. All parts of the gel cassette should be wiped with ethanol oracetone on cleaning tissue to prevent contamination with otherproteins including keratin.

104 Daisuke Takahashi et al.

20. Wells are separated between samples to prevent mixing upsamples and electrophoresis. Electric power supply is turnedon constant voltage mode.

21. Be careful not to mix with the next sample bands. Illuminatethe glass plate from below with a fluorescent lamp to see thegels easily.

22. At this stage, dehydrated, compressed, and completelybleached gels should be observed. If the gels do not change,repeat this step twice.

23. At this stage, rehydrated and swollen gels should be observed.If the gels do not change, repeat this step twice.

24. Gels are sometimes partly bleached, but this is acceptable.

25. At this stage, dehydrated and compressed gels are easily lost byelectrostatic force. Take extra care.

26. Digested and purified peptides should be analyzed by nano-LC-MS/MS within 1 week.

Acknowledgments

This work was supported in part by Grants-in-Aid for ScientificResearch (#22120003 and #24370018) from MEXT, Japan toY.K. and M.U.

References

1. Aebersold R, Mann M (2003) Massspectrometry-based proteomics. Nature422:198–207

2. Domon B, Aebersold R (2006) Mass spec-trometry and protein analysis. Science312:212–217

3. Kersten B, Burkle L, Kuhn E-J et al (2002)Large-scale plant proteomics. Plant Mol Biol48:133–141

4. van Wijk KJ (2001) Challenges and prospectsof plant proteomics. Plant Physiol126:501–508

5. Santoni V, Kieffer S, Desclaux D et al (2000)Membrane proteomics: use of additive maineffects with multiplicative interaction modelto classify plasma membrane proteins accord-ing to their solubility and electrophoretic prop-erties. Electrophoresis 21:3329–3344

6. Luche S, Santoni V, Rabilloud T (2003) Eval-uation of nonionic and zwitterionic detergentsas membrane protein solubilizers intwo-dimensional electrophoresis. Proteomics3:249–253

7. Gorg A, Weiss W, Dunn M-J (2004) Currenttwo-dimensional electrophoresis technologyfor proteomics. Proteomics 4:3665–3685

8. Rabilloud T (2009) Membrane proteins andproteomics: love is possible, but so difficult.Electrophoresis 30:174–180

9. Rabilloud T (2002) Two-dimensional gel elec-trophoresis in proteomics: old, old fashioned,but it still climbs up the mountains. Proteomics2:3–10

10. Simons K, Ikenen E (1997) Functional rafts incell membranes. Nature 387:569–572

11. Peskan T, Westermann M, Oelmuller R (2000)Identification of low-density tritonX-100-insoluble plasma membrane microdo-mains in higher plants. Eur J Biochem267:6989–6995

12. Mongrand S, Morel J, Laroche J et al (2004)Lipid rafts in higher plant cells. J Biol Chem279:36277–36286

13. Bhat RA, Panstruga R (2005) Lipid rafts inplants. Planta 223:5–19

Shotgun Proteomics of Plasma Membrane 105

14. Martin SW, Glover BJ, Davies JM (2005) Lipidmicrodomains: plant membranes getorganized. Trends Plant Sci 10:263–265

15. Grennan AK (2007) Lipid rafts in plants. PlantPhysiol 143:1083–1085

16. Thakur S-S, Geiger T, Chatterjee B et al (2011)Deep and highly sensitive proteome coverageby LC-MS/MS without prefractionation. MolCell Proteomics 10:M110.003699

17. Matros A, Kasper S, Witzel K et al (2011)Recent progress in liquid chromatography-based separation and label-free quantitativeplant proteomics. Phytochemistry 72:963–974

18. Takahashi D, Kawamura Y, Yamashita T et al(2011) Detergent-resistant plasma membrane

proteome in oat and rye: similarities and dis-similarities between two monocotyledonousplants. J Proteome Res 11:1654–1665

19. Li B, Takahashi D, Kawamura Y et al (2012)Comparison of plasma membrane proteomicchanges of Arabidopsis suspension cells (T87line) after cold and abscisic acid treatment inassociation with freezing tolerance develop-ment. Plant Cell Physiol 53:543–554

20. Masuda T, Tomita M, Ishihama Y (2008)Phase transfer surfactant-aided trypsin diges-tion for membrane proteome analysis. J Prote-ome Res 7:731–740

106 Daisuke Takahashi et al.

Chapter 8

A Protocol for the Plasma Membrane Proteome Analysisof Rice Leaves

Ravi Gupta, Yu-Jin Kim, and Sun Tae Kim

Abstract

Subcellular proteome analysis is one of the most effective ways to reduce the complexity of total proteome.With the advancement in protein extraction methodologies, it is now possible to fractionate and isolate theproteins from subcellular compartments without significant contamination from the cytoplasm and otherorganelles. Of the different subcellular proteomes, plasma membrane remained largely uncharacterizedbecause of the difficulties in isolation of contamination free plasma membrane proteins. Moreover, prote-ome analysis in the past two decades majorly relied on the two-dimensional gel electrophoresis whichshowed limited protein loading ability and poor separation of highly hydrophobic plasma membraneproteins. Development of shotgun proteomics methods has facilitated the identification and quantificationof hydrophobic proteins isolated from plasma membrane or other cellular membranes. Here, we present asimplified procedure for the isolation of plasma membrane proteins by a two-phase partitioning methodand their identification by shotgun proteomics approach using rice as a model plant.

Key words Plasma membrane, Shotgun proteomics, Rice, Two-phase partitioning, Signaling

1 Introduction

Plasma membrane is the outermost membrane that physically sepa-rates a cell from its external environment and plays a central role inthe intracellular signaling [1]. Transmission of signals from externalenvironment to inside the cells is mediated by the lipids, receptorsand other proteins which are integral components of the plasmamembrane [2]. Majority of the plasma membrane-localized pro-teins contain a transmembrane domain with hydrophobic regionsspanning the membrane and hydrophilic domains located towardapoplast and symplast [3]. Plasma membrane proteins are not onlyinvolved in the cell signaling but also play pivotal roles in transportby functioning as carrier and channel proteins [4]. Thus, analysis ofplasma membrane proteome can provide important clues regardingcell to cell communication, signaling, and transport [4]. Further-more, some of the biological processes, such as plant–pathogen

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_8, © Springer Science+Business Media, LLC, part of Springer Nature 2020

107

interactions, cannot be fully understood without the analysis ofplasma membrane proteins. In particular, understanding ofpattern-triggered immunity (PTI) responses requires isolation andcharacterization of plasma membrane proteins as pathogen secretedmolecules bind to the plasma membrane-localized receptors of theplants [5]. This binding of pathogen secreted molecules, known aspathogen-associated molecular patterns (PAMPs) to the plasmamembrane-localized pattern-recognition receptors (PRRs) of theplants activate the myriad of signaling events culminating into thedefense responses [6, 7].

In this chapter, we describe the pipeline for the plasma mem-brane proteome analysis using a shotgun proteomics approach.Plasma membrane protein isolation method described here is mod-ified from the method described by [8].

2 Materials

Prepare all solutions for protein extraction and trypsin digestion indeionized water while solutions for peptide desalting and massspectrometry should be prepared in HPLC or LC-MS gradewater. After trypsin digestion, the use of low-protein bind tubes isrecommended to minimize the peptide loss. Wear lab coat andnitrile gloves throughout the experimentation to avoid keratincontamination.

2.1 Plant Material 1. Rice plants (see Note 1).

2.2 Reagents,

Equipment, and

Software

1. 2D-Quant kit (GE Healthcare) (see Note 2).

2. 30 kDa spin filter (Amicon Ultra Centrifugal filters, Catalognumber UFC503096) or Microcon-30 kDa Centrifugal FilterUnit with Ultracel-30 membrane (Millipore, Catalog numberMRCF0R030).

3. Trypsin Gold, Mass Spectrometry grade (Promega, Catalognumber V5280) (see Note 3).

4. 3 M Empore HP Extraction disk cartridge (C18-SD), 7 mm/3 mL (Catalog No. 4215SD).

5. Eppendorf LoBind microcentrifuge tubes (Catalog numberZ666505) or low protein binding collection tubes (ThermoFisher Scientific, Catalog number 90410).

6. Acclaim PepMap 100 trap column (100 μm� 2 cm, nanoViperC18, 5 μm, 100 A) (Thermo Fisher Scientific).

7. Acclaim PepMap 100 capillary column (75 μm� 15 cm, nano-Viper C18, 3 μm, 100 A).

8. UHPLC Dionex UltiMate® 3000 (Thermo Fisher Scientific).

108 Ravi Gupta et al.

9. QExactive™ Orbitrap High-Resolution Mass Spectrometer(Thermo Fisher Scientific) for high mass accuracy, high resolu-tion, and high scan speed. Other n-LC/MS-MS can also beused for the protein identification; however, the total numberof identified proteins may vary.

10. MaxQuant and Perseus software.

11. Ultracentrifuge (Beckman Coulter).

12. pH strips.

13. Sonicator (both probe type and water bath type).

14. Dry bath preferably (see Note 4).

2.3 Buffers 1. Homogenization buffer: 50 mM Tris, pH 8.0, 500 mMsucrose, 10% glycerol (v/v), 20 mM EDTA, 20 mM EGTA,0.6% PVP, and 10 mM ascorbic acid. Adjust the pH of thebuffer to 8.0 using 2-morpholinoethane sulfonic acid (MES)(see Note 5).

2. Upper phase solution: 5 mM phosphate buffer pH 7.8,330 mM sucrose and 2 mM DTT (see Note 5).

3. Lower phase solution: 5 mM phosphate buffer, pH 7.8, 5 mMKCl, 300mM sucrose, 6.4%Dextran T-500 (w/w), 6.4% PEG-3350 (w/w) (see Note 5).

4. TEAB buffer: 100 mM triethyl ammonium bicarbonate,pH 8.5 (see Note 6).

5. SDT-lysis buffer: 4% SDS, 100 mM DTT in 100 mM TEABpH 8.5 (see Note 7).

6. UA buffer: 8 M urea in 100 mM TEAB pH 8.5 (see Note 7).

7. Alkylation buffer: 50 mM iodoacetamide in 100 mM TEABpH 8.5 (see Note 7).

8. Trypsin solution: Dissolve 100 μg of Trypsin lyophilized pow-der in 95 μL 100 mM TEAB and add 5 μL ACN (acetonitrile)(see Note 8).

9. Solvent A: water–ACN, 98:2 v/v; 0.1% formic acid (FA).

10. Solvent B: 100% ACN, 0.1% FA (v/v).

11. Solvent C: 0.1% FA in water (v/v).

3 Methods

3.1 Extraction of

Plasma Membrane

Proteins

1. Take 20 g of fresh healthy green leaves of rice and grind it in theprechilled pestle and mortar using liquid nitrogen.

2. Add 40 mL of homogenization buffer and vortex the homog-enate for 5 min.

Rice Leaf Plasma Membrane Proteome 109

3. Filter the homogenate using nylon cloth and centrifuge at26,000 � g for 25 min. Save 100 μL of this fraction as totalcellular proteins (T) for SDS-PAGE analysis.

4. Centrifuge the supernatant again at 84,000 � g for 25 min topellet down the microsomal fraction.

5. Discard the supernatant and add 9 mL of upper phase solutionand sonicate for 3 min using an ultrasonic probe sonicator.

6. Add 18 mL of lower phase solution, vortex and incubate on icefor 5 min (label it as tube A).

7. Centrifuge at 2000 � g for 10 min for phase separation.

8. Carefully transfer the upper phase to a new tube (label it as tubeB).

9. To maximize the yield, add 9 mL of upper phase solution againto the lower phase of tube A and vortex.

10. Incubate on ice for 5 min and centrifuge at 2000 � g for10 min.

11. Carefully collect the upper phase and combine it with the upperphase of tube B.

12. Add 18 mL of lower phase solution to the combined upperphase (tube B), vortex and centrifuge at 2000 � g for 10 min.

13. Collect the upper phase and dilute it with 5 volumes of water.Mix well and incubate on ice for 5 min.

14. Centrifuge at 84,000 � g for 10 min at 4 �C. Remove thesupernatant and pellet is the plasma membrane proteins.

15. Dissolve a part of the pellet in SDS-loading buffer and resolvethe isolated proteins on SDS-PAGE. A typical SDS-PAGE gelprofile of total and plasma membrane proteins from young riceleaves should look like the one shown in Fig. 1a.

16. Check the quality of the isolated plasma membrane proteins byWestern blots using organelle specific markers. An enrichmentof plasma membrane protein marker should be observed whilecytoplasmic and nuclear proteins markers should not bedetected in the isolated plasma membrane proteins fraction(Fig. 1b).

3.2 In-Solution

Trypsin Digestion by

Filter-Aided Sample

Preparation (FASP)

1. Dissolve the remaining plasma membrane protein pellet in300 μL of SDT-lysis buffer, sonicate for 5 min in a water bathtype sonicator.

2. Incubate the sample at 99 �C for 30 min followed by sonicationagain for 5 min.

3. Centrifuge at 12,000� g for 5 min to pellet down the insolubledebris.

110 Ravi Gupta et al.

4. Quantify proteins using 2DE quant kit following the manufac-turer’s protocol.

5. Take 100 μg of protein, dilute it to 300 μL with UA buffer, andload on 30 kDa spin filter.

6. Centrifuge at 14,000 � g for 10 min and wash thrice with UAbuffer for complete removal of SDS.

7. Add 200 μL of alkylation buffer and incubate for 1 h in dark.

8. Add 300 μL of UA buffer and centrifuge at 14,000 � g for10 min (two times).

9. Add 300 μL of 50 mM TEAB and centrifuge at 14,000 � g for10 min (three times).

10. Add 290 μL of 50 mM TEAB and 10 μL of Trypsin solutionand incubate at 37 �C overnight in a dry bath (see Note 9).

11. Transfer the spin filter to a new collection tube and collect thedigested peptides by centrifugation at 14,000 � g for 10 min.

12. Add 100 μL of 50 mM TEAB and 50 μL of 0.5 M NaCl andcollect the filtrate (digested peptides) (see Note 10).

13. Add 0.5 μL of FA to stop the reaction (see Note 11).

3.3 Desalting of

Peptides

1. Centrifuge the acidified digest at 2000� g for 15 min. Transferthe supernatant to a new protein low bind tube and discard theprecipitate.

Fig. 1 (a) SDS-PAGE showing the protein profile of total cellular (T) as well as plasma membrane (PM) proteinsisolated from young rice leaves. (b) Immunoblots showing total (T) and plasma membrane proteins probedwith anti-glutamine synthase (GS), anti-histone 1 (H1), and anti-plasma membrane intrinsic protein 2 (PIP2)antibodies used as cytosolic, nuclear, and plasma membrane markers, respectively. (c) Functional annotationof the MS identified proteins using MapMan program showing signaling overview

Rice Leaf Plasma Membrane Proteome 111

2. Take a new C-18-SD extraction disc cartridge and washsequentially with 3 mL each of Solvent-B and Solvent-C (seeNote 12).

3. Carefully load the acidified digest and collect the flow-through.

4. Reload the flow-through to the column and allow to pass itonce more for efficient binding of the peptides to the columnmatrix.

5. Wash column with 1 mL of Solvent-C. Repeat this step thrice.

6. Elute peptides with 300 μL each of 40%, 60%, and 80% ACNcontaining 0.1% FA.

7. Lyophilize the peptides in a speedvac (see Note 13).

3.4 Mass

Spectrometry

1. Reconstitute lyophilized peptides in 40 μL of Solvent-A.

2. Inject the desalted peptides in a UHPLC Dionex UltiMate®

3000 instrument equipped with Acclaim PepMap 100 trapcolumn and perform washing with 98% solvent A for 6 min ata flow rate of 6 μL/min.

3. Separate peptides continuously by reversed-phase chromatog-raphy using an Acclaim PepMap 100 capillary column at a flowrate of 400 nL/min.

4. Use a liquid chromatography–tandem mass spectrometry(LC-MS/MS) coupled with an electrospray ionization sourceto the quadrupole-based mass spectrometer QExactive™Orbitrap High-Resolution Mass Spectrometer and run theLC analytical gradient at 2%–35% solvent B over 90 min, then35–95% over 10 min, followed by 90% solvent B for 5 min, andfinally 5% solvent B for 15 min.

5. Let the resulting peptides electrosprayed through the coatedsilica emitted tip at an ion spray voltage of 2000 eV.

6. Acquire the MS spectra at a resolution of 70,000 (200m/z) ina mass range of 350–1800 m/z. Set a maximum injection timeto 100 ms for ion accumulation.

7. Use the eluted samples for MS/MS events (resolution of17,500), measured in a data-dependent mode for the10 most abundant peaks (Top10 method), in the high massaccuracy Orbitrap after ion activation/dissociation with HigherEnergy C-trap Dissociation (HCD) at 27 collision energy in a100–1650 m/z mass range.

3.5 Data Processing

Using MaxQuant

Software

1. Export all the raw files and load in the MaxQuant software (seeNote 14).

2. Download rice protein database (Osativa_373, 52424sequences) file from Phytozome and upload to the MaxQuantas the database file (see Note 15).

112 Ravi Gupta et al.

3. Search the acquiredMS/MS spectra against this database usingintegrated Andromeda as a search engine.

4. Select and apply the FDR <0.01 for proteins, peptides, andmodifications and select trypsin as a cleavage enzyme, cysteinecarbamidomethylation as a fixed modification, and oxidation ofmethionine and acetylation (protein N-term) as variablemodifications.

5. Specify a minimum peptide length of six amino acids andenable the “match between runs” (MBR) with a matchingtime window of 0.7 min. Click start and allow the MaxQuantto run.

6. After finishing the MaxQuant run, upload the ProteinGroupfile (saved automatically after MaxQuant run in the Combined-txt folder) to the Perseus and remove all the contaminants,reverse hits and identification based on the sites only.

7. Select the protein IDs and load it into the MapMan for thefunctional annotation of the identified proteins.

8. Download Osa_MSU_v7 mapping file from MapMan storeand select it for mapping of the rice phytozome IDs.

9. A typical pie chart of the functional annotation of the identifiedplasma membrane proteins from young rice leaves should besimilar to the one presented in Fig. 1c.

4 Notes

1. Sterilize rice seeds with 0.05% Spotak solution (BayerCropScience, South Korea) overnight at 4 �C, followed byfive washings with deionized water. Place sterilized seeds onmoist tissue paper and incubate at 28 �C in dark for germina-tion. Transfer germinated seeds to sterilized soil in a growthchamber at 24 �C � 1 �C temperature, 70% relative humidity,and 16/8 h day/night cycle. Harvest only leaf sheaths ofprimary and secondary leaves of 4-week-old rice plant and usefor protein extraction [9].

2. Use of this kit is recommended as proteins would be dissolvedin a high concentration of urea and other detergents whichinterfere with protein quantification by commonly used meth-ods such as Bradford’s method.

3. Trypsin/Lys-C Mix, Mass Spectrometry grade (Promega, Cat-alog number V5071) can also be used to increase the efficiencyof protein digestion.

4. Avoid using water bath during in-solution trypsin digestion asdigestion would be carried out in filter units and water can beleaked inside resulting in sample contamination.

Rice Leaf Plasma Membrane Proteome 113

5. These buffers can be stored at �20 �C for months.

6. A stock solution of 1 M TEAB can be purchased from Sigma-Aldrich (Catalog number T7408).

7. Always prepare these solutions fresh, as some of these solutionsinclude volatile solvents which may evaporate during storage.

8. Store remaining trypsin solution at�70 �C. Alternatively, tryp-sin can be dissolved in 50 mM acetic acid (pH < 3.0) solutionfor long term storage. At higher pH, trypsin activity is notinhibited, resulting in trypsin autolysis.

9. Lys-C can also be used along with trypsin to increase theefficiency of protein digestion.

10. The concentration of the peptides can be checked after thisstep using Pierce quantitative fluorometric peptide assay kit.Alternatively, the concentration of the peptides can be roughlyestimated by measuring the absorbance of the digest at280 nm, assuming that the 1 mg/mL solution would have1.1 absorbance units. Use only quartz cuvettes and UV–visiblespectrometer for the latter approach.

11. Check the pH of the eluate using a pH strip. If pH is>3.0, addmore FA until the pH drops <3.0.

12. Avoid air bubbles which may trap during changing of solventsand do not allow the column to dry in between.

13. Lyophilized peptides can be stored at �70 �C for months.

14. Proteome Discover Software can also be used for the databasesearch.

15. Rice protein database file can also be downloaded from Uni-Prot. However, in that case, UniProt IDs should be convertedto Phytozome IDs by BLAST analysis. Alternatively, functionalannotation of the identified proteins can be done by KEGGpathway or Gene Ontology directly using UniProt IDs.

Acknowledgments

This work was supported by the National Research Foundation ofKorea through Basic Research Lab (BRL) program(2018R1A4A1025158) and (2019R1A2C2085868) provided toS.T.K.

References

1. Li B, Takahashi D, Kawamura Y et al (2018)Plasma membrane proteomics of Arabidopsissuspension-cultured cells associated with growthphase using nano-LC-MS/MS. In: Plant mem-brane proteomics. Springer, Cham, pp 185–194

2. Santoni V, MolloyM, Rabilloud T (2000)Mem-brane proteins and proteomics: un amourimpossible? Electrophoresis 21:1054–1070

114 Ravi Gupta et al.

3. Cordwell SJ, Thingholm TE (2010) Technolo-gies for plasma membrane proteomics. Proteo-mics 10:611–627

4. Voothuluru P, Anderson JC, Sharp RE et al(2016) Plasma membrane proteomics in themaize primary root growth zone: novel insightsinto root growth adaptation to water stress.Plant Cell Environ 39:2043–2054

5. Gupta R, Lee SE, Agrawal GK et al (2015)Understanding the plant-pathogen interactionsin the context of proteomics-generated apoplas-tic proteins inventory. Front Plant Sci 6:352

6. Wang Y, Gupta R, Song W et al (2017) Label-free quantitative secretome analysis of Xantho-monas oryzae pv. Oryzae highlights the

involvement of a novel cysteine protease in itspathogenicity. J Proteome 169:202–214

7. Wang Y, Wu J, Kim SG et al (2016) Magna-porthe oryzae-secreted protein MSP1 inducescell death and elicits defense responses in rice.Mol Plant-Microbe Interact 29:299–312

8. Santoni V (2007) Plant plasma membrane pro-tein extraction and solubilization for proteomicanalysis. In: Plant proteomics. Springer, Cham,pp 93–109

9. Meng Q, Gupta R, Min CW et al (2019) Aproteomic insight into the MSP1 and flg22induced signaling in Oryza sativa leaves. J Prote-ome 196:120–130

Rice Leaf Plasma Membrane Proteome 115

Chapter 9

Isolation, Purity Assessment, and Proteomic Analysisof Endoplasmic Reticulum

Xin Wang and Setsuko Komatsu

Abstract

Subcellular proteomics include, in its experimental workflow, steps aimed at purifying organelles. The purityof the subcellular fraction should be assessed before mass spectrometry analysis, in order to confidentlyconclude the presence of associated specific proteoforms, deepening the knowledge of its biologicalfunction. In this chapter, a protocol for isolating endoplasmic reticulum (ER) and purity assessment isreported, and it precedes the proteomic analysis through a gel-free/label-free proteomic approach. Dys-function of quality-control mechanisms of protein metabolism in ER leads to ER stress. Additionally, ER,which is a calcium-storage organelle, is responsible for signaling and homeostatic function, and calciumhomeostasis is required for plant tolerance. With such predominant cell functions, effective protocols tofractionate highly purified ER are needed. Here, isolation methods and purity assessments of ER aredescribed. In addition, a gel-free/label-free proteomic approach of ER is presented.

Key words Endoplasmic reticulum isolation, Endoplasmic reticulum purity, Subcellular proteomics,Shotgun proteomics

1 Introduction

The endoplasmic reticulum (ER) is a specialized organelle formedby a continuous membrane system without (smooth ER) or with(rough ER) ribosomes studded [1]. The ER lumen is the placewhere protein traffic, modification, and metabolism take place,including folding, refolding, degradation, and secretion [2]. Path-ways of ER-associated degradation and unfolded protein responseare activated to degrade misfolded or unfolded proteins, which areaccumulated in response to ER stress caused by environmental cues[3]. Besides, ER is a calcium-storage organelle, whose homeostasisis controlled through the distribution of calcium-handling pro-teins, namely calcium-binding proteins, calcium pumps, andcalcium-release channels [4]. Endoplasmic reticulum-associateddegradation was necessary for plants to overcome salt stress becausea defect component in endoplasmic reticulum-associated

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_9, © Springer Science+Business Media, LLC, part of Springer Nature 2020

117

degradation complex led to alteration of unfolded protein responseand increased plant sensitivity [5]. In addition, released calciumfrom ER was involved in elevation of unfolded protein response [5]and increased cytosolic calcium disturbed proper environment forprotein folding in ER [6]. Such documented information hasaddressed the vital roles of ER in protein quality-control mechan-isms, calcium homeostasis, and stress tolerance in plants under salt,drought, and flooding conditions.

Because of the important cell functions of ER, studies withincreasing technical complexity and sophistication have been car-ried out to obtain protein profiles of ER organelle [7]. In castorbean, a gel-based proteomic analysis was conducted in ER isolatedfrom developing and germinating seeds, indicating that proteinfolding represented dominant components of ER [8]. In rice endo-sperm, an iTRAQ-based proteomic approach was performed toexplore metabolisms induced by ER stress, a result of the accumu-lation of unfolded or misfolded proteins in ER, suggesting thatpathways of protein-processing in ER and degradation-related pro-teasome were predominantly affected [9]. In barley, glycosylationsites of secreted proteins in gibberellic acid-induced aleurone layerswere investigated under ER-stressed conditions through agel-based proteomic technique [10]. Additionally, both ofgel-based [11] and gel-free/label-free [6] proteomic approacheswere performed in ER isolated from soybean, suggesting thatsuppressed protein glycosylation and disturbed calcium homeosta-sis were induced by flooding. These studies emphasize pivotal rolesof proteomic techniques on investigation of ER functions in plantsduring seed development and under stress conditions.

For subcellular proteomics, it is critical to isolate highly purifiedER prior to conducting mass spectrometry (MS) analysis. A num-ber of methods have been developed to enrich ER fraction, such ascontinuous iodixanol gradients [12], discontinuous sucrose gradi-ents [13], continuous sucrose gradients [14], ultracentrifugationcombined with [15–17] or without [18] sucrose gradients.Although these methods are available, the whole process is rathercomplex and time consuming [19]. Simple methods using theEndoplasmic Reticulum Enrichment kit (Novus, Littleton, CO,USA; Catalog Number NBP2-29482) were applied for ER enrich-ment in soybean with high efficiency [6, 11], though they weredeveloped from animal materials. These findings present variousapproaches for ER isolation; however, compared with animal mate-rials, isolation protocols of ER from plants are rare, and alternativemethods need to be developed.

Purity validation of ER is conducted through microscopic,immunoblot, and enzymatic analyses. Electron microscope wasapplied to estimate integrity and purity of targeted fraction; how-ever, it is largely dependent on commercial dyes developed forspecific subcellular organelles. To overcome such shortage,

118 Xin Wang and Setsuko Komatsu

immunoblot and enzymatic analyses are utilized as alternative waysto estimate ER purity. Antibodies for immunoblot analysis andmaker proteins for enzyme assay have been reviewed to indicatecontamination from cytosol, plasma membrane, nuclei, mitochon-dria, and chloroplast [19, 20]. Furthermore, for immunoblot anal-ysis, Tlg2p, ALP, and Pep12p were used as antibodies to assesscontamination from Golgi, vacuole, and endosome, respectively,while ER-resident proteins, such as Sec63p, Dpm1p, Kar2p, Sey1p,and Yop1p were applied as ER-specific antibodies [18]. These stud-ies provide multiple methods to isolate the intact ER fraction withhigh purity and they could be alternatively used based on theadvantages of available techniques.

Here, protocols applied for ER enrichment from soybean arepresented [6, 11]. In addition, purity of isolated ER fraction isestimated through immunoblot and enzymatic analyses. Further-more, the gel-free/label-free proteomic approach applied to ER isdescribed.

2 Materials

2.1 Isolation of ER

Fraction

1. Isosmotic homogenization buffer: the isosmotic homogeniza-tion buffer and protease inhibitor cocktail are provided by theEndoplasmic Reticulum Enrichment kit (Novus, Littleton,CO, USA), and consisted of HEPES, sucrose, and KCl.Prepare fresh and kept on ice for use. Mix 1� isosmotichomogenization buffer with 100� protease inhibitor cocktail(see Note 1).

2. ER precipitation solution: 8 mM CaCl2.

2.2 Immunoblot

Analysis for ER Purity

Assessment

1. Sodium dodecyl sulfate (SDS)-sample buffer: 60 mM Tris–HCl, pH 6.8, 2% SDS, 10% glycerol, and 5%2-mercaptoethanol.

2. Blocking buffer: 20 mM Tris–HCl, pH 7.5, 500 mM NaCl,and 5% nonfat milk.

3. Washing buffer: 20 mMTris–HCl, pH 7.5, 137mMNaCl, and0.1% Tween-20.

4. Filter paper: four pieces of filter paper (7.5 cm � 10 cm;Thermo Fisher Scientific, San Jose, CA, USA) are used toblot one membrane.

5. Polyvinylidene difluoride membrane (7.0 cm � 8.4 cm;Thermo Fisher Scientific;).

6. Semidry transfer blotter: a semidry transfer blotter (Bio-Rad,Hercules, CA, USA) with the current of 1 mA/cm2 for 90 minis used for membrane blotting.

Proteomic Analysis of Endoplasmic Reticulum 119

7. First antibodies: anti-ascorbate peroxidase [21], anti-calnexin[22], and anti-histone H3 (Abcam, Cambridge, UK).

8. Secondary antibody: goat anti-rabbit IgG conjugated withhorseradish peroxidase (Bio-Rad).

9. Chem-Lumi One Super kit (Nacalai Tesque, Kyoto, Japan).

10. LAS-3000 (Fujifilm, Tokyo, Japan).

2.3 Enzymatic

Analysis for ER Purity

Assessment

1. Extraction buffer: 50 mM HEPES-NaOH, pH 7.5, 1 mMEDTA, 5 mMMgCl2, 2% polyvinylpyrrolidone-40, 1 mM phe-nylmethylsulfonyl fluoride, 1 mM dithiothreitol, and 0.1% Tri-ton X-100.

2. Reaction buffer for alcohol dehydrogenase assay: 50 mMMES-NaOH, pH 7.5, 5 mM MgCl2, 0.1 mM NADH, 1 mMdithiothreitol, and 4% acetaldehyde.

3. Reaction buffer for glucose-6-phosphate dehydrogenase assay:55 mM Tris–HCl, pH 7.8, 3.3 mM MgCl2, 0.2 mM NADP,and 3.3 mM glucose 6-phosphate.

4. Reaction buffer for fumarase assay: 70 mM KH2PO4-NaOH,pH 7.7, 50 mM L-malic acid, and 0.05% Triton X-100.

5. Reaction buffer for catalase assay: 50 mMNa2HPO4-KH2PO4,pH 7.0, and 15 mM H2O2.

6. Reaction buffer for NADH cytochrome c reductase assay:20 mM Na2HPO4-KH2PO4, pH 7.2, 0.2 mM NADH,0.02 mM cytochrome c, and 6 mM NaN3.

7. Beckman Coulter DU-370 spectrophotometer UV/Vis (Beck-man, Coulter, CA, USA).

2.4 Concentration

Measurement

of Proteins

1. Bovine serum albumin: 20 mg/mL bovine serum albumin(Sigma-Aldrich, St. Louis, MO, USA).

2. Dye regent: Bio-Rad Protein Assay Dye Reagent Concentrate(Bio-Rad) is five times diluted before using.

3. Pierce 660 nm Protein Assay Kit (Thermo Fisher Scientific) isused to examine the protein concentration of samples dissolvedin SDS-sample buffer. One pack of Ionic Detergent Compati-bility Reagent (Thermo Fisher Scientific) is dissolved into20 mL of Pierce 660 nm Protein Assay Reagent.

2.5 Proteomic

Analysis of ER Proteins

1. Lysis solution: 7 M urea, 2 M thiourea, 5% CHAPS, and 2 mMtributylphosphine.

2. Protein enrichment: methanol and chloroform.

3. Protein reduction: 250 mM dithiothreitol in 50 mM ammo-nium bicarbonate.

4. Protein alkylation: 300 mM iodoacetamide in 50 mM ammo-nium bicarbonate.

120 Xin Wang and Setsuko Komatsu

5. Suspension solution for alkylated proteins: 100 mM ammo-nium bicarbonate.

6. Protein digestion: trypsin (Wako, Tokyo, Japan) is dissolved inthe medium provided by the kit to make 0.1 μg/μL workingsolution. Lysyl endopeptidase (Wako) is dissolved in 50 mMammonium bicarbonate to make 0.1 μg/μL working solution.

7. Peptide acidification: 20% (v/v) formic acid.

8. Peptide desalt: C18-pipette tips (SPE C-TIP, Nikkyo Technos,Tokyo, Japan).

2.6 Search Engine,

Software,

and Database

for Proteomic Analysis

1. Search engine: Mascot search engine (version 2.5.1; MatrixScience, London, UK).

2. Software: Xcalibur software (version 2.0.7; Thermo Fisher Sci-entific), Proteome Discoverer software (version 1.4.0.288;Thermo Fisher Scientific), SIEVE software (version 2.1.377;Thermo Fisher Scientific), MapMan software (version3.6.0RC1), SUBA3 (http://suba3.plantenergy.uwa.edu.au/),MultiLoc2 (http://abi.inf.uni-tuebingen.de/Services/MultiLoc2), and WoLF PSORT (http://www.genscript.com/wolf-psort.html).

3. Database: soybean peptide database constructed from soybeangenome database (Phytozome version 12; http://www.phytozome.net/soybean) and Gmax_109_peptide (http://mapman.gabipd.org/web/guest/mapman).

3 Methods

3.1 Isolation of Total

ER Fraction

The procedure is conducted on ice or at 4 �C in the cold room.

1. Fresh sample is ground in a glass mortar and pestle with isos-motic homogenization buffer (see Note 1).

2. The homogenate is transferred to a Falcon tube and centri-fuged at 3000 � g for 10 min at 4 �C.

3. The pellet is collected as Fraction 1. The supernatant is col-lected and centrifuged at 12,000 � g for 15 min at 4 �C.

4. The pellet is collected as Fraction 2. The supernatant is col-lected and centrifuged at 90,000 � g for 60 min at 4 �C.

5. The supernatant is discarded and the pellet collected as totalER fraction (Fig. 1).

3.2 Isolation

of Rough ER Fraction

The procedure is conducted on ice or at 4 �C in the cold room.

1. Fresh sample is ground in a glass mortar and pestle withhomogenization buffer (see Note 1).

Proteomic Analysis of Endoplasmic Reticulum 121

2. The homogenate is transferred to a Falcon tube and centri-fuged at 1000 � g for 10 min at 4 �C.

3. The pellet is collected as Fraction 1. The supernatant is col-lected and centrifuged at 10,000 � g for 15 min at 4 �C.

4. The pellet is collected as Fraction 2. The supernatant is col-lected and centrifuged at 12,000 � g for 15 min at 4 �C.

5. The supernatant is transferred to a glass beaker and kept on ice(see Note 2).

6. Precipitation solution of CaCl2 is added drop by drop to thesupernatant, stirring the mixture for 15 min. Keep the CaCl2 inice and slow down the stirring speed during the precipitation(see Note 3).

7. The mixture is transferred to a Falcon tube and centrifuged at8000 � g for 10 min at 4 �C.

8. The supernatant is discarded and the pellet collected as roughER fraction (Fig. 2).

3.3 Immunoblot

Analysis for ER Purity

Assessment of ER

Fraction

1. Protein preparation: total or rough ER fractions are suspended,for protein solubilization in SDS-sample buffer; vigorous vor-tex favors dissolution. The mixture is centrifuged at 20,000� gfor 20 min at room temperature. Supernatant is collected andused for immuno-blot analysis.

2. Protein concentration is determined by Pierce 660 nm ProteinAssay Kit with ionic detergent compatibility reagent (seeNote 4).

Transfer the supernatant to an Eppendorf tube

Fresh sample collection

Sample homogenization

Put the homogenate in a Falcon tube

Transfer the supernatant to an Eppendorf tubePellet (Fraction 1)

Pellet (Total ER fraction)

3,000 × g, 10 min, 4 oC

12,000 × g, 15 min, 4 oC

90,000 × g, 60 min, 4 oCPellet

(Fraction 2)

Fig. 1 Procedure of total ER isolation. Fresh sample is ground with homogeniza-tion buffer. The homogenate is centrifuged at 3000 � g for 10 min at 4 �C andthe pellet is collected as Fraction 1. The supernatant is centrifuged at12,000 � g for 15 min at 4 �C. The pellet is collected as Fraction 2 and thesupernatant is centrifuged at 90,000 � g for 60 min at 4 �C. This final pellet iscollected as total ER fraction

122 Xin Wang and Setsuko Komatsu

To examine concentration of protein samples for immunoblotanalysis, a serial diluted bovine serum albumin of 0.5 mg/mL,1.0 mg/mL, 1.5 mg/mL, 2.0 mg/mL, 2.5 mg/mL, 3.0 mg/mL, 3.5 mg/mL, 4.0 mg/mL, 4.5 mg/mL, and 5.0 mg/mL isfreshly prepared to make standard curve. Concentration of pro-tein samples are calculated based on the standard curve.

3. Electrophoresis: protein samples (10 μg) are separated by gelelectrophoresis on 17% SDS–polyacrylamide gel (SDS-PAGE).Coomassie brilliant blue staining of applied proteins for elec-trophoresis is served as a loading control.

4. A “sandwich” making by two pieces of filter paper (Bio-Rad),one piece of polyvinylidene difluoride membrane (ThermoFisher Scientific), 17% SDS–polyacrylamide gel, two pieces offilter paper is applied for blotting using a semidry transferblotter (Bio-Rad) in a current of 1 mA/cm2 for 90 min (seeNote 5).

Transfer the supernatant to an Eppendorf tube

Fresh sample collection

Sample homogenization

Transfer the supernatant to an Eppendorf tubePellet (Fraction 1)

Pellet (Rough ER fraction)

1,000 × g, 10 min, 4 oC

10,000 × g, 15 min, 4 oC

8,000 × g, 10 min, 4 oC

Pellet (Fraction 2) 12,000 × g, 15 min, 4 oC

Transfer the supernatant to a glass beaker

Supernatant precipitation

Transfer the supernatant to a Falcon tube

Put the homogenate in a Falcon tube

Fig. 2 Procedure of rough ER isolation. Fresh sample is ground with homogeniz-ing buffer. The homogenate is centrifuged at 1000� g for 10 min at 4 �C and thepellet is collected as Fraction 1. The supernatant is collected and centrifuged at10,000 � g for 15 min at 4 �C. The pellet is collected as Fraction 2 and thesupernatant centrifuged at 12,000 � g for 15 min at 4 �C. The supernatant iscollected for precipitation using 8 mM CaCl2 and centrifuged at 8000 � g for10 min at 4 �C. The pellet is collected as rough ER fraction

Proteomic Analysis of Endoplasmic Reticulum 123

5. Membrane blocking: blotted membrane is blocked in blockingbuffer overnight at 4 �C.

6. Membrane washing: blocked membrane is washed three timesusing washing buffer at room temperature (see Note 6).

7. Reaction with the first antibody: anti-ascorbate peroxidaseantibody, anti-calnexin antibody, and anti-histone H3 antibodyare used as the first antibodies. Anti-ascorbate peroxidase anti-body is 10,000 times diluted and used as marker protein forcytosol. Anti-calnexin antibody is 5000 times diluted and usedas marker protein for ER. Anti-histone H3 antibody is 9000times diluted and used as marker protein for nucleus (see Note7). Membrane is reacted with the first antibody for 60 min atroom temperature.

8. Membrane washing: reacted membrane is washed three timesas described above.

9. Reaction with the secondary antibody: goat anti-rabbit IgGconjugated with horseradish peroxidase (Bio-Rad) is used asthe secondary antibody. The secondary antibody is 3000 timesdiluted in use (see Note 7). Membrane is reacted with thesecondary antibody for 60 min at room temperature.

10. Membrane washing: reacted membrane is washed three timesusing washing buffer at room temperature (see Note 6).

11. Signal detection: signals are detected using a Chem-Lumi OneSuper kit (Nacalai Tesque, Kyoto, Japan) and visualized byluminescent image analyzer LAS-3000 (Fujifilm, Tokyo,Japan). Reaction using the Chem-Lumi One Super kit is per-formed according to the user’s manual. The kit provides Solu-tion A and Solution B.Mix Solution A and Solution B in one toone ratio to prepare working solution, keep mixed solution in a1.5 mL Falcon tube, cover the tube with aluminum foil, andkeep it at room temperature for use.Wipewashing buffer on themembrane with paper towels, and put the membrane on aplastic wrap spread on the desk. Cover the membranecompletely with working solution, and incubate in dark condi-tion for 1 min. Wipe the working solution from the membranecarefully with paper towels and cover the membrane with a newplastic wrap for signal detection. Detection using luminescentimage analyzer LAS-3000 is conducted according to the user’smanual. Start the computer equippedwithLAS-3000, select theexposure type, set the interval time, put membrane on the traywith suitable position, and focus to get the image (seeNote 8).

3.4 Enzymatic

Analysis for ER Purity

Assessment of ER

Fraction

1. Total or rough ER preparations are suspended, for proteinsolubilization, in the enzyme corresponding buffer (Subhead-ing 3.4, steps 3–6); vigorous vortex favors dissolution.

124 Xin Wang and Setsuko Komatsu

2. The mixture is sonicated in cold water for 40 min followed bycentrifugation at 20,000 � g for 20 min at 4 �C. Supernatant iscollected and used for enzymatic analysis.

3. Protein concentration is determined by Bradford method[23]. The bovine serum albumin (Sigma-Aldrich) is used asstandard protein. A serial of diluted bovine serum albumin of0.2 mg/mL, 0.4 mg/mL, 0.6 mg/mL, 0.8 mg/mL, 1.0 mg/mL, 1.2 mg/mL, 1.4 mg/mL, 1.6 mg/mL, 1.8 mg/mL, and2.0 mg/mL is freshly prepared to make standard curve. TheBio-Rad Protein Assay Dye Reagent Concentrate (Bio-Rad) isfive times diluted to prepare the working solution. Concentra-tion of protein samples are calculated based on the standardcurve.

4. Activity of alcohol dehydrogenase or glucose-6-phosphatedehydrogenase: protein samples used for enzyme assay arediluted to 1.0 mg/mL using extraction buffer. A volume of100 μL of protein sample is added into 900 μL of reactionbuffer for alcohol dehydrogenase assay or for glucose-6-phos-phate dehydrogenase assay. Immediately mix by inversion andthe reaction is measured for 5 min at 25 �C at 340 nm(EC340 ¼ 6.23 mM�1 cm�1). The activity of alcohol dehydro-genase [24] or glucose-6-phosphate dehydrogenase [25] iscalculated using the following formula: (ΔA340 � total vol-ume � sample dilution factor)/(6.23 � sample volume).

5. Activity of fumarase: protein samples used for fumarase assayare diluted to 1.0 mg/mL using extraction buffer. A volume of100 μL of protein sample is added into 900 μL of reactionbuffer for enzyme assay. Immediately mix by inversion and thereaction is measured for 5 min at 25 �C at 340 nm(EC340 ¼ 2.55 mM�1 cm�1). The activity of fumarase is calcu-lated using formula: (ΔA340 � total volume � sample dilutionfactor)/(2.55 � sample volume) [26].

6. Activity of catalase: protein sample used for catalase assay isdiluted to 1.0 mg/mL using extraction buffer. A volume of100 μL of protein sample is added into 900 μL of reactionbuffer for enzyme assay. Immediately mix by inversion and thereaction is measured for 5 min at 25 �C at 240 nm(EC240 ¼ 40 mM�1 cm�1). The activity of catalase is calculatedwith formula: (ΔA240 � total volume � sample dilution fac-tor)/(40 � sample volume) [27].

7. Activity of NADH cytochrome c reductase: protein sampleused for NADH cytochrome c reductase assay is diluted to1.0 mg/mL using extraction buffer. A volume of 100 μL ofprotein sample is added into 900 μL of reaction buffer forenzyme assay. Immediately mix by inversion and the reactionis measured for 5 min at 25 �C at 550 nm

Proteomic Analysis of Endoplasmic Reticulum 125

(EC550 ¼ 21.1 mM�1 cm�1). The activity of NADH cyto-chrome c reductase is calculated with formula: (ΔA550 � totalvolume � sample dilution factor)/(21.1 � sample volume)[28, 29].

3.5 Proteomic

Analysis of ER Proteins

3.5.1 Preparation

of Peptides for Gel-Free/

Label-Free Proteomic

Analysis

1. ER fraction (Subheading 3.1, step 5 and Subheading 3.2, step8) is dissolved in lysis buffer followed by sonication in coldwater for 20 min. The suspension is centrifuged at 20,000 � gfor 20 min at 4 �C. The solubilized proteins kept in thesupernatant.

2. Proteins concentration is determined as described in Subhead-ing 3.4, step 3.

3. Proteins (100 μg) are added to 400 μL of methanol and mixedthoroughly before adding 100 μL of chloroform and 300 μL ofwater.

4. Mixed sample is centrifuged at 20,000 � g for 10 min at roomtemperature to achieve phase separation.

5. Upper aqueous phase is discarded and 300 μL of methanol isslowly added to lower phase.

6. Mixture is centrifuged at 20,000 � g for 10 min at roomtemperature. Supernatant is discarded and the pellet alloweddrying at room temperature.

7. Dried pellet is re-suspended in 20 μL of 50 mM ammoniumbicarbonate.

8. Proteins are reduced with 5 μL of 250 mM dithiothreitol in50 mM ammonium bicarbonate for 30 min at 56 �C.

9. Proteins are alkylated with 5 μL of 300 mM iodoacetamide in50 mM ammonium bicarbonate for 30 min at 37 �C indarkness.

10. Alkylated proteins are resuspended in 40 μL of 100 mMammonium bicarbonate.

11. Proteins are digested with 10 μL of 0.1 μg/μL trypsin (Wako)and 10 μL of 0.1 μg/μL lysyl endopeptidase (Wako) for 16 h at37 �C.

12. Peptides are acidified with 20 μL of 20% (v/v) formic acid(pH < 3) and centrifuged at 20,000 � g for 10 min at roomtemperature.

13. Supernatant is collected and acidified peptides are desalted withC18-pipette tips (SPE C-TIP, Nikkyo Technos, Tokyo, Japan).

14. Desalted peptides are subjected to nano-liquid chromatogra-phy (LC)-MS/MS analysis.

126 Xin Wang and Setsuko Komatsu

3.5.2 Mass Spectrometry

Analysis

Peptides are separated using an Ultimate 3000 nanoLC system(Dionex, Germering, Germany) equipped with a C18 PepMaptrap column (300 mm ID � 5 mm; Dionex) equilibrated with0.1% formic acid and eluted with a linear acetonitrile gradient(8–30% over 150 min) in 0.1% formic acid at a flow rate of200 nL/min on a C18 Tip column (75 μm 1D � 120 mm; NikkyoTechnos) with a spray voltage of 1.8 kV. Peptide ions are detectedusing a nanospray LTQ Orbitrap Discovery MS (Thermo FisherScientific) in data-dependent acquisition mode with installed Xca-libur software (version 2.0.7; Thermo Fisher Scientific). Full-scanmass spectra are acquired in mass spectrometer over 400–1500m/zwith a resolution of 30,000. A lock mass function is used to obtainhigh mass accuracy. Ions of C24H39O4

+ (m/z 391.28429),C14H46NO7Si7

+ (m/z 536.16536), and C16H52NO8Si8+ (m/z

610.18416) are used as lock mass standards [30]. Top ten mostintense precursor ions are selected for collision-induced fragmenta-tion in linear ion trap at a normalized collision energy of 35%.Dynamic exclusion is employed within 90 sec to prevent repetitiveselection of peptides [31].

3.5.3 Protein

Identification from

Acquired Mass

Spectrometry Data

Protein identification is conducted using Mascot search engine(version 2.5.1; Matrix Science, London, UK) with soybean peptidedatabase constructed from soybean genome database (Phytozomeversion 12; http://www.phytozome.net/soybean) [32]. Acquiredraw files are processed using ProteomeDiscoverer software (version1.4.0.288; Thermo Fisher Scientific). Parameters set in Mascotsearch engine are as follows: carbamidomethylation of cysteine isfixed modification; oxidation of methionine is variable modifica-tion; trypsin is specific proteolytic enzyme; one missed cleavage isallowed; peptides mass tolerance is 10 ppm; fragment mass toler-ance is 0.8 Da; and peptide charge is set at +2, +3, and +4. Peptidecutoff score is 10, and S/N threshold (FT-only) is set at 1.5 forpeak filtration. An automatic decoy database search is performed aspart of search. Mascot percolator is performed to improve accuracyand sensitivity of peptide identification [33]. False discovery ratesfor peptide identification of all searches are less than 0.01. Peptideswith more than 13 ( p < 0.05) percolator ion score are used forprotein identification.

3.5.4 Analysis of Relative

Protein Abundance Using

Acquired Mass

Spectrometry Data

Acquired Mascot results are exported into SIEVE software (version2.1.377; Thermo Fisher Scientific) for quantitation analysisbetween the control and experimental groups. Chromatographicpeaks detected by MS are aligned, and peptide peaks are detected asa frame on all parent ions scanned by MS/MS using 5 min of frametime width and 10 ppm of frame m/z width. Areas of chro-matographic peak within a frame are compared for each sample,and ratios between samples are determined for each frame. Frameswith MS/MS scan are matched to Mascot results. Peptide ratios

Proteomic Analysis of Endoplasmic Reticulum 127

between samples are determined from variance-weighted average ofratios in frames, which MS/MS spectrum match to the peptides.Ratios of peptides are further integrated to determine ratios ofcorresponding proteins. Total ion current is used for normalizationof differential analysis of protein abundance. The outliers of ratioare deleted in frame table filter based on frame area. The minimumrequirement for protein identification is two matched peptides.Isoforms are deleted manually according to proteinID. Significant changes of relative protein abundance between thecontrol and experimental groups are analyzed (p < 0.05).

3.5.5 Analysis

of Absolute Protein Amount

Using Acquired Mass

Spectrometry Data

Exported XML files from Mascot are used to analyze absoluteprotein abundance. The term of exponentially modified proteinabundance index (emPAI) is used to indicate absolute proteinamount. The emPAI value of each identified protein is divided bysum of emPAI values of all identified proteins and multiplied by100. The absolute protein amount is determined by molar percent-age (mol %) [34].

3.5.6 Visualization

of Protein Abundance

Visualization of protein abundance is performed using MapMansoftware (version 3.6.0RC1). Software and mapping files ofGmax_109_peptide are downloaded from MapMan website(http://mapman.gabipd.org/web/guest/mapman) [35].

3.5.7 Protein Localization

Prediction

Protein localization is predicted using intracellular targeting predic-tion programs of SUBA3 (http://suba3.plantenergy.uwa.edu.au/)[36], MultiLoc2 (http://abi.inf.uni-tuebingen.de/Services/MultiLoc2) [37], and WoLF PSORT (http://www.genscript.com/wolf-psort.html) [38].

4 Notes

1. A portion (1.0 g) of fresh sample is available for ER enrichmentfrom plants, working well for soybean root tip. A volume ofgrinding buffer consisting of 4 mL of 1� isosmotic homogeni-zation buffer combined with 40 μL of 100� protease inhibitorcocktail is used to grind the fresh sample. Grinding buffer isfreshly prepared each time for ER enrichment.

2. Volume of supernatant used for precipitation is recorded.Based on the records, a volume of CaCl2 is 15 times as that ofsupernatant used for precipitation.

3. Pasteur pipette is used to add CaCl2 drop by drop. Smallstirring bar is recommended. Precipitation time could be opti-mized according to the volume CaCl2.

128 Xin Wang and Setsuko Komatsu

4. Vigorously vortex to help the Ionic Detergent CompatibilityReagent easily dissolving into Pierce 660 nm Protein AssayReagent. Protect the mixture from light using aluminum foil.

5. Air bubble is removed between membrane and filter paper. Theorder for “sandwich” is “anode–filter paper–membrane–gel–filter paper–cathode,” from bottom to top.

6. Three minutes is recommended for membrane washing eachtime. Washing time is optimized based on the intensity ofdetected signals.

7. Dilution ratio is optimized based on the purity of antibody.High dilution ratio is recommended for high purifiedantibodies.

8. It is better to carry out the detection after incubation withworking solution as soon as possible. To get the suitable inten-sity of signals, incubation time, exposure time, and concentra-tion of proteins or antibodies could be optimized.

Acknowledgments

This work was supported by JSPS KAKENHI Grant Number15H04445.

References

1. Healy SJ, Verfaillie T, Jager R et al (2012)Biology of the endoplasmic reticulum. In:Agostinis P, Samali A (eds) Endoplasmic retic-ulum stress in health and disease. Springer,Dordrecht, pp 3–22

2. Kleizen B, Braakman I (2004) Protein foldingand quality control in the endoplasmic reticu-lum. Curr Opin Cell Biol 16:343–349

3. Howell SH (2013) Endoplasmic reticulumstress responses in plants. Annu Rev PlantBiol 64:477–499

4. Papp S, Dziak E, Michalak M, Opas M (2003)Is all of the endoplasmic reticulum createdequal? The effects of the heterogeneous distri-bution of endoplasmic reticulum Ca2+-handling proteins. J Cell Biol 160:475–479

5. Liu L, Cui F, Li Q et al (2011) The endoplas-mic reticulum-associated degradation is neces-sary for plant salt tolerance. Cell Res21:957–969

6. Wang X, Komatsu S (2016) Gel-free/label-freeproteomic analysis of endoplasmic reticulumproteins in soybean root tips under floodingand drought stresses. J Proteome Res15:2211–2227

7. Chen X, Karnovsky A, Sans MD et al (2010)Molecular characterization of the endoplasmicreticulum: insights from proteomic studies.Proteomics 10:4040–4052

8. Maltman DJ, Gadd SM, SimonWJ et al (2007)Differential proteomic analysis of the endoplas-mic reticulum from developing and germinat-ing seeds of castor (Ricinus communis)identifies seed protein precursors as significantcomponents of the endoplasmic reticulum.Proteomics 7:1513–1528

9. Qian D, Tian L, Qu L (2015) Proteomic anal-ysis of endoplasmic reticulum stress responsesin rice seeds. Sci Rep 5:14255

10. Barba-Espın G, Dedvisitsakul P, H€agglund Pet al (2014) Gibberellic acid-induced aleuronelayers responding to heat shock or tunicamycinprovide insight into the N-glycoproteome,protein secretion, and endoplasmic reticulumstress. Plant Physiol 164:951–965

11. Komatsu S, Kuji R, Nanjo Y et al (2012) Com-prehensive analysis of endoplasmic reticulum-enriched fraction in root tips of soybean underflooding stress using proteomics techniques. JProteome 77:531–560

Proteomic Analysis of Endoplasmic Reticulum 129

12. Graham JM (2002) Fractionation of Golgi,endoplasmic reticulum, and plasma membranefrom cultured cells in a preformed continuousiodixanol gradient. Sci World J 2:1435–1439

13. Williamson CD, Wong DS, Bozidis P et al(2015) Isolation of endoplasmic reticulum,mitochondria, and mitochondria-associatedmembrane and detergent resistant membranefractions from transfected cells and fromhuman cytomegalovirus-infected primaryfibroblasts. Curr Protoc Cell Biol68:3.27.1–3.27.33

14. Shore GC, Tata JR (1977) Two fractions ofrough endoplasmic reticulum from ratliver. I. Recovery of rapidly sedimenting endo-plasmic reticulum in association with mito-chondria. J Cell Biol 2:714–725

15. Coughlan SJ, Hastings C,Winfrey RJ Jr (1996)Molecular characterisation of plant endoplas-mic reticulum. Identification of proteindisulfide-isomerase as the major reticuloplas-min. Eur J Biochem 235:215–224

16. Maltman DJ, Simon WJ, Wheeler CH et al(2002) Proteomic analysis of the endoplasmicreticulum from developing and germinatingseed of castor (Ricinus communis). Electro-phoresis 23:626–639

17. Chanat E, Le Parc A, Lahouassa H et al (2016)Isolation of endoplasmic reticulum fractionsfrom mammary epithelial tissue. J MammaryGland Biol Neoplasia 21:1–8

18. Wang X, Li S, Wang H et al (2017) Quantita-tive proteomics reveal proteins enriched intubular endoplasmic reticulum of Saccharomy-ces cerevisiae. elife 6:e23816

19. Komatsu S, Hashiguchi A (2018) Subcellularproteomics: application to elucidation offlooding-response mechanisms in soybean.Proteomes 6:E13

20. Wang X, Komatsu S (2016) Plant subcellularproteomics: application for exploring optimalcell function in soybean. J Proteome143:45–56

21. Komatsu S, Yamamoto A, Nakamura T et al(2011) Comprehensive analysis of mitochon-dria in roots and hypocotyls of soybean underflooding stress using proteomics and metabo-lomics techniques. J Proteome Res10:3993–4004

22. Nouri MZ, Komatsu S (2010) Comparativeanalysis of soybean plasma membrane proteinsunder osmotic stress using gel-based and LCMS/MS-based proteomics approaches. Prote-omics 10:1930–1945

23. Bradford MM (1976) A rapid and sensitivemethod for the quantitation of microgramquantities of protein utilizing the principle of

protein-dye binding. Anal Biochem72:248–254

24. Komatsu S, Nanjo Y, Nishimura M (2013)Proteomic analysis of the flooding tolerancemechanism in mutant soybean. J Proteome79:231–250

25. Honjoh K, Mimura A, Kuroiwa E et al (2003)Purification and characterization of two iso-forms of glucose 6-phosphate dehydrogenase(G6PDH) from Chlorella vulgaris C-27. BiosciBiotechnol Biochem 67:1888–1896

26. Huang S, Jacoby RP, Millar AH et al (2014)Plant mitochondrial proteomics. In: JorrinNovo JV, Komatsu S, WeckwerthW,WienkoopS (eds) Plant proteomics: methods and proto-col. Springer, New York, pp 499–526

27. Kato M, Shimizu S (1987) Chlorophyll metab-olism in higher plants. VII. Chlorophyll degra-dation in senescing tobacco leaves; phenolic-dependent peroxidative degradation. Botany65:729–735

28. Hasinoff BB (1990) Inhibition and inactiva-tion of NADH-cytochrome c reductase activityof bovine heart submitochondrial particles bythe iron(III)-adriamycin complex. Biochem J265:865–870

29. Gomez L, Chrispeels MJ (1994) Complemen-tation of an Arabidopsis thaliana mutant thatlacks complex asparagine-linked glycans withthe human cDNA encoding N-acetylglucosa-minyltransferase I. Proc Natl Acad Sci U S A91:1829–1833

30. Olsen JV, de Godoy LM, Li G et al (2005)Parts per million mass accuracy on an Orbitrapmass spectrometer via lock mass injection intoa C-trap. Mol Cell Proteomics 4:2010–2021

31. Zhang Y, Wen Z, Washburn MP et al (2009)Effect of dynamic exclusion duration on spec-tral count based quantitative proteomics. AnalChem 81:6317–6326

32. Schmutz J, Cannon SB, Schlueter J et al (2010)Genome sequence of the palaeopolyploid soy-bean. Nature 463:178–183

33. Brosch M, Yu L, Hubbard T et al (2008) Accu-rate and sensitive peptide identification withMascot Percolator. J Proteome Res8:3176–3181

34. Ishihama Y, Oda Y, Tabata T et al (2005)Exponentially modified protein abundanceindex (emPAI) for estimation of absolute pro-tein amount in proteomics by the number ofsequenced peptides per protein. Mol Cell Pro-teomics 4:1265–1272

35. Usadel B, Poree F, Nagel A et al (2009) Aguide to using MapMan to visualize and com-pare Omics data in plants: a case study in the

130 Xin Wang and Setsuko Komatsu

crop species, maize. Plant Cell Environ32:1211–1229

36. Tanz SK, Castleden I, Hooper CM et al (2013)SUBA3: a database for integrating experimen-tation and prediction to define the SUBcellularlocation of proteins in Arabidopsis. NucleicAcids Res 41:D1185–D1191

37. Blum T, Briesemeister S, Kohlbacher O (2009)MultiLoc2: integrating phylogeny and geneontology terms improves subcellular proteinlocalization prediction. BMC Bioinformatics10:274

38. Horton P, Park KJ, Obayashi T et al (2007)WoLF PSORT: protein localization predictor.Nucleic Acids Res 35:W585–W587

Proteomic Analysis of Endoplasmic Reticulum 131

Chapter 10

Dimethyl Labeling-Based Quantitative Proteomicsof Recalcitrant Cocoa Pod Tissue

Yoel Esteve-Sanchez, Jaime A. Morante-Carriel,Ascension Martınez-Marquez, Susana Selles-Marchart,and Roque Bru-Martinez

Abstract

Dimethyl labeling is a type of stable-isotope labeling suitable for creating isotopic variants of peptides andthus be utilized for quantitative proteomics experiments. Labeling is achieved through a reductive amina-tion/alkylation reaction using the low-cost reagents formaldehyde and cyanoborohydride, resulting indimethylation of free amine groups of Lys and N-termini. Availability of isotopomeric forms of thesereagents allows for the generation of up to six different isotopic variants. Here we describe the application ofdimethylation to create two isotopic variants, light and heavy, differing in 4 Da, to label the total trypticdigest peptides of cocoa pod extracted from healthy pods from cultivars susceptible and resistant to thefungal disease called “frosty pod” caused by Moniliophthora roreri.

Key words Dimethyl labeling, Stable-isotope labeling, Quantitative proteomics, Plant proteomics,Cocoa pod, Moniliophthora roreri, Fungal disease

1 Introduction

Stable-isotope labeling of proteins and peptides has become apopular approach in quantitative proteomics as mass spectrometry(MS) allows to distinguish between isotopic variants. Each isotopicvariant can be associated to a type of sample and the differentlylabeled samples be pooled and subjected to a single run of liquidchromatography (LC) separation coupled to tandem mass spec-trometry (MS/MS). Mass intensity ratio of isotopic variant pairs(and thus between samples) is used as a relative abundance mea-surement, while the MS/MS spectra of the fragmented peptideions is used for protein identification as in a standard proteomeshotgun identification experiment.

Currently, there are a plethora of labeling strategies and meth-ods for quantitative proteomics. At first glance, there are two

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_10, © Springer Science+Business Media, LLC, part of Springer Nature 2020

133

groups of methods attending to the signal used for quantification.The first group uses the intensity of the parent peptide ions, so theisotopic variants are resolved at the MS scan level. In the secondgroup, parent peptide ions are isobaric and thus indistinguishable atMS level, but as they bear different combinations of isotopes, eachvariant releases a characteristic reporter ion upon fragmentation ofthe isobaric parent ion and quantification is performed at MS/MSlevel. A major concern of using mass intensity signal for quantifica-tion was the shift of the retention time of the isotopic variants thatmay occur during chromatographic separation particularly whenvariants are created by substitution of hydrogen by deuterium[1]. Thus, instead of using the mass intensity ratio of a MS scan,the issue is resolved by measuring the ratio of peak areas ofextracted ion chromatograms of the isotopic variant pairs [2].

Most popular strategies of labeling are metabolic and chemical.Metabolic labeling ([3, 4]) relies on the incorporation of isotopicvariants of amino acids (e.g., 13C6-Arg vs 12C6-Arg, or uniformincorporation of 15N vs 14N) during protein synthesis; thus, thestrategy is mainly applicable to cell cultures. The metabolic variantsso generated can be resolved at MS level. Chemical labeling consistsof the covalent binding to protein/peptide reactive groups (typi-cally -SH in Cys; -COOH in Asp, Glu and C-termini; -NH2 in Lysand N-termini) of chemical groups of which a number of isotopicvariants are available. There are tens of stable-isotope chemicalreagents most of which are designed to attach to free amine orcarboxyl groups depending on their chemistry (for a recent reviewsee [5]), thus assuring that each peptide generated by proteinenzymatic digestion is susceptible of being labeled. For chemicallabeling there are both isobaric reagents [6–8] that release reporterions at MS/MS level (e.g., iTRAQ available as fourplex and eight-plex, TMT available as duplex, sixplex, and tenplex) and isotopo-meric reagents distinguishable at MS level (e.g., ICPL available infourplex [9]) that incorporate to peptides through a direct nucleo-philic substitution reaction.

Dimethyl labeling [15] is a type of chemical labeling methodproducing isotopic variants; it performs the addition of two methylgroups to free amino groups of Lys and N-termini by a reductiveamination/alkylation reaction. Although the labeling methodcould be applicable to peptide digest obtained with virtually anyprotease used in proteomics, it is a good choice is to use the trypsindigestion, which is more robust and cheaper, as a high proportionof peptides will have Lys as C-terminal and thus double-dimethyllabel. The exception occurs when Pro is the N-terminus of thepeptides to label. In such case only one methyl group isincorporated as it is a secondary amine. The use of trypsin asdigesting enzyme precludes the generation of such type of peptides.The labeling reaction uses formaldehyde as the carbon donor foralkylation and cyanoborohydride as the reducing agent of the imine

134 Yoel Esteve-Sanchez et al.

intermediate. For each methyl group, the carbon and two hydro-gens come from formaldehyde and the third hydrogen from thereducing agent as depicted in Fig. 1; thus, combining isotopomersof both reagents, a number of resulting isotopic dimethyl variantscan be generated. There are three commercial isotopomers offormaldehyde, CH2O, CD2O and 13CD2O, and two of cyanobor-ohydride, NaBH3CN and NaBD3CN, useful for dimethyl labeling.Combinations of these reagent lead to six isotopic variants thatbring about an increase in nominal mass per –NH2 site of 28, 30,32, 34, 34, and 36Da. Actually, the two variants of +34 are pseudo-isobaric differing in 0.00584 Da [10]. It allows for designingdifferent multiplex methods for quantitative proteomics dependingon the reagent combinations used, including the mostly usedduplex (+28, +32), triplex (+28, +32, +36) [11], fourplex (+28,+30, +32, +34) [12], and fiveplex (+28, +30, +32, +34, +36)[13]. This multiplex-labeling ability of dimethyl labeling compareswith the aforementioned chemical methods. Furthermore, itsmuch lower cost per sample (as low as €0.10 for 25 � 3 μg proteintriplex labeling [14] and much less for 25 � 2 μg duplex labeling)makes this strategy highly cost-effective versus other labelingoptions.

Dimethyl labeling has been shown compatible with majorchromatographic methods used in proteomics [15–18] and, exceptfor extremely large peptides bearing a single dimethyl label, there isno significant overlapping of isotopic envelopes between isotopicvariants that might affect quantitation accuracy [19].

Dimethyl labeling has been widely applied in global proteinabundance profiling (search of human diseases biomarkers, basicresearch in cellular pathways in disease models) and quantitativePTM analysis including protein phosphorylation and glycosylation(reviewed in [20]). In the case of microorganisms, dimethyl

Fig. 1 Dimethyl labeling reaction in primary amine. Reductive amination/alkylation with formaldehyde as donor of the carbon and two hydrogens(in red) and sodium cyanoborohydride as donor of the third hydrogen (in green)of the linked methyl groups. There are three commercial isotopomers offormaldehyde, CH2O, CD2O, and 13CD2O, and two of cyanoborohydride,NaBH3CN and NaBD3CN, useful for dimethyl labeling. Combinations of thesereagents lead up to six isotopic variants that bring about an increase in nominalmass per –NH2 site of 28, 30, 32, 34, 34, and 36 Da

Proteomics of Cocoa Pod Using Dimethyl Labelling 135

labeling has been applied rather scarcely, to basic research with themodels E. coli [21] and yeast [22], and for studying performance ofmicrobial polysaccharide substrate degradation for biofuel produc-tion [23]. Yet in plants, dimethyl labeling has not been applied forglobal protein abundance profiling to the best of our knowledge.

Cocoa pod is a highly recalcitrant tissue for protein extractionand exhaustive tissue washing is necessary to extract good qualityand quantity of protein [24]. It is affected by a major fungal diseaseknown as “frosty pod” [25] due to the attack of Moniliophthoraroreri causing large economic losses to producers in tropical regionsof the Americas and affecting the economy of a large number ofsmall farms [26]. Here we describe a method to apply dimethyllabeling to carry out a global quantitative proteomic profiling ofcocoa pods with different susceptibility phenotype to this fungaldisease.

2 Materials

2.1 Sample

Preparation and

Labeling

1. Liquid nitrogen.

2. 20% (w/v) trichloroacetic acid (TCA) in acetone (stored at�20 �C).

3. 20% (w/v) TCA in water (stored at 4 �C).

4. Tris-saturated phenol, pH 8.0.

5. SDS buffer: 30% (w/v) sucrose, 2% (w/v) SDS, 0.1 M Tris–HCl, pH 8.0, and 5% (w/v) 2-mercaptoethanol.

6. 0.1 M ammonium acetate in methanol.

7. 2% (w/v) sodium deoxycholate (DOC) (stored at 4 �C).

8. 24% (w/v) TCA in water (stored at 4 �C).

9. 80% (v/v) acetone (stored at �20 �C).

10. 6 M urea.

11. 1 M stock solution of triethylammonium bicarbonate (TEAB).

12. 0.2 M dithiothreitol (DTT) in 25 mM TEAB. Make freshlybefore use.

13. 0.2 M iodoacetamide (IAM) in 25 mM TEAB.

14. Trypsin mass spectrometric grade.

15. SpeedVac.

16. 4% (v/v) formaldehyde (CH2O).

17. 4% (v/v) formaldehyde (CD2O).

18. 0.6 M sodium cyanoborohydride (NaBH3CN).

19. 1% (v/v) ammonium (NH4+).

20. 5% (v/v) formic acid (FA).

136 Yoel Esteve-Sanchez et al.

21. PepClean™ C18 spin columns (Thermo Scientific).

22. Activation Solution: 50% (v/v) acetonitrile (ACN). 400 μl persample.

23. Equilibration Solution: 0.5% (w/v) trifluoroacetic acid (TFA)in 5% (v/v) ACN. 400 μl per sample.

24. Sample Buffer: 2% (w/v) TFA in 20% (v/v) ACN; 1 μl for every3 μl of sample.

25. Wash Solution: 0.5% (w/v) TFA in 5% (v/v) ACN. 400–800 μlper sample.

26. Elution Buffer: 70% (v/v) ACN. 40 μl per sample.

2.2 Sample Analysis

by LC-MS/MS

1. Reverse-phase (RP) analytical column: AdvanceBio UHPLCcolumn, 2.1 mm � 250 mm, 2.7 μm particle size (AgilentTechnologies).

2. Agilent 6550 hybrid spectrometer Q-TOF equipped with a JetStream® source (see Note 1).

3. RP chromatography buffer A (RPB-A): 0.1% (v/v) FA and 5%(v/v) ACN in water.

4. RP chromatography buffer B (RPB-B): 0.1% (v/v) FA and 90%(v/v) ACN in water.

3 Methods

The procedure detailed here for high-quality protein extractionfrom a recalcitrant plant tissue has been used in our group withcocoa (Theobroma cacao L.) pod [24], which is adapted from Wanget al. [27] with some modifications. Digestion is performed accord-ing to Klammer and MacCoss [28] with modifications. Labelingsteps are adapted from Boersema et al. [14]. The following proto-col is used to assess proteomic differences between two samples inthe same LC-MS/MS assay.

3.1 Sample

Preparation and

Labeling

3.1.1 Protein Extraction

1. Grind plant tissue (0.1–0.3 g) with mortar and pestle in liquidnitrogen to a fine powder and place it in 2 ml microtubes.

2. Resuspend samples in 1 ml of cold 20% TCA in acetone.

3. Vortex thoroughly for 30 s and centrifuge at 10,000� g at 4 �Cfor 5 min.

4. Discard the supernatant and repeat washing steps until itbecomes colorless.

5. Wash the pellet with 1 ml of cold 20% TCA in water twice as instep 3.

6. Wash the pellet with 1ml of cold 80% acetone twice as in step 3.

7. Dry the pellet at room temperature.

Proteomics of Cocoa Pod Using Dimethyl Labelling 137

8. Resuspend the pellet in 800 μl of Tris-saturated phenol pH 8.0and 800 μl SDS buffer.

9. Vortex thoroughly for 30 s and incubate with orbital shakingon ice 1 h.

10. Centrifuge at 10,000 � g at 4 �C for 20 min and recover upperphenol phase by pipetting it out to a fresh microtube (seeNote2).

11. Reextract remaining aqueous phase as in steps 9 and 10.

12. Precipitate proteins from the pooled phenol phases by adding5 volumes of cold 0.1 M ammonium acetate in methanol,incubate at �20 �C overnight.

13. Collect precipitated proteins by centrifugation at 10,000� g at4 �C for 10 min.

14. Wash pellet with 0.1 M ammonium acetate twice by centrifu-gation at 10,000 � g at 4 �C for 5 min.

15. Wash pellet with 80% acetone twice as in step 14.

16. Dry pellet at room temperature.

17. Dissolve pellet in 25 μl of urea 6 M (see Note 3).

18. Assess protein quantitation using a suitable protein assay kit(e.g., BCA assay or Bradford).

3.1.2 Sample

Precipitation

Follow this stage in case you have your proteins extracted andquantified but resuspended in a different solution to urea 6 M.Otherwise, skip to Sample trypsin digestion (Subheading 3.1.3).

1. Aliquot 25 μg of protein from each sample in a new tube.

2. Add 0.5 volumes of 2% (w/v) DOC and incubate on ice for15 min.

3. Add 0.5 volumes of 24% (w/v) TCA and incubate on ice for20–30 min.

4. Centrifuge at 10,000 � g at 4 �C for 10 min. Remove thesupernatant.

5. Wash twice the pellet with 80% acetone (�20 �C) as in step 4.Dry the pellet at room temperature.

6. Resuspend the pellet in 25 μl of 6 M (see Note 3).

3.1.3 Sample Trypsin

Digestion

1. Reduce disulfide bridges by adding 0.2 volumes of 0.2 M DTTin 25 mM TEAB. Vortex and incubate at 37 �C for 1 h.

2. Alkylate cysteines thiol groups by adding 0.7 volumes of 0.2 MIAM in 25 mMTEAB. Vortex and incubate in the dark at roomtemperature for 1 h.

138 Yoel Esteve-Sanchez et al.

3. Add 2 volumes of 0.1 M TEAB and 0.4 volumes of 0.2 MDTT. Adjust the pH to 7–9 with 1 M TEAB if necessary, tooptimize following digestion by trypsin.

4. Digest samples to obtain peptides by adding trypsin in a 30:1protein–trypsin ratio (i.e., 1 μg trypsin per 30 μg protein).Vortex and incubate at 37 �C overnight.

5. Complete digestion by adding trypsin in a 60:1 protein–trypsinratio (i.e., 1 μg trypsin per 60 μg protein). Vortex and incubateat 37 �C for 3–5 h.

6. Dry samples in SpeedVac centrifuge.

3.1.4 Sample Dimethyl

Labeling

1. Dissolve samples in 100 μl of 0.1 M TEAB with vortex.

2. Label one of the two samples with 4 μl of 4% (v/v) formalde-hyde light variant (CH2O). On the other hand, label the sec-ond sample with 4 μl of 4% (v/v) formaldehyde heavy variant(CD2O).

3. Complete labeling by adding 4 μl of 0.6 M sodium cyanobor-ohydride. Vortex and incubate at room temperature with gen-tle shaking for 1 h at room temperature.

4. Quench labeling by adding 16 μl of 1% (v/v) ammonium foreach tube. Add this reagent in a fume hood. Vortex.

5. Add 8 μl of 5% (v/v) FA to both samples.

6. Mix samples finally in the same tube to run a single LC-MS/MS assay later.

3.1.5 Sample Cleaning 1. For desalting samples with PepClean™ C18 spin columnsfollow manufacturer’s instructions (Pierce®) (see Note 4).Briefly, set up the resin with Activation and Equilibration solu-tions. Load sample with Sample Buffer. Remove impuritieswith Wash Solution. Finally, obtain peptides with ElutionSolution.

3.2 Sample Analysis

by LC-MS/MS

The protocol described here is used with an Agilent 1200 UHPLCequipped with an AdvancedBio column (2.1 mm � 250 mm,2.7 μm particle size) coupled to an Agilent 6550 hybrid spectrom-eter Q-TOF equipped with a Jet Stream® source.

1. 8 μl injections are programmed to ensure reproducible sampleinjection in autosampler.

2. Peptide are separated in the aforementioned analytical columnusing a 140-min linear gradient from 3 to 40% RPB-B flowingat 0.4 ml/min.

3. Peptides were introduced to the mass spectrometer from theLC by using a Jet Stream source (Agilent Technologies)operating in positive-ion mode (3500V) and in high sensitivity

Proteomics of Cocoa Pod Using Dimethyl Labelling 139

mode. Source parameters employed: gas temperature (250 �C),drying gas (14L/min), nebulizer (35 psig), sheath gas temper-ature (250 �C), sheath gas flow (11L/min), capillary voltage(3500V), fragmentor (360V), and OCT 1 RF Vpp (750V).

4. Q-TOF can operate in a high sensitivity mode and AutoMS/MS, allowing for detection of the 20 most intense pre-cursors with charge 2–5 and above a threshold of 1000 countsin a 300–1700m/z scan. MS/MS spectra (50–1700m/z scans)are acquired until either 25,000 counts in total were collectedor a maximum of 333 ms accumulation time.

3.3 Protein

Identification and

Quantitation

The protocol described here is based on the functionality of twosoftware packages Progenesis QI for proteomics (PQIp) (Nonlin-ear Dynamics, Waters) and Proteolabels (Omics Analytics). Fordetailed handling of the software, refer to user guides and onlinehelp (http://www.nonlinear.com/progenesis/qi-for-proteomics/v3.0/user-guide/, http://www.omicanalytics.com/products/proteolabels/doc). PQIp provides a platform for MS and MS/MSdata extraction, LC-MS alignment across runs, MS feature detec-tion, MS signal intensity normalization, and management ofMS/MS spectra-derived peptide and protein identification. ThePQIp output is imported in Proteolabels to carry out the quantita-tive analysis based on heavy–light intensity ratio of MS feature pairs.The previous alignment before fragment spectra identification inPQIp allows for the propagation of identified fragment spectraacross runs, thus filling in identification gaps between runs; like-wise, MS feature pairing in Proteolabels allows for the propagationof identity within runs. As a result, this workflow (Fig. 2) leads to anenhancement of identified spectra and quantified peptides andproteins.

3.3.1 LC-MS and MS/MS

Processing with Progenesis

QI for Proteomics

1. Import .d files generated from Agilent MassHunter Worksta-tion (i.e., the software implemented for data acquisition fromUHPLC-Q-TOF instrument) in PQIp (see Note 5). This soft-ware has its ownMS andMS/MS raw data extraction tool from.d files, thus generating peak list files.

2. Select one of the runs as the reference and align to it all LC runsmaking MS features overlap each other to a minimal alignmentscore of 80% (see Notes 6 and 7) (Fig. 3).

3. Review peak picking automatically assigned by the software.Add, edit, or delete peak detections manually to define couplesof precursors MS spectra (light and heavy) (seeNote 8) (Fig. 4).

4. Export MS/MS spectra as .mgf files (see Note 9).

140 Yoel Esteve-Sanchez et al.

Laboratory workflow

Cocoa Podtissue sampling

Proteinextraction

Trypsindigestion

Stable-isotopedimethyl labeling

Mix pairs of phenotypesdifferentially labeled

UHPLC-MS analysis(e.g. QTOF)

Bioinformatic processing

Chromatographicruns alignment and features detection(e.g. Progenesis QI)

MS/MS spectra searchwith dimethyl as quantitation parameter(e.g. MASCOT)

MS/MS-peptidesequence association

(e.g. Progenesis QI)

Heavy/Light ratio quantitation peptideand protein levels(e.g. Proteolabels)

Ontology annotation

(e.g. Blast2GO)

Fig. 2 Laboratory workflow and subsequent bioinformatic data treatment. Once proteins have been extractedfrom plant tissue and digested, dimethyl labeling is carried out independently. Labeled samples are mixed perpairs so that each pair contains both heavy and light dimethyl versions and the two phenotypes, and the mix isthen analyzed in a LC-MS instrument in data dependent acquisition mode. Bioinformatic processing comprisesLC peaks alignment across different runs and MS features detection. Those features belonging to light-labeledpeptides keep the expected m/z differences with the respective heavy-labeled counterpart and their MS/MSspectra are used for database search selecting dimethyl labeling as quantitation parameter. After searchresult import, association MS/MS-peptide sequence is accomplished. Quantitation by H/L ratio calculation atpeptide and protein levels is the last quantitative workflow step. Identified proteins with differential abundancebetween experimental groups can be annotated by a Gene Ontology. Software packages used for each task inthis work is indicated between brackets

Fig. 3 LC-MS map showing peak signals on PQIp alignment stage. Refinement must focus on central area,where points are gathered the most. Bottom area in the map (i.e., highest retention time and lowest m/z ratio)may be due to nonpeptidic contaminant compounds. Hairpin-like alignment vectors are first seeded manuallyand the added automatically throughout the map to make samples LC signals overlap each other at most.Suitability degree is ranked in a color code, where green stands for good-quality alignment on depicted area.High-quality alignment must be focus at least on central zone, where most of the peaks are present

Proteomics of Cocoa Pod Using Dimethyl Labelling 141

3.3.2 Protein

Identification

The exported file of peak lists is used for fragment ion databasesearch using an appropriate search engine. Here we have usedMASCOT. If a different search engine is used, peak lists shouldbe exported in the appropriate format. This search will detectdimethyl-labeled peptides correctly after previous LC runs refine-ment. Label assignment is essential for protein quantitation.

1. The exported .mgf dataset is searched against a cocoa genomeencoded peptide database (https://www.cacaogenomedb.org/databases.cacao11peptides_pub3i.aa.fasta) supplemented withcontaminant proteins selecting the following settings: enzymetrypsin up to three missed cleavages, quantitation withdimethylation, carbamidomethylation in cysteines as fixedmodification, oxidation in methionine as variable modification,peptide tolerance of 20 ppm, MS/MS tolerance of 50 ppm,monoisotopic mass, and peptide charge of 2+, 3+, and 4+. Dataformat as Mascot generic is required.

2. Export results .xml files.

Fig. 4 Example of a pair of differentially dimethylated peptides showed in PQIp interface. Top-left imagesdepict intensity and m/z values for light (m/z ¼ 638.3313, z ¼ 2) precursors’ isotopic envelope and itschromatographic peak. Top-right images depict intensity and m/z values for heavy (m/z ¼ 640.3313, z ¼ 2)precursors’ isotopic envelope and its chromatographic peak. Bottom-right image depicts complete signalsmap in LC run to locate the showed feature. Bottom-right shows a zoomed area of the LC-MS map centered inthe selected precursor. Light and heavy precursor doublets can be seen throughout. Each precursor in thedoublet must have the same number of nonoverlapped isotopic signals detected (i.e., signals framed in thesame strip of linked squares), ideally four including the monoisotopic one. Charge state is depicted in a colorcode (e.g., red peaks stand for doubly charged precursors). Reviewing must focus above all on peaks with highMS/MS counts, good-quality chromatogram, and placed in central area of the map

142 Yoel Esteve-Sanchez et al.

3.3.3 Protein

Identification Refinement

with Progenesis QI for

Proteomics

1. Import search results generated as .xml files by MASCOT inPQIp to review peptides identity. Identifications obtained withscore <20 are filtered out as well as those of non–T. cacaospecies.

2. Resolve identification conflicts. As each MS/MS spectrum hasseveral amino acid sequence candidates in case of proteinassignment conflict the winner is that of higher MASCOTscore (see Note 10). Discard the rest of identificationassignments.

3. Export data to Proteolabels directly from PQIp interface.

3.3.4 Protein

Quantitation in Proteolabels

Quantification is performed in this software with precursor LC-MScorrectly refined in PQIp, and dimethyl-labeling detection properlyin MASCOT.

1. Import desired experiment from PQIp.

2. Experiment detection settings must include dimethyl labelingin N-term and K, mass shift of 4.025 Da with tolerance formass of 10 ppm and 0.15 for retention time.

3. As mentioned inNote 7, group together replicates (i.e., differ-entially labeled samples couples) belonging to same experimentto obtain heavy/light (H/L) ratio for relative quantitation(Fig. 5).

4. Select only peptides identified in a protein group.

5. Export results to generate a table with H/L quantitation ratiosfor each protein.

Fig. 5 Experimental design setup stage in Proteolabels. Differentially labeled samples run in the same LCanalysis must be placed in the same position on its own Sample column so that they are correctly assigned tobe one replicate. In this case, the only experiment showed (i.e., Condition 1) is made up by four replicates,each one consisting of differentially labeled peptides with light and heavy dimethyl labels. As an example,Sample A may be the control group and Sample B would be the treated group. Four H/L ratios would be theoutput

Proteomics of Cocoa Pod Using Dimethyl Labelling 143

4 Notes

1. High-resolution spectrometers (e.g., Orbitraps) aresuitable too.

2. SDS complex often appears at the interphase. Care should betaken not to disturb this interphase by pipetting.

3. Do not overheat the sample, otherwise cross-reactivity of ureawith residues as for lysine and arginine may occur. Dissolvecompletely the proteins with vortex and sonication if necessary.In case the pellet does not dissolve well, incubation at 4 �Cwithagitation overnight in 6–8 M urea may work.

4. Prior to LC-MS/MS analysis, sample cleaning is important toeliminate interfering salts. These columns get saturated with30 μg protein. If a higher amount of sample must be analyzed,it may be useful to use some more columns for the Washproduct of the previous one so as not to lose sample as a resultof the overload. Elution products of every column can begathered in the same tube so that the entire sample quantityis analyzed in the same LC-MS/MS assay.

5. Every dataset to present in the work can be initially imported allat once regardless of the experiment it belongs to, since refineddata would be assigned to a defined group of replicates inProteolabels software later. However, following alignment orLC peak reviewing would be less time-effective due to the largenumber of spectra handled.

6. Automatic alignment can be performed, but results may not besuitable, following manually alignment would be necessary,therefore.

7. If absolute, complete alignment throughout the map is notviable, focus mainly in the feather-like area present in the mid-dle of the map.

8. This step is critical for proper quantitation. Every precursorisotope envelope must have the same number of isotopic var-iants detected than its labeled counterpart, which is equallycharged besides. It may be useful to take into account thetheoretical Δm/z between monoisotopic variants to make pairassignment easier (e.g., a couple of two-charged precursorsdiffers in 2 m/z units at monoisotopic variants, due to themass discrepancy of 4 Da because of the differential dimethyllabel).

9. MS/MS spectra data can be exported in a wide variety of searchformats. In this case, we use MASCOT as search engine forprotein identification with dimethyl-labeled peptides.

144 Yoel Esteve-Sanchez et al.

10. There may be peptides with the same identification score for asingle protein. In this case, those equally assigned peptides areconsidered.

Acknowledgments

Work supported by grants from Senescyt-Government of Ecuador(UTEQ-Ambiental-9-FCAmb-IFOR-2014-FOCICYT002),MAEC-AECID (2014-2015), Spanish Ministry of Science andInnovation (BIO2017-82374-R), Spanish Ministry of Economyand Competitiveness (PEJ-2014-A-90762/PEJ-2014-P-00289)and European Funds for Regional Development (FEDER).

References

1. Hansen KC, Schmitt-Ulms G, Chalkley RJ et al(2003) Mass spectrometric analysis of proteinmixtures at low levels using cleavable 13C-iso-tope-coded affinity tag and multidimensionalchromatography. Mol Cell Proteomics2:299–314

2. Boutilier JM, Warden H, Doucette AA et al(2012) Chromatographic behaviour of pep-tides following dimethylation with H2/D2-formaldehyde: implications for comparativeproteomics. J Chromatogr B 908:59–66

3. Ong SE, Blagoev B, Kratchmarova I, Kristen-sen DB et al (2002) Stable isotope labeling byamino acids in cell culture, SILAC, as a simpleand accurate approach to expression proteo-mics. Mol Cell Proteomics 1:376–386

4. Conrads TP, Alving K, Veenstra TD et al(2001) Quantitative analysis of bacterial andmammalian proteomes using a combinationof cysteine affinity tags and 15N-metaboliclabeling. Anal Chem 73:2132–2139

5. Chahrour O, Cobice D, Malone J (2015) Sta-ble isotope labeling methods in massspectrometry-based quantitative proteomics. JPharm Biomed Anal 113:2–20

6. Ross PL, Huang YN, Marchese JN (2004)Multiplexed protein quantitation in Saccharo-myces cerevisiae using amine-reactive isobarictagging reagents. Mol Cell Proteomics3:1154–1169

7. Choe L, D’Ascenzo M, Relkin NM (2007)8-plex quantitation of changes in cerebrospinalfluid protein expression in subjects undergoingintravenous immunoglobulin treatment forAlzheimer’s disease. Proteomics 7:3651–3660

8. Dayon L, Hainard A, Licker V (2008) Relativequantification of proteins in human

cerebrospinal fluids by MS/MS using 6-plexisobaric tags. Anal Chem 80:2921–2931

9. Schmidt A, Kellermann J, Lottspeich F (2005)A novel strategy for quantitative proteomicsusing isotope-coded protein labels. Proteomics5:4–15

10. Zhou Y, Shan Y, Wu Q et al (2013) Massdefect-based pseudoisobaric dimethyl labelingfor proteome quantification. Anal Chem85:10658–10663

11. Boersema PJ, Aye TT, van Veen TA et al (2008)Triplex protein quantification based on stableisotope labeling by peptide dimethylationapplied to cell and tissue lysates. Proteomics8:4624–4632

12. Hsu JL, Huang SY, Chen SH (2006) Dimethylmultiplexed labeling combined with microcol-umn separation andMS analysis for time coursestudy in proteomics. Electrophoresis27:3652–3660

13. Wu Y, Wang F, Liu Z et al (2014) Five-plexisotope dimethyl labeling for quantitative pro-teomics. Chem Commun (Camb)50:1708–1710

14. Boersema PJ, Raijmakers R, Lemeer S et al(2009) Multiplex peptide stable isotopedimethyl labeling for quantitative proteomics.Nat Protoc 4:484–494

15. Hsu JL, Huang SY, Chow NH et al (2003)Stable-isotope dimethyl labeling for quantita-tive proteomics. Anal Chem 75:6843–6852

16. Di Palma S, Raijmakers R, Heck AJ et al (2011)Evaluation of the deuterium isotope effect inzwitterionic hydrophilic interaction liquidchromatography separations for implementa-tion in a quantitative proteomic approach.Anal Chem 83:8352–8356

Proteomics of Cocoa Pod Using Dimethyl Labelling 145

17. Wu CJ, Chen YW, Tai JH et al (2011) Quanti-tative phosphoproteomics studies using stableisotope dimethyl labeling coupled with IMAC-HILIC-nanoLC- MS/MS for estrogeninduced transcriptional regulation. J ProteomeRes 10:1088–1097

18. Xu B, Wang F, Song C et al (2014) Large-scaleproteome quantification of hepatocellular car-cinoma tissues by a three-dimensional liquidchromatography strategy integrated with sam-ple preparation. J Proteome Res13:3645–3654

19. Cappadona S, Munoz J, Spee WPE (2011)Deconvolution of overlapping isotopic clustersimproves quantification of stable isotopelabeled peptides. J Proteome 74:2204–2209

20. Hsu J-L, Chen S-H (2016) Stable isotopedimethyl labeling for quantitative proteomicsand beyond. Philos Trans R Soc A374:20150364

21. Ji C, Li L (2005) Quantitative proteome analy-sis using differential stable isotopic labeling andmicrobore LC�MALDIMS and MS/MS. JProteome Res 4:734–742

22. Synowsky SA, van Wijk M, Raijmakers R et al(2009) Comparative multiplexed mass spectro-metric analyses of endogenously expressedyeast nuclear and cytoplasmic exosomes. JMol Biol 385:1300–1313

23. Tolonen AC, Haas W, Chilaka AC et al (2011)Proteome wide systems analysis of a cellulosicbiofuel-producing microbe. Mol Syst Biol7:461

24. Martınez-Marquez A, Morante-Carriel JA,Bru-Martinez R (2017) A comparison of tissuepreparation methods for protein extraction ofcocoa (Theobroma cacao L.) pod. Acta Agron66:248–253

25. Phillips-Mora W, Wilkinson MJ (2007) Frostypod of cacao: a disease with a limited geo-graphic range but unlimited potential for dam-age. Phytopathology 97:1644–1647

26. Phillips-Mora W, Aime M, Wilkinson M(2007) Biodiversity and biogeography of thecacao (Theobroma cacao) pathogen Moni-liophthora roreri in tropical America. PlantPathol 56:911–922

27. Wang W, Scali M, Vignani R et al (2003) Pro-tein extraction for two-dimensional electro-phoresis from olive leaf, a plant tissuecontaining high levels of interfering com-pounds. Electrophoresis 24:2369–2375

28. Klammer AA, MacCoss MJ (2006) Effects ofmodified digestion schemes on the identifica-tion of proteins from complex mixtures. J Pro-teome Res 5:695–700

146 Yoel Esteve-Sanchez et al.

Chapter 11

Quantitative Profiling of Protein Abundanceand Phosphorylation State in Plant Tissues UsingTandem Mass Tags

Gaoyuan Song, Christian Montes, and Justin W. Walley

Abstract

Proteins produce or regulate nearly every component of cells. Thus, the ability to quantitatively determinethe protein abundance and posttranslational modification (PTM) state is a critical aspect toward ourunderstanding of biological processes. In this chapter, we describe methods to globally quantify proteinabundance and phosphorylation state using isobaric labeling with tandem mass tags followed by phospho-peptide enrichment.

Key words Plant proteomics, Tandem mass tags (TMT), Phosphoproteomics, Protein extraction,Mass spectrometry

1 Introduction

Quantitative proteomics utilizing liquid chromatography(LC) coupled to tandem mass spectrometry (MS/MS) representsthe state-of-the-art approach that is employed for deep and quanti-tative analyses of protein abundance and posttranslational modifi-cation (PTM) levels. Sample preparation is a critical step inquantitative proteomic workflow and is particularly challengingfor plant tissues. Plants produce large amounts of interfering com-pounds such as phenolics, terpenes, pigments, organic acids, lipids,and polysaccharides, which makes generation of high quality sam-ples difficult [1, 2]. We recently evaluated a number of samplepreparation methods and found that either a phenol or urea-based extraction prior to protein digestion on molecular weightcutoff filters (filter-aided sample preparation, FASP) enables gener-ation of high quality peptides suitable for deep proteome profilingof plant samples [3].

PTMs vastly expand proteome complexity and are critical mod-ifications that affect regulatory activity, localization, and interaction

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_11, © Springer Science+Business Media, LLC, part of Springer Nature 2020

147

of the protein with other molecules. Protein phosphorylation hasbeen the most intensively studied PTM in plants though proteomicstudies reporting modifications such as lysine acetylation and ubi-quitination are increasing [4–14]. Typically, deep coverage ofPTMs requires an enrichment step, prior to MS, due to their lowabundance. Metal oxide affinity enrichment is a common approachfor phosphoproteomics that uses metal ions (TiO2, CeO2, Fe

3+) tobind the negatively charged phosphate for enrichment [15–19].

Finally, there are a number of approaches available to quantifyprotein and PTM abundance. These methods include in vitro (i.e.,isobaric labeling) and in vivo (i.e., stable isotope labeling by aminoacids in cell culture, SILAC) labeling as well as label-free(ion-intensity or spectral counting) strategies [20]. Due to chal-lenges associated with in vivo metabolic labeling in plants themajority of plant proteomic studies utilize either isobaric labelingor label-free quantification (LFQ) approaches for quantitative pro-teomics. LFQ does not require the use of costly labeling reagentsand because each sample is run independently the experimentaldesign for analyzing a large number of samples (>11) is straight-forward. However, because each sample is run independently LFQtypically suffers from missing values (i.e., protein/PTM not identi-fied in each MS run). Isobaric tagging using either isobaric tag forrelative and absolute quantitation (iTRAQ [21]) or tandem masstags (TMT [22]) enables multiplexing of up to 11 samples in asingle run, and various strategies have been developed to enableexperimental designs with more than 11 samples [23]. When quan-tified at the MS2 level isobaric approaches have lower accuracycompared to LFQ due to ratio compression [24]. However, thehigher precision provided by isobaric tagging has been demon-strated to enable identification of a larger number of significantdifferential regulation events than other quantification approaches[25]. When isobaric quantification is incorporated into PTM stud-ies the isobaric tagging of peptides is often done following PTMenrichment, due to the cost of labeling reagent and the scale ofpeptides needed for enrichment. While this reduces experimentalcost technical variation in PTM enrichment is not controlled.However, modifications to the isobaric labeling reaction enablecost-effective labeling of large amounts of peptides, which makeslabeling prior to phosphopeptide enrichment cost-effective.

Below we provide a current workflow for quantitative profilingof protein abundance and phosphorylation state that details proteinextraction, digestion, TMT labeling, and phosphopeptides enrich-ment steps (Fig. 1), which has been recently described in our recentpublications [3, 26]. While these methods are optimized for plantsamples, we have used them to analyze a range of nonplant samplessuch as yeast and mouse [27].

148 Gaoyuan Song et al.

Protein extraction

Filter-assisted sample purification (FASP) / reduce / alkylateLys-C and Trypsin digestion

TMT isobaric labelling (11-plex)

Combine all labeled peptides

Phospho-peptideenrichment

LC-MS/MSacquisition

LC-MS/MSacquisition

m/z

Total peptidesIdentification / quantification

m/z

Phosphorylated peptidesIdentification / quantification

Fig. 1 Quantitative proteomics workflow using tandem mass tags. Proteins from up to 11 samples areextracted and digested into peptides using phenol-FASP. Peptides from each sample are then independentlylabeled with a TMT reporter. Following labeling, the samples are pooled into a single tube. The pooledTMT-labeled samples can directly be used to quantify protein abundance. The pooled TMT-labeled samplescan also be subjected to phosphopeptide (or other PTM) enrichment prior to LC-MS/MS

Quantitative Phosphoproteomics Using TMT 149

2 Materials

2.1 Protein

Extraction

1. Tris buffered phenol, pH 8.

2. Protein extraction buffer: 50 mM Tris pH 7.5, 1 mM ethyle-nediaminetetraacetic acid (EDTA) pH 8, 0.9 M sucrose.

3. 50� Phosphatase inhibitor mix: 125 mM sodium fluoride(NAF), 12.5 mM sodium vanadate (NaVO4), 12.5 mMsodium pyrophosphate decahydrate (NaPyroPO4), and12.5 mM glycerophosphate (glycerol-P) in H2O.

4. 0.1 M ammonium acetate in methanol.

5. 70% methanol.

6. Protein resuspension buffer: 8 M urea, 50 mM Tris pH 7,5 mM Tris(2-carboxyethyl)phosphine hydrochloride (TCEP:).

2.2 Filter-Aided

Sample Preparation

(FASP) and on Filter

Digestion

1. UA solution: 8 M urea in 0.1 M Tris–HCl pH 8.0. Prepare16 ml per sample (see Note 1).

2. IAM solution: 0.05 M iodoacetamide (IAM) in UA solution.Prepare 2 ml per sample.

3. Trypsin, 1 μg/μl in water.

4. Lys-C, 0.1 μg/μl in water.

5. Ammonium bicarbonate (ABC) solution: 0.05 M NH4HCO3

in water. Prepare 7 ml per sample.

6. TCEP stock: 100 mM TCEP in 1 M Tris–HCl pH 7.3.

7. Amicon Ultracel-30K Centrifugal Filters—4 ml (Millipore,UFC803008) (see Note 2).

8. Pierce™ Detergent compatible Bradford assay kit (Thermo-Fisher Scientific).

9. pH-indicator strips (Millipore, 1095840001 and1095410001).

2.3 C18 Desalting 1. C18 column 100 mg (Waters C18 Cartridge).

2. 100% methanol.

3. H2O (LC-MS grade).

4. 20%, 40%, 80% Acetonitrile (ACN).

5. Vacuum manifold (Visiprep™ SPE Vacuum Manifold).

6. Speedvac system (ThermoFisher Scientific).

7. Pierce™ BCA protein assay kit (ThermoFisher Scientific).

2.4 TMT Labeling 1. TMT label reagents (ThermoFisher).

2. 0.5 M HEPES, pH 8.5 (Alfa Aesar)

3. DriSolv® ACN (Millipore, AX0143).

4. 50% hydroxylamine (ThermoFisher).

150 Gaoyuan Song et al.

2.5 Phosphopeptide

Enrichment

1. Titansphere Phos-TiO2 beads (GL Sciences).

2. Wash and binding buffer: 2 M lactic acid in 50% acetonitrile.

3. 50% ACN in 0.1% trifluoroacetic acid (TFA)

4. Elution buffers: 3% and 5% ammonium hydroxide.

5. 0.1% formic acid (FA).

3 Methods

3.1 Protein

Extraction

1. Grind 0.1 g plant tissue in liquid nitrogen into fine power usinga ceramic mortar and pestle for 15 min (see Note 3).

2. Add 5 volumes (tissue–buffer; w:v) of Tris buffered phenolpH 8 to the ground tissue and vortex for 1 min. For example,for 0.1 g tissue add 5 ml of buffer.

3. Add 5 volumes (tissue–buffer; w:v) extraction buffer with 1�phosphatase inhibitor mix to the phenol–tissue solution, vortex1 min.

4. Centrifuge at 13,000 � g for 10 min at 4 �C (see Note 4).

5. Transfer the phenol solution (top layer) to a new tube, addsame volume buffered phenol pH 8 as step 2 to the aqueousphase, and vortex for 1 min.

6. Centrifuge at 13,000 � g for 10 min at 4 �C (see Note 4).

7. Transfer the phenol phase and combine with the phenol phasefrom step 5.

8. Add 5 volumes of prechilled methanol with 0.1 M ammoniumacetate to the phenol solution, vortex to mix well.

9. Incubate at �80 �C for 1 h.

10. Centrifuge at 4500 � g, for 10 min at 4 �C.

11. Discard the supernatant, add same volume of prechilled meth-anol with 0.1 M ammonium acetate as step 8 to the tube.

12. Resuspend the pellet with probe sonication, then keep at�20 �C for 30 min.

13. Centrifuge at 4500 � g, for 10 min at 4 �C.

14. Repeat once steps 11–13.

15. Add 5 ml prechilled 70% methanol to the tube, resuspend thepellet with probe sonication, then keep at�20 �C for 30min toovernight (see Note 5).

16. Centrifuge at 4500 � g for 10 min at 4 �C.

17. Discard the supernatant, remove the residual methanol byspeedvac at room temperature, resuspend in resuspensionbuffer, and measure the protein concentration using the Brad-ford assay.

Quantitative Phosphoproteomics Using TMT 151

3.2 FASP 1. Add the protein (max 4 mg protein) resuspended in 4 ml UAbuffer to filter unit and centrifuge at 4000 � g for 20–40 min(see Note 6).

2. Add 4 ml UA and centrifuge at 4000 � g for 20–40 min.

3. Add 4 ml of UA with 2 mM TCEP to the filter unit andcentrifuge at 4000 � g for 20–40 min.

4. Discard the flow-through from the collection tube.

5. Add 2 ml IAA solution, mix, and incubate without mixing for30 min in dark.

6. Centrifuge the filter units at 4000 � g for 20–40 min.

7. Add 2 ml of UA to the filter unit and centrifuge at 4000� g for20–40 min. Repeat this step once.

8. Add 2 ml of ABC to the filter unit and centrifuge at 4000 � gfor 20–40 min. Repeat this step once.

9. Transfer the filter unit to a new collection tube.

10. Add 2 ml ABC with trypsin (enzyme to protein ratio 1:50) andmix well.

11. Incubate at 37 �C overnight.

12. Estimate the amount of undigested protein using a Bradfordassay.

13. Add trypsin (enzyme to protein ratio 1:100) and an equalvolume of Lys-C (0.1 μg/μl). Incubate 2–4 h at 37 �C.

14. Centrifuge the filter units at 4000 � g for 20–40 min.

15. Add 1 ml ABC and centrifuge the filter unit at 4000 � g for20–40 min. Repeat this step once.

16. Acidify samples to pH 3 with 100% formic acid (measure pHusing indicator paper), store at �80 �C until furtherprocessing.

3.3 C18 Desalting 1. Setup a C18 column on Vacuum manifold and place a 15 mlconical tube in vacuum chamber to hold flow through.

2. Rinse the column (for a 100 mg C18 column) with 1 mlMeOH followed by 1 ml Water; repeat with other 1 ml water.

3. Load digested sample at a flow rate of less than 1ml per min (seeNote 7).

4. Wash the column with 1 ml water, repeat once with other 1 mlwater.

5. Elute peptides stepwise with 250 μl 20% acetonitrile, 250 μl40% acetonitrile, and 500 μl 80% acetonitrile.

6. Speedvac the peptides solution until almost dry.

152 Gaoyuan Song et al.

7. Resuspend the pellet with water to a protein concentration of~1 mg/ml, vortex and spin to dissolve well.

8. Measure the peptides amount with BCA assay and store at�80 �C.

3.4 TMT Labeling 1. Resuspend 100 μg vacuum dried peptides, per sample, with100 μl of 0.2 M HEPES, pH 8.5 to a final concentration of1 μg/μl. Vortex for 10 min to dissolve well (see Note 8).

2. Remove TMT labels from the freezer and warm to roomtemperature.

3. Resuspend TMT labels with 60 μl of dry acetonitrile, vortex,and leave at room temperature for 5 min. Then spin to collectreagent at the bottom of the vial.

4. When labeling 100 μg of peptides, aliquot 10 μl of the resus-pended TMT labeling reagent into a tube and add 30 μl ofACN. The remaining 50 μl of TMT labeling reagent and beused to label additional samples in parallel or vacuum dried forlater use (see Note 9).

5. Add TMT labels (i.e., 40 μl) to the resuspended peptides(100 μl) with a ratio 0.4:1 (ACN–HEPES; v:v), vortex to mixwell, and leave at room temperature for 1–2 h (see Note 10).

6. Add 8 μl 5% hydroxylamine to each 140 μl TMT labelingsolution to quench the reaction, vortex to mix well, and leaveat room temperature for 15 min.

7. Pool the TMT-labeled peptides to a single tube and vortexwell, speedvac to almost dry (see Note 11).

3.5 Phosphopeptide

Enrichment

1. Resuspend Titansphere Phos-TiO2 beads with wash and bind-ing buffer (prepare 6 mg beads for each 1 mg peptides) byvortexing, centrifuge at 3000 � g for 1 min, and then removethe supernatant. Repeat this step for a total three times (seeNote 12).

2. Resuspend the pooled TMT-labeled peptides with wash andbinding buffer to a final concentration of 1 μg/μl and vortex todissolve well.

3. Transfer resuspended peptides to the tube with TiO2 beads(4 mg beads for each 1 mg peptides) and rotate at roomtemperature for 1 h.

4. Centrifuge at 3000 � g for 1 min, save the beads, and transfersupernatant to a new tube containing the second aliquot ofTiO2 beads (2 mg beads for each 1 mg peptides), rotate atroom temperature for 1 h.

5. Centrifuge at 3000� g for 1 min; save the beads (seeNote 13).

Quantitative Phosphoproteomics Using TMT 153

6. Resuspend TiO2 beads from steps 4 to 5 with wash and bind-ing buffer, centrifuge at 3000 � g for 1 min, and discard thesupernatant. Repeat this step once.

7. Resuspend the TiO2 beads with 50% acetonitrile in 0.1% TFA,centrifuge at 3000 � g for 1 min, and discard the supernatant.Repeat this step once.

8. Add 500 μl of 3% ammonium hydroxide to each tube of beads,vortex, and centrifuge at 3000 � g for 1 min.

9. Remove the supernatant to a new tube.

10. Add 500 μl of 5% ammonium hydroxide to each tube of beads,vortex, and centrifuge at 3000 � g for 1 min. Remove thesupernatant and combine with the supernatant from step 9.

11. Speedvac the supernatant to almost dry, resuspend with 0.1%FA, measure the peptides amount with BCA assay (optional),and store at �80 �C until LC/MS-MS analysis.

4 Notes

1. UA solution should be prepared fresh.

2. The capacity of the filter of the largest amount of protein is4 mg.

3. The protein yield from different tissues or species varies. Thus,the amount of tissue necessary should be optimized.

4. For large scale protein extractions using 15 ml or 50 ml tubecentrifuge at 4500 � g, for 15–20 min at 4 �C.

5. The volume of this step will influence the efficiency of proteinprecipitation, so for protein amounts lower than 500 μg 1–2mlof 70% methanol can be used.

6. The centrifuge time in each step may need to be adjusted fordifferent samples. It is not necessary to completely remove theliquid above the filter; up to 10% of the volume can remainabove the filter.

7. The pH should be around 2–3, which is critical for the bindingactivity of C18 column. The initial flow-through can bereloaded on the column to increase peptide recovery.

8. We usually use at least 100–200 μg peptides (or more) persample for TMT labeling. When carrying out TMT 10-plexexperiments this will enable phosphoenrichment from 1 to2 mg of pooled TMT-labeled peptides. Larger amounts ofpeptides enable identification of a greater number of phosphor-ylation sites. For reference, from 750 μg of TMT-labeled pep-tides we can identify ~10,000 phosphosites [26], whereas

154 Gaoyuan Song et al.

enrichment from 2 mg of peptides enables identification of~17,000 phosphosites.

9. Unused TMT labeling reagent can be dried by speedvac andstored at �80 �C. Only thaw/freeze one time. Divide eachlabel into multiple aliquots prior to vacuum drying if futuresmall-scale labeling reactions are anticipated.

10. In our hands, each tube of TMT labels (0.8 mg) can label up to600 μg peptides with a labeling efficiency >99%. A small-scaletest LC-MS/MS run can be performed to confirm TMT label-ing efficiency.

11. The peptides from this step are ready for LC-MS/MS run ofTMT-labeled protein abundance assay. For phosphopeptideenrichment, desalt using C18 columns.

12. The amount of TiO2 beads used for each step of enrichmentcan be optimized for the tissue being studied.

13. The flow-through from this step can be saved and used toquantify protein abundance or used for enrichment of otherproteins [28].

References

1. Wu X, Gong F, WangW (2014) Protein extrac-tion from plant tissues for 2DE and its applica-tion in proteomic analysis. Proteomics14:645–658

2. Wu X, Xiong E, Wang Wet al (2014) Universalsample preparation method integrating tri-chloroacetic acid/acetone precipitation withphenol extraction for crop proteomic analysis.Nat Protoc 9:362–374

3. Song G, Hsu PY, Walley JW (2018) Assess-ment and refinement of sample preparationmethods for deep and quantitative plant prote-ome profiling. Proteomics 18:1800220

4. Finkemeier I, Laxa M, Miguet L et al (2011)Proteins of diverse function and subcellularlocation are lysine acetylated in Arabidopsis.Plant Physiol 155:1779–1790

5. Silva-Sanchez C, Li H, Chen S (2015) Recentadvances and challenges in plant phosphopro-teomics. Proteomics 15:1127–1141

6. Rao RSP, Thelen JJ, Miernyk JA (2014) IsLys-N(ε)-acetylation the next big thing inpost-translational modifications? Trends PlantSci 19:550–553

7. Hartl M, Fußl M, Boersema PJ et al (2017)Lysine acetylome profiling uncovers novel his-tone deacetylase substrate proteins in Arabi-dopsis. Mol Syst Biol 13:949

8. Fang X, Chen W, Zhao Y et al (2015) Globalanalysis of lysine acetylation in strawberryleaves. Front Plant Sci 6:739

9. Xie X, Kang H, Liu W et al (2015) Compre-hensive profiling of the rice ubiquitome revealsthe significance of lysine ubiquitination inyoung leaves. J Proteome Res 14:2017–2025

10. Song G, Walley JW (2016) Dynamic proteinacetylation in plant–pathogen interactions.Front Plant Sci 7:421

11. Aguilar-Hernandez V, Kim D-Y, Stankey RJet al (2017) Mass spectrometric analyses reveala central role for ubiquitylation in remodelingthe Arabidopsis proteome during photomor-phogenesis. Mol Plant 10:846–865

12. Liu S, Yu F, Yang Z et al (2018) Establishmentof dimethyl labeling-based quantitative acetyl-proteomics in Arabidopsis. Mol Cell Proteo-mics 17:1010–1027

13. Walley JW, Shen Z, McReynolds MR et al(2018) Fungal-induced protein hyperacetyla-tion in maize identified by acetylome profiling.Proc Natl Acad Sci 115:210–215

14. Kelley DR (2018) E3 ubiquitin ligases: keyregulators of hormone signaling in plants.Mol Cell Proteomics 17:1047–1054

15. Pinkse MWH, Uitto PM, Hilhorst MJ et al(2004) Selective isolation at the femtomolelevel of phosphopeptides from proteolyticdigests using 2D-NanoLC-ESI-MS/MS andtitanium oxide precolumns. Anal Chem76:3935–3943

16. Nakagami H, Sugiyama N, Mochida K et al(2010) Large-scale comparative

Quantitative Phosphoproteomics Using TMT 155

phosphoproteomics identifies conserved phos-phorylation sites in plants. Plant Physiol153:1161–1174

17. Kettenbach AN, Gerber SA (2011) Rapid andreproducible single-stage phosphopeptideenrichment of complex peptide mixtures:application to general and phosphotyrosine-specific phosphoproteomics experiments. AnalChem 83:7635–7644

18. Marcon C, Malik WA, Walley JW et al (2015) Ahigh-resolution tissue-specific proteome andphosphoproteome atlas of maize primaryroots reveals functional gradients along theroot axes. Plant Physiol 168:233–246

19. Walley JW, Sartor RC, Shen Z et al (2016)Integration of omic networks in a developmen-tal atlas of maize. Science 353:814–818

20. Bantscheff M, Schirle M, Sweetman G et al(2007) Quantitative mass spectrometry in pro-teomics: a critical review. Anal Bioanal Chem389:1017–1031

21. Wiese S, Reidegeld KA, Meyer HE et al (2007)Protein labeling by iTRAQ: a new tool forquantitative mass spectrometry in proteomeresearch. Proteomics 7:340–350

22. Thompson A, Sch€afer J, Kuhn K et al (2003)Tandem mass tags: a novel quantification strat-egy for comparative analysis of complex protein

mixtures by MS/MS. Anal Chem75:1895–1904

23. Plubell DL, Wilmarth PA, Zhao Y et al (2017)Extended multiplexing of tandem mass tags(TMT) labeling reveals age and high fat dietspecific proteome changes in mouse epididymaladipose tissue. Mol Cell Proteomics16:873–890

24. Karp NA, Huber W, Sadowski PG et al (2010)Addressing accuracy and precision issues iniTRAQ quantitation. Mol Cell Proteomics9:1885–1897

25. Hogrebe A, von Stechow L, Bekker-Jensen DBet al (2018) Benchmarking common quantifi-cation strategies for large-scale phosphopro-teomics. Nat Commun 9:1045

26. Song G, Brachova L, Nikolau BJ et al (2018)Heterotrimeric G-protein-dependent prote-ome and phosphoproteome in unstimulatedArabidopsis roots. Proteomics 18:1800323

27. Abdulghani M, Song G, Kaur H et al (2019)Comparative analysis of the transcriptome andproteome during mouse placental develop-ment. J Proteome Res 18(5):2088–2099

28. Mertins P, Qiao JW, Patel J et al (2013)Integrated proteomic analysis of post-translational modifications by serial enrich-ment. Nat Methods 10:634–637

156 Gaoyuan Song et al.

Chapter 12

Optimizing Shotgun Proteomics Analysis for a ConfidentProtein Identification and Quantitation in Orphan PlantSpecies: The Case of Holm Oak (Quercus ilex)

Isabel Gomez-Galvez, Rosa Sanchez-Lucas, Bonoso San-Eufrasio,Luis Enrique Rodrıguez de Francisco, Ana M. Maldonado-Alconada,Carlos Fuentes-Almagro, and Mari Angeles Castillejo

Abstract

The proteomics of orphan, unsequenced, and recalcitrant organisms is highly challenging. This is the case ofthe typical Mediterranean forest tree Holm oak (Quercus ilex). Proteomics has moved on quite fast from theclassical 2DE-MS to shotgun or gel-free/label-free approaches, with the latter possessing a series ofadvantages over the gel-based ones. Before translating proteomics data into biological knowledge, a fewquestions as to the analysis technique itself have to be answered including its confidence in proteinidentification and quantification. It is important to clearly differentiate a hit from an ortholog and geneproduct identification, with the difference depending on the database and the confidence parameters (score,number of peptides, and coverage). With respect to quantification and for comparative purposes it isimportant to make sure that we are within the linear dynamic range. For that, a calibration curve basedon mass spectrometry analysis of a serial dilution of the extracts should be performed. Thus, just byvalidating our data with the aim of improving the quality of the analysis enables us to give a correctinterpretation of our results. We show a method that aims to improve the confidence in protein identifica-tion and quantification in the orphan species Q. ilex using a shotgun proteomics approach.

Key words Holm oak, Orphan species, Plant proteomics, Shotgun, Confidence parameters

1 Introduction

Proteomics is changing in scale and focus, from its initial objectiveof identifying as many individual proteins as possible to analyzingthe dynamics of the proteome [1]. Liquid chromatography coupledwith mass spectrometry, also called “shotgun” or “gel-free,” is analternative to the 2-DE which has been extensively used for proteinseparation although both can be combined in a single experiment[2, 3]. The gel-free methods, essential for bottom-up MS analysis,have a series of advantages over the gel-based ones (top-down),

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_12, © Springer Science+Business Media, LLC, part of Springer Nature 2020

157

such as allowing for a greater coverage of the proteome whenworking with total protein extracts, faster processing, and lessersample handling. The resulting peptides are analyzed by liquidchromatography coupled to mass spectrometry equipment. Inaddition, if prefractionation steps such as SDS-PAGE electropho-resis are introduced, a higher resolution can be achieved by elim-inating impurities and concentrating the sample [4].

For protein identification, software that requires the use of adatabase is used. The database chosen is of extreme importance as itwill, to a great extent, determine the number and quality of theidentifications. In nonmodel species, such as Quercus ilex, the bestchoice is an initial transcriptome analysis using deep sequencing togenerate the species-specific database with which to compare theproteins [5] so that those identified will correspond to gene pro-ducts of that species. Otherwise, employing a non–species-specificdatabase, identifications will be closest to orthologous genes pro-ducts, and from them we would only be able to hypothesize orspeculate on the protein function [5]. Besides identification, shot-gun techniques permit peptide and protein quantification, and forthat, two main approximations are used: peak area and spectralcount [6, 7].

These new proteomic analysis tools have an almost nonexistentuse in forest species, as is the case of Quercus ilex. Therefore, aworkflow optimization will be necessary in order to improve thequality of the analysis and be able to correctly interpret the results.For this purpose a few questions on the analysis technique itselfhave to be answered. For instance, what are we identifying and howconfident is the identification? Does the identified protein corre-spond to an allelic, variant, or isogenic protein species? How confi-dent is the quantification? Here we present a method that is aimedat improving confidence in protein identification in the orphanspecies Q. ilex. For this purpose we have carried out a shotgunproteomic study based on a serial dilution with protein extractsobtained from a mixture of different holm oak plant organs(embryo, seed, root, leaf). Our study permits the determinationof the confidence range when interpreting the results from a shot-gun proteomics experiment in an orphan species such as Q. ilex.

2 Material

2.1 Plant Material In the present experiment, a mixture of different organs (embryo,cotyledons, leaves, and roots) from Q. ilex are used (see Note 1).

2.2 Protein

Extraction

If not specified, reagents are of analytical grade. Solutions areprepared in distilled water. Stock solutions are stored at �20 �C,unless otherwise stated.

158 Isabel Gomez-Galvez et al.

1. Solution 1: 10% (w/v) trichloroacetic acid (TCA) in acetone.

2. Solution 2: 0.1 M ammonium acetate in 80% methanol.

3. Solution 3: 80% (v/v) acetone.

4. SDS buffer: 0.1 M Tris–HCl pH 8, 30% (w/v) sucrose, 2%(w/v) SDS, 5% (v/v), β-mercaptoethanol. Store at 4 �C.

5. Solution 4: Phenol Tris–HCl saturated pH 8 (Sigma)/SDSbuffer (1:1).

6. Precipitation solution: 0.1 M ammonium acetate/methanol.

7. Solubilization solution: 7 M urea, 2 M thiourea, 4% (w/v)CHAPS, 0.5% (v/v) Triton X-100, 20 mM dithiothreitol(DTT).

2.3 SDS

Polyacrylamide Gel

1. Running buffer: for a 10� stock mix 30.2 g Tris base, 144 g ofglycine, and 1 g of SDS; add H2O up to 1 L.

2. Resolving gel: amount required for 1 mini-Protean (Bio-Rad)of 12% acrylamide gel: mix 3 mL of 40% (w/v) acrylamide–bisacrylamide solution (19:1), 2.5 mL 1.5 M Tris–HCl,pH 8.8, 100 μL of 10% (w/v) SDS, and 4.35 mL ofH2O. Add 50 μL of 10% (w/v) APS and 5 μL of TEMED forstarting the polymerization. Add the mix to the 7 cm gelcassette (see Note 2). Let the gels polymerize for 1 h.

3. Stacking gel: amount required for 1 mini-Protean (Bio-Rad) of4% acrylamide gel: mix 0.25 mL of 40% (w/v) acrylamide–bisacrylamide solution (19:1), 0.63 mL 0.5 M Tris–HCl,pH 6.8, 25 μL of 10% (w/v) SDS, and 1.59 mL ofH2O. Add 50 μL of 10% (w/v) APS and 5 μL of TEMED forstarting the polymerization (see Note 3).

4. Laemmli buffer (5�): 0.3 M Tris–HCl (pH 6.8), 10% SDS,25% β-mercaptoethanol, 0.005% bromophenol blue, 50%glycerol.

5. Coomassie stain solution: 40% (v/v) methanol, 10% (v/v)acetic acid, 0.1% (w/v) Coomassie R-250, in distilled water(see Note 4).

6. Destaining solution: 40% (v/v) methanol, 10% (v/v) aceticacid in distilled water.

2.4 Protein Digestion 1. Washing solution 1: 0.1 M ammonium bicarbonate.

2. Washing solution 2: 0.1 M ammonium bicarbonate–acetoni-trile (1:1) (v/v).

3. Acetonitrile 100%.

4. Reduction and alkylation solutions: reduction with 20 mMDTT in 0.1 M ammonium bicarbonate; alkylation with55 mM of iodoacetamide solution in 0.1 M ammonium bicar-bonate. Keep the solution in a dark place.

Optimizing Shotgun Proteomics Analysis in Orphan Plant Species 159

5. Trypsin solution: Dilute the stock solution of trypsin withtrypsin buffer (25 mM NH4HCO3, 10% Acetonitrile, 5 mMCaCl2) to reach the final concentration of trypsin 12.5 ng/μL.Keep it at 4 �C until digestion.

2.5 Peptide

Desalting

1. Columns cartridges 60 A C18 (Sharlau).

2. Activation solution: 70% acetonitrile–0.1% trifluoroacetic acid.

3. Washing solution: 0.1% trifluoroacetic acid.

4. Elution solution: 70% acetonitrile.

5. 0.1% trifluoroacetic acid.

2.6 Solutions for LC-

MS/MS

1. Mobile phase A: 0.1% (v/v) formic acid in water.

2. Mobile phase B: 0.1% (v/v) formic acid in 80% acetonitrile.

2.7 Equipment and

Software

1. Proteome Discoverer (version 2.1, Thermo Scientific), usingthe SEQUEST algorithm to perform the search.

2. Viridiplantae database obtained from UniProtKB and aspecies-specific database of Quercus ilex developed from thetranscriptome [8].

3. MERCATOR Software for protein classification into MapManfunctional plant categories (http://www.plabipd.de/portal/mercator-sequence-annotation) complemented with the useof the KEGG database (https://www.genome.jp/kegg/pathway.html).

3 Methods

3.1 Protein

Extraction by TCA/

Acetone/Phenol

The proteins are extracted by using the method of Wang et al. [9]with some modifications.

1. Add 1 mL of solution 1 precooled at �20 �C from proteinextraction protocol to 200 mg of plant material previouslypulverized in liquid nitrogen.

2. Sonicate 10 min at maximum speed (P Selecta Ultrasons) (seeNote 5).

3. Centrifuge at 14,000 � g for 10 min (4 �C) and discard thesupernatant.

4. Add 1 mL of solution 2 precooled at�20 �C and solubilize thepellet.

5. Centrifuge at 14,000 � g for 10 min (4 �C) and discard thesupernatant.

6. Add 1 mL of solution 3 precooled at�20 �C and solubilize thepellet.

160 Isabel Gomez-Galvez et al.

7. Centrifuge at 14,000 � g for 10 min (4 �C) and discard thesupernatant.

8. Air-dry the pellet at room temperature to remove residualacetone.

9. Add 1 mL of solution 4 precooled at 4 �C (under the hood).Mix thoroughly and incubate for 5 min at 4 �C (see Note 6).

10. Centrifuge at 14,000� g for 10 min (4 �C). Transfer the upperphenol phase into a new tube.

11. Add 1 mL of precipitation solution precooled at �20 �C, mixwell, and incubate for precipitation overnight at �20 �C.

12. Centrifuge at 14,000 � g for 10 min (4 �C) and discard thesupernatant.

13. Wash the pellet once with 100% methanol (precooled at�20 �C) and disperse.

14. Centrifuge at 14,000 � g for 10 min (4 �C) and discard thesupernatant.

15. Wash the pellet once with 80% (v/v) acetone (precooled at�20 �C) and disperse.

16. Centrifuge at 14,000 � g for 10 min (4 �C) and discard thesupernatant.

17. Air-dry the pellet at room temperature.

18. Solubilize the pellet with solubilization solution at room tem-perature (see Note 7).

19. Quantify proteins using the Bradford method [10].

3.2 SDS-PAGE

Electrophoresis

1. Prepare a mixture of different holm oak plant organs (embryo,seed, root, leaf) extracts with 300 μg of protein each.

2. Prepare a serial dilution of proteins in the range of 1–200 μgBSA equivalents, mixed with Laemmli buffer and heat them to95 �C for 5 min.

3. Perform a SDS-PAGE in a 12% acrylamide gel. Before loadingthe protein, mark a line 1 cm below the staking-resolvinginterphase. Load the samples and run the electrophoresis at80 V, constantly until the bromophenol blue reaches themarked line.

4. Stop the electrophoresis and immediately transfer the gel to aplate with Coomassie staining solution [11]. Incubate in anorbital shaker for 1 h.

5. Distain the gels in distaining solution for 80 min. Finishbleaching with distilled water.

6. Cut the unique bands from the gel, all of them in the same way.Transfer gel pieces to individual 1.5 mL tubes and cover them

Optimizing Shotgun Proteomics Analysis in Orphan Plant Species 161

with distiller water. At this point the gel pieces can be stored at4 �C.

3.3 Sample

Preparation for MS

Analysis

3.3.1 Protein Digestion

1. Cut the gel bands with a scalpel into small fragments (around1 mm3). Transfer the gel pieces to a 1.5 mL low binding tube.

2. Add 1 mM ammonium bicarbonate/acetonitrile (1:1) (v/v).Mix equal volumes of washing solution and acetonitrile. Stir for30 min at 37 �C. Repeat this step.

3. Remove the supernatant and add acetonitrile. Incubate for5 min at room temperature and remove the supernatant.

4. For the reduction and alkylation of the proteins, add 20 mMDTT/100 mM ammonium bicarbonate. Subsequently add55 mM iodoacetamide/100 mM ammonium bicarbonate.Incubate for 30 min at room temperature in each solution.

5. Wash twice with 25 mM ammonium bicarbonate and with25 mM ammonium bicarbonate/acetonitrile 50%.

6. Digest with the trypsin solution (12.5 ng/μL trypsin) andincubate overnight with shaking at 37 �C.

3.3.2 Peptides Extraction 1. Spin the tubes and transfer the supernatant to a new tube “A”.

2. Add 150 μL (enough volume to cover the gel) of 20% acetoni-trile/1% formic acid and incubate for 5 min at room tempera-ture. Sonicate for 3 min. Transfer the peptide solution to thetube A.

3. Step 2 is repeated twomore times, with 150 μL of, respectively,50% and 90% acetonitrile/1% formic acid. Transfer the finalsolutions to the tube A.

4. Dry out in SpeedVac. Keep at �20 �C or �80 �C for long termstorage.

5. Resuspend the samples in 100 μL of 0.1% formic acid withsonication.

3.3.3 Peptides Desalting 1. Activate the C18 column with 0.4 mL 70% acetonitrile/0.1%trifluoroacetic acid. Then wash it with 0.5 mL 0.1%trifluoroacetic acid.

2. Add the sample to the column and keep it for 5 min atRT. Collect the flow through and repeat this step twice (seeNote 8).

3. Wash the columns with 0.5 mL 0.1% trifluoroacetic acid (seeNote 9).

4. Elute the peptides with 100 μL of 70% acetonitrile three timesto a final volume of 300 μL and dry in a SpeedVac.

162 Isabel Gomez-Galvez et al.

3.4 nLC-MS/MS 1. Prepare a serial dilution of the recovered peptides from theconcentrations of protein loaded in the gel (see Note 10).

2. Load the samples onto a nano-LC-MS-UHPLC Ultimate3000 using a flow of 300 nL/min and a gradient of B in Afrom 4 to 35% (120 min), 35–55% (6 min), and 55–90%(3 min). Finally, elute the column with 90% of B over 8 minbefore wash and reequilibration with a total time of chroma-tography of 150 min.

3. The eluent from the column is introduced in the electrosprayionization source of an MS/MS instrument (Orbitrap Fusion,Thermo Fisher Scientific) operating in positive ion mode.

4. Perform survey scans of peptide precursors from 400 to1500 m/z at 120 K resolution (at 200 m/z) with a 4 � 105

ion count target. Tandem MS by isolation at 1.2 Da with thequadrupole, CID fragmentation with normalized collisionenergy of 35, and rapid scan MS analysis in the ion trap.

3.5 Protein

Identification

1. Use the MS2 spectra for identification, using the SEQUESTalgorithm with the Proteome Discover software (version 2.1.,Thermo Scientific). The following parameters were set: theo-retical tryptic digestion allowing up to one missed cleavage,carbamidomethylation of cysteines as fixed modification andoxidation of methionine as a variable modification. Precursormass tolerance of 10 ppm and product ions search at 0.1 Datolerance. Validate peptide spectral matches (PSM) using per-colator based on q-values at a 1% FDR. To group peptideidentifications into proteins use the law of parsimony and filterto 1% FDR and a minimum XCorr of 2.

2. Use two databases for protein identification: one generic, Vir-idiplantae database from UniprotKB and a custom species-specific Q. ilex database developed from the transcriptome [8].

3. Assign functional categorization of proteins with the MERCA-TOR software and the KEGG database.

3.6 Protein

Quantification

1. Protein quantification by peak area: using the peak area valuesgiven for each protein, following the normalization of datawith the total sum of the peak area values per each sample.

2. Protein quantification by proteotypic peptides: using the proteo-typic peptides (specific of protein) from different proteins andrepresenting the peak area intensity in each of the serial dilu-tions studied (see Note 11).

Optimizing Shotgun Proteomics Analysis in Orphan Plant Species 163

4 Anticipated Results

1. The number of identifications (proteins/peptides) it dependson the amount of protein loaded, with values from 3206 pep-tides/1141 proteins to 6745 peptides/1636 proteins in thelinear range of 100–600 ng (see Fig. 1a and b).

2. The number of identifications is dependent of the database. Inthe case of orphan species, such as Q. ilex, it is highly recom-mended to use a species-specific database from the transcrip-tome. These include gene products that are nonorthologoussuch as that occur when generic databases are used. In oursystem the number of peptides and proteins found with thespecific database was significantly greater than that found withthe generic one (see Table 1, Fig. 1c and d).

3. The identification confidence (% coverage, score, and numberof peptides) mostly depends on the database (species-specific vsgeneral). Using the Q. ilex database, for most of the proteinsidentified (up to 70–80%) we found values ranging from 1 to20% of coverage, 2–20 of score value, and 1–5 peptides perprotein. Compared to the Viridiplantae database, a larger

8000

7000

6000

5000

4000

3000

2000

1000

00 200 400 600

Total amount of protein (ng)

Viridiplantae

Quercus

Quercusilex

5539 1256 521723

a

c d

b

1206 380

UniprotKB-Viridiplantae

Quercusilex

UniprotKB-Viridiplantae

Viridiplantae

Quercus

Total amount of protein (ng)

Nu

mb

er o

f p

rote

ins

Nu

mb

er o

f p

epti

des

800 10000

0

500

1000

1500

2000

2500

200 400 600 800 1000

Fig. 1 Number of peptides (a) and proteins (b) identified, using the UniprotKB-Viridiplantae and Quercus ilexdatabases. The values correspond to the different amounts of proteins used, in the range of 10–1000 ng. Venndiagram showing the number of peptides (c) and proteins (d) identified from the two databases used(UniprotKB-Viridiplantae and Quercus ilex), corresponded to the 600 ng of protein sample dilution

164 Isabel Gomez-Galvez et al.

proportion of proteins showed higher confidence values usingthe Q. ilex database (see Fig. 2).

4. The relative protein quantification was confident in the100–1000 ng range of protein, as was revealed for the peakarea of several proteotypic peptides (see Fig. 3). For that, sixproteotypic peptides belonging to different proteins were ran-domly selected. The peak area values were plotted against theserial dilution of protein. We were able to observe that there is alimit of detection around 100 ng of protein, below which thequantification is not reliable. The linear dynamic range wasplaced between 100 and 800 ng for the six peptides analyzed.

5 Notes

1. Q. ilex acorns from Cordoba (Spain) are sterilized [12]. Forgermination, the pericarps are removed from the acorns, thensown in 0.5 L pots with perlite and grown in a greenhouse(35/19 �C day/night and HR less than 43%). Plants are irri-gated weekly to field capacity with Hoagland solution[13]. The embryo and cotyledons are obtained from germi-nated seeds, and the roots and leaves from 4-month seedlings(10-leaf developmental stage). Each organ (embryo, cotyle-don, root and leaves) is individually frozen in liquid nitrogenand stored at �80 �C until analysis.

2. Avoid bubbles during casting and quickly cover the acrylamidewith 2-propanol.

Table 1Number of peptides and proteins identified. The values correspond to a serial dilution used in therange of 10–1000 ng of protein. The UniprotKB (Viridiplantae) and Quercus ilex databases were used,showing orthologs, for the first, and gene products, for the second one

ng of Proteins

UniprotKB-Viridiplantae database Quercus ilex database

Peptides Proteins Orthologous products Peptides Proteins Gene products

10 996 556 85 3206 1141 1141

50 1110 563 95 3643 1058 1244

100 987 495 73 3234 993 1168

200 1363 659 100 2455 1180 1425

400 1678 812 116 6001 1511 1851

600 1929 901 135 6745 1636 2028

800 1678 793 122 6589 1587 1955

1000 1821 869 124 7034 1604 1992

Optimizing Shotgun Proteomics Analysis in Orphan Plant Species 165

3. Before adding the APS and TEMED discard the 2-propanollayer covering the resolving gels and briefly rinse with water.Then pour the stacking solution containing APS and TEMED,and carefully place the comb preventing bubbles.

Fig. 2 Relative number of proteins identified in the sample dilution of 600 ng,grouped according to the confidence parameter ranges (% coverage, scorevalue, and number of peptides)

166 Isabel Gomez-Galvez et al.

4. Dissolve the Coomassie in methanol and then add the othercomponents.

5. Keep the samples on ice, since sonication generates heat.

6. Shake the mixture frequently as both phases (phenol and SDSbuffer) tend to separate.

7. It is recommended to use the minimum amount of buffer inwhich the pellet is completely dissolved to obtain a high proteinconcentration.

8. Pass the sample very slowly through the column to facilitate themaximum binding of peptides.

9. After this step change the tubes to low-bind to recover theeluted peptides.

10. For MS analysis a serial dilution ranged from 10 to 1000 ng ofprotein BSA equivalents as determined with the Bradford assaywas prepared.

11. In shotgun experiments, complex mixtures of peptides areusually used and some of them may be present in more thanone protein. For this reason, only using proteotypic peptides,we can make a better estimation of the amount of a givenprotein in the sample.

Acknowledgments

The authors thank the University of Cordoba, Spain (UCO-CeiA3)and the staff of the Central Service for Research Support (SCAI) fortheir technical support in the bioinformatics data analysis. This

10000000

9000000

8000000

7000000

6000000

5000000

4000000

3000000

2000000

1000000

00 200 400 600 800

ng of protein

Pea

k ar

ea (

Inte

nsi

ty)

1000

[K].AEYDESGPSIVHR.[K]

[K].AGEDADTLGLTGHER.[Y]

[K].AGIVASLDELVK.[E]

[K].GAPVVAAPAK.[E]

[K].ILDGPPGTAER.[A]

[K].VGNFLNR.[F]

Fig. 3 Peak area for several proteotypic peptides, determined at the different protein dilutions. The selectedpeptides corresponded to the following proteins, from top to bottom: actin-97, aconitate hydratase, disulfide-isomerase A6, elongation factor 1-beta 1, flowering locus K homology domain and UTP-glucose-1phosphateuridylyltransferase

Optimizing Shotgun Proteomics Analysis in Orphan Plant Species 167

research was funded by the grant ENCINOMICA BIO2015-64737-R from Spanish Ministry of Economy and Competitiveness.M.A.C. is grateful for the contract “Ramon y Cajal (RYC-2017-23706) program” of the Spanish Ministry of Science, Innovation,and Universities.

References

1. Barbier-Brygoo H, Jouard J (2004) Focus onplant proteomics. Plant Physiol Biochem42:913–917

2. Canovas FM, Dumas-Gaudot E, Recorbet Get al (2004) Plant proteome analysis. Proteo-mics 4:285–298

3. Jorrın-Novo JV (2014) Plant proteomics:methods and protocols. In: Jorrin-Novo JV,Komatsu S, Weckwerth W, Wienkoop S (eds)Methods in molecular biology, vol 1072.Humana Press, Totowa, pp 3–13

4. Valledor L, WolframW (2014) Standardizationof data processing and statistical analysis incomparative plant proteomics experiment. In:Jorrin-Novo JV, Komatsu S, Weckwerth W,Wienkoop S (eds) Plants proteomics: methodsand protocols. Methods in molecular biology,vol 1072. Humana Press, Totowa, pp 347–358

5. Romero-Rodriguez MC, Pascual J, Valledor L,Jorrin-Novo J (2014) Improving the quality ofprotein identification in non-model species.Characterization of Quercus ilex seed andPinus radiata needle proteomes by usingSEQUEST and custom databases. J Proteome105:85–91

6. Zhu W, Smith JW, Huang CM (2010) Massspectrometry-based label-free quantitative pro-teomics. J Biomed Biotechnol:1–6. https://doi.org/10.1155/2010/840518

7. Xie F, Liu T, Qian WJ et al (2011) Liquidchromatography-mass spectrometry-based

quantitative proteomics. J Biol Chem 286(29):25443–25449

8. Guerrero-Sanchez VM, Maldonado-AlconadaAM, Amil-Ruiz F, Jorrin-Novo J (2017)Holm oak (Quercus ilex) transcriptome. Denovo sequencing and assembly analysis. FrontMol Biosci 4:70

9. Wang W, Vignani R, Scali M, Mauro C (2006)A universal and rapid protocol for proteinextraction from recalcitrant plant tissues forproteomic analysis. Electrophoresis27:2782–2786

10. Bradford MM (1975) A rapid and sensitivemethod for the quantitation of microgramquantities of protein utilizing the principle ofprotein-dye binding. Anal Biochem72:248–254

11. Neuhoff V, Arold N, Taube D, Ehrhardt W(1988) Improved staining of proteins in poly-acrylamide gels including isoelectric focusinggels with clear background at nanogram sensi-tivity using Coomassie Brilliant Blue G-250and R-250. Electrophoresis 9:255–262

12. Bonner FT, Vozzo JA (1987) Seed biology andtechnology of Quercus. General technicalreport, SO-66. U.S. Dept. of Agriculture, For-est Service, Southern Forest Experiment Sta-tion, New Orleans, LA, p 21

13. Hoagland DR, Arnon DI (1950) The water-culture method for growing plants withoutsoil. California Agricultural Experiment Sta-tion, Circular-347

168 Isabel Gomez-Galvez et al.

Chapter 13

Combining Targeted and Untargeted Data Acquisitionto Enhance Quantitative Plant Proteomics Experiments

Gene Hart-Smith

Abstract

Most quantitative proteomics experiments either target a limited number of selected proteins for quantifi-cation or quantify proteins on a broad scale in an untargeted manner. However, we recently demonstratedthat experiments that have both targeted and untargeted components can be particularly advantageous.Using a combined targeted and untargeted liquid chromatography–tandem mass spectrometry data acqui-sition strategy termed TDA/DDA (shorthand for targeted data acquisition/data-dependent acquisition),which we applied to a model quantitative plant proteomics experiment performed on Arabidopsis, wedemonstrated improved quantification of both targeted and untargeted proteins relative to purely untar-geted experiments performed using conventional data-dependent acquisition (Hart-Smith et al. FrontPlant Sci 8:1669, 2017). This suggests that many quantitative proteomics datasets earmarked for collectionusing data-dependent acquisition are likely to benefit from the use of TDA/DDA instead.This chapter describes how TDA/DDA liquid chromatography–tandemmass spectrometry methods can

be created on commonly used mass spectrometric instrument platforms. It described how, using freelyavailable software, tandem mass spectrometry inclusion lists designed to target proteins of hypothesizedinterest can be generated. Best practice implementation of these inclusion lists in TDA/DDA strategies isthen described. Relative to conventional data-dependent acquisition, the liquid chromatography–tandemmass spectrometry methods created using these guidelines increase the chances of quantifying targetedproteins and can produce widespread improvements in the reproducibility of untargeted protein quantifi-cation, without compromising the total numbers of proteins quantified. They are compatible with differentquantitative proteomics methodologies, including metabolic labeling, chemical labeling and label-freeapproaches, and can be used to create tailored assay libraries to aid the interpretation of quantitativeproteomics data collected using data-independent acquisition.

Key words Quantitative proteomics, Shotgun proteomics, Inclusion lists, Targeted data acquisition(TDA), Data-dependent acquisition (DDA)

1 Introduction

Quantitative proteomic studies are expected to play a critical role inthe burgeoning field of plant molecular systems biology. Thesestudies, which quantify proteins using peptide ions identified inliquid chromatography (LC)–tandem mass spectrometry(MS/MS) experiments, have traditionally been categorized as

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_13, © Springer Science+Business Media, LLC, part of Springer Nature 2020

169

either hypothesis driven or non–hypothesis driven [2]. Hypothesisdriven studies can be highly selective and sensitive toward individ-ual proteins [3–5]. They quantify specific proteins using targetedLC-MS/MS data acquisition strategies such as selected reactionmonitoring (SRM) [6], parallel reaction monitoring (PRM) [7],or targeted data acquisition (TDA) [8–11], or extract quantitativedata for proteins of interest from LC-MS/MS datasets collectedusing data-independent acquisition (DIA) [12]. In contrast, non–hypothesis driven studies facilitate exploratory analyses by quantify-ing proteins on a broad scale in an untargeted manner. This isgenerally achieved using data-dependent acquisition (DDA), aLC-MS/MS data acquisition strategy that selects a number ofpeptide ions for MS/MS per MS/MS scan cycle, with relative ionabundances used as a means to prioritize selections [13].

Despite this traditional segregation of quantitative proteomicsinto hypothesis driven or non–hypothesis driven research, thesetwo categories are not mutually exclusive. For example, we recentlyemployed a combined targeted and untargeted LC-MS/MS dataacquisition strategy, termed TDA/DDA, in quantitative proteo-mics analyses of wild-type Arabidopsis thaliana plants relative toplants mutant for the proteins DOUBLE STRANDED RNABINDING PROTEIN1 (DRB1) and DRB2 [14, 15] using themetabolic 15N-labeling approach [16]. In this context, TDA/DDAenabled both hypothesis driven [14] and non–hypothesis driven[15] insights into miRNA-guided translation inhibition.

TDA/DDA operates as follows (Fig. 1): within each MS/MSscan cycle performed over the course of an LC-MS/MS experi-ment, (1) targeted peptide ions are firstly selected for MS/MS viaTDA using an m/z inclusion list employed with an open retentiontime window; and (2) after a set number of inclusion list-triggeredMS/MS scans, or if all peptide ions matching those in the inclusionlist are selected for MS/MS, additional peptide ions are thenselected for MS/MS in an untargeted manner using DDA.

We recently conducted an in-depth evaluation of the utility ofTDA/DDA for combined hypothesis driven and non–hypothesisdriven quantitative proteomics. This study demonstrated that, rel-ative to conventional DDA, TDA/DDA is capable of not onlyenhancing the quantification of targeted proteins; surprisingly itcan also enhance the broad scale quantification of untargeted pro-teins [1]. These unexpected improvements in non–hypothesisdriven protein quantification stem from untargeted peptide ionswith m/z values serendipitously matching those in inclusion lists,which are repeatedly identified across replicate experiments. Thisenhanced experimental reproducibility can lead to widespreadimprovements in the identification of statistically significantchanges in protein abundance.

170 Gene Hart-Smith

The benefits of TDA/DDA relative to DDA will be sample andinstrument specific. However, using the guidelines described in thischapter, TDA/DDA can be expected to match or outperformDDA in most non–hypothesis driven quantitative proteomicsexperiments performed on complex peptide samples (>25 K pep-tide features), while concomitantly allowing specific proteins to bequantified in a targeted manner. We therefore suggest that TDA/DDA should be considered for use in all broad scale quantitativeplant proteomics experiments traditionally assigned for collectionusing DDA; particularly those that could also benefit from thetargeted quantification of proteins of known or hypothesizedbiological interest.

2 Materials

2.1 Creation

of Inclusion Lists

and TDA/DDA LC-MS/

MS Methods

1. Benchtop computer with Skyline [17] installed (see Note 1).

2. Benchtop computer with mass spectrometry instrument soft-ware (e.g., Xcalibur if using Thermo Scientific equipment)installed (see Note 2).

3 Methods

3.1 Creation

of Inclusion Lists

1. Create lists of proteins of known or hypothesized biologicalinterest to target for quantification. Create one list per peptidemixture to be subjected to LC-MS/MS. These lists can rangefrom several proteins to over one hundred proteins per peptidemixture (see Note 3).

Fig. 1 TDA/DDA LC-MS/MS methods employ TDA inclusion lists for the hypothesis driven selection of peptidesfor MS/MS (green arrows), followed by DDA for the non–hypothesis driven selection of peptides for MS/MS(blue arrows). An illustrative MS survey scan is shown with green signals representing peptides derived fromtargeted proteins, and black signals representing other peptides. In this example, up to 5 TDA events and5 DDA events are employed per MS/MS scan cycle

Targeted and Untargeted Quantitative Proteomics 171

2. For each list of targeted proteins, import the amino acidsequences of these proteins into Skyline. If using Skyline v4,this can be done, for example, by importing FASTA entries (seeNote 4) for individual proteins as follows:Edit ! Insert ! FASTA. . . .

3. Skyline will automatically perform an in silico digestion foreach imported protein, creating a list of theoretical peptides.Ensure that these digestions are performed using parametersappropriate for the peptide mixture to be analyzed.

If using Skyline v4, click: Settings ! Peptide Settings...Ensure that the enzyme used to create the peptide mixture

to be analyzed is correctly specified. This can be done under the“Digestion” tab (e.g., select “Trypsin [KR | P]”).

If a metabolic labeling experiment has been performed,ensure that the correct isotope modifications (e.g., “Label:15N”) have been listed and are checked. This can be doneunder the “Modifications” tab.

It is also recommended that in silico digestions shouldproduce peptides 7–15 amino acids in length. This can bespecified under the “Filter” tab.

Generally, in silico digestions that produce up to twomissed cleavages (specified under the “Digestion” tab) andconsider methionine oxidation as a structural modification(specified under the “Modifications” tab) are also recom-mended, as elaborated on below. If alkylation of cysteine resi-dues was performed during sample preparation, this shouldalso be specified when selecting structural modifications.

4. Appropriatem/z values need to be calculated for the theoreticalpeptides generated by Skyline. It is recommended that theseshould fall in the range m/z 350–1500, and be associated withpeptide ions of charge state +2, +3 or +4.

If using Skyline v4, click: Settings ! Transition Settings...Precursor charges (e.g., “2, 3, 4”) can be specified under

the “Filter” tab, and m/z ranges under the “Instrument” tab.

5. The inclusion list incorporating thesem/z values needs to be ofan appropriate size (seeNote 5). In Skyline v4, the inclusion listsize is shown on the bottom right corner of the main GUIwindow (i.e., the denominator next to “prec”).

If the inclusion list size needs to be reduced, the in silicodigestion parameters should be refined. It is recommended thatthe following parameters, accessed as above, should be consid-ered for alteration in the following order: (1) remove methio-nine oxidation as a structural modification; (2) reduce thenumber of missed proteolytic cleavages from 2 to 1; (3) specifyprecursor ion charges of +2 and +3 only; (4) reduce the numberof missed proteolytic cleavages from 1 to 0.

172 Gene Hart-Smith

If the inclusion list size still needs to be reduced aftermaking the above alterations, consider placing limits on thenumber of m/z values associated individual large proteins, orrefining the list of targeted proteins.

6. Export the inclusion list. Ensure that the list is in an appropriatefile type and format for the mass spectrometric instrumentplatform to be used during LC-MS/MS (see Note 6), andthat retention time windows are left open (see Note 7).

If using Skyline v4, click: File! Export! Isolation List. . .If relevant, select the instrument platform to be used for

LC-MS/MS data collection before clicking “OK.” This willensure that the list is saved in a correct file type and format forthe selected instrument platform. It is nonetheless recom-mended that exported files are manually checked to ensurecorrect formatting.

If the instrument platform to be used for LC-MS/MS datacollection is not available, export the inclusion list using anyinstrument type and manually format the list.

3.2 Creation of TDA/

DDA LC-MS/MS

Methods

1. Open the mass spectrometry instrument software and navigateto the method editor. For example, if using Thermo Scientificequipment, open Xcalibur and click: Roadmap View! Instru-ment Setup.

2. Open a preoptimized DDA LC-MS/MS method to use as thestarting point for the new TDA/DDA LC-MS/MS method.This preoptimized DDA LC-MS/MS method should containappropriate parameters for the following: survey scans, precur-sor ion selection, and MS/MS (elaborated on in step 3,below).

3. Create a TDA component to the LC-MS/MS method, whichshould be prioritized over the DDA component. Specify howmany TDA and DDA events to perform per MS/MS scan cycle(see Note 8). The steps required to perform these actions willbe dependent on the mass spectrometry instrument software.

For many instruments it will be possible to perform theseactions by modifying the DDA LC-MS/MSmethod opened instep 2. This is, for example, possible on Thermo ScientificFusion (see Note 9) or Q Exactive (see Note 10) seriesinstruments.

For other instruments, a new method will need to becreated. This may, for example, be necessary on LTQ Orbitrapseries instruments (see Note 11). Ensure that any such newmethod is populated with parameters found in the preopti-mized DDA LC-MS/MS method (see Note 12).

4. Import the inclusion list created in Subheading 3.1 and specifythat the list should be used with a 10 ppm mass tolerance. The

Targeted and Untargeted Quantitative Proteomics 173

steps required to take these actions will be dependent on themass spectrometry instrument software (see Note 13).

5. Save the newmethod and use it for LC-MS/MS data collectionin place of the DDA LC-MS/MS method opened in step 2.

4 Notes

1. Although this chapter describes the use of Skyline v4 for insilico protein digestions, different versions of Skyline andnumerous other utilities—for example the online utilityMS-Digest (UC San Francisco)—are also capable ofperforming this task.

2. The software required to create TDA/DDA methods willdepend on the mass spectrometer used for LC-MS/MS datacollection. This chapter will provide detailed descriptions of thecreation of TDA/DDA methods for Thermo Scientific LTQOrbitrap series, Q Exactive series or Fusion series mass spectro-meters using Xcalibur versions 2, 3 and 4, respectively. How-ever, steps similar to those described here can be applied to thecreation of TDA/DDA methods on other instrumentplatforms.

3. Extremely long lists of targeted proteins may limit the numberof theoretical peptide ions that can be targeted per protein,which may diminish the efficacy of hypothesis driven proteinquantification. Whether or not this may be an issue will beapparent following step 5 of Subheading 3.1. (See also Note5, below).

4. There are numerous ways to access FASTA entries of individualproteins. For example for proteins listed in Uniprot [18],FASTA entries can be accessed online from each protein’sweb page by clicking ! FASTA (Sequence data in FASTAformat).

5. The advantages of TDA/DDA relative to DDA can beexpected to hold true across a range of inclusion list sizes. Itcan be expected that large inclusion lists comprising thousandsof values should offer particular advantages. This is because,even when using small inclusion lists (e.g., ~100 values) andhigh resolution mass analyzers, substantial redundancybetween inclusion list and untargeted peptide ion m/z valuescan be expected when analysing complex peptide samples[1]. It is this redundancy in m/z values that improves thereproducibility of untargeted protein quantification whenusing TDA/DDA. This redundancy in m/z values can beexpected to increase with inclusion list size.

174 Gene Hart-Smith

Despite these expected advantages of large inclusion lists,inclusion list sizes are, nonetheless, capped by mass spectrome-try instrument software. These maximum inclusion list sizeswill be software specific, and for most software platforms, willbe indicated in error messages if exceeded. For example, ifcreating TDA/DDAmethods for an LTQOrbitrap instrumentusing Xcalibur v2.2, inclusion lists will be capped at 2000values when using open retention time windows.

6. If creating inclusion lists for Thermo Scientific LTQ Orbitrap,Q Exactive or Fusion series instruments using Xcalibur version2, 3, or 4, respectively, inclusion lists should be created as eithertab delimited text (.txt) or comma separated values (.csv) files.

For LTQ Orbitrap series instruments, 3 columns arerequired per m/z value specifying the following: m/z value(to 4 decimal places), retention start time (in min), retentionend time (in min). These lists must be formatted such thatthere is no redundancy in m/z values within each specifiedretention time window. It is therefore recommended that anyduplicate m/z values are removed prior to entering retentiontime values (see Note 7).

For Q Exactive series instruments, 5 columns are requiredperm/z value specifying the following:m/z value (to 4 decimalplaces), molecular formula (entries can be left blank), targetedspecies (entries can be left blank), peptide ion charge state (“2,”“3,” etc.), polarity of the peptide ion (“Positive” if usingpositive ion mode electrospray ionization).

For Fusion series instruments, three columns are requiredper m/z value specifying the following: m/z value (to fourdecimal places), peptide ion charge state (“2,” “3,” etc.),name of targeted species (these names will not affect howinclusion lists are implemented). In addition, the followingcase-sensitive column headers are required: “m/z,” “z,”“Name.”

7. If creating inclusion lists for Q Exactive or Fusion series instru-ments using Xcalibur version 3 or 4 respectively, the creation ofopen retention time windows simply involves leaving the col-umns for retention start and end times out, as per Note 6.

If creating inclusion lists for LTQ Orbitrap series instru-ments using Xcalibur v2, this will involve inputting retentionstart and end times covering the entire period of peptide elu-tion. From our experience with Xcalibur v2.1 and v2.2, speci-fying a single broad retention time window perm/z value (e.g.,14.00–50.00 min) leads to faulty implementations of inclusionlists by the instrument software. To remedy this, we input eachm/z value multiple times using a series of 6-min retention timewindows, with each window separated by a minimal retentiontime difference (14.00–20.00 min, 20.01–26.00 min,26.01–32.00 min, etc.).

Targeted and Untargeted Quantitative Proteomics 175

8. When creating a TDA/DDA method, it is recommended thatthe total number of dependent scan events per MS/MS scancycle is the same as the preoptimized DDA LC-MS/MSmethod opened in Subheading 3.2, step 2. It is recommendedthat the first half of these dependent scan events should beallocated to TDA, with the latter half allocated to DDA.

9. If using a Fusion series instrument, adding a TDA componentto the preoptimized DDA LC-MS/MS method opened inSubheading 3.2, step 2 involves the following.

Click on the “Scan Parameters” tab to open the data acqui-sition workflow associated with the DDA LC-MS/MSmethod. Drag and drop a new “ddMS2” scan node into theworkflow and ensure that its MS/MS parameters match thoseof the existing ddMS2 scan node. Ensure that the new scannode has Scan Priority ¼ 1 (listed under “Data-DependentMSn Scan Properties”) and change the Scan Priority of theexisting ddMS2 scan node to 2. Drag and drop a “TargetedMass” filter node above the new ddMS2 scan node.

Following this, navigate to “Data Dependent Properties”(e.g., by clicking on the node specifying “# sec” or “# scans”)and ensure that the data-dependent mode is set to “Scans PerOutcome.” This will allow the number of TDA and DDAevents per MS/MS scan cycle to be specified.

10. If using a Q Exactive series instrument, adding a TDA compo-nent to the preoptimized DDA LC-MS/MS method openedin Subheading 3.2, step 2 involves the following.

Navigate to “Properties of the method” in the DDALC-MS/MS method and ensure that User Role ¼ Advanced.Navigate to “Properties of Full MS/dd-MS2 (TopN)” andensure that Inclusion ¼ on, and that If idle ¼ pick others.

11. If using an LTQ Orbitrap series instrument, it is possible tocreate an LC-MS/MS method with both TDA and DDAcomponents as follows.

After opening the method editor in Xcalibur (Subheading3.1, step 1), click “Data dependent MS/MS” to create a newmethod. Specify the number of Scan Events following Note8 and ensure that the “Dependent scan” checkbox is markedfor all Scan Events other than Scan Event 1.

Following this, edit the parameters for the first half of thedependent scans by clicking on “Settings. . .”. For each of theseScan Events, navigate to “Parent Mass List” and ensure thatthe “Use global mass lists” box is checked. After this, navigateto the parameters listed under “Current Scan Event.” Ensurethat each mass is determined from Scan Event 1 and that “Nthmost intense from list” is specified, starting from 1 and increas-ing by 1 for each subsequent Scan Event.

176 Gene Hart-Smith

For the latter half of the dependent scans, click on“Settings. . .” and navigate to the parameters listed under“Current Scan Event.” Ensure that each mass is determinedfrom Scan Event 1 and that “Nth most intense ion” is specified,starting from 1 and increasing by 1 for each subsequent ScanEvent.

12. For survey scans, it is particularly important to define thefollowing parameters appropriately: AGC target values, maxi-mum injection times, mass analyzer used, and mass analyzerresolution.

For precursor ion selection, it is particularly important todefine the following parameters appropriately: minimum ioncounts required to trigger MS/MS events, charge states capa-ble of triggeringMS/MS events, whether or not monoisotopicprecursor ion selection is enabled, and parameters associatedwith dynamic exclusion.

For MS/MS, different dissociation methods will requiredifferent parameters to be specified. For example, if usingHCD or CID, it is particularly important to define the follow-ing parameters appropriately: collision energies, AGC targetvalues, activation times, mass analyzer used, and mass analyzerresolution.

13. If using an LTQ Orbitrap series instrument, import the inclu-sion list by clicking on the “Mass Lists” tab. Ensure that“Parent Masses” is selected in the “Mass List” pull-downmenu before importing the inclusion list. Following this,click on the “MS Detector Setup” tab. Click on “Settings. . .”for any dependent scan and navigate to “Mass Widths.” Under“Parent mass width,” specify low and high mass tolerances of10 ppm.

If using a Q Exactive series instrument, click on “Inclu-sion” (under “Global Lists”) to import the inclusion list. Fol-lowing this, under “Properties of the method” navigate to“Customized Tolerances (�)” and specify mass tolerances of10 ppm.

If using a Fusion series instrument, click on the “TargetedMass” node and specify the mass list type as “m/z & z” beforeimporting the inclusion list. After this, specify low and highmass tolerances of 10 ppm.

References

1. Hart-Smith G, Reis RS, Waterhouse PM et al(2017) Improved quantitative plant proteo-mics via the combination of targeted and untar-geted data acquisition. Front Plant Sci 8:1669

2. Domon B, Aebersold R (2010) Options andconsiderations when selecting a quantitative

proteomics strategy. Nat Biotechnol28:710–721

3. Gillet LC, Leitner A, Aebersold R (2016) Massspectrometry applied to bottom-up proteo-mics: entering the high-throughput era for

Targeted and Untargeted Quantitative Proteomics 177

hypothesis testing. Annu Rev Anal Chem9:449–472

4. Picotti P, Bodenmiller B, Mueller LN et al(2009) Full dynamic range proteome analysisof S. cerevisiae by targeted proteomics. Cell138:795–806

5. Schmidt A, Claassen M, Aebersold R (2009)Directed mass spectrometry: towardshypothesis-driven proteomics. Curr OpinChem Biol 13:510–517

6. Picotti P, Aebersold R (2012) Selected reactionmonitoring–based proteomics: workflows,potential, pitfalls and future directions. NatMethods 9:555–566

7. Peterson AC, Russell JD, Bailey DJ et al (2012)Parallel reaction monitoring for high resolu-tion and high mass accuracy quantitative, tar-geted proteomics. Mol Cell Proteomics11:1475–1488

8. Domon B, Bodenmiller B, Carapito C et al(2009) Electron transfer dissociation in con-junction with collision activation to investigatethe Drosophila melanogaster phosphopro-teome. J Proteome Res 8:2633–2639

9. Hart-Smith G, Low JK, Erce MA et al (2012)Enhanced methylarginine characterization bypost-translational modification-specific tar-geted data acquisition and electron-transferdissociation mass spectrometry. J Am SocMass Spectrom 23:1376–1389

10. Savitski MM, Fischer F, Mathieson T et al(2010) Targeted data acquisition for improvedreproducibility and robustness of proteomicmass spectrometry assays. J Am Soc Mass Spec-trom 21:1668–1679

11. Schmidt A, Gehlenborg N, Bodenmiller B et al(2008) An integrated, directed mass spectro-metric approach for in-depth characterizationof complex peptide mixtures. Mol Cell Proteo-mics 7:2138–2150

12. Gillet LC, Navarro P, Tate S et al (2012) Tar-geted data extraction of the MS/MS spectragenerated by data-independent acquisition: anew concept for consistent and accurate prote-ome analysis. Mol Cell Proteomics 11:O111.016717

13. Kalli A, Smith GT, Sweredoski MJ et al (2013)Evaluation and optimization of mass spectro-metric settings during data-dependent acquisi-tion mode: focus on LTQ-Orbitrap massanalyzers. J Proteome Res 12:3071–3086

14. Reis RS, Hart-Smith G, Eamens AL, WilkinsMR, Waterhouse PM (2015) Gene regulationby translational inhibition is determined byDicer partnering proteins. Nat Plants 1:1–6

15. Reis RS, Hart-Smith G, Eamens AL et al(2015) MicroRNA regulatory mechanismsplay different roles in Arabidopsis. J ProteomeRes 14:4743–4751

16. Arsova B, Kierszniowska S, Schulze WX (2012)The use of heavy nitrogen in quantitative pro-teomics experiments in plants. Trends Plant Sci17:102–112

17. MacLean B, Tomazela DM, Shulman N et al(2010) Skyline: an open source document edi-tor for creating and analyzing targeted proteo-mics experiments. Bioinformatics 26:966–968

18. Consortium U (2014) UniProt: a hub for pro-tein information. Nucleic Acids Res 43:D204–D212

178 Gene Hart-Smith

Chapter 14

A Phosphoproteomic Analysis Pipeline for Peels of TropicalFruits

Janet Juarez-Escobar, Jose M. Elizalde-Contreras,Vıctor M. Loyola-Vargas, and Eliel Ruiz-May

Abstract

Phosphorylation is a posttranslational reversible modification related to signaling and regulatory mechan-isms. Protein phosphorylation is linked to structural changes that modulate protein activity, interaction, orlocalization and therefore the cell signaling pathways. The use of techniques for phosphoprotein enrich-ment along with mass spectrometry has become a powerful tool for the characterization of signal transduc-tion in model organisms. However, limited efforts have focused on the establishment of protocols for theanalysis of the phosphoproteome in nonmodel organisms such as tropical fruits. This chapter describes apotential pipeline for sample preparation and enrichment of phosphorylated proteins/peptides before MSanalysis of peels of some species of tropical fruits.

Key words Peptide enrichment, Phosphorylation, Phosphoproteome, Tropical fruit

1 Introduction

Arabidopsis thaliana has been the plant model for studying severalbiological processes including the application of omics tools andthe massive profiling of posttranslational modifications (PTM)[1]. The establishment of proteomics pipelines in A. thalianarepresents an invaluable tool in the study of molecular mechanismsof plants. In most of the cases, the extrapolation of these proteo-mics protocols to other plant species is not feasible. In fact, innonmodel plant species, such as tropical fruits with recalcitranttissue and limited genomics information exponentially increasethe complexity of proteomics protocols. Working with fruit tissuessuch as peels has some pitfalls due to the presence of a thick cuticle,cell wall, lining, proteases, storage polysaccharides, phenolic com-pounds, lipids, and secondary metabolites, and the high dynamicrange of the proteome. Therefore, a protocol should be optimizedfor each plant species and tissue.

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_14, © Springer Science+Business Media, LLC, part of Springer Nature 2020

179

Plant proteome complexity increases due to posttranslationalmodifications (PTM). Phosphorylation is considered one of themost important posttranslational modifications pertaining to aplant’s response to external stimuli and other cellular processes,such as signal transduction, cell proliferation, differentiation, apo-ptosis, and metabolism [2, 3]. At least one-third of all proteins arephosphorylated at any given time, as the result of a kinase reaction,so phosphorylation data can be a measure of signaling activity [3–6]. Phosphorylation occurs on tyrosine, serine, or threonine resi-dues, although six other amino acids can be phosphorylated: cyste-ine, arginine, lysine, aspartate, glutamate, and histidine [7]. Proteinphosphorylation is highly dynamic, spatially regulated in the cell,and inherently of low stoichiometry [5], therefore of relatively lowabundance (see Note 1). Hence, we need to implement effectiveprotocols for enrichment; moreover, phosphoproteins represent asmall proportion of total proteins present in the initial cell lysate.

Over the past several decades, protein phosphorylation has beenvisualized on 1D and 2Dgels, by 32P labeling or byWestern blottingwith phosphosite-specific antibodies or using phosphoprotein stainsthat specifically bind to phosphate moieties of phosphoproteins,such as Pro-QDiamond Phosphoprotein stain (Pro-QDPS). How-ever, these techniques are not entirely reliable due to the generationof false-positive results. It is worth to note that there is no standardmethod for sample processing, but a typical phosphoproteomicsworkflow includes cell lysis, protein extraction, reduction/alkyl-ation, trypsin digestion, prefractionation, enrichment, andMS anal-ysis (Fig. 1). Although there are many strategies to follow, a highlyefficient protein extraction process with proteases and phosphataseinhibitor cocktails is strongly suggested as the first step (seeNote 2),followed by fast and reliable digestion required to inhibit any prote-ase and phosphatase activity (see Note 3). In addition, a fraction-ation prior to an enrichment step is suggested for several reasons:(a) to reduce sample complexity; (b) to improve the sensitivity to themoderate and/or low-abundance proteins especially where high-abundance proteins might dim the presence of the less abundant, orwhen protein concentrations vary through a wide range [8]; (c) toremove nonphosphorylated moieties since they are often expressedin low abundance; (d) the low stoichiometry of phosphorylationyields a small number of phosphopeptides; and (e) generally manyregulatory phosphoproteins have low expression levels.

1.1 Fractionation

Methods

Fractionation is oriented to reduce complexity in each sample andenhance the subsequent enrichment stage [3]. It is usually carriedout after protein digestion. Fractionation methods include hydro-philic or ion exchange resins, electrostatic repulsion hydrophilicinteraction chromatography (ERLIC), polymer-based metal ionaffinity capture (PolyMAC), and hydroxyapatite chromatography.

180 Janet Juarez-Escobar et al.

Some examples of commercially available methods of fractionationare as follows:

l HILIC (hydrophilic interaction liquid chromatography):

– SeQuant®ZIC®-HILIC column (Merck).

– TSKgel Amide-80 column (Tosoh).

l SAX (strong anion exchange):

– POROS™ XQ strong anion exchange resin (Thermo).

– Pierce spin columns (Thermo).

– III pro analysis (Merck).

– Amberlite IRA-402 (Merck).

– Amberlite IRA-410 (Merck).

– Amberjet 4200 CL (Merck).

– Dowex 1 � 8 (Merck).

Fig. 1 Schematic representation of a phosphoproteomics workflow. Strategies often combine two orthogonalseparation modes or multiple enrichment techniques to enhance phosphoproteome coverage. Coloreddiagram is the representation of the methodology presented in this chapter

Phosphoproteomic in Peels 181

l SCX (strong cation exchange; please note that this may require afurther desalting step):– POROS™ XS strong cation exchange resin (Thermo).

– POROS™ XS resin (Thermo).

– Pierce SCX columns (Thermo).

– Amberlite IR-120 (Merck).

– Dowex 50 WX-8 (Merck).

– Amberlyst 15 (Merck).

– Dowex 50 WX-4 (Merck).

– Polysulfoethyl aspartamide.

– RESOURCE S column (GE Healthcare, Sweden).

HILIC offers a highly efficient separation of polar molecules, sophosphorylated peptides are retained more strongly in the column.HILIC has the highest degree of orthogonality to RP of all com-mon separation methods. When using HILIC fractionation, a fur-ther enrichment with Fe3+-IMAC is suggested since it has beenobserved that HILIC fractionation improves the selectivity ofIMAC to greater than 90% [9]. ERLIC includes the selectivity ofHILIC with an additional electrostatic repulsion given by the func-tional groups attached to the stationary phase. It was observed thatusing ERLIC, the number of phosphopeptides was tripled com-pared to SCX-IMAC [10].

First: SCX or SAX Phosphopeptides elute earlier than theirnonphosphorylated counterparts, then Fe+-IMAC enriches the collected fraction [15]

Second: IMAC

First: SCX or HILIC Phosphoryl group let the enrichment ofnegative-charged peptides [16]Second: TiO2

First: TiO2 Improves efficiency and reproducibility inlarge-scale quantitative profiling [17, 18]Second: SCX

First:Immunoprecipitation ofpTyr peptides

Estimates the level of tyrosine phosphorylationand lets the recovery of large number ofpeptides [19]

Second: Fe (III)-NTAcolumn

1.2 Enrichment

Methods

The most efficient enrichment strategy is carried out after peptidedigestion. It should be considered that not all proteins can be lateridentified using the fragmented peptides, since the nonphosphory-lated “part” is lost during the enrichment step [13]. Phosphopro-tein enrichment alone (without prefractionation) is less used sincethe complexity of vegetal samples; it is also is preferred whenworking with proteins associated with subcellular fractions orisolated organelles [14]. Phosphopeptide enrichment strategies

182 Janet Juarez-Escobar et al.

are based on chemical modifications, affinity chromatography, andimmunological techniques (Table 1). Alternative affinity methodsare Phos-Tag chromatography, polymer-based metal ion affinitycapture (PolyMAC), and hydroxyapatite chromatography. It isusual to find coupled orthogonal chromatographic strategies thatproduce nonoverlapping separation but an increased identificationof low-abundance peptides [14]. It is also possible to perform twosequential enrichment steps to reduce sample complexity andincrease the phosphoproteome coverage, for example: It is usualto find coupled orthogonal chromatographic strategies that pro-duce nonoverlapping separation but an increased identification oflow-abundance peptides [14]. For a more detailed description ofenrichment techniques, review the following references [20–32].

1.3 Quantification

Strategies

The most used techniques comprise label-free quantification(LFQ), stable isotope labeling by amino acids in cell culture(SILAC), and isobaric tandem mass tags (iTRAQ or TMT), withLFQ and SILAC being the most accurate techniques [33].

A cost-effective alternative to the commercial iTRAQ and TMT[34], that has been proved successful in maize leaves [35] andArabidopsis [36], is the stable isotope dimethyl labeling of peptidesat their α- and ε-amino groups before enrichments using SCXand IMAC.

Chemical derivatization strategies can be used to incorporatesample labeling for quantification (e.g., after β-elimination of thephosphate group). Weckwerth et al. [37] added ethanethiol andethane-d5-thiol to create two different isotopic labels to make aquantitative comparison of the samples. Goshe et al. [38]incorporated a phosphoprotein isotope-coded affinity tag(PhIAT), a biotinylated tag that allowed for high-specificity affinitypurification and isotopic labeling to perform a relativequantification.

Table 1Affinity chromatography-based techniques commercially available

Stationaryphase Method Component Examples

Metals IMAC Fe, Ga, Al, ZrNi

HisPur™ Cobalt Superflow Agarose, byThermo Fisher

Ni-NTA spin column (P/N 31014), QiagenMOAC TiO2 Titansphere Phos-TiO kits are available from GL

Sciences Inc. (Torrance, CA, USA)

Antibodies Immunoprecipitation Phosphotyrosinepeptides

pTyr-100 (Cell Signaling Technology)

Phosphoproteomic in Peels 183

1.4 Plant

Phosphoproteome

Tools and Databases

There are many tools developed as predictors for phosphorylationsites classified under kinase-specific or non–kinase-specific queries.For example, in the case of kinase-specific tools, users should pro-vide protein sequence and the type of kinase under consideration.Databases of phosphorylation sites and prediction tools are sum-marized in Table 2.

Table 2Prediction tools and databases of phosphorylation sites in plant proteomics

Name Description Link Ref.

Databases

PhosPhAt 4.0 Arabidopsisphosphorylation sitesidentified by massspectrometry in large-scale experiments bydifferent researchgroups.

http://phosphat.uni-hohenheim.de

[39]

dbPPT A curated database fromliterature inconsistency with otherdatabases.

http://dbppt.biocuckoo.org

[40]

P3DB 3.0 Provides informationand annotationregarding geneontology homologs,three-dimensionalstructure, kinase/phosphatase families,protein domains.

http://www.p3db.org/ [41]

RIPP-DB RIKEN PlantPhosphoproteomeDatabase: informationobtained by LC-MS/MS shotgunphosphoproteomics ofArabidopsis and rice

http://metadb.riken.jp/metadb/db/SciNetS_ria102i)

[42]

MedicagoPhosphoProteindatabase

Phosphoproteomic datafrom Medicago rootsand phosphorylationsites on proteinsinvolved in symbioticsignaling.

http://www.phospho.medicago.wisc.edu/db/index.php

[43]

(continued)

184 Janet Juarez-Escobar et al.

Table 2(continued)

Name Description Link Ref.

ProMEX Mass spectral referencedatabase of trypticpeptide fragmentationderived from plants.

http://promex.pph.univie.ac.at/promex/

[44]

AtProteome Data of the high-density,organ-specificproteome map forArabidopsis. All theinformation about theprotein identificationsis shown together witha proteogenomicmapping of thepeptides onto thegenome, togetherwith links to otherdatabases.

http://fgcz-atproteome.uzh.ch

[45]

Pep2pro Proteome informationon Arabidopsis basedon 2.6 million peptidespectra, provides theinformation about theprotein identificationswith a proteogenomicmapping of thepeptides onto thegenome andannotation in organ-specific processes;allows the user aspecific peptide search.

http://fgcz-pep2pro.uzh.ch

[46]

Phospho.ELM Curated database ofexperimentally verifiedphosphorylation sitesin eukaryotic proteins,linked to literaturereferences; alsoincorporates sitescontained in universaldatabases such asUniProt (www.uniprot.org).

http://phospho.elm.eu.org/index.html

[47]

(continued)

Phosphoproteomic in Peels 185

Table 2(continued)

Name Description Link Ref.

Predictors

NetPhos 3.1 server Kinase-specificprediction ofphosphorylation sitesfor the following17 kinases: ATM, CKI,CKII, CaM-II,DNAPK, EGFR,GSK3, INSR, PKA,PKB, PKC, PKG,RSK, SRC, cdc2, cdk5,and p38MAPK

http://www.cbs.dtu.dk/services/NetPhos-3.1/

[48]

PHOSFER(PHOsphorylationSite FindER)

Predicts phosphorylationsites in soybeanproteins

http://saphire.usask.ca/saphire/phosfer/

[49]

KinasePhos 2.0 Integrates SVM (supportvector machines).Approximately 91%accuracy for predictionof phosphorylated Ser,Thr, Tyr, and histidineresidues is exhibited bythis tool.

http://kinasephos2.mbc.nctu.edu.tw/

[18, 50]

Scansite 4.0 Prediction ofphosphorylation sitesis based on a matrix ofselectivity values foramino acids for everyprobable site ofphosphorylation.

https://scansite4.mit.edu/4.0/#home

[51]

PhosphoRice Meta-predictor of rice-specificphosphorylation site,was constructed byintegrating the newlyphosphorylation sitespredictors,NetPhos2.0,NetPhosK,Kinasephos, Scansite,Disphos, andPredphosphos withparameters selected byrestricted grid searchand random search

https://github.com/PEHGP/PhosphoRice

[52]

(continued)

186 Janet Juarez-Escobar et al.

2 Materials

Prepare all solutions using purified deionized water and analyticalgrade reagents. Prepare and store all reagents at room temperature(unless indicated otherwise).

Note: mention of specific companies does not represent anendorsement by the authors. Reagents are purchased from Sigma,unless otherwise noted. Prepare all solutions using double-deionized water (MilliQ) and analytical HPLC-grade reagents.

2.1 Total Protein

Extraction

1. Mortar and pestle.

2. Liquid nitrogen.

3. Polyvinylpolypyrrolidone (PVPP).

4. Extraction buffer: 150 mM Trizma base pH 8, 100 mM KCl,1.4M Sucrose, 1% Triton X-100, 5% (v/v) β-mercaptoethanol.

Table 2(continued)

Name Description Link Ref.

MUsite 1.0 Pretrained model forprediction of kinase-specific proteinphosphorylation sitesfor A. thaliana andother five eukaryoticmodels. Provides aunique functionalityfor trainingcustomized predictionmodels (includingcondition-specificmodels) from users’own data.

http://musite.sourceforge.net

[53]

PlantPhos Prediction of potentialphosphorylation sitesfrom catalytic kinasemotifs generated usingmaximal dependencedecomposition(MDD)

http://csb.cse.yzu.edu.tw/PlantPhos/index.html

[54]

PhosK3D Web server foridentifying kinase-specificphosphorylation siteson protein sequencesand three-dimensionalstructures.

http://csb.cse.yzu.edu.tw/PhosK3D/

[55]

Phosphoproteomic in Peels 187

Freshly add protease inhibitor cocktail (e.g., Sigma PlantProtease Inhibitor Cocktail, 100 μL for every 5 g of tissue),1 mM phenylmethylsulfonyl fluoride (PMSF, previouslyprepared with absolute ethanol see Note 2). Then, add thefollowing phosphatase inhibitors: 10 mM sodium pyrophos-phate dibasic (Na2H2P2O7), 1 mM sodium orthovanadate(Na3VO4), 10 mM β-glycerolphosphate, 50 mM sodium fluo-ride (NaF) (see Notes 3 and 4).

5. Phenol, Tris-saturated pH 8.0.

6. Protein quantification reagents (e.g., Thermo Scientific PierceBCA Protein Assay kit).

7. 1.5 mL tubes.

2.2 SDS–

Polyacrylamide Gel

Electrophoresis

1. SDS protein sample buffer: To make 1 mL of a 4� stock mix,mix the following: 0.1 g sodium dodecyl sulfate (SDS), 0.4 gsucrose, 50 μL 1 M Tris–HCl pH 6.8, 10 μL 100 mM EDTA,400 μL water, 200 μL 14.7 M β-mercaptoethanol, andbromophenol blue.

2. 10x Laemmli electrophoresis running buffer: Dissolve in1000 mL water the following: 30.0 g Tris, 144.0 g glycine,and 10.0 g SDS. Store the running buffer at roomtemperature.

3. Separating buffer: 1.5 M Tris–HCl, pH 8.8, 0.4% SDS.

4. Stacking buffer: 0.5 M Tris–HCl, pH 6.8; 0.4% SDS.

5. Fresh 10% ammonium persulfate (APS) water solution.

6. Casting of two separation 14% acrylamide mini-gels with 6 Murea: 3.6 g urea, 1.18 mL water, 2.5 mL separation buffer,3.5 mL of 40% acrylamide, 6 μL N,N,N0,N-0-tetramethylethylenediamine (TEMED), and 60 μL 10% APS.

7. Casting of 6% acrylamide stacking gel for two mini-gels: 3.6 gurea, 3.6 mL water, 2.5 mL stacking buffer, 1.5 mL acrylam-ide, 8 μL TEMED, and 80 μL 10% APS.

8. Prestained protein molecular weight markers (e.g., ThermoScientific).

9. Gel casting and electrophoresis system (e.g., Mini-PRO-TEAN® Tetra Handcast System with Mini-PROTEAN®

Tetra Cell, Bio-Rad).

2.3 Reduction,

Alkylation, and

Digestion

1. Prepare just before use 50 mM of ammonium bicarbonate(NH4HCO3) in 100 mL water.

2. Reduction buffer (prepare just before use): 10 mM Tris(2-carboxyethyl) phosphine hydrochloride (TCEP) in50 mM NH4HCO3.

3. 30 mM iodoacetamide in 50 mM NH4HCO3.

188 Janet Juarez-Escobar et al.

4. 30 mM 1,4-dithiothreitol (DTT) in 50 mM NH4HCO3.

5. Acetone, MS grade.

6. Triethylammonium bicarbonate (TEAB).

7. Trypsin, MS grade.

8. Trypsin enhancer (e.g., ProteaseMAX™ Surfactant,Promega).

2.4 Direct

Phosphopeptide

Enrichment

1. Iron-chelate resin in spin columns (e.g., Pierce™ HiSelect™Fe-NTA phosphopeptide Enrichment, Thermo Scientific™).

2.5 High pH

Reversed-Phase (RP)

Fractionation

1. Reversed-phase fractionation resin, trimethylamine (0.1%)(e.g., Pierce™ high pH reversed-phase peptide fractionationkit, Thermo Scientific™).

2.6 SCX-RP

Fractionation and

Enrichment

1. SCX Diluent: 25% ACN/water, pH 3.0.

2. SCX “A” buffer: 10 mM KH2PO4/25% acetonitrile, pH 3.0.

3. SCX “B” buffer: 10 mM KH2PO4/1 M KCl/2% acetonitrile(1:1:1) pH 3.0.

4. Equilibration/loading/rinse buffer ¼ SCX diluent and SCX“A” Buffer (1:1).

5. Elution buffer: SCX “A” buffer: SCX “B” buffer (1:1).

6. Strong cation exchanger cartridges (e.g., HyperSep StrongCation Exchanger (SCX) SPE Cartridges [ThermoScientific]).

7. RP equilibrium/load buffer: 0.1% TFA.

8. RP desalting buffer: 5% MeOH/0.1% TFA (1:1).

9. RP elution buffer: 50% ACN/0.1% TFA (1:1).

10. Formic acid, MS grade.

2.7 Other Materials 1. Gel documentation system (e.g., Gel Doc™ XR System,Bio-Rad®, and software Image Lab™).

2. Centrifugal vacuum concentrator (e.g., CentriVap,LABCONCO®).

3. Liquid chromatography–mass spectrometry (LC-MS) system.

3 Methods

Carry out all procedures at ice-cold temperature; except for SDSbuffer, add phosphatase inhibitors to all other buffer solutions.

Phosphoproteomic in Peels 189

3.1 Tissue Protein

Extraction

1. Grind approximately 3 g of fruit tissue to fine powder in liquidnitrogen, adding PVPP (1:10, w/w) while grinding. Keep intoa 50-mL tube. If not processed after sampling, store at�80 �Cuntil analysis.

2. Add 6 mL of extraction buffer to every 3 g of fruit tissue.

3. Add 6 mL Tris-saturated phenol pH 8.0 and incubate thesamples with agitation in crushed ice for 30 min.

4. Centrifuge at 10,000 � g for 30 min at 4 �C and transfer eachupper phenolic phase to a new centrifuge tube.

5. Add 4 volumes of ice-cold acetone with 0.07% (v/v)β-mercaptoethanol for soluble protein precipitation at�20 �C overnight.

6. Centrifuge at 3000 � g for 30 min at 4 �C.

7. Remove the supernatant after centrifugation, and let the pelletdry under a laboratory fume hood.

8. Resuspend the dried pellet with 350 μL of phosphate-bufferedsaline (PBS) 1� (Sigma) with SDS (1%). Vortex and sonicatefor 20 min.

9. Centrifuge at 15,000 � g for 10 min at RT. Transfer thesupernatant to a new tube.

10. Measure protein concentration of the extract.

11. Store at �80 �C until use.

3.2 Subject the

Extract to SDS–

Polyacrylamide Gel

Electrophoresis (SDS-

PAGE) According to

Laemmli [56]

1. Run the gels at 10 mA/gel through the stacking gel andincrease to 25 mA/gel when the samples have entered theseparation gel.

3.3 Reduction,

Alkylation, and

Digestion

1. To 100 μg of protein extract add water to a final volume of100 μL.

2. Add 10 mM TCEP and incubate for 45 min at 60 �C.

3. To alkylate the proteins, add 30 mM IAM and incubate in thedark for 60 min at room temperature (21 �C). Add 30 mMDTT and incubate for 10 min at room temperature.

4. Add 1 mL ice-cold acetone and incubate overnight at �20 �Cfor protein precipitation.

5. Centrifuge at 10,000 � g for 15 min at 4 �C. Discard thesupernatant and let the pellet dry in a vacuum concentrator.

6. Resuspend the dried pellet with 150 μL 50 mM TEAB + 0.1%SDS. Sonicate for 15 min.

7. Measure protein concentration of the reduced/alkylatedextract.

190 Janet Juarez-Escobar et al.

8. Add ProteaseMAX™ Surfactant (Promega) and trypsin at a(0.01:1:30) ratio (trypsin–protein). Incubate for 3 h at 37 �C(see Note 5).

3.4 Direct Fe-NTA

Enrichment

1. Lyophilize digested sample in a centrifugal vacuumconcentrator.

2. Follow the manufacturer’s instructions (Thermo Scientific) forFe-NTA-based phosphopeptide enrichment.

(a) Resuspend digested peptides in binding/wash buffer (seeNote 6).

(b) Equilibrate the spin column in binding/wash buffer andcentrifuge at 1000 � g for 30 s.

(c) Bind phosphopeptides to spin column and gently rock for30 min at room temperature, then centrifuge as indicatedin the previous step.

(d) Wash column thrice with binding/wash buffer discardingeach flow through after centrifugation at 1000 � g.Repeat washing once with water.

(e) Elute the phosphopeptides with elution buffer and centri-fuge at 1000 � g; repeat one more time (see Note 7).

3.5 Fractionation

Prior to Fe-NTA

Enrichment

3.5.1 High RP

Fractionation and

Enrichment

1. Dry digested sample in a centrifugal vacuum concentrator.

2. Follow the manufacturer’s instructions (Thermo Scientific) forhigh-pH reversed-phase fractionation (see Note 8).

3.5.2 SCX-RP

Fractionation and

Enrichment

1. Reconstitute lyophilized digested sample in 1 mL equilibrationbuffer. Adjust pH 2.5–3.0 with formic acid if necessary.

2. Wet HyperSep SPE cartridge with 2 mL of Milli-Q water.

3. Pass 1 mL of elution buffer at 1–2 drops/s.

4. Wash with 2 mL of Milli-Q water.

5. Equilibrate with 5 mL of equilibration buffer at 1–2 drops/s.

6. Load the sample slowly, no faster than 1 drop/s. Collect theeffluent.

7. Rinse the cartridge with 3 mL of equilibration buffer andcollect the effluent in the same tube.

8. Elute the sample with 1.5 mL of elution buffer with increasingconcentrations of KCl: 75, 250, 500 mM, no faster than1 drop/s. Collect the effluents in clean tubes.

9. Dry the effluents in a centrifugal vacuum concentrator.

Phosphoproteomic in Peels 191

10. Reconstitute dried effluents with 1 mL 0.1–0.5% TFA. AdjustpH 2.5–3.0 with formic acid if necessary.

11. Wet new HyperSep SPE cartridge with 2 mL of RP equilib-rium/load buffer.

12. Load the sample slowly and 1 mL of RP equilibrium/loadbuffer.

13. Pass 1 mL of RP desalting buffer slowly.

14. Elute the sample with 1 mL of RP elution buffer. Collect theeffluents and let them dry in a centrifugal vacuumconcentrator.

15. Fe-NTA Enrichment is carried out as described in Subheading3.4.

3.6 LC/MS-MS

Analysis

Mass spectrometric analysis is performed according to the methodsavailable to the user (see Note 9): neutral-loss scanning, multistageactivation, or MS2 fragmentation.

We use a nanosystemUltiMate RSLC (Dionex, Sunnyvale, CA)and an Orbitrap Fusion™ Tribid™ (Thermo Fisher Scientific, SanJose, CA) mass spectrometer with electrospray ionization in posi-tive mode at 3.5 kV. Each sample is reconstituted with 50 μL of0.1% formic acid. Twenty microliters is injected to a C18 precolumn(2 cm � 3 μm ID, 75 μm OD, Dionex, Sunnyvale, CA) at a flowrate of 3 μL min�1. Peptides are separated on an EASY-Spraycolumn (25 cm � 75 μm ID), PepMap RSLC C18 2 μm, at aflow rate of 300 μL min�1 for 100 min. Solvent A (0.1% formicacid) and solvent B (0.1% formic acid in 90% ACN) are used toestablish a elution gradient: solvent A for 10min, solvent B from 7%to 20% for 35 min, solvent B (20%–25%) for 15 min, solvent B(25–95%) for 20 min, and solvent A for 8 min. Calibration isperformed with caffeine, Met-Arg-Phe-Ala (MRFA), andUltramark 1621.

A typical analysis MS2 will show ions a, b, and y with multiplecharges and the neutral loss of the phosphate in Fig. 2.

4 Notes

1. In phosphoproteomics, it is better to use data-independentacquisition (DIA) over data-dependent acquisition (DDA). Inthis way, even low-abundance peptide ions will not be lostbecause of their low intensities. DIA helps to overcome thepoor ionization and low stoichiometry of phosphopeptides inthe samples.

2. APMSF is unstable in aqueous solutions, and the buffer shouldbe used as soon as possible after the addition of PMSF.

192 Janet Juarez-Escobar et al.

3. Any given extraction protocol has its physicochemical limita-tions. In our experience with several tropical fruits, we obtainedoptimal results with a protocol based on phenol and acetoneprecipitation.

4. To preserve phosphorylation, it is essential that extract buffersand SDS-PAGE sample buffers contain high concentrations ofphosphatase inhibitors, for example, when using sodium fluo-ride and β-glycerolphosphate together at 50 and 100 mM,respectively [57]

5. In order to keep ProteaseMAX™ Surfactant efficiency, pHshould be maintained at 7.8.

6. We suggest carrying out a previous C18 or SCX prefractiona-tion as a way to enhance the phosphopeptide recovery. Be sureto keep pH < 3 during this procedure and to gently mix thesample with the resin.

7. Dry immediately to avoid modification in the phosphopeptidesdue to extreme acid of the elution buffer.

8. When using this kit, special care should be taken when mixingthe sample and buffer to avoid resin slurrying.

9. We used Orbitrap fusion™ Tribrid™ MS1 detection in Orbi-trap and ddMS2 in ion trap (IT) when using electron transferdissociation (ETD), higher-energy C-trap dissociation (HCD),and collision-induced dissociation (CID). In Orbitrap fusion™Tribrid™method settings. Experiment 1. In a cycle time of 3 s,master scan in detector Orbitrap at resolution 120,000, usequadrupole isolation in scan range 350–1500 m/z, maximuminjection time: 50 ms. Include charge states: 2–8. Scan eventtype 1: condition charge states 3–4, range 300–1600 m/z.

Fig. 2 Phosphorylated amino acids: T1, T16, S22 (79.96633 Da), charge: +3. Identified with: Mascot (v1.36);fragments used for search: a, a-H2O, a-NH3; b, b-H2O, b-NH3; y-H2O, y-NH3

Phosphoproteomic in Peels 193

Scan event type 2: condition charge state 2, 3, 4, 5, range400–1600 m/z and scan event type 3: condition charge states2–5.

ddMS2 ETD. MS2, isolation mode in quadrupole. ETDactivation type, collision energy 10%. Ion trap scan rate: rapid.First mass 120 m/z. Maximum injection time: 100 ms. Orbi-traps and Q-TOF instruments have a mechanism of fragmen-tation that preserves the PTMs.

ddMS2 HCD. MS2, isolation mode: quadrupole, HCDactivation type with collision energy 30%. Detector type: iontrap, scan range mode: auto: m/z normal. Ion trap scan rate:rapid, first mass (m/z): 100, maximum Injection time (ms): 50.

ddMS2 CID. MS2, isolation mode: quadrupole, CID acti-vation type, collision energy 30%. Activation Q: 0.25. Ion trapscan rate: rapid. AGC target: 1.0e4. Maximum injection time(ms): 50.

Acknowledgments

This work was supported by the National Council of Science andTechnology (FS-1515, Fordecyt 292399, INFR-2015-01-255045, and INFR-2017-01-280898 to V.M.L. and U0004-PROCEDYT_2015-1_259915 to E.R.M.).

References

1. Friso G, van Wijk KJ (2015) Posttranslationalprotein modifications in plant metabolism.Plant Physiol 169:1469–1487

2. Mann M, Ong SE, Gronborg M et al (2002)Analysis of protein phosphorylation using massspectrometry: deciphering the phosphopro-teome. Trends Biotechnol 20:261–268

3. Kumar V, Khare T, Sharma M et al (2018)Engineering crops for the future: a phospho-proteomics approach. Curr Protein Pept Sci19:413–426

4. de la Fuente van Bentem S, Roitinger E,Anrather D et al (2006) Phosphoproteomicsas a tool to unravel plant regulatory mechan-isms. Physiol Plant 126:110–119

5. Macek B, Mann M, Olsen JV (2009) Globaland site-specific quantitative phosphoproteo-mics: principles and applications. Annu RevPharmacol Toxicol 49:199–221

6. Cutillas PR, Timms JF (2010) Approaches andapplications of quantitative LC-MS for proteo-mics and activitomics. In: Cutillas PR, TimmsJF (eds) LC-MS/MS in proteomics. Springer,New York, pp 3–17

7. Schweighofer A, Meskiene I (2015) Phospha-tases in plants. In: Schulze WX (ed) Plant

phosphoproteomics: methods and protocols.Springer, New York, pp 25–46

8. Yang Z, Li N (2015) Absolute quantitation ofprotein posttranslational modification isoform.In: Schulze WX (ed) Plant phosphoproteo-mics: methods and protocols. Springer,New York, pp 105–119

9. McNulty DE, Annan RS (2008) Hydrophilicinteraction chromatography reduces the com-plexity of the phosphoproteome and improvesglobal phosphopeptide isolation and detection.Mol Cell Proteomics 7:971–980

10. Gan CS, Guo T, Zhang H et al (2008) Acomparative study of electrostatic repulsion-hydrophilic interaction chromatography(ERLIC) versus SCX-IMAC-based methodsfor phosphopeptide isolation/enrichment. JProteome Res 7:4869–4877

11. Beltran L, Cutillas PR (2012) Advances inphosphopeptide enrichment techniques forphosphoproteomics. Amino Acids43:1009–1024

12. Silva-Sanchez C, Li H, Chen S (2015) Recentadvances and challenges in plant phosphopro-teomics. Proteomics 15:1127–1141

194 Janet Juarez-Escobar et al.

13. Fıla J, Honys D (2012) Enrichment techniquesemployed in phosphoproteomics. Amino Acids43:1025–1047

14. Ito J, Taylor NL, Castleden I et al (2009) Asurvey of the Arabidopsis thaliana mitochon-drial phosphoproteome. Proteomics9:4229–4240

15. Villen J, Gygi SP (2008) The SCX/IMACenrichment approach for global phosphoryla-tion analysis by mass spectrometry. Nat Protoc3:1630

16. Batth TS, Francavilla C, Olsen JV (2014)Off-line high-pH reversed-phase fractionationfor in-depth phosphoproteomics. J ProteomeRes 13:6176–6186

17. Wu J, Warren P, Shakey Q et al (2010) Inte-grating titania enrichment, iTRAQ labeling,and Orbitrap CID-HCD for global identifica-tion and quantitative analysis of phosphopep-tides. Proteomics 10:2224–2234

18. Ren L, Li C, Shao W et al (2017) TiO2 withtandem fractionation (TAFT): an approach forrapid, deep, reproducible, and high-throughput phosphoproteome analysis. J Pro-teome Res 17:710–721

19. Adelmant GO, Cardoza JD, Ficarro SB et al(2011) Affinity and chemical enrichment formass spectrometry-based proteomics analyses.In: Ivanov A, Lazarev A (eds) Sample prepara-tion in biological mass spectrometry. Springer,Dordrecht, pp 437–486

20. Reinders J, Sickmann A (2005) State-of-the-artin phosphoproteomics. Proteomics5:4052–4061

21. Li W, Backlund PS, Boykins RA et al (2003)Susceptibility of the hydroxyl groups in serineand threonine to b-elimination/Michael addi-tion under commonly used moderately high-temperature conditions. Anal Biochem323:94–102

22. Warthaka M, Karwowska-Desaulniers P, PflumMK (2006) Phosphopeptide modification andenrichment by oxidation-reduction condensa-tion. ACS Chem Biol 1:697–701

23. Pinkse MW, Uitto PM, Hilhorst MJ et al(2004) Selective isolation at the femtomolelevel of phosphopeptides from proteolyticdigests using 2D-NanoLC-ESI-MS/MS andtitanium oxide precolumns. Anal Chem76:3935–3943

24. Kweon HK, Hakansson K (2006) Selective zir-conium dioxide-based enrichment of phos-phorylated peptides for mass spectrometricanalysis. Anal Chem 78:1743–1749

25. Rivera JG, Choi YS, Vujcic S et al (2009)Enrichment/isolation of phosphorylated

peptides on hafnium oxide prior to mass spec-trometric analysis. Analyst 134:31–33

26. Ye J, Zhang X, Young C et al (2010) Opti-mized IMAC-IMAC protocol for phosphopep-tide recovery from complex biological samples.J Proteome Res 9:3561–3573

27. Larsen MR, Thingholm TE, Jensen ON et al(2005) Highly selective enrichment of phos-phorylated peptides from peptide mixturesusing titanium dioxide microcolumns. MolCell Proteomics 4:873–886

28. Aryal UK, Ross AR (2010) Enrichment andanalysis of phosphopeptides under differentexperimental conditions using titanium dioxideaffinity chromatography and mass spectrome-try. Rapid Commun Mass Spectrom24:219–231

29. Salomon AR, Ficarro SB, Brill LM et al (2003)Profiling of tyrosine phosphorylation pathwaysin human cells using mass spectrometry. ProcNatl Acad Sci U S A 100:443–448

30. Rush J, Moritz A, Lee KA et al (2005) Immu-noaffinity profiling of tyrosine phosphorylationin cancer cells. Nat Biotechnol 23:94

31. Bergstrom Lind S, Molin M, Savitski MM et al(2008) Immunoaffinity enrichments followedby mass spectrometric detection for studyingglobal protein tyrosine phosphorylation. J Pro-teome Res 7:2897–2910

32. Zhang G, Neubert TA (2006) Use of deter-gents to increase selectivity of immunoprecipi-tation of tyrosine phosphorylated peptidesprior to identification by MALDI quadrupole-TOF MS. Proteomics 6:571–578

33. Hogrebe A, von Stechow L, Bekker-Jensen DBet al (2018) Benchmarking common quantifi-cation strategies for large-scale phosphopro-teomics. Nat Commun 9:1045

34. Boersema PJ, Aye TT, van Veen TA et al (2008)Triplex protein quantification based on stableisotope labeling by peptide dimethylationapplied to cell and tissue lysates. Proteomics8:4624–4632

35. Bonhomme L, Valot B, Tardieu F et al (2012)Phosphoproteome dynamics upon changes inplant water status reveal early events associatedwith rapid growth adjustment in maize leaves.Mol Cell Proteomics 11:957–972

36. Boex-Fontvieille E, Daventure M, Jossier Met al (2013) Photosynthetic control of Arabi-dopsis leaf cytoplasmic translation initiation byprotein phosphorylation. PLoS One 8:e70692

37. Weckwerth W, Willmitzer L, Fiehn O (2000)Comparative quantification and identificationof phosphoproteins using stable isotope label-ing and liquid chromatography/mass

Phosphoproteomic in Peels 195

spectrometry. Rapid Commun Mass Spectrom14:1677–1681

38. Goshe MB, Veenstra TD, Panisko EA et al(2002) Phosphoprotein isotope-coded affinitytags: application to the enrichment and identi-fication of low-abundance phosphoproteins.Anal Chem 74:607–616

39. Heazlewood JL, Durek P, Hummel J et al(2007) PhosPhAt: a database of phosphoryla-tion sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor. NucleicAcids Res 36(Database):D1015–D1021

40. Cheng H, Deng W, Wang Y et al (2014)dbPPT: a comprehensive database of proteinphosphorylation in plants. Database (Oxford)2014:bau121

41. Yao Q, Ge H, Wu S et al (2013) P3DB 3.0:from plant phosphorylation sites to proteinnetworks. Nucleic Acids Res 42(Databaseissue):D1206–D1213

42. Nakagami H, Sugiyama N, Mochida K et al(2010) Large-scale comparative phosphopro-teomics identifies conserved phosphorylationsites in plants. Plant Physiol 153:1161–1174

43. Rose CM, Venkateshwaran M, Grimsrud PAet al (2012) Medicago phosphoprotein data-base: a repository for Medicago truncatulaphosphoprotein data. Front Plant Sci 3:122

44. Hummel J, Niemann M, Wienkoop S et al(2007) ProMEX: a mass spectral referencedatabase for proteins and protein phosphoryla-tion sites. BMC Bioinformatics 8:216

45. Baerenfaller K, Grossmann J, Grobei MA et al(2008) Genome-scale proteomics reveals Ara-bidopsis thaliana gene models and proteomedynamics. Science 320:938–941

46. Hirsch-Hoffmann M, Gruissem W, Baerenfal-ler K et al (2012) pep2pro: the high-throughput proteomics data processing, analy-sis, and visualization tool. Front Plant Sci3:123

47. Dink H, Chica C, Via A et al (2011) Phospho.ELM: a database of phosphorylation sites.Nucleic Acids Res 39:D261–D267

48. Blom N, Sicheritz-Ponten T, Gupta R et al(2004) Prediction of post-translational glyco-sylation and phosphorylation of proteins fromthe amino acid sequence. Proteomics4:1633–1649

49. Trost B, Kusalik A (2013) Computationalphosphorylation site prediction in plants usingrandom forests and organism-specific instanceweights. Bioinformatics 29:686–694

50. Wong YH, Lee TY, Liang HK et al (2007)KinasePhos 2.0: a web server for identifyingprotein kinase-specific phosphorylation sitesbased on sequences and coupling patterns.Nucleic Acids Res 35:W588–W594

51. Obenauer JC, Cantley LC, Yaffe MB (2003)Scansite 2.0: proteome-wide prediction of cellsignaling interactions using short sequencemotifs. Nucleic Acids Res 31:3635–3641

52. Que S, Li K, Chen M et al (2012) PhosphoR-ice: a meta-predictor of rice-specific phosphor-ylation sites. Plant Methods 8:5

53. Gao J, Thelen JJ, Dunker AK et al (2010)Musite: a tool for global prediction of generaland kinase-specific phosphorylation sites. MolCell Proteomics 9:2586–2600

54. Lee TY, Bretana NA, Lu CT (2011) PlantPhos:using maximal dependence decomposition toidentify plant phosphorylation sites with sub-strate site specificity. BMC Bioinformatics12:261

55. Su MG, Lee TY (2013) Incorporating sub-strate sequence motifs and spatial amino acidcomposition to identify kinase-specific phos-phorylation sites on protein three-dimensionalstructures. BMC Bioinformatics 14:S2

56. Laemmli UK (1970) Cleavage of structuralproteins during the assembly of the head ofbacteriophage T4. Nature 227:680–685

57. Dephoure N, Gould KL, Gygi SP et al (2013)Mapping and analysis of phosphorylation sites:a quick guide for cell biologists. Mol Biol Cell24:535–542

196 Janet Juarez-Escobar et al.

Chapter 15

Label-Free Quantitative Phosphoproteomics for Algae

Megan M. Ford, Sheldon R. Lawrence II, Emily G. Werth,Evan W. McConnell, and Leslie M. Hicks

Abstract

The unicellular alga Chlamydomonas reinhardtii is a model photosynthetic organism for the study ofmicroalgal processes. Along with genomic and transcriptomic studies, proteomic analysis of Chlamydomo-nas has led to an increased understanding of its metabolic signaling as well as a growing interest in theelucidation of its phosphorylation networks. To this end, mass spectrometry-based proteomics has madegreat strides in large-scale protein quantitation as well as analysis of posttranslational modifications (PTMs)in a high-throughput manner. An accurate quantification of dynamic PTMs, such as phosphorylation,requires high reproducibility and sensitivity due to the substoichiometric levels of modified peptides, whichcan make depth of coverage challenging. Here we present a method using TiO2-based phosphopeptideenrichment paired with label-free LC-MS/MS for phosphoproteome quantification. Three technicalreplicate samples in Chlamydomonas were processed and analyzed using this approach, quantifying atotal of 1775 phosphoproteins with a total of 3595 phosphosites. With a median CV of 21% acrossquantified phosphopeptides, implementation of this method for differential studies provides highly repro-ducible analysis of phosphorylation events. While the culturing and extraction methods used are specific tofacilitate coverage in algal species, this approach is widely applicable and can easily extend beyond algae toother photosynthetic organisms with minor modifications.

Key words Phosphorylation, Quantitative proteomics, Mass spectrometry,Chlamydomonas reinhard-tii, Label-free, Algae

1 Introduction

The unicellular alga Chlamydomonas reinhardtii is a model organ-ism for the study of microalgal processes, particularly photosynthe-sis due to its photoheterotrophic growth [1]. More recently,Chlamydomonas research has expanded to include the utilizationof microalgae for biofuel production due to their ability to producelarge amounts of triacylglycerol while having rapid growth poten-tial and tolerance to environmental conditions [2]. Along with

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_15, © Springer Science+Business Media, LLC, part of Springer Nature 2020

Electronic supplementary material:The online version of this chapter (https://doi.org/10.1007/978-1-0716-0528-8_15) contains supplementary material, which is available to authorized users.

197

genomic and transcriptomic studies [3–5], proteomic analysis ofChlamydomonas has led to an increased understanding of its meta-bolic signaling as well as a growing interest in the elucidation of itsphosphorylation networks, particularly those related to biofuelproduction [6, 7].

Protein phosphorylation is a posttranslational modification(PTM) that serves as a rapid and reversible means to modulatingprotein activity and signal transduction in the cell. This modifica-tion involves the addition of a phosphate group to an amino acid bya protein kinase, which together with phosphatases, can act as amolecular switch to regulate complex signaling networks. Proteinphosphorylation has been extensively studied for more than60 years due to its widespread prevalence and its critical involve-ment in the regulation of nearly all basic cellular processes[8, 9]. Dynamic protein phosphorylation plays a central role incell proliferation, metabolism, signaling, and survival, emphasizingthe need for an efficient and selective method of analysis. However,studying these events remains an analytical challenge.

One important challenge stems from the labile nature of phos-phorylation. As a PTM that is tightly linked to protein function, thephosphorylation status of proteins continually changes in responseto specific conditions and stimuli. Thus, understanding phosphor-ylation requires detection and quantification of the same phospho-protein(s) in multiple states, or proteoforms, across differentconditions while using sample preparation techniques, such asflash freezing and the use of phosphatase inhibitors, to ensure thesignal being analyzed is answering the biological question of inter-est. An additional challenge arises from the large dynamic range ofphosphorylation events in the cell, which is dependent on theabundance of the protein in the cell, that can span many orders ofmagnitude [10], and the occupancy of the phosphorylation site,which is generally low at any given time [10]. Also, while phos-phorylation occurs on thousands of proteins, many of them sharelittle sequence homology, increasing the difficulty in identifyingdynamic changes in phosphorylation across an entirephosphoproteome [11].

To date several enrichment approaches have been employed toaddress the challenges in assessing protein phosphorylation[12]. Among these, titanium dioxide metal oxide affinity chroma-tography (TiO2-MOAC) is one of the most common shotgunenrichment methods for phosphopeptides from complex biologicalsamples [13–15]. TiO2-based enrichments have been shown to bemore selective [15], and are less sensitive to interferents such assalts and detergents than immobilized metal affinity chromatogra-phy [16, 17]. However, they show preference to singly phosphory-lated peptides over those with multiple phosphosites, potentiallydue to stronger interactions between TiO2 and multiphosphory-lated peptides making elution of these peptides challenging

198 Megan M. Ford et al.

[18]. At acidic pH, TiO2 has a high affinity for phosphorylatedspecies, forming a bidentate bond with the titanium surface andtwo of the oxygen atoms [18]. To minimize copurification of acidicpeptides, the use of organic acids, such as phthalic,2,5-dihydroxybenzoic, or lactic acid, as an additive for bindingenhances the overall selectivity of this enrichment method [12] .

LC-MS/MS offers highly reproducible and accurate systems-level analysis that can be paired with enrichment for the study oflarge-scale protein phosphorylation [19]. For quantification, alabel-free approach can provide advantages over label-based tech-niques, primarily in experimental design flexibility [20]. Label-freequantitation (LFQ), with a number of software programs availableto aid in data analysis [20], allows for rapid, straightforward, andcost-effective measurements of a wide range of protein abundances.Typically, LFQ is employed via one of two approaches: changes inion intensity from LC-peak areas [i.e., area under the curve (AUC)]of the peptides, or based on spectral counting of peptides fromMS2

analysis. The latter approach is limited in its ability to quantifyproteins of low abundance [6] due in part to the variability inspectral count response for each peptide making it necessary toobserve many spectra for a given protein to assume a linear responsebetween counts and abundance. Additionally, many experimentsemploy a dynamic exclusion of ions already selected for fragmenta-tion, making accurate quantitation with this method challenging[21]. In phosphoproteomics, quantitation is performed on a singlepeptide for each phosphorylation site, making AUC quantitationgenerally preferable for these studies. However, AUC requireshighly reproducible chromatography and high mass accuracybecause it relies on accurate peak alignment and mass measurementfor quantification.

Here we present a method to quantify the phosphoproteome ofChlamydomonas that uses a combination of efficient extraction,TiO2-based phosphopeptide enrichment and LFQ to provide indepth coverage of the phosphoproteome (Fig. 1). Using thismethod, analysis of replicate samples resulted in the quantificationof 3595 phosphosites on 1775 phosphoproteins. Assessment of thereproducibility of this method shows the technical replicates arehighly similar, with a 21% median CV. These results are similar toprevious studies performed using a similar approach that uses iden-tical sample preparation and LC separation with a different make/model of mass spectrometer [19, 22]. While our quantitativebreadth of coverage is extensive, qualitative studies have shownthat the global phosphoproteome is still drastically larger than canbe obtained in a shotgun LFQ approach. A previous study [23],which used two enrichment methods and additional fractionationto create a total of 60 samples subjected to LC-MS/MS, identifiedover 4500 phosphoproteins from nearly 16,000 phosphosites,showing that there is room for improvement in the depth of

Algal Phosphoproteomics 199

coverage obtained in these phosphoproteomic studies. Implemen-tation of an orthogonal fractionation prior to analysis would helpimprove this depth of coverage, but at the cost of increased instru-ment time and variability from the added sample preparation.Although providing moderate depth of coverage, the method out-lined here provides an accurate and high-throughput approach foranalyzing algal phosphoproteomic samples.

2 Materials

2.1 Cell Culture 1. Hutner’s Trace Elements stock [24]. This can be purchased as astock solution or prepared in lab (see Note 1).

2. TRIS–Acetate–Phosphate (TAP) Media: 20 mM TRIS base,17.5 mM acetic acid, 1.65 mM K2HPO4, 945 μM KH2PO4,287 μM CaCl2, 405 μM MgSO4, 7.01 mM NH4Cl, and Hut-ner’s Trace Elements. Stock solutions can be made for easypreparation of TAP media (see Note 2).

3. TAP agar media plates, 1.5% agar: To TAP media (see Subhead-ing 2.1, item 2), add Bacto Agar and autoclave. Cool media to52 �C and pour plates into petri dishes, 100 � 15 mm, inbiosafety cabinet, about 10 mL per plate. Let plates solidifyovernight, Parafilm to seal each plate and store at 4 �C.

4. Chlamydomonas reinhardtii, strain CC-2895 (6145c mt-).

5. 100 μE m�2 s�1 white light source.

6. Platform shaker.

7. Liquid nitrogen, 0.5 L.

Fig. 1 Phosphoproteomic workflow for Chlamydomonas reinhardtii cells. Briefly, Chlamydomonas cultures areharvested, resuspended in lysis buffer and sonicated. The lysate is collected and soluble proteins are reduced,alkylated and digested with trypsin. Phosphopeptides are enriched for using a titanium dioxide-based (TiO2)enrichment before being subjected to LC-MS/MS analysis. For the data reported here, samples were pooledafter resuspension and aliquoted into three technical replicates to remove any biological variation

200 Megan M. Ford et al.

2.2 Protein

Extraction

1. Lysis buffer: 100 mM TRIS, pH 8.0, 1% Sodium dodecylsulfate (SDS), 1� cOmplete protease inhibitor cocktail(Roche, Risch-Rotkreuz, Switzerland) and 1� phosSTOPphosphatase inhibitor cocktail (Roche). Stock solutions canbe made for easy preparation of lysis buffer (see Note 3).When preparing the lysis buffer, stir slowly when fully dissol-ving contents to minimize agitation and bubble formationfrom the SDS.

2. Covaris 2 mL milliTUBE tubes and 24 Place milliTUBE rack.

3. 100 mM ammonium acetate in methanol (MeOH).

4. 70% ethanol (EtOH).

5. 100 mMTRIS, pH 8.0. Using a 1M TRIS stock (seeNote 3) isrecommended for ease of buffer preparation.

6. Resuspension buffer: 8 M urea, 100 mM TRIS, pH 8.0.

7. CB-X Protein Assay Kit (G-Biosciences, St. Louis, MO, USA)or equivalent protein quantification assay.

2.3 Reduction,

Alkylation, and

Digestion

1. Reduction buffer: 500 mM dithiothreitol in 100 mM TRIS,pH 8.0.

2. Alkylation buffer: 500 mM iodoacetamide (IAM) in 100 mMTRIS, pH 8.0. Make fresh for each experiment and cover tubewith aluminum foil or keep buffer in the dark to preventdegradation of light-sensitive IAM solution.

3. Trypsin resuspension buffer: 50 mM acetic acid.

4. Promega (Madison, WI, USA) Trypsin Gold, Mass Spectrom-etry grade.

5. 20% trifluoroacetic acid (TFA).

2.4 Desalting 1. Waters (Milford, MA, USA) Sep-Pak C18 1 cc Vac Cartridge,50 mg, 55–105 μm particle size.

2. 0.1% TFA (LC-MS grade).

3. 80% acetonitrile (ACN, LC-MS grade), 0.1% TFA (LC-MSgrade).

4. Vacuum manifold with 24-port cover (Phenomenex, Torrance,CA, USA) or equivalent setup.

2.5 Phosphopeptide

Enrichment

1. Wash Buffer: 80% ACN (LC-MS grade), 1% TFA (LC-MSgrade).

2. Resuspension Buffer: 80% ACN (LC-MS grade), 1% TFA(LC-MS grade), 25 mg/mL phthalic acid. This can be madeby adding phthalic acid to the Wash Buffer.

Algal Phosphoproteomics 201

3. Elution Buffer: 20% ACN (LC-MS grade), 5% aqueousammonia.

4. TiO2 phosphopeptide enrichment tips, 3 mg. Titansphere™Phos-TiO Spin Columns (GL Sciences, Torrance, CA, USA)recommended.

5. Spin column centrifuge adaptors.

2.6 Sample

Purification

1. 1% formic acid (FA, LC-MS grade), 2% ACN (LC-MS grade).

2. 0.1% FA (LC-MS grade).

3. 60% ACN (LC-MS grade), 0.1% FA (LC-MS grade).

4. Millipore (Burlington, MA, USA) C18 ZipTips.

2.7 LC-MS/MS 1. 5% ACN (LC-MS grade), 0.1% TFA (LC-MS grade).

2. LC-MS Total Recovery Vials.

3. Symmetry C18 trap column (100 A, 5 μm, 180 μm � 20 mm;Waters).

4. HSS T3 C18 column (100 A, 1.8 μm, 75 μm � 250 mm;Waters). Mobile Phase A: 0.1% FA. Add 1 mL of OptimaLC-MS grade FA to 1 L of Optima LC-MS grade water.

5. Mobile Phase B: 0.1% FA in ACN (LC-MS grade).

6. NanoAcquity UPLC system (Waters).

7. Q Exactive HF-X Hybrid Quadrupole Orbitrap mass spec-trometer (ThermoFisher, Waltham, MA, USA).

2.8 Data Analysis 1. Progenesis QI for Proteomics v2.0 (Nonlinear Dynamics, Dur-ham, NC, USA).

2. Mascot Daemon v3.5.1 (Matrix Science, Boston, MA, USA).

3. R script for processing phosphoproteome data. The code usedfor processing these data is available on GitHub (https://github.com/hickslab/QuantifyR).

3 Methods

3.1 Culturing 1. Maintain Chlamydomonas strain on TAP agar plates undercontinuous light, streaking a fresh plate from a single colonyon a previous plate every 1–2 weeks.

2. Grow a 100 mL starter culture of Chlamydomonas using TAPmedia in a 250 mL flask. In a biosafety cabinet, select a singlecolony from a TAP agar plate and suspend it in the TAP media.Grow the culture 4–5 days shaking at 120 rpm and undercontinuous light until a growth density of OD750 0.4–0.5 isreached.

202 Megan M. Ford et al.

3. Prepare 6 � 350 mL liquid culture of Chlamydomonas in TAPmedia. Transfer 3.5 mL of a starter culture to fresh TAP media.Use a 1 L flask for 350 mL of culture to provide sufficient roomfor consistent mixing. Shake at 120 rpmwith 100 μmolm�2 s�1

white light at room temperature. Grow for 3–4 days until anOD750 of 0.4–0.5 is reached (see Note 4).

4. Centrifuge each culture for 5 min at 6000 � g at 4 �C in a 1 Lcentrifuge bottle to harvest the Chlamydomonas.

5. Decant the supernatant from each culture while not disturbingthe cell pellet in the centrifuge bottle.

6. Resuspend the Chlamydomonas pellets in 10 mL of fresh TAPmedia and transfer each solution to a 15 mL conicalcentrifuge tube.

7. Centrifuge each culture for 2 min at 3200 � g, at 4 �C.

8. Decant the supernatant from each culture while not disturbingthe cell pellet in the centrifuge bottle.

9. Place the conical centrifuge tubes containing cell pellets inliquid nitrogen until fully frozen. Store at �80 �C untilperforming plant-based protein extraction.

3.2 Protein

Extraction

1. Resuspend cell pellets in 4 mL lysis buffer (see Note 5) andtransfer to Covaris 2 mL tubes. Keep samples on ice duringresuspension.

2. Sonicate samples in a 4 �C water bath for 3 min at 200 cycles/burst, 100W power, and 13% duty cycle using an E220 focusedultrasonicator (Covaris, Woburn, MA, USA).

3. Transfer samples from Covaris tubes to 2 mL centrifuge tubes,keeping the samples on ice.

4. Centrifuge cell lysates at 16,000 � g for 10 min at 4 �C andcollect the supernatant into a 50 mL conical tube.

5. Add 1 mL of fresh lysis buffer to the pelleted cell debris andvortex.

6. Centrifuge this sample again at 16,000 � g for 10 min at 4 �C.Collect the supernatant and combine with the first extraction ina 15 mL conical tube.

7. Precipitate proteins by adding 5 volumes (about 30 mL) ofcold 100 mM ammonium acetate in MeOH. Incubate samplesovernight at �80 �C.

8. Collect protein pellet by centrifuging for 5 min at 2000 � g.Decant the supernatant without disturbing the pellet.

9. Perform two additional washes with 30 mL fresh 100 mMammonium acetate in MeOH followed by a wash with 30 mL70% EtOH. For each wash, resuspend the pellet by vortexingbefore centrifuging for 5 min at 2000 � g.

Algal Phosphoproteomics 203

10. Allow protein pellets to dry for 5 min in a fume hood at roomtemperature.

11. Resolubilize the pellets in 1–2 mL minimal resuspensionbuffer. Incubate for 1 hr. to ensure protein is fully dissolved.

12. Use a 10 μL aliquot of each replicate to perform proteinquantification using the CB-X Protein Assay. Complete assayusing manufacturer’s protocol (see Note 6).

13. Normalize each replicate to 4mg/mL and use a 0.5 mL aliquot(2 mg) of each sample to continue through the remaining stepsin the protocol.

3.3 Reduction,

Alkylation, and

Digestion

1. Reduce samples using 10 mM DTT. Add 10 μL reductionbuffer to each sample. Incubate for 30 min at room tempera-ture while shaking (500–850 rpm).

2. Alkylate samples using 40 mM IAM (seeNote 7). Add 40 μL ofalkylation buffer to each sample. Incubate for 45 min in thedark at room temperature while shaking.

3. Following alkylation, diluted the samples fivefold using100 mM TRIS, pH 8.0 so the concentration of urea is <2 M,which is a requirement for effective tryptic protein digestion.For 0.5 mL samples, add 2 mL of 100 mM TRIS, pH 8.0.

4. Perform overnight digestion using mass spectrometry-gradetrypsin (Trypsin Gold from Promega is recommended) at aprotease to protein ratio of 1:50 at 25 �C. For 2 mg lysate,40 μg trypsin is needed. Gently invert or shake the samplesduring digestion.

5. Following digestion, quench the reaction by adding 20% TFAto the samples until their pH is less than 3 when measured witha pH test strip. Usually 0.2–0.4% final volume TFA, or 5–10 μLfor 2.5 mL samples, is sufficient.

6. Freeze samples at �80 �C following digestion until desaltingusing 50 mg SepPak (Waters) cartridges is performed.

3.4 Desalting 1. Thaw samples on ice and centrifuge them for 5 min at10,000 � g to pellet. Remove undigested protein pellet fromsoluble peptide mixture to avoid clogging the cartridges.

2. Set up one cartridge for each sample on a vacuum manifoldusing test tubes to collect the flow through from the cartridges.

3. Wet cartridges by adding 1 mL of 80% ACN, 0.1% TFA (seeNote 8).

4. Equilibrate cartridges using 2 mL of 0.1% TFA.

5. Load peptide samples onto the cartridge and recover the flowthrough in a new test tube.

6. Reapply this flow through to the cartridge.

204 Megan M. Ford et al.

7. After the flow through passes through, switch to a new testtube and flow 2 mL of 0.1% TFA are added to the cartridges toremove salts.

8. Elute desalted peptides into a new 2mL tube by adding 1.5 mLof 80% ACN, 0.1% TFA to the cartridge. Once the elutionflows all the way through the cartridge, apply vacuum for about5 s to collect the remaining solvent from the packed bed.

9. Following peptide elution, freeze the samples and vacuumcentrifuge to dryness.

3.5 Phosphopeptide

Enrichment

1. Each sample uses one TiO2 tip placed in a microcentrifuge tubeusing an adaptor. Preelute the tips using 100 μL of elutionbuffer (see Note 9).

2. Condition each tip with 100 μL of wash buffer twice, for a totalof 200 μL, followed by 3 washes using 100 μL of resuspensionbuffer.

3. Resuspend the dried peptides in 150 μL of resuspension buffer.Centrifuge the samples at 10,000 � g for 5 min to preventclogging and load onto the tips. Use a new centrifuge tube torecover the sample flow through.

4. Reapply the flow through five times.

5. Following binding using a new centrifuge tube, wash the tipsusing 100 μL of resuspension buffer twice and then wash threetimes with 100 μL of wash buffer.

6. Using a new centrifuge tube to collect the buffer, elute thephosphopeptide-enriched samples using two aliquots of100 μL of elution buffer, combining them for a total of200 μL of elution.

7. Flash-freeze the elution with liquid nitrogen and vacuum cen-trifuge to dryness with the concentrator set to roomtemperature.

3.6 Sample

Purification

1. Resuspend phosphopeptide-enriched samples in 15 μL 1% FA,2% ACN.

2. Centrifuge the samples at 15,000 � g for 5 min and transfer toa new tube, taking care not to disturb the pellet if present, toremove any insoluble portion of the sample.

3. Aliquot 15 μL 60% ACN, 0.1% FA for each sample into its owntube to elute samples from the ZipTip.

4. Perform a C18 ZipTip purification on each sample, using a newtip each time (see Note 10).

5. Attach a ZipTip to a 10 μL pipette. With pipette set to 10 μL,draw up LC-MS grade ACN to wet the tip. Discard the ACN

Algal Phosphoproteomics 205

while keeping the resin wet. Repeat twice for a total of threepreelution steps.

6. Equilibrate the ZipTip by pipetting 0.1% FA three times, dis-carding the solvent each time while keeping the resin wet.

7. Pipet the sample 10 times to load the peptides onto the ZipTip.

8. Wash six times with 0.1% FA.

9. Elute the peptides by pipetting 10 times using aliquoted elu-tion solvent from step 4, expelling all of the solvent from thepipette tip.

10. Dry down all of the eluted peptide samples.

3.7 LC-MS/MS 1. Resuspend phosphopeptide samples in 20 μL and whole cellsamples in 40 μL of 5% acetonitrile, 0.1% TFA and transfer to aTotal Recovery Vial (Waters).

2. Inject 5 μL of each sample and perform LC-MS/MS analysison each sample using a NanoAcquity UPLC system (Waters)coupled to a Q Exactive HF-X Hybrid Quadrupole Orbitrapmass spectrometer (ThermoFisher) via a Nanospray Flex IonSource (ThermoFisher). Inject the peptide mixture to a Sym-metry C18 trap column (100 A, 5 μm, 180 μm � 20 mm;Waters) with a flow rate of 5 μL/min for 3 min using 99% A and1% B, then separate on a HSS T3 C18 column (100 A, 1.8 μm,75 μm � 250 mm; Waters) using a gradient of increasingmobile phase B at a flow rate of 300 nL/min for 120 mintotal. Increase mobile phase B from 5–35% in 90 min, rampto 85% in 5 min, hold for 5 min, return to 5%mobile phase B in2 min, and reequilibrate for 13 min.

3. Use the following MS parameters: Use a tune file set withpositive polarity, 2.2 kV spray voltage, 325 �C capillary tem-perature, and 40 S-lens RF level. In the instrument method,include lock masses best of 371.10124 and 445.12003 back-ground polysiloxane ions. Select full MS/DD-MS2 scan typeand set method duration to 120 min and default charge state to2. Perform MS survey scan in profile mode across350–1600 m/z at 120,000 resolution until 50 ms maximumIT or 3 � 106 AGC target is reached. Select the top 20 featuresabove 5000 counts excluding ions with unassigned, +1, or>+8charge state. Collect MS2 scans at 45,000 resolution with NCEat 32 until 100ms maximum ITor 1� 105 AGC target. Set thedynamic exclusion window for precursor m/z to 10 s and anisolation window of 0.7 m/z. Check the system’s performanceevery 8 h using an injection of BSA tryptic digest run with thesame instrument method.

206 Megan M. Ford et al.

3.8 Data Analysis 1. Upload acquired spectral files (∗.raw) into Progenesis QI forProteomics (Nonlinear Dynamics). Use automatically assignedreference spectrum to align the total ion chromatograms tominimize run-to-run differences in retention time and normal-ize peak abundances. Design experiment so that replicates aregrouped together as one subject. Export a combined peak list(∗.mgf).

2. Upload and determine peptide sequence and protein inferenceusing Mascot (Matrix Science). Use the following search para-meters: Search against the database containing the proteomefor the organism of interest, in this case the Phytozome Chla-mydomonas proteome appended with the NCBI mitochon-drial and chloroplast databases, along with the sequence forcommon laboratory contaminants (www.thegpm.org/cRAP;116 entries). Use a target decoy MS/MS search with trypsinprotease specificity with up to two missed cleavages, a peptidemass tolerance of 15 ppm, and a fragment mass tolerance of0.1 Da. Set a fixed modification of carbamidomethylation atcysteine and include the following variable modifications: acet-ylation at the protein N-terminus, oxidation at methionine,and phosphorylation at serine, threonine, and tyrosine. Afterthe search is complete, adjust the false discovery rate of thesignificant peptide identifications to be less than 1% using theembedded Percolator algorithm. Export matches (∗.xml) andreupload data to Progenesis.

3. From Progenesis, export the “Peptide Measurements” fromthe “Review Proteins” tab (Table S1). These data can be usedto determine the number of phosphosites, and phosphopro-teins identified in each replicate (Fig. 2a) and the reproducibil-ity can be assessed (Fig. 3).

4. The proteomics data have been deposited to the ProteomeX-change Consortium (www.proteomexchange.org) via thePRIDE partner repository [25] with the dataset identifiersPXD012261.

5. Parse data using custom R script found at GitHub (https://github.com/hickslab/QuantifyR) or using similar parsingtechnique. This script groups together features matched withidentical sequence, modifications, and score with differing pro-tein accessions, representing them by the protein accessionwith the highest number of unique peptides and largest confi-dence score assigned by Progenesis. Features duplicated bymultiple peptide identifications are reduced to a single peptidewith the highest Mascot ion score. The results are then limitedto only peptide with one or more phosphosites. Identifiers aremade by joining the protein accession of each feature with thesingle-letter amino acid code of the modified residue and

Algal Phosphoproteomics 207

location of the modification. The data are then reduced tounique identifiers by summing the abundance of all contribut-ing features (charge states, missed cleavages, etc.). Each identi-fier group is represented in the final dataset by the peptide withthe highest Mascot score (Table S2). Using these parsedresults, the total number of phosphosites, phosphoproteins,and %CV can be calculated for the three replicates (Fig. 2aand b).

4 Notes

1. Hutner’s Trace Elements stock preparation taken from Chla-mydomonas Resource Center (www.chlamycollection.org).Stock preparation is extensively described on the Chlamydo-monas Resource Center Website and by Hutner et al. [24].

Fig. 2 Summary of quantitation results between three replicate samples. A. Number of phosphopeptides,phosphoproteins, and statistics for each individual replicate and combined data with filtered and imputeddata. B. Histogram of the % CV for quantitated phosphosites

Fig. 3 Plots comparing the log2 transformed abundances between replicate samples

208 Megan M. Ford et al.

2. TAP Salts Stock (40�): Add 15.00 g of NH4Cl, 4.00 gMgSO4·7H2O, and 2.00 g CaCl2·2H2O to 1 L water. Stiruntil dissolved and autoclave. TAP Phosphate Stock (1000�):Add 288.00 g K2HPO4 and 144.00 g KH2PO4 to 1 L of water.Stir until dissolved and autoclave. TAP Acetate Stock, pH 7.0(50�): Add 121.00 g TRIS base and 50 mL of glacial aceticacid to 950mLwater. Stir to dissolve and filter sterilize. For 1 Lof media combine the following amounts of stock solutionsand autoclave: 25 mL TAP Salts Stock, 1 mL TAP PhosphateStock, 20 mL of TAP Acetate Stock, and 1 mL of Hutner’sTrace Elements.

3. 1 M TRIS Stock (10�), pH 8.0: Dissolve 121.10 g of TRISbase in 800 mL of water, adjust the pH to 8.0 by addingconcentrated HCl, and add water to a final volume of 1 L.20% SDS Stock: Add 20.00 g SDS to 80 mL water, slowly mixto dissolve keeping the speed low to prevent frothing andheating if needed to no higher than 68 �C, and adjust to finalvolume of 100 mL with water. For 10 mL of buffer, add 1 mL1 M TRIS Stock solution, 1 protease inhibitor tablet, 1 phos-phatase inhibitor tablet, and 0.5 mL 20% SDS stock solution to8.5 mL of distilled water.

4. An OD750 of 0.4–0.5 was identified as mid-log phase growthfor this strain of Chlamydomonas based on the known growthpatterns [22]. Growth curves should be measured and used toidentify the optical density where mid-log growth occurs whenusing this method to study other strains or algal species. Thisensures that the cells are actively growing, there is no shortageof any nutrients, and enough material is harvested for eachsample to perform phosphoproteomic analysis.

5. Three Chlamydomonas cultures were harvested, resuspendedin lysis buffer, combined, and realiquoted into three technicalreplicates to assess the reproducibility of this method and nor-malize any biological variability in the samples. When using thismethod for differential studies, each culture should be abiological replicate, with no recombination step.

6. For CB-X protein assay, take a 10 μL aliquot of the proteinsample and perform the assay according to the manufacturer’sinstructions. Briefly, add 1 mL of CB-X reagent and vortex.Centrifuge the sample at 15,000 � g for 5 min. Remove thesupernatant without disturbing the pelleted protein. Add50 μL of Solubilization Buffer 1 and 50 μL SolubilizationBuffer 2, and pipet to resuspend the pellet. Incubate for1 min before adding 1 mL CB-X Assay Dye. Incubate for5 min before measuring the absorbance of the sample at595 nm.

7. IAM in solution is unstable and light sensitive. Keep IAMsolution in the dark before and during alkylation to prevent

Algal Phosphoproteomics 209

degradation. Covering the tubes or mixer with aluminum foilworks well for this.

8. When using C18 SepPak cartridges, a manifold can be used toapply vacuum to the samples to increase the flow rate throughthe cartridges. Vacuum can be used for all of the steps in theprocedure except for the initial loading of the peptides onto thecartridge and the elution of the peptides. Flow rate should notexceed 1 mL/min when vacuum is used. The bed of thecartridges should stay wet throughout the procedure bykeeping a small amount of solvent above the packed bed at alltimes.

9. For each step in the enrichment, centrifuge the tips at 1000� gat room temperature to pass buffer through the column. Forsteps using 100 μL and 150 μL buffer, centrifuge the tips for3 min and 5 min, respectively.

10. ZipTips work by drawing solvent through the resin using amicropipette to aspirate up and down. It is important that theresin remains wet throughout the purification by leaving asmall amount of solvent visible above the resin bed at alltimes until the sample is ready for elution.

Acknowledgments

This research was supported by a National Science FoundationCAREER award (MCB-1552522) awarded to L.M.H. NSF MRI(CHE-1726291) supported the purchase of the Q-Exactive HF-Xmass spectrometer, and we thank Dr. Brandie Ehrmann for trainingon the HF-X instrument.

References

1. Harris EH (2001) Chlamydomonas as a modelorganism. Annu Rev Plant Physiol Plant MolBiol 52:363–406

2. Hu Q, Sommerfeld M, Jarvis E et al (2008)Microalgal triacylglycerols as feedstocks forbiofuel production: perspectives and advances.Plant J 54(4):621–639

3. Merchant SS, Prochnik SE, Vallon O et al(2007) The Chlamydomonas genome revealsthe evolution of key animal and plant func-tions. Science 318:245–250

4. Zones JM, Blaby IK, Merchant SS et al (2015)High-resolution profiling of a synchronizeddiurnal transcriptome from Chlamydomonasreinhardtii reveals continuous cell and meta-bolic differentiation. Plant Cell 27:2743–2769

5. Miller R, Wu G, Deshpande RR et al (2010)Changes in transcript abundance in Chlamydo-monas reinhardtii following nitrogen depriva-tion predict diversion of metabolism. PlantPhysiol 154:1737–1752

6. Wang H, Alvarez S, Hicks LM (2012) Com-prehensive comparison of iTRAQ and label-free LC-based quantitative proteomicsapproaches using two Chlamydomonas rein-hardtii strains of interest for biofuels engineer-ing. J Proteome Res 11:487–501

7. Roustan V, Bakhtiari S, Roustan P-J et al(2017) Quantitative in vivo phosphoproteo-mics reveals reversible signaling processes dur-ing nitrogen starvation and recovery in thebiofuel model organism Chlamydomonas rein-hardtii. Biotechnol Biofuels 10:280. https://doi.org/10.1186/s13068-017-0949-z

210 Megan M. Ford et al.

8. Krebs EG, Fischer EH (1955) Phosphorylaseactivity of skeletal muscle extracts. J Biol Chem216:113–120

9. Fischer EH, Krebs EG (1955) Conversion ofphosphorylase b to phosphorylase a in muscleextracts. J Biol Chem 216:121–132

10. Eriksson J, Fenyo D (2010) Modeling experi-mental design for proteomics. Methods MolBiol 673:223–230

11. Blackburn K, Goshe MB (2009) Challengesand strategies for targeted phosphorylationsite identification and quantification usingmass spectrometry analysis. Brief Funct Geno-mic Proteomic 8:90–103

12. Dunn JD, Reid GE, Bruening ML (2010)Techniques for phosphopeptide enrichmentprior to analysis by mass spectrometry. MassSpectrom Rev 29:29–54

13. Kokubu M, Ishihama Y, Sato T et al (2005)Specificity of immobilized metal affinity-basedIMAC/C18 tip enrichment of phosphopep-tides for protein phosphorylation analysis.Anal Chem 77:5144–5154

14. Ruprecht B, Koch H, Medard G et al (2015)Comprehensive and reproducible phosphopep-tide enrichment using iron immobilized metalion affinity chromatography (Fe-IMAC) col-umns. Mol Cell Proteomics 14:205–215

15. Larsen MR, Thingholm TE, Jensen ON et al(2005) Highly selective enrichment of phos-phorylated peptides from peptide mixturesusing titanium dioxide microcolumns. MolCell Proteomics 4:873–886

16. Tsai C-F, Wang Y-T, Chen Y-R et al (2008)Immobilized metal affinity chromatographyrevisited: pH/acid control toward high selec-tivity in phosphoproteomics. J Proteome Res7:4058–4069

17. Ye J, Zhang X, Young C et al (2010) Opti-mized IMAC protocol for phosphopeptide

recovery from complex biological samples. JProteome Res 9:3561–3573

18. Aryal UK, Ross ARS (2010) Enrichment andanalysis of phosphopeptides under differentexperimental conditions using titanium dioxideaffinity chromatography and mass spectrome-try. Rapid Commun Mass Spectrom24:219–231

19. Werth EG, McConnell EW, Lianez IC et al(2019) Investigating the effect of target ofrapamycin kinase inhibition on the Chlamydo-monas reinhardtii phosphoproteome: fromknown homologs to new targets. New Phytol221:247–260

20. Neilson KA, Ali NA (2011) Less label, morefree: approaches in label-free quantitative massspectrometry. Proteomics 11:535–553

21. Bantscheff M, Schirle M, Sweetman G et al(2007) Quantitative mass spectrometry in pro-teomics: a critical review. Anal Bioanal Chem389:1017–1031

22. Werth EG, McConnell EW, Gilbert TSK et al(2017) Probing the global kinome and phos-phoproteome in Chlamydomonas reinhardtiivia sequential enrichment and quantitative pro-teomics. Plant J 89:416–426

23. Wang H, Gau B, Slade WO et al (2014) Theglobal phosphoproteome of Chlamydomonasreinhardtii reveals complex organellar phos-phorylation in the flagella and thylakoid mem-brane. Mol Cell Proteomics 13:2337–2353

24. Hutner SH, Provasoli L, Schatz A et al (1950)Some approaches to the study of the role ofmetals in the metabolism of microorganisms.Proc Am Philos Soc 94:152–170

25. Vizcaıno JA, Cote RG, Csordas A et al (2013)The PRoteomics IDEntifications (PRIDE)database and associated tools: status in 2013.Nucleic Acids Res 41:D1063–D1069

Algal Phosphoproteomics 211

Chapter 16

Targeted Quantification of Phosphopeptides by ParallelReaction Monitoring (PRM)

Sara Christina Stolze and Hirofumi Nakagami

Abstract

Parallel reaction monitoring (PRM) is a liquid chromatography–mass spectrometry (LC-MS)-based tar-geted peptide/protein quantification method that was initially implemented for Orbitrap mass spectro-meters. Here, we describe detailed workflows that utilize the freely available MaxQuant and Skylinesoftware packages to target peptides of interest, primarily focusing on phosphopeptides.

Key words Parallel reaction monitoring (PRM), Targeted quantification, Orbitrap mass spectrome-ter, Phosphopeptide, Phosphorylation, Posttranslational modification (PTM)

1 Introduction

Deducing the functions of gene products from genomic and tran-scriptomic information alone is difficult; thus, determination ofprotein abundance and posttranslational modification (PTM) sta-tus are crucial [1–3]. Recent developments in MS-based targetedproteomics methods, including parallel reaction monitoring(PRM), have paved the way for the sensitive detection and accuratequantification of peptides/proteins of interest in complex samples[4–6]. These MS-based methods are complementary to classicalWestern blotting but have not yet been widely utilized and/oraccepted in plant research fields. However, MS-based techniquesshould in fact be the methods of choice for peptide/protein quan-tification due to their higher sensitivity and the limited availabilityof specific antibodies for Western blotting [7].

PRM is an alternative to selected reaction monitoring (SRM)and was developed to take advantage of high-resolution and accu-rate mass analyzers incorporated in Orbitrap mass spectrometers[6, 8]. SRM is performed on triple quadrupole mass spectrometersand utilizes the third quadrupole to detect a single isolated frag-ment ion derived from a precursor ion. PRM, on the other hand,

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_16, © Springer Science+Business Media, LLC, part of Springer Nature 2020

213

utilizes a high-resolution and accurate mass analyzer, Orbitrap ortime-of-flight (TOF), to detect all fragment ions derived from aprecursor ion in parallel. PRM has several advantages over SRM.First, the identity of quantified ions can easily be confirmed becausefull MS/MS spectra are available for database search. Second, thequality of quantification can be reliably assessed from relative abun-dance distributions for expected and measured fragmented ions.Third, PRM has superior selectivity and sensitivity because of theuse of a high-resolution and accurate mass analyzer. Fourth, andmost importantly, PRM does not require the selection of fragmentions for the construction of a data acquisition method; hence, it ispossible to target peptides of interest without acquiring MS/MSspectra for the peptides in advance. For successful PRM analysis, alist of target precursor ions is needed, which can be derived from adata-dependent acquisition (DDA) or alternatively be generatedfrom an in silico digest of target proteins.

We here describe workflows for the setup of PRMmethods withtwo distinct approaches. The described methods focus especially onconstructing a target list for PRMdata acquisition onOrbitrapmassspectrometers with the use of the freely available MaxQuant andSkyline software packages [9–11]. The first approach (Method A)utilizes MS data acquired in DDA mode to construct a target list.The second approach (MethodB) explains how to construct a targetlist with protein sequence information alone and without MS datafor the targeted peptides. The described protocols were developedfor targeting phosphopeptides, but they can be easily adapted fornonmodified peptides or peptides with other modifications simplyby adjusting the parameters of the protocol.

2 Materials

MaxQuant software (https://maxquant.org) [9, 10]. Version1.5.7.4 was used for this protocol.

Skyline (64-bit) 4.2.019009 software (https://skyline.ms)[11] was used for this protocol.

3 Methods

Method A: Construction of a PRM-based targeted method usingdata-dependent acquisition (DDA) data.

3.1 Sample

Preparation for DDA

and PRM

Measurements

Prepare samples containing proteins of interest in the phosphory-lated form for DDA measurements in order to construct a PRMmethod for targeted quantification. To identify and then to targetphosphopeptides/phosphosites from plant material, phosphopep-tide enrichment protocols need to be optimized/established

214 Sara Christina Stolze and Hirofumi Nakagami

beforehand to ensure good reproducibility. Details of the titaniumdioxide (TiO2)-based phosphopeptide enrichment protocol thatwe use have been described previously [12]. For analysis of phos-phopeptides/phosphosites of in vitro–treated recombinant pro-teins, we recommend analyzing digested peptides without furtherphosphopeptide-enrichment. Parallel analysis of negative controlsin which the phosphorylated forms of the proteins of interest arelow-abundant or absent can help to define good targets for PRM.

3.2 DDA

Measurement Using

an Orbitrap Mass

Spectrometer

Measure phosphopeptide-enriched samples in DDA mode (seeNotes 1 and 2).

3.3 DDA Data

Processing

with MaxQuant

Software

After DDA measurement, analyze derived RAW data using Max-Quant software.

1. Download, install, and open the MaxQuant software (https://maxquant.org).

2. Go to the “Raw files” tab, and click “Load” to import ThermoRAW files. If more than two files are going to be analyzed,specify the experimental design in the same tab.

3. Go to the “Group-specific parameters” tab, and choose the“Modifications” option. Select phosphorylation of serine, thre-onine, and tyrosine as variable modifications besides the defaultsettings of alkylation (e.g., carbamidomethylation) of cysteineresidues as fixed and oxidation of methionine residues andprotein N-terminal acetylation as variable modifications.

4. Choose the “Digestion” option and select an enzyme. Weusually keep the default setting for “Max. missed cleavages,”which is “2.”

5. (Optional for quantification). Choose the “Label-free quantifi-cation” option and select the “LFQ.” Set the “LFQ min. ratiocount” parameter to “1,” and deselect the “Fast LFQ.” Quan-tification is optional and not needed for constructing a PRMmethod. We often select this option to evaluate the reproduc-ibility of the replicates.

6. Go to the “Global parameters” tab, and choose the“Sequences” option. Click “Add file” to import a FASTA-formatted protein database that is suitable for the analyzedsamples.

7. (Optional for quantification). Choose the “Adv. identification”option, and enable the “match between runs.”

8. (Optional for quantification). Choose the “Label-free quantifi-cation” option, and enable the “iBAQ.”

9. Click “Start” to run the analysis.

PRM of Phosphopeptides 215

3.4 Target List

Construction

with Skyline Software

Using the MaxQuant

Output File

1. Download, install, and open the Skyline software (https://skyline.ms).

2. Open a “Blank Document.”

3. If you have used Skyline before, go to the “Settings” menu, andclick “Default” to restore the default settings. Skyline automat-ically restores settings that were used on the previous occasion,and, therefore, it is necessary/desirable to clear previoussettings.

4. Go to the “Settings” menu, and open the “PeptideSettings” form.

5. Go to the “Digestion” tab, and select an enzyme. For “Maxmissed cleavages,” set the values that was defined for the Max-Quant search.

6. Go to the “Filter” tab, and set the “Min length” option to “7”and the “Max length” option to “25,” which are the defaultsettings for the MaxQuant search. If you have changed theseparameters for the MaxQuant search, match the parametersaccordingly. Set the “Exclude N-terminal AAs” option to “0.”

7. Go to the “Settings” menu, and open the “Transitionsettings” form.

8. Go to the “Filter” tab, and set the “Precursor charges” optionto “2, 3, 4”, the “Ion charges” option to “1,” and the “Iontypes” option to “p”(precursor). Set the “To: (of the “Production selection”)” option to the “last ion.”

9. Go to the “Instrument” tab, and set the “Minm/z” and “Maxm/z” options to the values that were used for the DDAmeasurement.

10. Go to the “Full-Scan” tab, and set parameters for the “MS1filtering.” Set the “Isotope peaks included” option to “Count”and the “Precursor mass analyzer” option to “Orbitrap.” Setthe “Resolving power: At:” option to the values that were usedfor the DDA measurement. Then, close the “Transitionsettings” form.

11. Save the document with the adjusted settings.

12. Import the MaxQuant output file. Go to the “File” menu, andclick “Peptide Search” under the “Import” option to open the“Import Peptide Search” form.

13. Select “DDA with MS1 filtering” for the “Workflow” option.Click “Add Files” to select the MaxQuant output “msms.txt”file, then click “Next.” The “mqpar” file for the MaxQuantsearch is also needed for building a spectral library and has tobe stored in a same location as the “msms.txt” file.

14. Skyline will build a spectral library from the “msms.txt” file andautomatically search for the Thermo RAW files that were used

216 Sara Christina Stolze and Hirofumi Nakagami

for the MaxQuant search. If the Thermo RAW files are storedin a different location, manually select the files, then click“Next.”

15. Select the option of adding all modifications and click “Next”(see Note 3).

16. Click “Browse” to import the FASTA-formatted protein data-base that was used for the MaxQuant search. If you intend tomake a target list for proteins of interest, import a databasewith the selected proteins only. Alternatively, the FASTA-formatted database can be copy-pasted into the box from atext document. Click “Finish” to start importing the data.

17. If you want to extract peptides that are unique to the proteinsof interest, select the “Remove duplicate peptides” and click“OK.” The uniqueness will only be assessed within theimported database, and, therefore, extra evaluation is neededto ensure uniqueness within the samples you plan to measure.

18. Arrange the Skyline window for selecting targets. We usuallyarrange the window with the following panels, “Peak Areas—Replicate Comparison”, “Retention Times—Replicate Com-parison”, and “Retention Times—Scheduling”, as shown inFig. 1. These panels can be opened from the “View” menu.

19. Inspect the data for precursors of interest. Good results can beachieved with sharp, defined peaks without coelution. An“idotp” value of >0.95 is recommended. High reproducibilityis also a good indicator. In some cases, the software does not

Fig. 1 Example of precursor selection from MS1 scanning in Skyline

PRM of Phosphopeptides 217

pick the correct peak for all replicates. If necessary, reintegratepeaks manually. You can add notes to a target upon right-clicking and choosing the “Edit note” option.

20. Keep precursors that you would like to target by PRM, anddelete all others from the “Targets.”

21. Export an isolation list. Go to the “File” menu, and click“Isolation List” under the “Export” option to open the“Export Isolation List” form.

22. Select “Thermo Q Exactive” for “Instrument type.” SeeNote 4 for additional recommendations. Click “OK” to createan isolation list and store it.

23. Save the document to be used as a template for the subsequentPRM data analysis. See Note 5 for an alternative option.

3.5 PRM Data

Acquisition

1. Create a PRMmethod using the Thermo Xcalibur software. Inour case, with a Thermo Q-Exactive Plus, a method combiningone full scan followed by n PRM (n ¼ number of targetedprecursors) events was used to acquire PRM data (see Note 6).Figure 2 shows the setup of the instrument method on aQ-Exactive Plus.

2. Measure samples with the PRM method.

3.6 PRM Data

Analysis

1. Process acquired RAW data using the MaxQuant software, asstated above (A). Alternatively, import the Thermo RAW fileinto the Skyline software for analysis (B). SeeNote 7 for limita-tions of option B. Data inspection with the Skyline software isbasically the same for both approaches (C).

A1. MaxQuant analysis: Analyze the PRM data using the Max-Quant software, applying the same settings as stated underSubheading 3.3.

A2. Skyline analysis: Prepare a document for data analysis.Open the document that was generated as stated in Sub-heading 3.4, step 23.

Fig. 2 Setup of a PRM method on a Q-Exactive Plus system (Thermo)

218 Sara Christina Stolze and Hirofumi Nakagami

A3. Adjust the “Transition settings.” Go to the “Filter” tab,and set the “Ion charges” option to “1, 2” and the “Iontypes” option to “p, b, y”. Set the “From: (of the “Production selection”)” option to “ion 1” and the “To: (of the“Product ion selection”)” option to “last ion.”

A4. Set parameters for the “MS/MS filtering” in the “Full-Scan” tab. Set the “Acquisition method” option to“Targeted” and the “Product mass analyzer” option to“Orbitrap.” Set the “Resolving power: At:” optionto values that were used for the PRM measurement.

A5. Save the document with the adjusted settings.

A6. Import the MaxQuant output file. Go to the “File” menu,and click “Peptide Search” under the “Import” option toopen the “Import Peptide Search” form.

A7. Select the “Filter for document peptides” option, andselect “PRM” for the “Workflow” option. Click “AddFiles” to select the MaxQuant output “msms.txt” file.The “mqpar” file for the MaxQuant search is also neededfor building a spectral library, and has to be stored in thesame location as the “msms.txt” file. Click “Next” toproceed.

A8. Select the Thermo RAW files to be analyzed, and click“Next.”

A9. Select the option of adding all modifications, and click“Next.”

A10. Define the number of product ions to be quantified usingthe “Pick” option, if needed. We usually keep “5”, which isthe default setting. Click “Next” to proceed (see Note 8).

A11. Click “Finish” to start importing the data.

B1. Skyline analysis: Prepare the document for data analysis.Follow the procedures A2–A5.

B2. Import the Thermo RAW files. Go to the “File” menu, andclick “Results” under the “Import” option to open the“Import Results” form.

B3. Select “Collision Energy” from the “Optimizing” options.

B4. Click “OK” and choose the Thermo RAW files to beimported.

C1. Analyze the results in Skyline. Arrange the Skyline windowfor data analysis. To simultaneously display MS1 and MS2data, go to the “View” menu, and select “Split Graph”under the “Transitions” option. Alternatively, you canright-click a panel, and select the “Split Graph” under the“Transitions” option as shown in Fig. 3.

PRM of Phosphopeptides 219

C2. The results of the Skyline analysis can be exported usingthe “Report” option under the “Export” option in the“File” menu. The different report options can be furthercustomized to user requirements.

Method B: Construction of the PRM-based targeted method basedon in silico digest.

For possibilities and limitations of this approach refer toNote 9.

3.7 Target List

Construction

with the Skyline

Software by In Silico

Digest

1. Open a “Blank Document,” and save the document.

2. Adjust the “Peptide settings.” Go to the “Digestion” tab, andset appropriate “Enzyme” and “Max missed cleavages.”

3. Go to the “Filter” tab, and set the “Min length” option to “7”and the “Max length” option to “25,” which are the defaultsettings for the MaxQuant search. You can also attempt totarget shorter or longer peptides, and later adjust the para-meters for the MaxQuant search. If you want to target thesite in the N-terminal region, set the “Exclude N-terminalAAs” option to “0.”

4. Go to the “Modifications” tab, and click “Edit list” for the“Structural modifications.” Add “Phospho (ST)” and “Phos-pho (Y)” as variable modifications. Other possible modifica-tions, namely alkylation (e.g., carbamidomethylation) ofcysteine residues as fixed, oxidation of methionine residuesand protein N-terminal acetylation as variable modifications,also need to be added.

Fig. 3 Example of a split-graph analysis of PRM data in Skyline

220 Sara Christina Stolze and Hirofumi Nakagami

5. Adjust the “Transition settings.” Go to the “Filter” tab, and setthe “precursor charges” option to “2, 3, 4”, the “Ion charges”option to “1,” and the “Ion types” option to “p.” Set the“From: (of the “Product ion selection”)” option to the “m/z > precursor” and the “To: (of the “Product ion selection”)”option to the “last ion.”

6. Import a FASTA file containing the proteins of interest. Go tothe “File” menu, and click “FASTA” under the “Import”option to select the FASTA file. This will create a list of all(theoretically) possible precursors with the given parameters(Fig. 4).

7. Choose precursors of interest and delete all other transitionsfrom the “Targets” list (see Note 10).

8. Export an isolation list. Go to the “File” menu, and click the“Isolation List” under the “Export” option to open the“Export Isolation List” form.

Fig. 4 Transition list created by in silico digest from a FASTA file containing aprotein of interest

PRM of Phosphopeptides 221

9. Select the “Thermo Q Exactive” for the “Instrument type.”SeeNote 4 for additional recommendations. Then click “OK.”

10. Save the document to be used as a template for the PRM dataanalysis.

4 Notes

1. We will refer to phosphopeptide-enriched samples throughoutthis method, but the method is neither limited to analysis ofphosphopeptides nor enriched samples. With appropriate adap-tations, the protocol may also be used to target peptides withother modifications or without any modifications in differenttypes of samples.

2. In addition to phosphopeptide enrichment protocol optimiza-tion, we also recommend optimizing the data acquisitionmethod on the mass spectrometer for phosphopeptide identi-fication. An increased MS/MS injection time often providesbetter sensitivity and identification numbers. We recommendmeasurement of at least three replicates for each condition toaccount for the technical variation that can be introducedduring sample preparation and measurement.

3. The modifications box will contain all modifications that weredetected in the MaxQuant search. Alternatively, desired mod-ifications can be selected during the adjustment of the docu-ment before import under the “Peptide Settings” form in the“Modifications” tab. If this option was used, but all modifica-tions found in the MaxQuant search were not added at thatpoint, the modifications box during the import will notify theuser about additional modifications that can be added byselection.

4. The number of precursors to be targeted in a PRM experimentdepends on the mass spectrometer’s specifications as well as onthe chromatographic width of the peak. The larger the numberof precursors that are targeted, the longer the cycle time that isneeded for one full scan + MS/MS. Prolonged cycle timesresult in inferior coverage of chromatographic peaks; 10 scansacross a peak should serve as an orientation mark. The numberof precursors for targeting can be increased by using a sched-uled method; this, however, requires very stable and reproduc-ible chromatography. A scheduled PRM isolation list can onlybe created from a document containing spectral informationand not from a document created by importing a FASTA file asstated below in Note 5.

5. Alternatively, a template for analysis can be created by import-ing a FASTA file containing the proteins of interest into a blank

222 Sara Christina Stolze and Hirofumi Nakagami

Skyline document applying the settings specified in Subheading3.4, steps 4–10. From this list, only the precursors that are tobe targeted are retained and all others are deleted. Save thedocument to be used as an analysis template and/or to createan isolation list.

6. Using the combination of one full scan and PRM will enablequantification on both MS1 and MS2 levels. For a Q-Exactivesystem, make sure to adjust the “loop count” to the number oftargeted precursors when setting up the method.

7. (a) This option, which uses the analysis template created byuploading a FASTA file as described in Note 5, will not gener-ate a spectral library. Therefore, the MS2 peak area will notdisplay a dot product (dotp) score. This option will also notprovide information about the numbers of scans obtainedper peak.(b) This option will generate a list of all detected MS2 ions thatwere derived from a precursor. It is, however, recommended tokeep only ions that were robustly detected (indicated by a greentraffic light symbol). Undesired transitions can be removed byopening the transition list of a precursor and unchecking therespective ions. The transition list can be opened by hoveringover the m/z value for the precursor in the target list andclicking the downward arrow that appears.

8. You can subsequently change the setting under the “TransitionSettings” form in the “Library” tab. However, you can onlyreduce the number for filtering. It is not possible to increasethe number to have data for additional product ions becausethe data for the additional ions will not be imported foranalysis.

9. If difficulties are encountered in obtaining spectra for phos-phopeptides of interest, a target list can also be generatedwithout the DDA data. In contrast to the abovementionedmethod that can be regarded as an ab initio approach, foroptimal results, the alternative method should be based on apriori data: an example of a priori information would be aphosphorylation substrate in which one or multiple sites ofphosphorylation are known but for which no MS data areavailable. In this case, an in silico digest of the target proteincan be used to generate a list of theoretical precursors that canbe targeted in a PRM method without the need for prioracquisition of DDA spectra. For best results, in the firstinstance, all theoretical precursors generated in the in silicodigest should be targeted (see Note 10). If several rounds ofanalysis are planned, the target list can be adjusted to theprecursors with good identifications in the previous runs.

PRM of Phosphopeptides 223

10. Miscleavages should also be considered when choosing a pre-cursor. Any information from prior MS experiments with theproteins of interest may facilitate identification of good targets.If a precursor contains multiple phosphorylatable sites, forexample, a serine and a threonine, it is sufficient to targetonly one precursor to cover all (putative) phosphorylationsite isomers because they all have the same m/z value. Analysisof MS/MS fragmentation data created in the PRM measure-ment can enable a precise localization of the phospho group iffragments containing the phosphorylation have been observed.However, to enable this distinction between possible isomers,it is necessary to include all of them in the template for thePRM analysis.

Acknowledgments

This work was supported by the Max-Planck-Gesellschaft. Wethank Neysan Donnelly for editing the manuscript.

References

1. Walley JW, Sartor RC, Shen Z et al (2016)Integration of omic networks in a developmen-tal atlas of maize. Science 353:814–818

2. Marx H, Minogue CE, Jayaraman D et al(2016) A proteomic atlas of the legume Medi-cago truncatula and its nitrogen-fixing endo-symbiont Sinorhizobium meliloti. NatBiotechnol 34:1198–1205

3. Seaton DD, Graf A, Baerenfaller K et al (2018)Photoperiodic control of the Arabidopsis pro-teome reveals a translational coincidence mech-anism. Mol Syst Biol 14:e7962

4. Vidova V, Spacil Z (2017) A review on massspectrometry-based quantitative proteomics:targeted and data independent acquisition.Anal Chim Acta 964:7–23

5. Arsova B, Watt M, Usadel B (2018) Monitor-ing of plant protein post-translational modifi-cations using targeted proteomics. Front PlantSci 9:1168

6. Bourmaud A, Gallien S, Domon B (2016) Par-allel reaction monitoring using quadrupole-Orbitrap mass spectrometer: principle andapplications. Proteomics 16:2146–2159

7. Lehmann U, Wienkoop S, Tschoep H et al(2008) If the antibody fails--a mass westernapproach. Plant J 55:1039–1046

8. Peterson AC, Russell JD, Bailey DJ et al (2012)Parallel reaction monitoring for high resolu-tion and high mass accuracy quantitative, tar-geted proteomics. Mol Cell Proteomics11:1475–1488

9. Cox J, Mann M (2008) MaxQuant enableshigh peptide identification rates, individualizedp.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol26:1367–1372

10. Tyanova S, Temu T, Cox J (2016) The Max-Quant computational platform for massspectrometry-based shotgun proteomics. NatProtoc 11:2301–2319

11. MacLean B, Tomazela DM, Shulman N et al(2010) Skyline: an open source document edi-tor for creating and analyzing targeted proteo-mics experiments. Bioinformatics 26:966–968

12. Nakagami H (2014) StageTip-based HAM-MOC, an efficient and inexpensive phospho-peptide enrichment method for plant shotgunphosphoproteomics. Methods Mol Biol1072:595–607

224 Sara Christina Stolze and Hirofumi Nakagami

Chapter 17

Enrichment of N-Linked Glycopeptides and TheirIdentification by Complementary Fragmentation Techniques

Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

Abstract

N-linked glycans are a ubiquitous posttranslational modification and are essential for correct protein foldingin the endoplasmic reticulum of plants. However, this likely represents a narrow functional role for thediverse array of glycan structures currently associated with N-glycoproteins in plants. The identification ofN-linked glycosylation sites and their structural characterization by mass spectrometry remains challengingdue to their size, relative abundance, structural heterogeneity, and polarity. Current proteomic workflowsare not optimized for the enrichment, identification and characterization of N-glycopeptides. Here wedescribe a detailed analytical procedure employing hydrophilic interaction chromatography enrichment,high-resolution tandem mass spectrometry employing complementary fragmentation techniques (higher-energy collisional dissociation and electron-transfer dissociation) and a data analytics workflow to producean unbiased high confidence N-glycopeptide profile from plant samples.

Key words N-linked glycans, Glycoproteomics, HILIC, Higher-energy collisional dissociation, Elec-tron-transfer dissociation

1 Introduction

Asparagine (N)-linked glycosylation is a covalent posttranslationalmodification that is found across all eukaryotes. The modificationhas been linked to numerous important functions such as enzymeactivity, protein–protein interactions, protein folding, and sorting[1]. In mammals, N-linked glycans have been connected with avariety of cellular functions and are implicated in various diseases[2]. In plants,N-linked glycans are also essential for protein foldingas part of the endoplasmic reticulum (ER) quality control (ERQC)system [3]. However, roles for N-glycans in processes such asinfluencing enzyme activity in plants are less clear [4]. Untilrecently it was unclear whether N-linked glycans formed in theGolgi apparatus of plants were associated with any obvious function[5]. However, most studies were performed in Arabidopsis thali-ana, while recent efforts in Oryza sativa (rice) have identified that

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_17, © Springer Science+Business Media, LLC, part of Springer Nature 2020

225

N-linked glycan biosynthetic mutants are severely affected ingrowth, are sterile and adaptation to low temperature environ-ments is affected [6].

N-linked glycans are comprised of a highly conserved core N-acetylglucosamine (GlcNAc) and mannose (Man) that is sharedbetween eukaryotes. Differences in glycan extensions, decorationsand processing of this core define theN-glycan diversities observedacross Eukaryota. The structure of plant N-linked glycans rangesfrom high-mannose structures (HexNAc2Hex9) to complex bian-tennary oligosaccharide comprising a variety of glycans (HexNA-c4Hex5Fuc3Pent1). The sequential N-glycan biosynthetic processresults in a diversity of structures at a given N-glycan site. Theresultant microheterogeneity complicates data analysis, proteinquantitation and makes it challenging to identify and directly attri-bute function due to the macroheterogeneity at the protein level[7]. As a result, few studies have examined the function of a specificN-glycoproteoforms. Thus, the characterization of N-linked gly-coproteins in plants has generally occurred through the applicationproteomic methods, which have been challenging due to the sizeand physicochemical properties ofN-glycopeptides, poor fragmen-tation during collision-induced dissociation and the variations dueto microheterogeneity of structures found at a given site. N-glycanprofiles from plant proteins were initially obtained through theenzymatic removal of structures and profiling by mass spectrome-try. Initial efforts to identify N-glycoproteins employed affinitymethods and profiling by mass spectrometry with N-glycan sitesinferred by informatic techniques [8–11], but findings were notassociated with a polypeptide sequence. To reduce identificationissues caused by the presence of the glycan, studies employedendoglycosidases (PNGase A/F) to remove N-glycans fromenriched preparations prior to identification by tandem mass spec-trometry. Such approaches have identified nearly 3000 sites fromover 1600 proteins from the reference plant Arabidopsis[12, 13]. However, the use of anN-glycosidases to improve identi-fication by mass spectrometry results in data that lack anyN-glycanstructural information. A handful of studies in plants have nowapplied high-resolution tandem mass spectrometry on enrichedN-glycopeptide fractions revealing the extent of microheterogene-ity in plant N-glycoproteins [7, 14, 15]. These studies haveprovided high-resolution information on the variation of N-glycanstructures at a given site from around 500 N-glycoproteins fromArabidopsis.

As with most posttranslational modifications, the identificationofN-linked glycopeptides from a complex protein lysate by tandemmass spectrometry is infrequent. Consequently, some enrichmentstrategy is necessary prior to mass spectrometry. Initial enrichmentmethods in plants employed complex enrichment and fractionationtechniques utilizing cation exchange chromatography and gel

226 Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

filtration [16]. However, these methods have generally beenapplied for profiling N-glycan structures by mass spectrometry[17]. The use of lectin affinity for the enrichment of N-glycopep-tides has been commonly applied in plant samples [9]. Indeed, thefirst plant glycoproteomic study employed a lectin mixture com-prising concanavalin A (ConA), wheat germ agglutinin (WGA),and Ricinus communis (castor bean) agglutinin (RCA120) toenrich N-glycopeptides [13]. A solid phase enrichment approachusing hydrazide beads has also been employed to capture N-glyco-peptides from plant samples [12]. Recently the WGA enrichmentmethod was applied to Arabidopsis samples and intactN-glycopep-tides analyzed by high resolution tandem mass spectrometry[15]. However, the authors reported an abundance ofN-glycopep-tides harboring single N-linked GlcNAc residues, an N-glycan notcommonly reported in Arabidopsis samples [10, 17]. These find-ings indicate a potential structural bias when enriching withWGA [7].

An alternative enrichment strategy employs hydrophilic inter-action liquid chromatography (HILIC) for the enrichment ofN-glycopeptides and was first applied to human plasma[18]. Although N-glycans were also removed with PNGase Aprior to mass spectrometry, the enrichment method appeared unbi-ased, as it was not selecting a glycan type. Recently, the enrichmentof N-glycopeptides from plant samples using HILIC has beenconducted and intact N-glycopeptides characterized by tandemmass spectrometry [7, 14]. The proportions ofN-glycans identifiedin these studies reflected those previously found in a variety of plantspecies, including tobacco, Arabidopsis, and Lotus japonica [7],reflecting the unbiased nature of the HILIC enrichment method.Here, we outline a strategy to analyze enriched N-glycopeptidesfrom plant microsomal preparations and using HILIC enrichment,and a high-resolution tandem mass spectrometry employing com-plementary fragmentation techniques (HCD and ETD) to producean unbiased N-glycopeptide profiles. A recent study utilizing thisapproach in Arabidopsis identified over 1000 distinct N-glycopep-tides from over 300 glycoproteins with an FDR <1% [7].

2 Materials

Prepare all solutions using ultrapure water (18 MΩ-cm at 25 �C)and analytical grade reagents. Prepare and store all reagents at roomtemperature (unless indicated otherwise). Follow institutional reg-ulations when disposing of waste materials.

2.1 Microsomal

Preparation

1. Approximately 1 g plant tissue (fresh weight) (see Note 1).

2. Ceramic mortar and pestle (medium size).

N-linked Glycoproteomics in Plants 227

3. Microsome Extraction Buffer: 50 mM HEPES-KOH(pH 6.8), 0.4 M sucrose, 1 mM dithiothreitol (DTT), 5 mMMnCl2 and 5 mM MgCl2.

4. Proteinase inhibitor cocktail, such as cOmplete EDTA-freeproteinase inhibitor cocktail tablet (Roche).

5. Miracloth (EMD Millipore).

6. Funnel (glass), 80 mm.

7. Preparative centrifuge tubes, 10 mL.

8. Preparative centrifuge, with fixed angle rotor for 10 mL tubesand capable of 10,000 � g.

9. Ultracentrifuge tubes, 12 mL.

10. Ultracentrifuge with fixed angle rotor for 12 mL tubes capableof 100,000 � g for pelleting microsomes.

11. Protein Quantification Assay, such as Pierce™ BCA ProteinAssay Kit.

2.2 Digestion

and Hydrophilic

Interaction Liquid

Chromatography

(HILIC) Enrichment

of N-Glycopeptides

1. Denaturing Buffer: 7 M urea in 100 mM ammonium bicarbon-ate (see Note 2).

2. 100 mM ammonium bicarbonate.

3. 1 M dithiothreitol (DTT) (see Note 3).

4. 37 �C incubator.

5. 1 M iodoacetamide (IAA) (see Note 4).

6. Trypsin, sequencing grade (see Note 5).

7. Acetic acid.

8. C18 Solid phase extraction (C18 SPE), such as Sep-Pak plusC18 cartridges (Waters Corporation).

9. 1 mL syringe.

10. 10 mL syringe.

11. Centrifuge with rotor to handle 2 mL microfuge tubes capableof 13,000 � g.

12. C18 SPE Buffer 1: 0.1% formic acid.

13. C18 SPE Buffer 2: 80% acetonitrile and 0.1% formic acid.

14. SpeedVac concentrator.

15. Hydrophilic interaction liquid chromatography SPE (HILICSPE) spin columns, such as MacroSpin Columns HILIC(#SMM HIL, The Nest Group) (see Note 6).

16. HILIC Loading Buffer: 80% acetonitrile, 1%trifluoroacetic acid.

17. HILIC Elution Buffer 1: 70% acetonitrile, 1%trifluoroacetic acid.

228 Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

18. HILIC Elution Buffer 2: 60% acetonitrile, 1%trifluoroacetic acid.

19. HILICElutionBuffer 3:50%acetonitrile, 1% trifluoroacetic acid.

2.3 Identification

of N-Glycopeptides by

Tandem Mass

Spectrometry

1. ZipTipC18 Pipette Tips (Millipore).

2. ZipTip Buffer 1: 80% acetonitrile, 0.1% formic acid.

3. ZipTip Buffer 2: 0.1% formic acid.

4. SpeedVac concentrator.

5. Nano-flow liquid chromatograph with tandem mass spectrom-eter capable of triggered Electron-transfer dissociation (ETD),such as an Orbitrap Fusion™ Lumos™ Tribrid™ Mass Spec-trometer (Thermo Fischer Scientific) with an Ultimate 3000RSLC nano-flow HPLC (Thermo Fischer Scientific) (see Note7).

6. C18 nano-trap column (100 A, 75 μm � 2 cm).

7. C18 analytical column (100 A, 75 μm � 50 cm).

8. MS Loading Buffer: 3% (v/v) acetonitrile and 0.1% (v/v)formic acid.

9. MS Buffer B: 100% acetonitrile and 1% formic acid (v/v).

2.4 Spectral Data

Interrogation

and Matching

1. Byonic™ (Protein Metrics) (see Note 8).

2. Plant species specific database in FASTA format.

3. Microsoft Excel.

3 Methods

Carry out all procedures at room temperature, unless otherwiseindicated. A workflow of the methods is outlined in Fig. 1.

3.1 Preparation

of Microsomal Fraction

and Peptide Digestion

1. Harvest 1 g fresh weight of plant material (see Note 1).

2. Place material in 8 mL of Microsomal Extraction Buffer and aprechilled mortar and pestle and grind on ice until tissue ishomogenized.

3. Place two layers of Miracloth into a funnel and filter homoge-nate into 10 mL centrifuge preparative tube on ice. Gentlysqueeze the Miracloth to extract as much homogenate as pos-sible (see Note 9).

4. Centrifuge the homogenate at 3000 � g for 10 min at 4 �C.

5. Carefully transfer the supernatant into prechilled 12 mL ultra-centrifuge tubes.

6. Centrifuge the supernatant at 100,000 � g for 30 min.

7. Discard supernatant being careful not to disturb the pellet.

N-linked Glycoproteomics in Plants 229

8. Resuspend the pellet (microsomal fraction) in residual Extrac-tion Buffer.

9. Quantify the amount of protein using a protein quantificationassay (see Note 10).

3.2 Digestion

and N-Glycopeptide

Enrichment

1. Take around 500 μg of microsomal protein and make up to100 μL in Denaturing Buffer.

2. Add DTT to a final concentration of 10 mM incubate thesamples at 60 �C for 60 min (see Note 3).

3. Allow the sample to cool to room temperature, add IAA to afinal concentration of 100 mM and incubate at room tempera-ture for 45 min (see Note 4).

4. Dilute the sample to 1 M urea with 100 mM ammoniumbicarbonate (see Note 11).

5. Add trypsin at 1:25 trypsin–protein ratio (20 μg trypsin) andincubate overnight at 37 �C (see Note 12).

6. Add acetic acid to a concentration of 1% (v/v).

7. Desalt and purify tryptic peptides with a C18 SPE cartridge.

8. Wash the C18 SPE cartridge using 10 mL of C18 SPE Buffer2 using a 10 mL syringe.

9. Precondition the C18 SPE cartridge using 10 mL of C18 SPEBuffer 1 with a 10 mL syringe, repeat this step.

10. Load the peptide sample (approximately 1 mL) into the pre-conditioned C18 SPE cartridge using a 1 mL syringe.

Fig. 1 Graphical summary of the HILIC-based N-glycan peptide analysis workflow from plant material

230 Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

11. Wash the peptides with 10 mL of C18 SPE Buffer 1 using a10 mL syringe, repeat this wash step.

12. Elute the peptides with 2 mL of C18 SPE Buffer 2.

13. Concentrate peptides using a SpeedVac concentrator (seeNote13).

14. Resuspend the peptides with 500 μL HILIC Loading Buffer.

15. For HILIC enrichment of N-glycopeptides, first condition theHILIC SPE column (placed in a 2 mL microfuge tube) using500 μL of ultrapure water by centrifugation at 200 � g for3 min.

16. Precondition the HILIC SPE column with 500 μL HILICLoading Buffer by centrifugation at 200 � g for 3 min, repeatthis step.

17. Load the peptides (500 μL) onto the HILIC SPE column andcentrifuge at 200 � g, for 3 min.

18. Wash the sample by adding 500 μLHILIC Loading Buffer andcentrifuge at 200 � g, for 3 min, repeat this step.

19. Elute N-glycopeptides with 200 μL of HILIC Elution Buffer1 (centrifuge 200 � g for 3 min), save eluate.

20. Elute N-glycopeptides with 200 μL of HILIC Elution Buffer2 (centrifuge 200 � g, for 3 min), save eluate.

21. Elute N-glycopeptides with 200 μL of HILIC Elution Buffer3 (centrifuge 200 � g for 3 min), save eluate and combine(approximately 600 μL) (see Note 14).

22. Remove acetonitrile and concentrate peptides from the com-bined eluate using a SpeedVac concentrator (see Note 13).

3.3 Identification

of N-Glycopeptides by

Tandem Mass

Spectrometry

1. Prior to analysis by tandem mass spectrometry, N-glycopep-tides are harvested using ZipTipC18 Pipette Tips to ensureconsistent loading onto the C18 nano-trap column (seeNote 15).

2. Wash the ZipTipC18 Pipette Tip by aspirating 10 μL of ZipTipBuffer 1 three times, then expel and discard liquid.

3. Condition the ZipTipC18 Pipette Tip by aspirating 10 μL ofZipTip Buffer 2 three times, then expel and discard liquid.

4. Resuspend the concentrated N-glycopeptides in 10 μL of Zip-Tip Buffer 2.

5. Load the resuspended N-glycopeptides onto the conditionedZipTipC18 by performing 10–15 cycles of aspiration–dispensa-tion cycles.

6. Wash the ZipTipC18 by aspirating 10 μL of the ZipTip Buffer2, dispense to waste and repeat four more times.

N-linked Glycoproteomics in Plants 231

7. Elute the N-glycopeptides using 5 μL of ZipTip buffer 1, aspi-rate a few times and dispensing into a clean tube. Repeat fivetimes, resulting in final volume of 25 μL.

8. Remove acetonitrile using a SpeedVac concentrator and resus-pend nearly dried peptides with 8–10 μL of MS Loading Buffer(see Note 13).

9. Load about 6 μL of the purified glycopeptide mix into the C18

nan-trap column of the nanoflow liquid chromatography tan-dem mass spectrometry (LC-MS/MS), using the MS LoadingBuffer at isocratic flow of 5 μL min�1.

10. Elute the N-glycopeptides into the tandem mass spectrometerusing a gradient of 3% MS Buffer B to 20% over 95 min,followed by 20% MS Buffer B to 40% in 10 min, then 40%MS Buffer B to 80% over 5 min. Maintain at 80% MS Buffer Bfor 5 min before equilibration to 3% MS Buffer B over 10 min(see Note 16).

11. Operate the MS in a positive ion mode, at a resolution of120,000 in full scan mode using data-dependent acquisitionin HCD triggered ETD MS/MS analysis mode (see Note 17).

12. For HCD triggered ETD, the MS2 is operated in HCD modewith a resolution of 30,000, AGC target of 50,000, ActivationQ of 0.25, EThcD (False) and Collision Energy of 30% for ionsabove 50,000 with a charge state between 3 and 8 (see Note18).

13. ETD was conducted at a resolution of 30,000 using chargedependent reaction times of 11.59 ms (+6), 16.69 ms (+5),26.08 ms (+4), and 46.37 ms (3+) (see Note 19).

14. An AGC target of 300,000 for the precursor ion was triggeredwhen one of the following ions was detected in the top 20 ionsin the HCD fragment spectra: 138.0545 (GlcNAc, fragment1), 163.06 (Hex), 186.076 (GlcNAc, fragment 2), 204.0967(GlcNAc), or 366.1396 (ManGlcNAc) (see Note 17).

3.4 Spectral Data

Interrogation

1. Spectral data were interrogated using Byonic™ (ProteinMetrics) against a plant specific protein database in FASTAformat (see Note 20).

2. Default search parameters were employed with the followingchanges: precursor mass tolerance—5 ppm, Fragmentationtype—Both: HCD and ETD, Fragment mass tolerance(HCD)—10 ppm, Fragment mass tolerance (ETD)—20 ppm, Fixed modifications—Carbamidomethyl (Cys), Vari-able modification—Oxidation (M), Charge states—3,4,5Applied to unassigned spectra, Precursor isotype off by x—Too high (wide).

232 Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

3. A collection of plant specific modifications is added to theGlycan option, using the custom glycan text field and employ-ing the fine control format. These inputs are outlined inTable 1 under columns “N-Glycan composition” and “ByonicFormat” columns in Table 1 (see Note 21).

4. Use the following format for the list of N-glycan structures inTable 1 to generate a custom plant glycan database inByonic™:

HexNAc(4)Hex(3)Fuc(1)Pent(1) @ NGlycan | common1

5. After spectral assignments and matching, high confidencePSMs were obtained by filtering data in Microsoft Excel toonly include peptides with a glycan modification and log prob-ability (|Log Prob|) of >4 for HCD ( p < 0.0001) spectra or>2 for ETD spectra (p < 0.01) (see Note 22).

4 Notes

1. This method has been successfully used to enrich N-glycopep-tides from leaf, stem and floral material from Arabidopsis thali-ana. The approach should readily work on most plant speciesassuming protein extraction can be undertaken on the tissuebeing studied [19].

2. Urea can readily degrade to ammonium and cyanate in solutionand this decomposition will accelerate if the solution is heatedor old. The solution should be made fresh and maintained atroom temperature.

3. A stock solution of 1 M DTT can be stored in aliquots at�20 �C.

4. A stock solution of 1 M IAA can be stored in aliquots at�20 �C. IAA alkylates thiol group on cysteine residues afterreduction with DTT. This step and the DTT step can beomitted, however it is virtually impossible to detect cysteinecontaining peptides unless controlled alkylation is undertaken.IAA is light and heat sensitive and should be stored in the dark.

5. Most high-grade sources of trypsin can be employed; however,some suppliers of trypsin for proteomics have produced a sta-bilized enzyme where lysine residues have been modified byreductive methylation making the enzyme resistant to autolyticdigestion [20].

6. Macrospin columns have a capacity of around 300 μg of pro-tein/peptide.

7. N-glycopeptides fragmented only using HCD or CID willoften result in complex MS/MS spectra that are difficult tomatch. This is due to the retention and/or partial

N-linked Glycoproteomics in Plants 233

Table 1N-glycan structures used to generate plant glycan database for MS/MS data interrogation

N-glycan composition TypeMass(amu)

Relative abundance(%) Byonic format

HexNAc(2)Hex(11) Immature 2188.74 0.0 @ NGlycan | rare1

HexNAc(2)Hex(10) Immature 2026.69 0.7 @ NGlycan |common1

HexNAc(2)Hex(9) High mannose 1864.63 3.1 @ NGlycan |common1

HexNAc(2)Hex(8) High mannose 1702.58 7.7 @ NGlycan |common1

HexNAc(2)Hex(7) High mannose 1540.53 4.9 @ NGlycan |common1

HexNAc(2)Hex(6) High mannose 1378.48 3.5 @ NGlycan |common1

HexNAc(2)Hex(5) High mannose 1216.42 4.4 @ NGlycan |common1

HexNAc(2)Hex(4)Pent(1) Hybrid 1186.41 1.1 @ NGlycan |common1

HexNAc(2)Hex(4)Fuc(1) Hybrid 1200.43 0.2 @ NGlycan | rare1

HexNAc(2)Hex(4)Fuc(1)Pent(1)

Hybrid 1332.49 1.0 @ NGlycan |common1

HexNAc(2)Hex(4)Fuc(2)Pent(1)

Hybrid 1478.53 0.1 @ NGlycan | rare1

HexNAc(2)Hex(5)Pent(1) Hybrid 1348.47 0.2 @ NGlycan | rare1

HexNAc(2)Hex(5)Fuc(1) Hybrid 1362.48 0.1 @ NGlycan | rare1

HexNAc(2)Hex(5)Fuc(1)Pent(1)

Hybrid 1494.52 0.2 @ NGlycan | rare1

HexNAc(2)Hex(5)Fuc(2)Pent(1)

Hybrid 1640.62 0.1 @ NGlycan | rare1

HexNAc(2)Hex(6)Pent(1) Hybrid 1510.52 0.1 @ NGlycan | rare1

HexNAc(2)Hex(6)Fuc(1) Hybrid 1524.55 0.1 @ NGlycan | rare1

HexNAc(2)Hex(6)Fuc(1)Pent(1)

Hybrid 1656.58 0.3 @ NGlycan | rare1

HexNAc(3)Hex(4) Hybrid 1257.45 0.6 @ NGlycan |common1

HexNAc(3)Hex(4)Pent(1) Hybrid 1389.49 0.9 @ NGlycan |common1

HexNAc(3)Hex(4)Fuc(1)Pent(1)

Hybrid 1535.57 1.8 @ NGlycan |common1

(continued)

234 Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

Table 1(continued)

N-glycan composition TypeMass(amu)

Relative abundance(%) Byonic format

HexNAc(3)Hex(4)Fuc(2)Pent(1)

Hybrid 1681.61 0.0 @ NGlycan | rare1

HexNAc(3)Hex(5) Hybrid 1419.50 0.3 @ NGlycan | rare1

HexNAc(3)Hex(5)Pent(1) Hybrid 1551.54 0.3 @ NGlycan | rare1

HexNAc(3)Hex(5)Fuc(1)Pent(1)

Hybrid 1697.62 0.2 @ NGlycan | rare1

HexNAc(3)Hex(6) Hybrid 1581.59 0.1 @ NGlycan | rare1

HexNAc(3)Hex(7) Hybrid 1743.65 0.1 @ NGlycan | rare1

HexNAc(3)Hex(3) Complex 1095.40 0.3 @ NGlycan | rare1

HexNAc(3)Hex(3)Pent(1) Complex 1227.44 5.5 @ NGlycan |common1

HexNAc(3)Hex(3)Fuc(1) Complex 1241.45 0.4 @ NGlycan | rare1

HexNAc(3)Hex(3)Fuc(1)Pent(1)

Complex 1373.52 11.1 @ NGlycan |common1

HexNAc(4)Hex(3) Complex 1298.48 0.3 @ NGlycan | rare1

HexNAc(4)Hex(3)Pent(1) Complex 1430.52 2.3 @ NGlycan |common1

HexNAc(4)Hex(3)Fuc(1) Complex 1444.53 0.5 @ NGlycan | rare1

HexNAc(4)Hex(3)Fuc(1)Pent(1)

Complex 1576.60 20.4 @ NGlycan |common1

HexNAc(4)Hex(4) Complex 1460.53 0.1 @ NGlycan | rare1

HexNAc(4)Hex(4)Pent(1) Complex 1592.57 0.5 @ NGlycan | rare1

HexNAc(4)Hex(4)Fuc(1) Complex 1606.59 0.1 @ NGlycan | rare1

HexNAc(4)Hex(4)Fuc(1)Pent(1)

Complex 1738.65 0.1 @ NGlycan | rare1

HexNAc(4)Hex(4)Fuc(2)Pent(1)

Complex 1884.73 0.1 @ NGlycan | rare1

HexNAc(4)Hex(5)Fuc(3)Pent(1)

Complex 2192.86 0.1 @ NGlycan | rare1

HexNAc(5)Hex(3)Pent(1) Complex 1633.60 1.7 @ NGlycan |common1

HexNAc(2)Hex(3) Paucimannose 892.32 0.8 @ NGlycan |common1

HexNAc(2)Hex(3)Pent(1) Paucimannose 1024.36 2.8 @ NGlycan |common1

(continued)

N-linked Glycoproteomics in Plants 235

fragmentation of the N-glycan during HCD / CID. Thebenefit of ETD as a complement to HCD/CID is that the N-glycan remains intact on the peptide backbone and thesubsequent ETD fragmentation spectra generates c- and z-series fragment ions for optimal spectra matching (Fig. 2).

8. It is possible to employ other search engines, but we havefound that Byonic™ is simple to use and is well suited formatching N-glycopeptide spectra from either HCD or ETD.The software employs standard glycan strings to enable cus-tomization of many glycan structures as variable modificationsin the search parameters (Table 1). Note: these programs can-not distinguish isomers, nor can they identify the branchingstructure of the glycopeptide.

9. Be cautious when squeezing Miracloth as it can easily split. Ifdealing with small volumes and intense squeezing is necessary,a vinyl mesh support can be employed.

Table 1(continued)

N-glycan composition TypeMass(amu)

Relative abundance(%) Byonic format

HexNAc(2)Hex(3)Fuc(1) Paucimannose 1038.38 0.6 @ NGlycan |common1

HexNAc(2)Hex(3)Fuc(1)Pent(1)

Paucimannose 1170.44 16.4 @ NGlycan |common1

HexNAc(2)Hex(3)Fuc(2)Pent(2)

Paucimannose 1448.52 0.0 @ NGlycan | rare1

HexNAc(2)Hex(4) Truncated 1054.37 1.1 @ NGlycan |common1

HexNAc(2)Hex(2) Truncated 730.27 0.2 @ NGlycan | rare1

HexNAc(2)Hex(2)Fuc(1)Pent(1)

Truncated 1008.38 1.1 @ NGlycan |common1

HexNAc(2)Hex(2)Pent(1) Truncated 862.31 0.2 @ NGlycan | rare1

HexNAc(2)Hex(2)Fuc(1) Truncated 876.34 0.2 @ NGlycan | rare1

HexNAc(2)Hex(1)Fuc(1) Truncated 714.29 0.0 @ NGlycan | rare1

HexNAc(2)Hex(1) Truncated 568.22 0.2 @ NGlycan | rare1

HexNAc(2) Truncated 406.17 0.1 @ NGlycan | rare1

HexNAc(1) Truncated 203.09 0.6 @ NGlycan |common1

The relative abundance was calculated from high confidenceN-glycopeptide matches fromArabidopsis thaliana [7]. The

Byonic Format can be used to create a plantN-glycan database. The most common plantN-glycan structures are shown

as bold

236 Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

10. A smaller volume than recommended by these assay kits can beemployed to reduce waste (e.g., 5 μL instead of 10 μL assuggested in the Pierce™ BCA Protein Assay Kit).

11. A solution of 7 M urea will readily denature proteins; however,the activity of trypsin is dramatically reduced in 7 M urea.Dilution to 1 M urea before adding trypsin will enable diges-tion of proteins by trypsin.

12. Adding trypsin to protein at a ratio of 1:25 or 1:50 is generallyrecommended. When using stabilized trypsin, a ratio of 1:25should be suitable in most cases. However, if using a nonsta-bilized form of trypsin, increase the ratio to 1:10 for trypsin–protein.

13. Dry the sample down until a few microliters of liquid remain.This reduces the chance of peptides “sticking” to the plasticmicrofuge tube. If these peptide concentration steps are affect-ing sample yield, it is possible to employ surface siliconizationof microfuge tubes or the addition of 1% bovine serum

-z2

-z3

-z6

-z7

-z1

1

-z4

-z1

0

815.05 (M+3H)3+

1222.08 (M+2H)2+

-c2 -c4

-c3 -c5 -c6 -c7

INATGVVAPVGFKc2 c3 c4 c5 c6 c7

z2z3z4z6z7z10z11

500 15001000

m/z

Inte

nsity

(cp

s)

a

INATGVVAPVGFK

b2 b3 b4 b5 b6 b10

y4y5 y3y6y7y11y10

1271.75 (M+H)+

GlcNAc204.1 m/z

1475.84 (M+H)+

1679.93 (M+H)+

|

|

|

1621.95 (M+H)+|

636.86 (M+2H)2+

-y9

738.42 (M+2H)2+

-y3

-y5 -

y7

-y6

-y8

-y1

0

-y1

1

-y1

2

y8y9

-b2 -b3

-b4 b6

b12

-b5

-b1

0

-b1

2

ManGlcNAc366.14 m/z

500 15001000

m/z

Inte

nsity

(cp

s)

Fuc

GlcNAc

Man Xyl

Glc

Gal

b

Fig. 2 An example of HCD triggered ETD fragmentation spectra. (a) A peak of 204.1 m/z (GlcNAc) and366.1396 m/z (ManGlcNAc) in the HCD fragmentation spectra triggered ETD fragmentation of this precursorion 815.05 [M+3H]3+. (b) Resultant ETD spectra (See Notes 18 and 19)

N-linked Glycoproteomics in Plants 237

albumin (BSA) to the sample has been shown to significantlyimprove recovery [21].

14. The sequential concentrations of acetonitrile employed forelution of plant N-glycopeptides from the HILIC SPE col-umns was empirically tested and this range (70–50% acetoni-trile) was found to be optimal in the selective elution ofpeptides harboring a range of expected N-glycan structures.Lower concentrations of acetonitrile for elution from HILICSPE will result in a plethora of unmodified hydrophilicpeptides.

15. After election and concentration of N-glycopeptides from theHILIC SPE column the sample is theoretically ready for analy-sis by LC-MS/MS. However, the total amount of peptides inthis fraction can vary considerably. The use of ZipTipC18Pipette Tips at this step enables a fixed peptide amount to beloaded onto the C18 nano-trap column. A typical ZipTipC18Pipette Tip has a peptide binding capacity of around 1 μg,although the supplier indicates it can bind up to 5 μg.

16. The elution profile has been optimized for the separation ofhydrophilic N-glycopeptides with an elongated ramp to 20%acetonitrile.

17. During HCD, theN-glycan structures onN-glycopeptides willalso be fragmented. This will result in the generation of N-glycan signature ions that can be used to trigger ETD of thesame precursor ion. The following fragment ions are commonduring HCD fragmentation of plant N-glycopeptides:138.0545 m/z (GlcNAc, fragment 1), 163.06 m/z (Hex),186.076 m/z (GlcNAc, fragment 2), 204.0967 m/z(GlcNAc), and 366.1396 m/z (ManGlcNAc).

18. HCD triggered ETD will generate two classes of fragmenta-tion spectra. HCD fragmentation spectra will contain some y-and b-series ions, N-glycan fragments, y- and b-series ionsharboring glycans (usually a HexNAc or two) as well as thecharged precursor ion (peptide) without the N-glycan. Thus,the HCD fragmentation spectra enable confirmation of anN-glycan (N-glycan fragments) and an accurate estimation ofthe mass of the N-glycan structure and the peptide. ETDfragmentation spectra usually contain z- and c-series ionsenabling confident assignment when using a search engine.Note: only about 10% of MS/MS spectra from an HCD trig-gered ETD analysis will comprise ETD spectra (Fig. 2).

19. While it is possible to undertake ETD-only analysis of samplesin conjunction with HCD-only or HCD triggered ETD toobtain important ETD fragmentation spectra for unique N-glycopeptides, the current generation of instruments generatefew MS/MS.

238 Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

20. A Protein Metrics Byonic Viewer is freely available to viewresult files (.byrslt) generated after data interrogation.

21. Based on only high confidenceN-glycopeptides matched fromour analysis of eight Arabidopsis thaliana samples [7], we havegenerated a plantN-glycan database suitable for Byonic™ thatincludes a Fine Control parameter (common or rare) based onthe frequency that the particular structure was detected in oursamples. Those N-glycan structures identified in �0.5% of allN-glycopeptides identified were classifies as “rare.” This willgreatly accelerate data processing, resulting in a 10 min analysistime for a 1 gigabyte raw datafile on a system with 24 cores at3000 MHz.

22. By employing stringent log probability cutoffs, we estimatedan FDR < 1% for all PSMs (FDR 2D). While some of thereported N-glycan structures did not conform to expectedcompositions, for example, HexNAc(5)Hex(3)Pent(1), usingmultiple replicates eliminated many of these unexpectedstructures.

References

1. Hebert DN, Lamriben L, Powers ET et al(2014) The intrinsic and extrinsic effects ofN-linked glycans on glycoproteostasis. NatChem Biol 10:902–910

2. Stanley P, Taniguchi N, Aebi M (2015) N-Glycans. In: rd VA, Cummings RD et al (eds)Essentials of glycobiology. Cold Spring Har-bor Laboratory Press, Cold Spring Harbor(NY), pp 99–111

3. Liu Y, Li J (2014) Endoplasmic reticulum-mediated protein quality control in Arabidop-sis. Front Plant Sci 5:162

4. Rips S, Bentley N, Jeong IS et al (2014) Multi-ple N-glycans cooperate in the subcellular tar-geting and functioning of ArabidopsisKORRIGAN1. Plant Cell 26:3792–3808

5. Strasser R (2016) Plant protein glycosylation.Glycobiology 26:926–939

6. Fanata WI, Lee KH, Son BH et al (2013)N-glycan maturation is crucial for cytokinin-mediated development and cellulose synthesisin Oryza sativa. Plant J 73:966–979

7. Zeng W, Ford KL, Bacic A et al (2018) N-linked glycan micro-heterogeneity in glycopro-teins of Arabidopsis. Mol Cell Proteomics17:413–421

8. Henquet M, Lehle L, Schreuder M et al (2008)Identification of the gene encoding the alpha1,3-mannosyltransferase (ALG3) in Arabidop-sis and characterization of downstreamN-glycan processing. Plant Cell 20:1652–1664

9. Elbers IJW, Stoopen GM, Bakker H et al(2001) Influence of growth conditions anddevelopmental stage on N-glycan heterogene-ity of transgenic immunoglobulin G andendogenous proteins in tobacco leaves. PlantPhysiol 126:1314–1322

10. Strasser R, Stadlmann J, Svoboda B et al(2005) Molecular basis of N-acetylglucosami-nyltransferase I deficiency in Arabidopsis thali-ana plants lacking complex N-glycans.Biochem J 387:385–391

11. Pedersen CT, Loke I, Lorentzen A et al (2017)N-glycan maturation mutants in Lotus japoni-cus for basic and applied glycoprotein research.Plant J 91:394–407

12. Song W, Mentink RA, Henquet MG et al(2013) N-glycan occupancy of Arabidopsis N-glycoproteins. J Proteome 93:343–355

13. Zielinska DF, Gnad F, Schropp K et al (2012)Mapping N-glycosylation sites across sevenevolutionarily distant species reveals a diver-gent substrate proteome despite a commoncore machinery. Mol Cell 46:542–548

14. Ma J, Wang D, She J et al (2016) Endoplasmicreticulum-associated N-glycan degradation ofcold-upregulated glycoproteins in response tochilling stress in Arabidopsis. New Phytol212:282–296

15. Xu SL, Medzihradszky KF, Wang ZY et al(2016) N-glycopeptide profiling in

N-linked Glycoproteomics in Plants 239

Arabidopsis inflorescence. Mol Cell Proteomics15:2048–2054

16. Wilson IB, Zeleny R, Kolarich D et al (2001)Analysis of Asn-linked glycans from vegetablefoodstuffs: widespread occurrence of Lewis a,core alpha1,3-linked fucose and xylose substi-tutions. Glycobiology 11:261–274

17. Strasser R, Schoberer J, Jin C et al (2006)Molecular cloning and characterization ofAra-bidopsis thalianaGolgi alpha-mannosidase II, akey enzyme in the formation of complex N-glycans in plants. Plant J 45:789–803

18. Hagglund P, Bunkenborg J, Elortza F et al(2004) A new strategy for identification of N-glycosylated proteins and unambiguous

assignment of their glycosylation sites usingHILIC enrichment and partial deglycosylation.J Proteome Res 3:556–566

19. Ford KL, Zeng W, Heazlewood JL et al (2015)Characterization of protein N-glycosylation bytandem mass spectrometry using complemen-tary fragmentation techniques. Front Plant Sci6:674

20. Rice RH, Means GE, Brown WD (1977) Sta-bilization of bovine trypsin by reductive meth-ylation. Biochim Biophys Acta 492:316–321

21. Goebel-Stengel M, Stengel A, Tache Y (2011)The importance of using the optimal plastic-ware and glassware in studies involving pep-tides. Anal Biochem 414:38–46

240 Eduardo Antonio Ramirez-Rodriguez and Joshua L. Heazlewood

Chapter 18

High-Resolution Lysine Acetylome Profiling by OfflineFractionation and Immunoprecipitation

Jonas Giese, Ines Lassowskat, and Iris Finkemeier

Abstract

Acetylation of lysine side chains at their ε-amino group is a reversible posttranslational modification (PTM),which can affect diverse protein functions. Lysine acetylation was first described on histones, and nowadaysgains more and more attention due to its more general occurrence in proteomes, and its possible crosstalkwith other protein modifications. Here we describe a workflow to investigate the acetylation of lysine-containing peptides on a large scale. For this high-resolution lysine acetylome analysis, dimethyl-labeledpeptide samples are pooled and offline-fractionated using hydrophilic interaction liquid chromatography(HILIC). The offline fractionation is followed by an immunoprecipitation and liquid chromatography–-tandem mass spectrometry (LC-MS/MS) for data acquisition and subsequent data analysis.

Key words Lysine acetylation, HILIC, Offline fractionation, Dimethyl labeling, MaxQuant

1 Introduction

Plants are exposed to ever changing environmental conditions[1]. A proper development and growth of plants relies on themetabolic acclimation to such conditions [2]. A fast acclimation isrealized by signaling networks to restore metabolic homeostasisafter disturbance [1, 3]. Cell signaling networks are often mediatedby posttranslational protein modifications (PTM) through phos-phorylation, redox regulation, acetylation, and other modifica-tions, which then regulate gene expression and protein turnover[4, 5]. The chemical modification can have several consequencesfor the proteins such as an altered stabilization, degradation, locali-zation, interactions with other proteins and metabolites, as well asthe regulation of enzyme activities [6].

The acetylation of lysines ε-amino groups was initiallydescribed on histones where it regulates chromatin and geneexpression [7]. The addition of an acetyl-group to lysine side chainsneutralizes the positive charge of the amino group [8]. Usually themodification is catalyzed by lysine acetyltransferases using acetyl-

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_18, © Springer Science+Business Media, LLC, part of Springer Nature 2020

241

CoA as substrate [9–11]. However, under specific conditions, suchas a high cellular pH and large amounts of substrate, lysine acetyla-tion can also happen nonenzymatically [12, 13]. These conditionscan be met during respiration in the mitochondrial matrix or in thechloroplast stroma while photosynthesis takes place [14]. In con-trast to N-terminal acetylation, lysine acetylation is reversible andcan be removed by various types of deacetylases [15–18]. Detailedinvestigations of lysine acetylomes was delayed for a long timecompared to the analyses of phosphoproteomes due to severaltechnical limitations [14]. Technical advances in high-resolutionmass spectrometry and the development of more efficient antibo-dies for the enrichment of modified peptides via immunoprecipita-tion (IP) lead to huge leaps in acetylation research [19]. Thenumbers of identified acetylation sites in numerous organisms arecontinuously increasing [11, 13, 20–26]. While the first acetylomesof Arabidopsis only comprised about 100 acetylation sites [27, 28],over 2100 acetylation sites on more than 1000 proteins could beidentified with the enhanced workflow presented here, which alsoled to the discovery of new target proteins of histone deacetylases inthe nucleus as well as in chloroplasts [29, 30]. Lysine-acetylatedproteins were found especially abundant in plant mitochondria,chloroplasts, and nuclei [13, 27–29, 31–34]. Furthermore, it wasdiscovered that lysine acetylation negatively regulates the activity ofthe RubisCO enzyme as well as the ADP-sensitivity of RubisCOactivase [27, 29].

Since the occupancy of acetylation on lysine side chains isgenerally rather low, an enrichment of the acetylated peptides byIP is necessary for their detection by mass spectrometry [35]. Forthe quantification of lysine acetylation changes between samples, alabeling and pooling step of the different samples is required priorto the IP to reduce technical error caused by the enrichment.Dependent on the organism and type of tissue, different isotopiclabeling techniques can be used. If the label cannot be incorporatedinto the proteome of the living organism, a peptide-based labelingapproach is preferred. Here we utilize the dimethyl-labeling of thepeptide amino groups, as it allows for triplexing, and is affordable incomparison to commercial labeling reagents. The isotopicdimethyl-labeling results in labeling efficiencies of more than 99%[36]. Using a bottom-up proteomics approach, trypsinated, frac-tionated, and immunoprecipitated peptide samples are analyzed ona nano-LC MS/MS setup coupled via an ESI source.

The workflow consists of six methods that are utilized to obtaina quantitative high resolution lysine acetylome profile of Arabidop-sis thaliana leaves (Fig. 1).

1. Denaturing extraction of proteins using detergents to solubi-lize membrane proteins to achieve a higher proteome coverage.As detergent sodium dodecyl sulfate (SDS) is used.

242 Jonas Giese et al.

2. Alkylation and trypsin digestion is done by application of amodified filter-aided sample preparation (FASP) method[37]. In this step SDS is removed from the sample by washingwith an urea buffer, which is followed by alkylation with chlor-oacetamide (CAA) and subsequent trypsination.

3. Isotopic dimethyl labeling of peptides on C18 columns is usedfor an accurate quantification and reduction of technical error[38]. Up to three different samples can be combined (triplex-ing). Within replicates the different labels should be swappedto avoid a labeling bias due to some rare retention time shiftevents of deuterated peptides.

4. Samples from leaf extracts usually have a high complexity. Toenable better coverage of the lysine acetylome, an additionalfractionation step prior MS/MS analysis is performed. Thisoffline fractionation is executed using a hydrophilic interactionliquid chromatography (HILIC) column (e.g., SequantZIC-HILIC column, Merck) using a reversed-phase buffersystem (Fig. 2) on an HPLC or FPLC system.

Fig. 1 Workflow for quantitative lysine acetylome profiling. The workflow comprises six crucial steps:(1) Experimental setup to investigate up to three different conditions/genotypes or treatments. (2) Proteinextraction from harvested leaf tissue under denaturing and reducing conditions with subsequent alkylation andtrypsin digestion using a modified FASP protocol. (3) Desalting and isotopic dimethyl labeling, followed bypooling of the labeled samples in equal amounts of peptide. (4) Offline fractionation of peptides utilizing a ZIC-HILIC column on an HPLC system collecting seven peptide fractions. (5) Enrichment of lysine-acetylatedpeptides by immunoprecipitation (IP) using an anti-acetyl lysine antibody. (6) Measuring desalted samples(enriched and IP input for whole proteome analysis) on a nano-LC MS/MS system. Repeat analysis for atleast four biological replicates

Offline Fractionation for Lysine Acetylome Profiling 243

5. Enrichment of lysine-acetylated peptides. As lysine-acetylatedpeptides are underrepresented in the whole peptide popula-tion, enrichment by IP is crucial for identification and a goodcoverage of acetylation sites. Bead-bound anti-acetyl lysineantibodies are available from various suppliers.

6. Samples are desalted and subsequently measured on a nano-LCMS/MS setup.

2 Materials

2.1 Protein

Extraction

1. Liquid nitrogen.

2. SDT-extraction buffer: 4% (w/v) sodium dodecyl sulfate (SDS)in 100 mM Tris–HCl pH 7.6 containing 100 mM dithiothrei-tol (DTT). Prepare 2.5 volumes (v/w) of sample material.

3. Pierce 660 nm Protein Assay with ionic detergent compatibilityreagent (Thermo Scientific).

4. Mortar and pestle.

5. Heat block.

6. Ultrasonic bath.

7. Benchtop centrifuge for microcentrifuge tubes. Centrifugationwith speeds of at least 15,000 � g should be possible.

Fractions 1 2 3 4 5 6 7

mAU % B

150

100

50

0

70

60

50

40

30

20

10

Fig. 2 Stepped-linear gradient and peptide distribution of ZIC-HILIC offline fractionation. The gradient was setup with a flow rate of 0.5 mL per min for 22 column volumes (CV, 1 CV ¼ 2493 mL), which corresponds to115 min. The gradient started with 10 % buffer BZH in buffer AZH for the duration of three CV. The concentrationof buffer BZH was then increased to 58 % with a linear gradient over 12 CV. Another three CV ran at 79 %buffer BZH. The gradient was then reset to 10 % for another three CV. Twenty-two fractions of peptides werecollected during the first 19 CV; twenty-one of those fractions were pooled into seven fractions with aboutequal peptide amounts. The last fraction was discarded, as it contained no peptides

244 Jonas Giese et al.

2.2 Filter-Aided

Sample Preparation

(FASP)

1. Urea buffer: 8 M urea in 100 mM Tris–HCl pH 8.5. Prepare25 mL per sample. The buffer should always be prepared fresh(see Note 1).

2. Chloroacetamide (CAA) solution: 55 mM CAA in urea buffer.Prepare 1 mL per sample. The solution should always beprepared fresh (see Note 1).

3. ABC buffer: 50 mM ammonium bicarbonate in dH2O. Prepare20 mL per sample.

4. ABC buffer containing 10 % (v/v) acetonitrile (ACN).

5. Proteomics-grade trypsin dissolved in 50 mM acetic acid to afinal concentration of 1 μg/μL (e.g., Trypsin MS approved bySERVA).

6. Centrifugal filter device (CFD): Amicon Ultra-4 30k MWCO(Merck Millipore) or similar (see Note 2).

7. Laboratory film (e.g., Parafilm M, Bemis Company).

8. Benchtop centrifuge, accommodating 15 mL conical-bottomed tubes.

9. Microvolume UV-VIS spectrophotometer (e.g., NanoDrop2000, Thermo Scientific).

2.3 Desalting

and Dimethyl Labeling

of Peptides

1. Methanol.

2. Buffer AC18: 0.5 % (v/v) formic acid (FA) in ultrapure water.

3. Buffer BC18: 80 % (v/v) ACN (LC-MS-grade), 0.5 % (v/v) FAin ultrapure water.

4. 50 mM NaH2PO4 (monohydrate).

5. 50 mM Na2HPO4 (dihydrate).

6. 4 % (v/v) CH2O (light label), CD2O (intermediate label)or13CD2O (heavy label) in water (LC-MS-grade).

7. 0.6 M NaBH3CN (light and intermediate label) or NaBD3CN(heavy label) in water.

8. 10 % (v/v) trifluoroacetic acid (TFA) (LC-MS-grade) in ultra-pure water.

9. Sep-Pak C18 classic cartridge, 360 mg sorbent per cartridge,55–105 μm particle size (Waters).

10. 5 mL plastic syringe.

11. Vacuum concentrator (e.g., Concentrator 5301, Eppendorf).

2.4 ZIC-HILIC Offline

Fractionation

1. Buffer AZH: 95 % (v/v) ACN (LC-MS-grade) with 2 % (v/v)FA and 5 mM ammonium acetate.

2. Buffer BZH: 0.07 % (v/v) FA and 5 mM ammonium acetate.

3. ZIC-HILIC column (e.g., Sequant ZIC-HILIC column,150 � 4.6 mm, 3.5 μm, 200 A, Merck).

Offline Fractionation for Lysine Acetylome Profiling 245

4. High pressure liquid chromatography (HPLC) unit with aUV/VIS A280 nm detector and fraction collector (e.g.,LC-20A Prominence HPLC with SPD-20A UV/VIS detectorand FRC-10A fraction collector).

5. Ultrasonic bath.

6. Benchtop centrifuge for microcentrifuge tubes, reachingspeeds up to 12,000 � g.

7. Vacuum concentrator (e.g., Concentrator 5301, Eppendorf).

2.5 Enrichment

of Lysine-Acetylated

Peptides

1. TBS Buffer: 50 mM Tris–HCl with 150 mM NaCl, pH 7.6,sterile-filtered.

2. 5 M NaOH.

3. dH2O.

4. 10 % (v/v) TFA.

5. 1 % (v/v) TFA.

6. 5 % (v/v) ACN, 1 % (v/v) TFA.

7. Anti-acetyl-lysine antibody bound to agarose beads (e.g., Anti-Acetyl Lysine, Agarose, 10 mg/ mL in 1 mL glycerol slurry,Immunechem).

8. pH test paper.

9. 1.5 mL low-binding reaction tubes.

10. Rolling wheel in fridge or cold room.

11. Refrigerated benchtop centrifuge for microcentrifuge tubes.

2.6 Desalting

of Peptides

on SDB-RPS

Stop-and-Go-

Extraction Tips

(Stage Tips)

1. 30 % (v/v) MeOH, 1 % (v/v) TFA.

2. 0.2 % (v/v) TFA.

3. 80 % (v/v) ACN, 5 % (v/v) ammonia.

4. Buffer A∗: 2 % ACN, 0.1 % TFA.

5. Styrene divinylbenzene reversed-phase sulfonate (SDB-RPS)solid-phase extraction disks (Empore).

6. Stage Tip Adapter (Sonation) or Stage tipping centrifuge (e.g.,STC-V2, Sonation).

7. 2 mL low-binding reaction tubes.

8. 0.5/ or 1.5 mL low-binding reaction tubes.

9. Benchtop centrifuge for microcentrifuge tubes, reachingspeeds of at least 15,000 � g.

10. Vacuum concentrator (e.g., Concentrator 5301, Eppendorf).

246 Jonas Giese et al.

3 Methods

3.1 Protein

Extraction

The following protocol is optimized for Arabidopsis leaves (seeNote 3). Unless, indicated otherwise all steps are carried out atroom temperature (see Note 4).

1. Leaves are harvested and immediately frozen in liquid nitrogen.Material is then ground under liquid nitrogen to a fine powder.Required amount of material for protein extraction is weighedinto a precooled reaction tube (2 mL for up to 300 mg or a15 mL tube) (see Note 5).

2. Heat a water bath to 95 �C and preheat SDT extraction bufferto 95 �C (see Note 6).

3. Mix the samples with 2.5 volumes (v/w) of hot SDT-lysisbuffer. Vortex immediately to resuspend the powder in thebuffer. Incubate samples for 5 min at 95 �C in the water bathand vortex twice in between for 15–20 s.

4. Place samples in an ultrasonic bath and sonicate for 15 min.

5. Centrifuge extract for 30 min in a benchtop centrifuge at Vmax

(15,000–21,000 � g) (see Note 7).

6. Transfer the supernatant to a new tube without disturbing thepelleted material.

7. Repeat steps 5 and 6.

8. Determine protein concentration using the Pierce 660 nmprotein assay with ionic detergent compatibility reagentaccording to the manufacturer’s instructions.

3.2 Filter-Aided

Sample Preparation

(FASP)

Centrifugation steps will be done at 4000 � g, unless indicatedotherwise. CFDs should be centrifuged until supernatant is at leasttenfold concentrated. If sufficient concentration is not achievedduring the indicated centrifugation times, longer centrifugationmay be necessary. A peptide yield after elution of 50–70 % com-pared to initial protein amount can be expected.

1. Add 2 mL urea buffer on the CFD and centrifuge for 5 min forconditioning of the membrane (see Note 8).

2. Dilute the sample eightfold with urea buffer to decrease SDSconcentration below 0.5 %. Load the diluted sample onto theCFD (see Note 9).

3. Centrifuge samples for 15 min.

4. Discard flow-through and add 4 mL urea buffer to CFD.Repeat step 3.

5. Discard flow-through. Add 1 mL CAA solution and softly mixthe solution for 1 min by up and down pipetting. Be careful tonot damage the membrane. For alkylation, incubate the sam-ples for 30 min at room temperature in the dark.

Offline Fractionation for Lysine Acetylome Profiling 247

6. Centrifuge the CFD for 15 min and discard the flow-through.

7. Add 4 mL of urea buffer to the CFD and centrifuge for 15 min.Discard the flow-through. Repeat this step twice.

8. Add 4 mL ABC buffer to the CFD and centrifuge for 15 min.Discard the flow-through and repeat this step twice.

9. Place CFD into new collection tubes.

10. Fill the CFD with ABC Buffer until the membrane is covered.Add trypsin at a 1:100 enzyme to protein ratio (e.g., 10 μgtrypsin to 1 mg protein). Softly mix the sample using a pipette.Be careful to not damage the membrane (see Note 10).

11. Close the cap of the tube and secure it with laboratory film.Then incubate the CFD at 37 �C overnight.

12. To elute the peptides from the CFD, centrifuge for 15 min.

13. Rinse the filters by adding 500 μL ABC buffer to the CFD andcentrifuge.

14. Repeat step 13.

15. Repeat step 13 with 1 mL ABC buffer containing 10 % ACN.

16. Determine the peptide concentration with a microvolumeUV-VIS photometer at a wavelength of 280 nm (seeNote 11).

17. Add TFA to a final concentration of 1 % to acidify the samplefor storage or further processing.

3.3 Desalting

and Dimethyl Labeling

of Peptides on C18

Columns

Labeling reagent should be prepared shortly before use as shown inTable 1. Solutions must be kept cold until use to prevent unwantedside-reactions.

1. Sep-Pak C18 (360 mg) cartridges are assembled with a 5 mLsyringe (without piston) in a rack above a tray to collect flow-through.

2. Flush Sep-Paks with 3 mL methanol to condition the C18matrix (see Note 12).

3. Repeat step 2 once with 3 mL buffer BC18 and afterward with3 mL buffer AC18.

4. Load peptide sample on Sep-Paks (see Note 13).

5. Add 1 mL buffer AC18.

6. Add 3 mL buffer AC18.

7. Add dimethyl-labeling reagent on Sep-Paks (see Notes 13 and14).

8. Add 1 mL buffer AC18.

9. Add 3 mL buffer AC18.

10. Transfer SepPaks to 2mL reaction tubes to collect eluate. Elutesamples twice by loading 700 μL buffer BC18 (see Note 13).

248 Jonas Giese et al.

11. Determine yield of desalted peptides using a microvolumeUV-VIS photometer at 280 nm (see Note 15).

12. Corresponding differentially labeled samples are combined in a1:1:1 ratio in relation to their total peptide amount.

13. Dry peptides using a vacuum concentrator.

3.4 ZIC-HILIC Offline

Fractionation

1. Dissolve dried peptides in 34 μL buffer AZH and 900 μL bufferBZH. Sonicate for 4 min in ultrasonic bath and vortex shortly.Centrifuge for 1 min at 12,000 � g.

2. Transfer supernatant into a new 2 mL reaction tube. Dissolvepellet in 34 μL buffer AZH and 500 μL buffer BZH. Sonicate for4 min in ultrasonic bath and vortex shortly. Centrifuge for1 min with maximum speed.

3. Combine supernatants and repeat step 2 if a pellet is persistent.

4. Collect supernatant and combine with previous supernatants(see Note 16).

5. Samples are fractionated on an HPLC unit using a 3.5 μmZIC-HILIC column. Up to 10 mg can be loaded on thecolumn. A segmented linear gradient from 10 to 58% bufferBZH and a flow rate of 500 μL per minute is used (Fig. 2).

6. Twenty-two fractions are collected and combined into sevenfractions in 1.5 mL reaction tubes aiming at equally distributedpeptide amounts.

7. Fractions are then dried in a vacuum centrifuge.

Table 1Scheme to prepare dimethyl labeling reagent. 5 mL reagent per sample is used

Light label Intermediate label Heavy label

50 mM NaH2PO4 1 mL 1 mL 1 mL

50 mM Na2HPO4 3.5 mL 3.5 mL 3.5 mL

4% (v/v) CH2O 0.25 mL – –

4% (v/v) CD2O – 0.25 mL –

4% (v/v)13CD2O – – 0.25 mL

0.6 M NaBH3CN 0.25 mL 0.25 mL –

0.6 M NaBD3CN – – 0.25 mL

Volume in total 5 mL 5 mL 5 mL

The dimethyl-labeling of peptides at unmodified lysine residues and N-termini introduces a specific mass shift which is28, 32, or 36 Da for the light, intermediate, and heavy labels, respectively

Offline Fractionation for Lysine Acetylome Profiling 249

3.5 Enrichment

of Lysine-Acetylated

Peptides

All steps regarding the anti-acetyl-lysine antibody bound to agarosebeads should be performed on ice, as the antibody is sensitive totemperature changes. All buffers should be precooled.

1. Resuspend samples in a maximum volume of 1 mL TBS buffer(see Note 17).

2. The samples should be adjusted to pH 7–8.

3. Prepare an aliquot of 10 μg of peptide sample for the totalproteome analysis and acidify with 10 % (v/v) TFA to a finalconcentration of 1 % (v/v).

4. Use 25–50 μL of antibody–bead slurry per milligram of peptide(Fig. 3). Pipetting of the antibody should be done using cuttips. Up to 500 μL of antibody can be pooled into one 1.5 mLlow-binding reaction tube for equilibration (see Note 18).

5. Add 1 mL precooled TBS and incubate the antibody for 5 minon a rolling wheel in the fridge or a cold room. Centrifugeafterward for 1 min at 1000 � g, 4 �C.

6. Carefully remove the supernatant (see Note 19).

7. Repeat steps 4 and 5 twice.

8. Distribute corresponding amounts of the washed antibodybeads into fresh tubes and add the peptides.

9. Incubate overnight on a rolling wheel at 4 �C.

10. Centrifuge samples for 2 min at 1000 � g, 4 �C.

10 µL

AB/m

g

25 µL

AB/m

g

50 µL

AB/m

g0

250

500

750(A)

num

ber o

f KAc

site

s

10 µL

AB/m

g

25 µL

AB/m

g

50 µL

AB/m

g0

2500

5000

7500(B)

num

ber o

f pro

tein

gro

ups

Fig. 3 Optimization of antibody bead amount for lysine acetylome analysis. (a) The number of identified lysine-acetylated peptides (Kac) differs between different antibody bead (AB) amounts used for the IP (10, 25, 50 μLantibody solution, respectively, with 1 mg peptide). (b) The nonenriched total proteome samples showedidentical numbers of protein groups (mean � SD, n ¼ 4). Peptides and protein groups were identified withMaxQuant with settings reported in Hartl et al. [29]

250 Jonas Giese et al.

11. Transfer supernatant to a new reaction tube and store it as flow-through.

12. Add 1 mL precooled TBS and incubate the antibody beads for5 min on a rolling wheel in the fridge or a cold room. Centri-fuge afterward for 1 min at 1000 � g, 4 �C.

13. Carefully remove the supernatant.

14. Repeat steps 12 and 13 three times.

15. Add 1 mL dH2O and incubate the antibody for 5 min on arolling wheel in the fridge or a cold room. Centrifuge afterwardfor 1 min at 1000 � g, 4 �C.

16. Carefully remove the supernatant.

17. Repeat steps 15 and 16.

18. Elute with 1 % (v/v) TFA bead volume (e.g., 25 μL antibodybeads corresponds to 25 μL 1 % TFA). Incubate the antibodybeads for 5 min on a rolling wheel in the fridge or a cold room.Centrifuge afterward for 1 min at 1000 � g, 4 �C.

19. Transfer supernatant into a new reaction tube and keep it asenriched sample.

20. Repeat step 18. Transfer supernatant to the enriched sample.

21. Elute with 5 % (v/v) ACN, 1 % (v/v) TFA bead volume.Incubate the antibody beads for 5 min on a rolling wheel inthe fridge or a cold room. Centrifuge afterward for 1 minat 1000 � g, 4 �C.

22. Transfer supernatant to the enriched sample.

3.6 Desalting

of Peptides

on SDB-RPS

Stop-and-Go-

Extraction Tips

(Stage Tips)

1. Stack three layers of SDB-RPS matrix on top of each other.Punch one disk consisting of three stacked layers out and pushinto a 200 μL pipette tip (see Note 20).

2. Assemble the Stage Tip with a 2 mL reaction tube joint by anadaptor.

3. Add 100 μL ACN onto the Stage Tip and centrifuge tips at1500 � g (approximately 1–2 min).

4. Discard flow-through.

5. Add 100 μL 30 % (v/v) MeOH, 1 % (v/v) TFA onto the StageTip and centrifuge the tips at 1500 � g (approximately1–2 min).

6. Discard flow-through.

7. Add 100 μL 0.2 % (v/v) TFA onto the Stage Tip and centrifugethe tips at 1500 � g (approximately 1–2 min).

8. Discard flow-through.

9. Load each sample (enriched for acetylome and input for totalproteome analyses) onto a Stage Tip and centrifuge tips at

Offline Fractionation for Lysine Acetylome Profiling 251

650 � g until the sample is loaded (approximately 5–10 min).Multiple loading steps may be necessary. Transfer Stage Tipinto used 2 mL reaction tubes of steps 3–8 (see Note 21).

10. Add 100 μL 0.2 % (v/v) TFA onto the Stage Tip and centrifugethe tips at 1500 � g (approximately 1–2 min).

11. Discard flow-through.

12. Transfer Stage Tips to fresh 2 mL reaction tubes to elutedesalted peptides.

13. For elution add 60 μL 80 % (v/v) ACN, 5 % (v/v) ammonia tothe tips and centrifuge at 650 � g (see Note 21).

14. Transfer the eluate to a smaller reaction tube (0.5 or 1.5 mL)and dry the peptides in a vacuum centrifuge.

15. Samples are resuspended in 10 μL buffer A∗ and peptideconcentration is determined using a microvolume UV-VISphotometer at 280 nm. Dilute peptides to a final concentrationof 0.1–0.2 μg/μL.

3.7 Guidelines

for LC-MS/MS Analysis

1. High-resolution nano-UHPLC-MS/MS setup with 15–20 cm(75 μm diameter) C18 reversed-phase capillary columns (e.g.,1.9 μm ReproSil-Pur C18-AQ, Dr. Maisch GmbH) shouldbe used.

2. Column oven should be set to 50 �C.

3. At maximum, 0.5–1 μg of peptides is injected.

4. Separation is done at 300 nl/min flow.

5. Nano-LC Gradient:

(a) Buffer A: 0.1 % (v/v) formic acid.

(b) Buffer B: 80 % (v/v) ACN, 0.1 % (v/v) formic acid.

(c) 5 min linear gradient, at 5 % buffer B in buffer A.

(d) 60 min linear gradient, to 20 % buffer B.

(e) 25 min linear gradient, to 35 % buffer B.

(f) 10 min linear gradient, to 55 % buffer B.

(g) 5 min linear gradient, to 98 % buffer B.

(h) 10 min linear gradient at 98 % buffer B.

6. Detection of peptides on Q-Exactive HF MS (ThermoScientific).

(a) Positive mode.

(b) Mass range 300–1750 m/z at resolution 60,000.

(c) AGC target value 3e6.

(d) Lock mass enabled (445.12003).

252 Jonas Giese et al.

(e) MS2: Top 15 selected, resolution 15,000, isolation win-dow 1.3 m/z, AGC target 1e5, dynamic exclusion offragmented peptides.

(f) Charge states of +1, >8, and unassigned charges areexcluded from fragmentation.

7. Data analysis.

(a) Evaluation of raw data using MaxQuant [39].

(b) Search against Araport 11 database (www.araport.org).

(c) Trypsin is set as protease with two (total proteome sam-ples) or four (enriched samples) missed cleavages, sinceacetylation will lead to a miscleavage of trypsin.

(d) PSM and Protein FDR is 1 %.

(e) Multiplicity is set to 2 or 3, respectively, for light(Dimethyl 0), intermediate (Dimethyl 4) and heavy(Dimethyl 8) label.

(f) Fixed modification: Carbamidomethylation.

(g) Variable modifications: methionine oxidation, N-terminalacetylation, lysine acetylation (only for enriched samples).

(h) Match between runs and requantify are enabled.

4 Notes

1. Urea and CAA solutions must be freshly prepared and cannotbe stored for a prolonged period. Preparation of the urea buffershould be done in advance as it only dissolves slowly. Urea-containing solution should not be heated above room temper-ature, otherwise formation of isocyanate can happen, whichleads to carbamylation of proteins. The CAA solution has tobe kept in darkness until it is used.

2. Up to 2 mg protein can be loaded on Amicon Ultra-4 CFDs.

3. This protocol for protein extraction is optimized for Arabidop-sis leave material, but it should also work fine on everyother kind of plant tissue, as long as it can be ground to a finepowder.

4. Placing SDS or urea containing buffers on ice or at 4 �C leadsto precipitation.

5. Sample should not thaw after harvest to avoid protein degrada-tion and unwanted modifications.

6. A glass beaker placed on a magnetic stirrer with heating issufficient. Use magnetic stirring to distribute heat evenly.

Offline Fractionation for Lysine Acetylome Profiling 253

7. In case you use 15 mL reaction tubes, shortly spin downsamples and transfer the supernatant with a cut tip to 2 mLreaction tubes.

8. After approximately one minute you should stop the centrifugeand check if the majority of buffer is still above the filter. In rarecases abnormal fast flow-through can happen, which indicates abroken filter unit. These CFDs must be replaced to not lose thesample during the procedure. Broken columns are also indi-cated in later centrifugation steps by a green flow-throughwhen flow-through should already be colorless.

9. As 4 mL is the maximum capacity of the CFD, multiple load-ings might be required.

10. An additional digest with LysC can be done to achieve a morecomplete digestion. We usually add LysC at a 1:100 enzyme toprotein ratio and incubate samples on CFDs 2 h at roomtemperature. Afterward, trypsin is added as in Subheading3.2, step 10 described.

11. For the determination of peptide concentration, it is assumedthat an absorption of 1 equals a peptide concentration of1 mg/mL. Expected yields of the FASP would be approxi-mately 50–70 % of the initial amount of protein.

12. A 1 mL pipette with cut tip can be used at all conditioning andwashing steps to moderately increase flow. One drop every fewseconds would be convenient.

13. Load and elute sample by gravity flow if possible. Use pipetteonly if flow stops.

14. Exchange tray below rack to collect toxic flow-through sepa-rately and discard accordingly.

15. As high amounts of ACN affect the concentration measure-ment through fast evaporation, take 10 μL of the eluted sampleand dry in a vacuum centrifuge. Afterward, dissolve the pelletin 10 μL 2 % ACN, 0.1 % TFA to determine the peptideconcentration. It is assumed that an absorption of 1 equals apeptide concentration of 1 mg/ mL.

16. Sample is now dissolved in 95 % Buffer AZH and 5 % BufferBZH.

17. Sometimes it can be difficult to resuspend the samples. Adjust-ing the pH to a range between 7 and 8 can help. The pH can beset by adding a few μL 5 M NaOH. Also, short sonicationusing an ultrasonic water bath can be helpful to dissolve pellets.

18. Be sure to mix the antibody slurry thoroughly by gentlemixing.

19. Be careful not to disturb the antibody pellet. Using Gel-loadertips can be advantageous.

254 Jonas Giese et al.

20. Three layers of SDB-RPS can be used to desalt up to 25 μgpeptide.

21. If sample is not flowing through the matrix, centrifugationspeed may be increased in small steps of 100 � g for 2 minuntil sample is completely loaded/ eluted.

Acknowledgments

We gratefully acknowledge the Deutsche Forschungsgemeinschaft(DFG, German Research Foundation) for financial supportthrough the project grants FI1655/3-1; FI1655/4-1; FI1655/6-1; and the infrastructure grant INST211/744-1. This work wascarried out within the ERA-CAPS program “KatNat.”

References

1. Calfapietra C, Penuelas J, Niinemets €U (2015)Urban plant physiology: adaptation-mitigationstrategies under permanent stress. Trends PlantSci 20:72–75

2. Nunes-Nesi A, Fernie AR, Stitt M (2010) Met-abolic and signaling aspects underpinning theregulation of plant carbon nitrogen interac-tions. Mol Plant 3:973–996

3. Dietz K-J (2015) Efficient high light acclima-tion involves rapid processes at multiple mech-anistic levels. J Exp Bot 66:2401–2414

4. Hartl M, Finkemeier I (2012) Plant mitochon-drial retrograde signaling: post-translationalmodifications enter the stage. Front Plant Sci3:1–7

5. Johnova P, Skalak J, Saiz-Fernandez I et al(2016) Plant responses to ambient temperaturefluctuations and water-limiting conditions: aproteome-wide perspective. Biochim BiophysActa 1864:916–931

6. Huber SC, Hardin SC (2004) Numerous post-translational modifications provide opportu-nities for the intricate regulation of metabolicenzymes at multiple levels. Curr Opin PlantBiol 7:318–322

7. Allfrey VG, Faulkner R, Mirsky AE (1964)Acetylation and methylation of histones andtheir possible role in the regulation of RNAsynthesis. Proc Natl Acad Sci U S A51:786–794

8. Yang X-J, Seto E (2008) Lysine acetylation:codified crosstalk with other posttranslationalmodifications. Mol Cell 31:449–461

9. Kleff S, Andrulis ED, Anderson CW et al(1995) Identification of a gene encoding a

yeast histone H4 acetyltransferase. J BiolChem 270:24674–24677

10. Drazic A, Myklebust LM, Ree R et al (2016)The world of protein acetylation. Biochim Bio-phys Acta 1864:1372–1401

11. Koskela MM, Brunje A, Ivanauskaite A et al(2018) Chloroplast acetyltransferase NSI isrequired for state transitions in Arabidopsisthaliana. Plant Cell 30(8):1695–1709

12. Wagner GR, Payne RM (2013) Widespreadand enzyme-independent Nε-acetylation andNε-succinylation of proteins in the chemicalconditions of the mitochondrial matrix. J BiolChem 288:29036–29045

13. Konig A-C, Hartl M, Boersema PJ et al (2014)The mitochondrial lysine acetylome of Arabi-dopsis. Mitochondrion 19:252–260

14. Hosp F, Lassowskat I, Santoro V et al (2017)Lysine acetylation in mitochondria: frominventory to function. Mitochondrion33:58–71

15. Alinsug MV, Yu C-W, Wu K (2009) Phyloge-netic analysis, subcellular localization, andexpression patterns of RPD3/HDA1 familyhistone deacetylases in plants. BMC Plant Biol9:37

16. Shen Y, Wei W, Zhou D-X (2015) Histoneacetylation enzymes coordinate metabolismand gene expression. Trends Plant Sci20:614–621

17. Pandey R, Muller A, Napoli CA et al (2002)Analysis of histone acetyltransferase and his-tone deacetylase families of Arabidopsis thali-ana suggests functional diversification ofchromatin modification among multicellulareukaryotes. Nucleic Acids Res 30:5036–5055

Offline Fractionation for Lysine Acetylome Profiling 255

18. Konig A, Hartl M, Pham PA et al (2014) TheArabidopsis class II sirtuin is a lysine deacety-lase and interacts with mitochondrial energymetabolism. Plant Physiol 164:1401–1414

19. Choudhary C, Mann M (2010) Decoding sig-nalling networks by mass spectrometry-basedproteomics. Nat Rev Mol Cell Biol11:427–439

20. Zhang K, Zheng S, Yang JS et al (2013) Com-prehensive profiling of protein lysine acetyla-tion in Escherichia coli. J Proteome Res12:844–851

21. Henriksen P, Wagner SA, Weinert BT et al(2012) Proteome-wide analysis of lysine acety-lation suggests its broad regulatory scope inSaccharomyces cerevisiae. Mol Cell Proteomics11:1510–1522

22. Lundby A, Lage K, Weinert B et al (2012)Proteomic analysis of lysine acetylation sites inrat tissues reveals organ specificity and subcel-lular patterns. Cell Rep 2:419–431

23. Weinert BT, Wagner SA, Horn H et al (2011)Proteome-wide mapping of the drosophilaacetylome demonstrates a high degree of con-servation of lysine acetylation. Sci Signal 4:ra48

24. Svinkina T, Gu H, Silva JC et al (2015) Deep,quantitative coverage of the lysine acetylomeusing novel anti-acetyl-lysine antibodies andan optimized proteomic workflow. Mol CellProteomics 14:2429–2440

25. Zhou H, Finkemeier I, Guan W et al (2018)Oxidative stress-triggered interactions betweenthe succinyl- and acetyl-proteomes of riceleaves. Plant Cell Environ 41:1139–1153

26. Walley JW, Shen Z, McReynolds MR et al(2018) Fungal-induced protein hyperacetyla-tion in maize identified by acetylome profiling.Proc Natl Acad Sci U S A 115:210–215

27. Finkemeier I, Laxa M, Miguet L et al (2011)Proteins of diverse function and subcellularlocation are lysine acetylated in Arabidopsis.Plant Physiol 155:1779–1790

28. Wu X, Oh M-H, Schwarz EM et al (2011)Lysine acetylation is a widespread protein mod-ification for diverse proteins in Arabidopsis.Plant Physiol 155:1769–1778

29. Hartl M, Fußl M, Boersema PJ et al (2017)Lysine acetylome profiling uncovers novel

histone deacetylase substrate proteins in Ara-bidopsis. Mol Syst Biol 13:949

30. Fußl M, Lassowskat I, Nee G et al (2018)Beyond histones: new substrate proteins oflysine deacetylases in Arabidopsis nuclei.Front Plant Sci 9:461

31. He D, Wang Q, Li M et al (2016) Globalproteome analyses of lysine acetylation and suc-cinylation reveal the widespread involvement ofboth modification in metabolism in theembryo of germinating rice seed. J ProteomeRes 15:879–890

32. Smith-Hammond CL, Hoyos E, Miernyk JA(2014) The pea seedling mitochondrial N-ε-lysine acetylome. Mitochondrion19:154–165

33. Xiong Y, Peng X, Cheng Z et al (2016) Acomprehensive catalog of the lysine-acetylationtargets in rice (Oryza sativa) based on proteo-mic analyses. J Proteome 138:20–29

34. Zhang Y, Song L, Liang W et al (2016) Com-prehensive profiling of lysine acetylproteomeanalysis reveals diverse functions of lysine acet-ylation in common wheat. Sci Rep 6:21069

35. Weinert BT, Iesmantavicius V, Moustafa T et al(2014) Acetylation dynamics and stoichiome-try in Saccharomyces cerevisiae. Mol Syst Biol10:716

36. Lassowskat I, Hartl M, Hosp F et al (2017)Dimethyl-labeling-based quantification of thelysine acetylome and proteome of plants. In:Fernie AR, Bauwe H, Weber APM (eds) Pho-torespiration. Springer New York, New York,NY, pp 65–81

37. Wisniewski JR, Zougman A, Nagaraj N et al(2009) Universal sample preparation methodfor proteome analysis. Nat Methods6:359–362

38. Boersema PJ, Raijmakers R, Lemeer S et al(2009) Multiplex peptide stable isotopedimethyl labeling for quantitative proteomics.Nat Protoc 4:484–494

39. Tyanova S, Temu T, Cox J (2016) The Max-Quant computational platform for massspectrometry-based shotgun proteomics. NatProtoc 11:2301–2319

256 Jonas Giese et al.

Chapter 19

A Versatile Workflow for the Identificationof Protein–Protein Interactions Using GFP-Trap Beadsand Mass Spectrometry-Based Label-Free Quantification

Guillaume Nee, Priyadarshini Tilak, and Iris Finkemeier

Abstract

Protein functions often rely on protein–protein interactions. Hence, knowledge about the protein interac-tion network is essential for an understanding of protein functions and plant physiology. A major challengeof the postgenomic era is the mapping of protein–protein interaction networks. This chapter describes amass spectrometry-based label-free quantification approach to identify in vivo protein interaction networks.The procedure starts with the extraction of intact protein complexes from transgenic plants expressing theprotein of interest fused to a GFP-Tag (bait-GFP), as well as plants expressing a free GFP as backgroundcontrol. Enrichment of the GFP-tagged protein together with its interaction partners, as well as the freeGFP, is performed by immunoaffinity purification. The pull-down quality can be evaluated by simplegel-based techniques. In parallel, the captured proteins are trypsin-digested and relatively quantified bylabel-free mass spectrometry-based quantification. The relative quantification approach largely relies on thenormalization of protein abundances of background-binding proteins, which occur in both bait-GFP andfree GFP pull-downs. Therefore, relative quantification of the protein pull-down is superior over methodsthat solely rely on protein identifications and removal of often copurified high-abundance proteins from thebait-GFP pull-downs, which might remove real interaction partners. A further strength of this method isthat it can be applied to any soluble GFP-tagged protein.

Key words Protein–protein interactions, GFP-trap, Label-free quantitative proteomics

1 Introduction

The harmonious regulation of plant development andmetabolism isachieved by a complex network of gene products [1]. Considerableeffort in the last two decades has led to the sequencing, assembly,and annotation of plant genomes [1–3]. However, the mere identi-fication of genes only provides little information about the molecu-lar functions of the encoded proteins. In the current postgenomicera, mapping of protein–protein interaction networks is a major

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_19, © Springer Science+Business Media, LLC, part of Springer Nature 2020

Guillaume Nee and Priyadarshini Tilak have contributed equally to this work.

257

Bait protein complexes

GFP protein complexes

Non-bound proteins

Direct interaction

Indirect interaction

Contaminant proteins

No proteins

Contaminant proteins

Enriched bait protein complexes

Interacting proteins

GFP bait GFP

Isolation of proteins

Bait enrichment

Washes

Elution

NSNS

NSNS

SN SN

NSNS

P P

PP

PP

LC MS/MS analysis

Elution Time

Rel

ativ

e ab

unda

nce

-4 -3 -2 -1 0 1 2 3 4

0

0.5

1

1.

5

2

2

.5

3

Inte

nsity

m/z

log2 fold enrichment

-log 1

0p-

valu

e

Pull-down procedure analysis

Total staining Western Blot

1

GFP Bait PFGsnietorptnadnubAInteracting proteins Contaminant proteins

2 43 1 2 43 1 2 43 1 2 43

1. Input / 2. Non-bound / 3. Last wash / 4. Elution

ML

ML

ML

ML

EXP CTR EXP CTRTryptic digest and desalting

Trp

Eluted proteins

C18

Sta

ge ti

p

Contaminant proteins

Tryptic peptides

Desalted ready to measure

samples

Statistical analysis

GFP Contaminant proteins

Bait protein Interactingproteins

In silico data processing

Inte

nsity

m/z

MS1

MS2

optional

Fig. 1 Workflow for the mass spectrometry-based label-free quantification of protein interaction partners ofGFP-tagged proteins. Proteins are extracted from tissue under native conditions. The GFP-tagged protein andpossible interaction partners (EXP) are enriched on an agarose-coupled GFP antibody matrix. Next to the

258 Guillaume Nee et al.

challenge in plant biology [4]. This is particularly relevant for study-ing proteins, which require the formation of a protein complex [5–8], proteins that associate with regulatory subunits [9], and proteinlocalization [10, 11]. Protein–protein interaction studies are alsoessential to understand how plantmetabolic fluxes can be controlledthrough the formation of metabolons [12]. Moreover, many pro-tein activities are regulated by posttranslational modifications(PTM); therefore, it is necessary to identify the underlying modify-ing enzymes, which often only transiently interact with these pro-teins [13, 14]. Hence, mapping the physical protein interactionnetwork brings a higher level of understanding than solely theobservation of protein spatial and temporal co-occurrences[15]. Moreover, an unneglectable part of the genes (around 13%in Arabidopsis) is annotated with an unknown function[16, 17]. Knowing physical partners of the encoded proteins canlead to a substantial advancement in their characterization [18, 19].

Several molecular or biochemical techniques are available tostudy protein–protein interactions such as (1) two-hybrid systems[20], (2) bimolecular fluorescence complementation [21], (3) bluenative-PAGE gel electrophoresis [22], (4) cocrystallization [23],(5) size exclusion chromatography [24], (6) affinity purification[25, 26], and (7) immunoprecipitations [27], among others.While the first two are limited to analyze the direct interaction ofonly two proteins, the latter require complementing of these tech-niques with quantitative mass spectrometry (MS), which will allowthe identification of protein–protein interaction networks at a largescale [28, 29], and to distinguish background copurifying proteinsfrom real interactors. Since antibodies directed against the proteinof interest are not always available, it is often more easily achievableto express the protein of interest as tagged-protein (e.g., GFP-tag,Myc-tag, Flag-tag) in the living organism, for which immobilizedantibody matrices are available [30, 31].

In this chapter, we describe a workflow (Fig. 1) to identifynative protein interaction networks from Arabidopsis expressingGFP-tagged bait proteins.

Fig. 1 (continued) proteins of interest, background-binding proteins (grey and black dots) are copurified.These unspecific binders are important for the correct normalization of the label-free quantification. Afterwashing, the bound proteins are eluted by acidic denaturation. The quality of the pull-down can be assessedby gel electrophoresis and Western blotting. The eluted protein samples can be directly processed by trypticdigestion followed by a C18 desalting steps. Peptide samples are analyzed by LC-MS/MS and the acquiredspectra are processed by computational analysis using MaxQuant [32, 33]. Label-free quantification values(LFQ) are used to calculate the fold-enrichment and a t-test p-value from the biological replicates. Data can bepresented as a volcano plot, in which the interacting protein (blue dots), which satisfy quality thresholds (e.g.,adjusted p-value <0.05 and at least twofold enrichment), appear on the upper right part, while the unspecificGFP-agarose interactors (black dots) are located outside the threshold limits. SN supernatant, P pellet, MLmolecular ladder, Trp trypsin

Quantification of Protein-Protein Interactions with GFP-Trap Beads 259

This procedure uses plant tissue expressing either the baitprotein fused to a GFP-tag or GFP only as starting material. Afterprotein extraction under native conditions, free GFP or the bait-GFP protein is enriched by immunoaffinity purification togetherwith their putative interaction partners and background-bindingproteins. This setup enables a comparative mass spectrometry-based label-free quantification, which can discriminate unspecificprotein interactions from real interactors [18, 19], and whichshould be preferred over using blacklists of the so-called nonspecificproteins [26]. A commercially available GFP-matrix coupled with asingle-domain antibody from camelids termed VHH is used for theenrichment of the GFP-tagged proteins [31]. Trapped proteincomplexes are then eluted from the beads and subjected to trypsindigestion. The resulting peptides are analyzed by LC-MS/MSanalysis, and proteins are identified and quantified by label-freequantification using the MaxQuant software [32, 33]. Further pro-cessing of the data using the Perseus computational platform allowsto evaluate the reproducibility of the data, and to separate candidateinteracting protein from unspecific interactions by calculation of afold-enrichment assorted by a p-value [34]. In addition, this work-flow also includes a fast and easy gel-based procedure to evaluatethe quality of the pull-down. Although this chapter is described forleaf tissue, it can be applied to any other plant tissue or organellepreparation.

InArabidopsis, we previously applied this work-flow to success-fully identify interacting proteins of a mitochondrial sirtuin-typedeacetylase and of a seed protein with an unknown molecularfunction leading to substantial advances in the understanding oftheir respective role in plant biology [18, 19].

2 Materials

Prepare all the buffers using ultrapure water (double distilled orMilliQ water), unless otherwise specified.

2.1 Plant Tissue 1. Leaf tissue harvested from Arabidopsis thaliana plants expres-sing the bait-GFP protein and leaf tissue harvested from Ara-bidopsis thaliana plant lines expressing a free GFP (seeNote 1).

2.2 Immunoaffinity

Pull-Down Procedure

1. Extraction buffer: 50 mM Tris–HCl, pH 7.5, 150 mM NaCl,10 % (v/v) glycerol, 2 mM EDTA, 5 mM dithiothreitol (DTT)(see Note 2), 1 % (v/v) Triton-X100, protease inhibitor cock-tail for plant cell and tissue extracts Sigma P9599 at a dilution1:100 (v/v).

260 Guillaume Nee et al.

2. Equilibration/wash buffer: 50 mM Tris–HCl, pH 7.5,150 mMNaCl, 10 % (v/v) glycerol, 2 mMEDTA, 5 mMDTT.

3. Elution buffer: 0.2 % trifluoroacetic acid (TFA) (v/v).

4. Neutralization buffer: 0.1 M Tris–HCl, pH 8.0, 1 mM CaCl2.

5. ChromoTek GFP-Trap® A beads (see Note 3).

6. Pierce 660 nm Protein Assay (see Note 4).

2.3 SDS–

Polyacrylamide Gel

Electrophoresis

1. Resolving gel buffer: 1.5 M Tris–HCl, pH 8.8.

2. Stacking gel buffer: 0.5 M Tris–HCl, pH 6.8.

3. SDS 10 % (v/v).

4. Ammonium persulfate (APS) 10 % (v/v).

5. 30% acrylamide–bisacrylamide solution, 37.5:1 (see Note 5).

6. N,N,N,N0-Tetramethylethylenediamine (TEMED).

7. 5� protein loading buffer: 0.25 M Tris–HCl, pH 6.8, 25 %glycerol, 10 % SDS, 0.1 % bromophenol blue, 12.5 % (v/v)mercaptoethanol.

8. 10� running buffer: 1 % SDS, 1.92M glycine, 0.25M Tris; thepH of the buffer should be 8.3, no pH adjustment is required.

9. Electrophoresis apparatus such as Mini-PROTEAN® contain-ing spacer plates with 1.0 mm integrated spacers, casting sys-tem (Bio-Rad).

10. Gel staining solution: Oriole™ Fluorescent Gel Stain(Bio-Rad).

2.4 Western Blot 1. Transfer buffer: 25 mM Tris, 190 mM glycine, 20 % methanol,0.1 % SDS. the pH of the buffer should be 8.3, no pH adjust-ment is required.

2. 10� Tris-buffered saline (TBS): 1.5 M NaCl, 0.1 M Tris–HCl,pH 7.4.

3. TBS containing 0.05 % (v/v) Tween-20 (TBST).

4. Blocking solution: 5 % (w/v) Skim milk powder (SERVA Elec-trophoresis) prepared in TBS.

5. Nitrocellulose membrane (e.g., Amersham Protran 0.45 μMnitrocellulose Western blotting membranes).

6. Grade 3MM cellulose chromatography paper (GE HealthcareLife Science Whatman™).

7. Clean plastic or glass container.

8. Anti-GFP antibody raised in mouse (Roche #11814460001).

9. Secondary anti-mouse HRP-linked antibody (Sigma A3562).

10. Rocking shaker (VWR® Mini Blot Mixer).

Quantification of Protein-Protein Interactions with GFP-Trap Beads 261

11. Ponceau S staining solution: 0.1 % (w/v) Ponceau S dissolvedin a 5 % (v/v) acetic acid solution.

12. ECL detection kit for Western blot: Amersham ECL PrimeWestern Blotting Detection Reagent kit (GE Healthcare LifeScience).

13. Semi-Dry electrophoretic transfer apparatus (Biometra).

14. Chemiluminescence imaging system: ChemiDoc™ MP Imag-ing system (Bio-Rad).

2.5 Mass

Spectrometry

1. Ammonium bicarbonate (ABC) solution: 0.05 M NH4HCO3.

2. Chloroacetamide (CAA) stock solution: 550 mM CAAprepared in ABC solution.

3. Dithiothreitol (DTT) stock solution: 500 mM DTT.

4. Proteomics-grade trypsin stock solution: 1 μg/μL in0.1 M HCL.

5. 10 % formic acid.

6. 100 % methanol.

7. Buffer A: 0.5 % formic acid (FA) in H2O.

8. Buffer B: 80 % acetonitrile (ACN), 0.5 % FA.

9. Buffer A∗: 2 % ACN, 0.1 % TFA.

10. A high-resolution mass spectrometer for LC-MS/MS analysis(e.g., QExactive HF coupled to a nano-liquid chromatography,Thermo Scientific).

11. Software: Max-Quant [32, 33] and Perseus [34] (see Note 6).

3 Methods

All steps are performed at RT, unless specified otherwise.

3.1 Pull-Down

Procedure

3.1.1 Extraction of Intact

Protein Complexes

1. Homogenize around 2 g of leaf tissue per sample in liquidnitrogen (see Note 7).

2. Transfer the frozen powder into a prechilled tube and add coldextraction/wash buffer at a ratio of 1:3 [g: mL]. Make sure thatall the tissue powder is completely submerged by invertingthe tube.

3. Incubate sample for 30–60 min at 4 �C on a test tube rotator(12 rpm) and centrifuge for 20 min at 4 �C and 18,000 � g.

4. Transfer supernatant to a new tube, and if necessary, repeatcentrifugation step until the supernatant is clear (see Note 8).

5. Perform protein quantification for each sample using the Pierce660 nm Protein Assay kit (see Note 9).

262 Guillaume Nee et al.

6. Adjust all samples to a similar protein concentration by dilutingsamples with extraction buffer (see Note 10). The adjustedprotein samples are named “Input fractions”.

3.1.2 GFP-Trap

a Capture of Protein

Complexes

1. Add 25 μL of commercially available GFP-Trap® A beads (50 %slurry) to 500 μL of ice-cold equilibration buffer in a 2 mL lowprotein binding tube. Prepare one tube per sample and inverttubes three times (see Note 11).

2. Centrifuge beads for 1 min at 4 �C, 1000 � g and removesupernatant (see Note 12).

3. Resuspend beads in 500 μL of cold equilibration buffer andrepeat step 2.

4. Add a volume corresponding to 1–5 mg total protein (maxi-mum volume 1.5 mL) of adjusted samples (Input fractionsfrom Subheading 3.1.1) to each tube of equilibratedGFP-trap beads.

5. Incubate for 2–4 h at 4 �C on test tube rotator at 12 rpm (seeNote 13).

6. Centrifuge at 4 �C and 1000 � g for 1 min and removesupernatant (named “nonbound” fraction).

7. Wash the beads by slowly adding 500 μL of ice-cold equilibra-tion/wash buffer to the GFP-trap beads and carefully invertthe tube to resuspend the beads. Do not vortex.

8. Centrifuge at 4 �C and 1000 � g for 1 min and discardsupernatant.

9. Repeat steps 7 and 8 one to four times to eliminate most butnot all the background-binding proteins. Keep 50 μL of the lastwash for the downstream gel analysis of the pull-down quality(see Note 14).

10. Elute bait-GFP containing protein complexes by addition of35 μL of elution buffer to the matrix and wait for about 5 min,centrifuge for 1 min at room temperature (RT), 1000 � g andtransfer supernatant to a new tube. Directly neutralize elutionfractions by adding 35 μL of neutralization buffer.

11. Repeat step 9 and combine eluates (see Note 15).

3.2 Validation

of the Pull-Down

Procedure

3.2.1 SDS-PAGE

1. Prepare 10 mL of a 10 % SDS-PAGE resolving gel solution, bymixing 2.5 mL of resolving buffer, 3.33 mL of a 30% acrylam-ide–bisacrylamide (37.5:1) solution, 4 mL of distilled water,100 μL of a 10 % SDS solution, 66 μL of a 10 % APS solution,and 14 μL of TEMED.

2. Cast two gels using clean Mini-PROTEAN® Spacer Plates with1.0 mm integrated spacers by pouring 4.5 mL resolving gelsolution per gel. Make sure that no air bubbles are trapped inthe resolving gel solution.

Quantification of Protein-Protein Interactions with GFP-Trap Beads 263

3. Gently overlay with isopropanol (see Note 16).

4. Let the gel polymerize for about 30–40 min.

5. Prepare 2.2 mL of 4.2% of stacking gel solution by mixing550 μL of stacking gel buffer, 308 μL of acrylamide–bisacryla-mide (37.5:1) solution, 1.3 mL of distilled water, 22 μL of a10 % SDS solution, 11 μL of a 10 % APS solution, and 9 μL ofTEMED.

6. Remove isopropanol by inverting the gel cassette. Pour stack-ing gel solution on the top of the polymerized resolving gel andimmediately insert a 10-well gel comb without introducing airbubbles. Let it polymerize for about 60 min (see Note 17).

7. Prepare samples in 5� protein loading buffer. Prepare 20 μg ofinput fraction, 20 μg of nonbound fraction, 20 μL of the lastwash, and 20 μL of the elution fraction. Heat samples at 95 �Cfor 5 min to denature proteins.

8. Place the gels in the electrophoresis tank and fill with 1x SDSrunning buffer. Load two identical gels with either GFP-bait orcontrol samples.

9. Perform electrophoresis at 15 mA per gel until the sample hasentered the gel. Then apply 25 mA per gel until the dye frontreaches the bottom of the gel (see Note 18).

10. Following electrophoresis, open the gel plates using a spatulaand recover the resolving gels.

11. Stain one gel with Oriole™ Fluorescent Gel Stain (Bio-Rad)for 90 min under gentle agitation (see Note 19). Keep thesecond gel for the Western blot analysis using an anti-GFPantibody.

3.2.2 Western Blot

Analysis

1. Rinse the gel with water and transfer it to a container filled withWestern blot transfer buffer.

2. Cut a nitrocellulose membrane to the size of the gel andimmerse it with Western blot transfer buffer (see Note 20).

3. Soak four to eight pieces of Whatman filter paper (same size asthe nitrocellulose membrane) in the transfer buffer and placethem onto the anode side of the transfer apparatus. Positionthe nitrocellulose membrane onto the Whatman paper. Placethe gel on top of the membrane and place four to eight piecesof Whatman filter paper soaked in the transfer buffer on top (seeNote 21).

4. Assemble the blotting system and transfer proteins at 1.2 mAper cm2 of membrane for 1 h.

5. After transfer, carefully recover the membrane and place it,protein face up, into a clean plastic container.

264 Guillaume Nee et al.

6. Assess the quality of the transfer by immersing the membranein a Ponceau S staining solution for 2–5 min.

7. Remove Ponceau S staining by washing the membrane withTBS buffer.

8. Cover membrane with blocking solution. Incubate for 1 h withgentle agitation.

9. Remove the blocking solution and wash the membrane withTBS-T for 10 min.

10. Dilute the primary anti-GFP antibody 1:2500 in TBS-T.

11. Incubate the membrane with the primary antibody for 1 h atRT with gentle agitation.

12. Remove primary antibody solution and wash three times themembrane with TBS-T for 10 min.

13. Dilute the secondary antibody cross-linked to horseradish per-oxidase (1:20000) in TBS-T.

14. Incubate the membrane with the secondary antibody for 1 h atRT with gentle agitation.

15. Remove secondary antibody solution and wash the membranethree times with TBS-T for 10 min.

16. Incubate the membrane with the Amersham ECL Prime West-ern Blotting Detection Reagent or similar (see Note 22).

17. Visualize the GFP-tagged protein of interest and the free GFPwith the ChemiDoc™ MP system using the accumulationmode from 30 s to 5 min.

3.3 Sample

Preparation for Mass

Spectrometry Analysis

1. Add 1 μL of a 500 mM DTT stock solution to 100 μL of eachelution sample to reach a final concentration of 5 mM DTT.Incubate samples for 30 min in the dark.

2. Add 3 μL of a 550 mM CAA stock solution to each sample toreach a final concentration of 15 mM CAA. Incubate samplefor 1 h in the dark (see Note 23).

3. Quench excess CAA by addition of 2.5 μL of a 500 mM DTTstock solution to reach a final concentration of 12 mM DTTfollowed by incubation for 10 min in the dark.

4. Add 440 μL of ABC buffer (0.05 M NH4HCO3) for 5�dilution.

5. Add trypsin at a trypsin to protein concentration ratio of 1:100(min. conc. 5 ng/μL) and incubate at 37 �C for 16 h.

6. Stop the digestion by adding 55 μL of a 10 % formic acidsolution to reach a final concentration of 1 %.

7. Load a 200 μL pipette tip with a C18 matrix to prepare a STopand GO Extraction Tip (STAGE-Tip) (see Note 24).

Quantification of Protein-Protein Interactions with GFP-Trap Beads 265

8. Stepwise equilibrate matrix with 60 μL methanol, 60 μL ofbuffer B, and 60 μL of buffer A.

9. Slowly load peptide sample onto the C18 matrix.

10. Wash twice with 30 μL of buffer A (see Note 25).

11. Elute peptides with 20 μL of buffer B twice.

12. Evaporate the peptide-containing elutes using a vacuumconcentrator.

13. Resuspend sample in 10–12 μL of C18 buffer A∗ for LC-MS/MS analysis.

3.4 LC-MS/MS Data

Acquisition, Data

Processing,

and Statistical

Analysis

1. Separate peptides in a stepped gradient of 0–55 % solvent B(80 % ACN, 0.1 % FA) at a flow rate of 300 μL/min for 60 minfollowed by wash steps (see Note 26).

2. Acquire mass spectra in the Orbitrap analyzer with a Top15method and a resolution of 120,000 in MS1, and 15,000 inMS2. Use a scanning mass range from 300 to 1750 m/z. Setcollision energy to 25 (see Note 27). Exclude peptides with acharge of +1, >+6 and peptides which are not assigned to anycharge state from fragmentation. Accumulate ions to a targetvalue of 3 � 106 (MS1) and 5 � 104 (MS2).

3. Process the raw spectrum files with label-free quantificationenabled in MaxQuant and search against the ArabidopsisTAIR10 database including the sequence from theGFP-tagged protein (http://www.arabidopsis.org). Use presetstandard search settings of MaxQuant. Activate the “matchbetween runs” option (see Note 28).

4. Perform quantitative statistical analysis of the protein groupstable using Perseus or similar software.

5. Define experiment groups and filter proteins for “two in at leastone group”, to increase the stringency of the dataset.

6. Use log2 transformed LFQ intensity values to calculate therelative protein abundance between samples and perform astatistical analysis (e.g., two-sample t-test).

4 Notes

1. The quality of the starting transgenic plant material is a keyfactor to successfully identify protein interactions from planttissues. Hence, several aspects must be considered: (1) Werecommend evaluating if the fusion protein construct is cor-rectly expressed in the tissue of interest. (2) The levels and thespatiotemporal expression pattern of a protein are importantfeatures of its function. Expressing the fusion construct underits native promoter is preferable over the use of a constitutive

266 Guillaume Nee et al.

promoter. (3) In some cases, it is possible that the presence of afused tag to the protein of interest might alter its activity orinteractions. Therefore, it is recommended to validate thefunctionality of the fusion protein by genetic complementationassays whenever possible. Consequently, a functionally com-pleted line expressing the protein of interest under its nativepromoter can be considered as the most suitable starting plantmaterial. (4) The method described here aims to discriminateinteracting proteins from most of the proteins that are nonspe-cifically copurified with the bead matrix. It relies on the label-free quantitative mass spectrometry analysis of the proteinabundances of the control pull-down versus the pull-downfrom the bait-sample (Fig. 1). Thus, for a reliable statisticalanalysis, the GFP stoichiometry matters. Therefore, care mustbe taken during selection of a suitable line expressing a freeGFP for background control. This plant line should ideallyaccumulate similar amounts of the GFP protein in comparisonto the bait-protein plant line. A Western blot analysis for thedetection of the GFP protein will help in selecting a suitablecontrol line. (5) Protein subcellular localization is a determi-nant of the protein interaction network. Consequently, if theprotein of interest is targeted to organelles, it is recommendedto include the same or similar targeting amino acid sequencefused to the free GFP. (6) In a minimal experimental setup, alsotissue from nontransgenic wild-type plants can be included asbackground control. However, this might increase the numberof false positive enriched proteins. Anyhow, the biological datainterpretation must be done carefully to extract a meaningfulconclusion, and other methods will require the confirmation ofthe protein–protein interactions as listed in the introduction.

2. DTT is used to reduce disulfide bonds of proteins. It is recom-mended to prepare DTT solutions freshly.

3. GFP-Trap® A beads (ChromoTek) are recommended for thisprocedure. This commercial matrix offers the advantage tolimit contamination of the sample with antibody chains at theelution step and therefore improves MS identification of inter-acting proteins.

4. This protein quantification kit is fully compatible with theextraction procedure presented herein. If you intend to modifythe extraction buffer or use another protein quantificationmethod, ensure that all the components of the extractionbuffer are compatible with the quantification assay.

5. Buy commercially available bottle of acrylamide–bisacrylamidesolution, 37.5:1 and store at 4 �C.

6. Download MaxQuant and Perseus from http://www.maxquant.org/.

Quantification of Protein-Protein Interactions with GFP-Trap Beads 267

7. Freshly harvested tissues are recommended, but tissues storedat�80 � Cmight also be suitable, depending on the stability ofthe protein–protein interactions. To allow for a statistical anal-ysis of the data, the bait-protein and the control pull-downsneed to be carried out in at least biological triplicates each.Cool down mortars and pestles before homogenization toavoid protein degradation.

8. Be careful to avoid the lipid layer that could be present on topof the aqueous phase.

9. At this point, it is possible to check if the protein of interest ispresent in the native extract by Western blot analysis. If theprotein is not detected, you may need to optimize the compo-sition of the extraction buffer (varying the glycerol, salt, andreductant concentration might help).

10. The sample protein concentrations should be adjusted tobetween 1 and 5 mg mL�1 depending on the abundance ofthe bait-GFP protein in the extract. Protein concentrationshave to be kept similar between all samples to avoid loadingbias during the pull-down.

11. A volume of 25 μL of GFP-Trap® A beads is, in principle,sufficient to immobilize 0.5 nmoles of GFP-tagged protein.

12. To avoid loss of beads, first carefully remove around 75% of thesupernatant with a 1 mL pipette. Carefully remove the remain-ing supernatant using a gel-loader tip.

13. Some pull-down experiments may require overnight incuba-tion, especially in the case of very low in vivo abundance of thebait-GFP protein.

14. For label-free quantification analysis, it is important to notwash away all the background-binding proteins. Absence ofbackground proteins makes normalization and statistical anal-ysis impossible. You might have to optimize the number ofwashing steps in case you see too much or too littlebackground.

15. Eluted protein samples can be stored at �20 �C until furtherprocessing.

16. Adding isopropanol to cover the gel helps to get an even gelsurface and helps removing possible air bubbles trapped inbetween the glass plates.

17. It is important to let the gel polymerize well for the bestresolution. However, do not let the gel polymerize for morethan 2 h to avoid unwanted drying. Covering the gel-castingchamber with a wet paper towel can preserve humidity if thegels will not be used immediately.

268 Guillaume Nee et al.

18. Gel electrophoresis is carried out in a Mini-PROTEAN®

(Bio-Rad) apparatus for example.

19. No fixation or destaining steps are required. The Oriole™Fluorescent Gel Stain (Bio-Rad) can detect nanograms of pro-tein in the elution fraction. Proteins can be visualized using thestandard setting for ethidium bromide of the ChemiDoc™MP(Bio-Rad) imager for example.

20. Nitrocellulose membrane should be handled only at the cor-ners with adapted flat-ended forceps.

21. Gently roll over the sandwich with a clean glass tube to removeany air bubbles, which may have formed within the blottingsandwich.

22. Mix an equal volume of solution A and B from the AmershamECL Prime Western Blotting Detection Reagent kit and useimmediately. Make sure that all the membrane is evenly cov-ered with the reagent. For an even distribution of the reagent,you can place a Parafilm® sheet on top of the membrane.

23. Complete cysteine alkylation is essential to optimize the iden-tification rate. The pH of the reaction mixture must be aroundeight to ensure that all cysteine residues are deprotonated andlimit unspecific alkylation by CAA. You can check the pH of thereaction mixtures by applying 5 μL from each sample ontousing colorimetric pH paper.

24. Sample processing via STAGE tips can be achieved by applyingpressure onto the tip either using a syringe or by centrifugationat 1000 � g (this requires a STAGE tip centrifuge (SonationGmbH, Biberach) or an adaptor to fit the tip onto a 1.5 mLreaction tube).

25. If your sample will not be analyzed immediately, do not elutethe peptides and store the STAGE tips at �20 �C. Peptides aremore stable when bound to the C18 matrix during storage incomparison to storage in solution.

26. Peptides are separated using in-house packed C18 fused silicaemitters (75 μm inner diameter, SilicaTip™ PicoTip™ Emit-ter, New Objective), which are cut to about 17 cm in lengthand heated to 50 �C in a column oven.

27. Any high-resolution nano-UHPLC-MS/MS setup can be usedfor sample analyses. We analyze samples using an EASY-nLC™1200 coupled to a Q Exactive™ HF Hybrid Quadrupole-Orbitrap Mass Spectrometer (Thermo Fisher Scientific).

28. The “match between runs” option allows for using the MS1precursor intensity for quantification purposes in sampleswhere the MS2 identification is missing. In such cases, theprecursor identity can be matched to another sample wherean MS2 spectrum has been acquired and where the MS1 massis present at the same elution time window.

Quantification of Protein-Protein Interactions with GFP-Trap Beads 269

Acknowledgments

We gratefully acknowledge the Deutsche Forschungsgemeinschaft(DFG) for financial support through the project grants NE2296/1-1 and FI1655/6-1, and the infrastructure grant INST211/744-1.

References

1. Bolger M, Schwacke R, Gundlach H et al(2017) From plant genomes to phenotypes. JBiotechnol 261:46–52

2. Schatz MC, Witkowski J, Mccombie WR(2012) Current challenges in de novo plantgenome assembly. Genome Biol 13:2–7

3. Humann U,Woetzel S, Madrid-Herrero E et al(2017) Improving and correcting the contigu-ity of long-read genome assemblies of threeplant species using optical mapping and chro-mosome conformation capture data. GenomeRes 27:778–786

4. Bontinck M, Van Leene J, Gadeyne A et al(2018) Recent trends in plant protein complexanalysis in a developmental context. FrontPlant Sci 9:1–14

5. Sako K, Yanagawa Y, Kanai T et al (2014)Proteomic analysis of the 26S proteasomereveals its direct interaction with transit pep-tides of plastid protein precursors for their deg-radation. J Proteome Res 13:3223–3230

6. Dose A, Sindlinger J, Bierlmeier J et al (2016)Interrogating substrate selectivity and compo-sition of endogenous histone deacetylase com-plexes with chemical probes. Angew Chem IntEd 55:1192–1195

7. Inoshima MM, Ikuchi KK (2015) Chemicaltools for probing histone deacetylase (HDAC)activity. Anal Sci 31:287–292

8. Kramer K, Finkemeier I, Humpf H et al (2016)The SAGA complex in the rice pathogen Fusar-ium fujikuroi: structure and functional charac-terization. Mol Microbiol 102:951–974

9. Gao X, Liu CZ, Li DD et al (2016) The Arabi-dopsis KIN βγ subunit of the SnRK1 complexregulates pollen hydration on the stigma bymediating the level of reactive oxygen speciesin pollen. PLoS Genet 12(7):e1006228

10. Editor D (2014) Dual-targeting of Arabidopsischloroplasts and peroxisomes involves interac-tion with Trx m2 in the cytosol. Mol Plant7:252–255

11. Garagounis C, Kostaki K, Hawkins TJ et al(2017) Microcompartmentation of cytosolicaldolase by interaction with the actin cytoskel-eton in Arabidopsis. J Exp Bot 68:885–898

12. Zhang Y, Beard KF, Swart C (2017) Protein-protein interactions and metabolite channel-ling in the plant tricarboxylic acid cycle. NatCommun 8:15212

13. Liebert MA, Rouhier N, Villarejo A et al(2005) Identification of plant glutaredoxin tar-gets. Antioxid Redox Signal 7:919–929

14. Hao Y, Wang H, Qiao S et al (2016) Histonedeacetylase HDA6 enhances brassinosteroidsignaling by inhibiting the BIN2 kinase. ProcNatl Acad Sci U S A 113:2–7

15. Jones AM, Xuan Y, Xu M et al (2014) Bordercontrol — a membrane-linked interactome ofArabidopsis. Science 1:711–717

16. Krishnakumar V, Hanlon MR, Contrino S et al(2015) Araport: the Arabidopsis informationportal. Nucleic Acids Res 43:D1003–D1009

17. Luhua S, Hegie A, Suzuki N et al (2013) Link-ing genes of unknown function with abioticstress responses by high-throughput pheno-type screening. Physiol Plant 148:322–333

18. Nee G, Kramer K, Nakabayashi K et al (2017)DELAY of GERMINATION1 requires PP2Cphosphatases of the ABA signalling pathway tocontrol seed dormancy. Nat Commun 8:1–8

19. Konig A, Hartl M, Pham PA et al (2014) TheArabidopsis class II sirtuin is a lysine deacety-lase and interacts with mitochondrial energymetabolism. Plant Physiol 164:1401–1414

20. Trigg SA, Garza RM, MacWilliams A et al(2017) CrY2H-seq: a massively multiplexedassay for deep-coverage interactome mapping.Nat Methods 14:819–825

21. Bock R (2016) Lighting the way to protein-protein interactions: recommendations on bestpractices for bimolecular fluorescence comple-mentation analyses. Plant Cell 28:1002–1008

22. Senkler J, Senkler M, Eubel H et al (2017) Themitochondrial complexome of Arabidopsisthaliana. Plant J 89:1079–1092

23. Cui B, Fang S, Xing Y et al (2015) Crystallo-graphic analysis of the Arabidopsis thalianaBAG5 – calmodulin protein complex researchcommunications. Acta Crystallogr F StructBiol Cryst Commun 71:870–875

24. Bai Y (2015) Detecting protein-protein inter-actions by gel filtration chromatography. In:

270 Guillaume Nee et al.

Protein-protein interactions: methods andapplications, 2nd edn. Humana Press,New York, NY, pp 223–232

25. Stroher E, Dietz K (2006) Concepts andapproaches towards understanding the cellularredox proteome. Plant Biol 8:407–418

26. Van LJ, Eeckhout D, Cannoot B et al (2015)An improved toolbox to unravel the plant cel-lular machinery by tandem affinity purificationof Arabidopsis protein complexes. Nat Protoc10:169–187

27. Birkenbihl RP, Kracher B, Ross A et al (2018)Principles and characteristics of the ArabidopsisWRKY regulatory network during earlyMAMP-triggered immunity. Plant J96:487–502

28. Hein MY, Hubner NC, Poser I et al (2015) Ahuman interactome in three quantitativedimensions organized by stoichiometries andabundances. Cell 163:712–723

29. Keilhauer EC, Hein MY, Mann M (2015)Accurate protein complex retrieval by affinityenrichment mass spectrometry (AE-MS) rather

than affinity purification mass spectrometry(AP-MS). Mol Cell Proteomics 14:120–135

30. Moree WJ, Mitchell M, Widger W et al (2016)Observations on different resin strategies foraffinity purification mass spectrometry of atagged protein. Anal Biochem 515:26–32

31. Rothbauer U, Zolghadr K, Tillib S et al (2006)Targeting and tracing antigens in live cells withfluorescent nanobodies. Nat Methods3:887–889

32. Cox J, Mann M (2008) MaxQuant enableshigh peptide identification rates, individualizedp.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol26:1367–1372

33. Tyanova S, Temu T, Cox J (2016) The Max-Quant computational platform for massspectrometry-based shotgun proteomics. NatProtoc 11:2301–2319

34. Tyanova S, Temu T, Sinitcyn P et al (2016) ThePerseus computational platform for compre-hensive analysis of (prote)omics data. NatMethods 13:731

Quantification of Protein-Protein Interactions with GFP-Trap Beads 271

Chapter 20

In Vivo Cross-Linking to Analyze Transient Protein–ProteinInteractions

Heidi Pertl-Obermeyer and Gerhard Obermeyer

Abstract

Cross-linking converts noncovalent interactions between proteins into covalent bonds. The now artificiallyfused molecules are stable during purification steps (e.g., immunoprecipitation). In combination with avariety of techniques, including Western blotting, mass spectrometry (MS), and bioinformatics, thistechnology provides improved opportunities for modelling structural details of functional complexes inliving cells and protein–protein interaction networks. The presented strategy of immunoaffinity purificationand mass spectrometry (AP-MS) coupled with in vivo cross-linking can easily be adapted as a robustworkflow in interactome analyses of various species, also nonmodel organisms.

Key words In vivo cross-linking, Protein–protein interaction, Immunoaffinity purification, In-solu-tion digestion, Mass spectrometry, Pollen

1 Introduction

Protein–protein interactions are essential for living cells. Severalmethods have been developed to study these interactions at thelevel of individual molecules and at the global scale. Among these,coimmunoprecipitation (co-IP) is a very common key technique forthe analysis of protein–protein interactions, including interactionsof subunits within a protein complex [1]. By the use of an antibody,which specifically recognizes one of the known components of themultiprotein complex, the entire complex can be isolated using thespecific antibody covalently attached to magnetic beads or proteinA/G agarose beads. The immunoprecipitated samples can then beanalyzed by SDS-PAGE or subjected to in-solution trypsin diges-tion [2] followed by mass spectrometry analysis to identify theproteins. Unfortunately, this method is limited (1) by the require-ment of a specific antibody for each protein of interest, (2) by thecross-reactivity of the antibody that very often leads to the identifi-cation of some false positives and, most important, (3) by the loss oftransient or very weak interactions during the purification

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_20, © Springer Science+Business Media, LLC, part of Springer Nature 2020

273

procedure. To overcome some of these limitations a chemicallycross-linker can be added to cell cultures to prevent the loss ofcertain components of the protein complex and additionally, strin-gent washing conditions can remove unspecifically bound proteinsand contaminants. Many protein cross-linking reagents are com-mercially available and are defined by a minimum of two reactivegroups and classified according to their reactivity (primary amines,e.g., lysine, or sulfhydryls, e.g., cysteine), membrane permeability,water solubility, cleavability, and spacer length between the tworeactive groups [1]. Homobifunctional (identical reactive groupsat either end of a spacer arm), amine-reactive NHS(N-hydroxysuccinimide) esters or imidates, and heterobifunctional(different reactive groups at either end), amine-reactive, photoacti-vatable phenyl azides are the most commonly used cross-linkers. Anextensive list of cross-linking reagents including a cross-linker selec-tion tool is available from Thermo Fisher Scientific (https://www.thermofisher.com). A very useful reagent for stabilizing protein–protein interactions in vivo is formaldehyde [3]. It is an inexpensiveand commonly used protein cross-linker, which reacts primarilywith lysine residues and additionally, is membrane permeable,which allows for rapid fixation of transient processes in living cellsand consequently, inactivation of many cellular proteases. Formal-dehyde is also a short cross-linker (spacer arm length ~ 2 A) thatallows the capture of protein associations in close proximity andsimultaneously, minimizes the risk of identification of false positives.A major advantage of this technique is the reversibility of theinduced covalent bonds, especially when subsequent analysis ofthe sample by mass spectrometry is planned.

To maximize the yield of chemically cross-linked peptides, anoptimization of the cross-linking reaction (i.e., concentration offormaldehyde, incubation time) is mandatory (see Fig. 1). In thischapter we present a simple, inexpensive and robust procedure forcross-linking proteins in living organisms for mass spectrometry-based analysis of protein–protein interactions.

2 Materials

Prepare all buffers or solutions using double distilled or ultrapurewater and analytical grade reagents. Perform all centrifugation stepsat 4 �C (unless specified otherwise).

2.1 Plant Material All plant tissues or adequate plant cell cultures, for example, pollencultures from lily (Lilium longiflorum Thunb.), thale cress (Arabi-dopsis thaliana), tobacco (Nicotiana tabacum), and tomato (Sola-num lycopersicum) or seedlings as well as liquid cultures fromspecific plant tissues or organs (explants from roots, stems, leaves,flowers) as well as yeast cell cultures can be used for in vivo cross-linking experiments.

274 Heidi Pertl-Obermeyer and Gerhard Obermeyer

cell lysates

250--150--100--75--

50--

37--

25--

kDa

M 0 10 20 30 45 60 min

0.5 % PFA

cell lysates

250--150--100--75--

50--

37--

25--

kDa

14-3-3s

0 10 20 30 45 60 min

0.5 % PFA

cell lysates

0 10 20 30 45 60 min

0.5 % PFA

PM H+

ATPase

250--150--

100--75--

50--

37--

25--

kDa

b

*

15 5 10 15 20 30 min

RT 95 °C

cell lysates (0.5% PFA, 20 min) cell lysate (0.5 % PFA, 20 min)

5 10 15 20 30 min

95 °C

250--150--100--

75--

50--

37--

25--

kDa

14-3-3s

cell lysate (0.5 % PFA, 20 min)

15 5 10 15 20 30 min

RT 95 °C

250--150--100--

75--

50--

37--

25--

kDa

250--148--

98--

64--

50--

36--22--

kDa

c

*

M

% PFA, 20 min

cell lysates

250--150--100--75--

50--

37--

25--

kDa

M

cell lysates

14-3-3s

% PFA, 20 min

kDa

250--150--

100--

75--

50--

37--

25--

250--150--100--75--

50--

37--

25--

kDa

% PFA, 20 min

PM H+

ATPase

cell lysates

*

a

Fig. 1 Optimization of in vivo cross-linking of lily pollen grains. (a) To determine the optimal PFA concentrationpollen grains from 5 flowers (~0.5 g fresh weight) were incubated in 0.0625–1% (w/v) PFA for 20 min at roomtemperature. Cell lysate proteins were then separated by SDS-PAGE and visualized by Coomassie stain (left)and analyzed by immunodetection with monoclonal anti-14-3-3 antibody (middle) and monoclonal anti-PM H+

ATPase antibody (right). (b) To define the optimal incubation time pollen grains were incubated for 10–60 minin 0.5% (w/v) PFA at RT. Incubation times of 20 min and more resulted in band smearing, which is anindication that the cross-linking is too excessive. 0¼ untreated pollen grains; 5 μl cell lysate were loaded pergel lane; (a) and (b) 15 min denaturing of samples at RT prior SDS-PAGE. (c) To reverse PFA cross-links, celllysates (pollen grains in 0.5% (w/v) PFA for 20 min) were incubated in 6� sample loading buffer, incubated for5–30 min at 95 �C and analyzed by CBB (Coomassie Brilliant Blue R250) or Western Blot. A shift from 14-3-3proteins cross-linked in complexes (∗) to 14-3-3 monomers and a losing of high molecular weight PM H+

ATPase complexes (arrow) are clearly visible. 5 μl cell lysate were loaded per gel lane [4]

Interactome Analysis in Living Cells 275

2.2 Formaldehyde

Cross-Linking of Cells

1. Cross-link solution: 10% (w/v) paraformaldehyde (PFA), 0.5 gPFA in 5 ml culture medium (see Note 1). Always preparefresh!

2. Stop buffer: 1.25 M glycine stock solution, weigh 0.938 g in10 ml double distilled water. Prepare fresh solution each time.

2.3 Preparation of

Cell Lysates

Prepare fresh solutions for each extraction.

1. Germination medium (Med B): 292 mM sucrose, 1.6 mMH3BO3, 1 mM KCl, 0.1 mM CaCl2, pH 5.6 (see [5]), weigh10 g sucrose in 90 ml double distilled water, add 100 μl of 1 MKCl stock solution, add 100 μl of 100 mM CaCl2 stock solu-tion, add 1.6 ml of 100 mM H3B03 stock solution, and fill upto 100 ml. The pH value should be 5.6, adjust with Tris orMES if necessary.

2. 8 μm filter.

3. Vacuum filtration unit.

4. Liquid nitrogen.

5. Precooled mortar and pestle (see Note 2).

6. Phosphate-buffered saline (PBS, pH 7.4): 1 mM NaH2PO4,5 mM Na2HPO4, 140 mM NaCl, 5 mM ethylenediaminete-traacetic acid (EDTA), 1% (v/v) Triton X-100. Weigh in 0.16 gNaH2PO4 � H2O, 0.98 g Na2HPO4 � 2 H2O, 8.10 g NaCland fill up with double distilled water to 1 l.

7. Lysis buffer: Weigh 0.093 g EDTA, add 40 ml PBS, add 50 μlleupeptin of 10 mM stock solution (see Note 3), add 5 μlpepstatin A of 10 mM stock solution (see Note 4) add 50 μlPMSF of 1 M stock solution (see Note 5), add 100 μl E-64 of1 mM stock solution (see Note 6), and add 500 μl TritonX-100. Make up to 50 ml with PBS. Always prepare fresh andkeep on ice! Protease inhibitors like PMSF degrade in aqueoussolutions with a half-life time of ca. 30 min at pH 8.

8. Refrigerated centrifuge.

2.4 Immunoaffinity

Purification

Prepare fresh solutions for each purification. Solutions can be keptat room temperature, if not otherwise indicated.

1. Magnetic beads (see Note 7).

2. Magnetic particle concentrator.

3. PBS, pH 7.4.

4. Wash Buffer: PBS + 0.1% (w/v) BSA (bovine serum albumin),0.01 g BSA and fill up to 10 ml with PBS. Store at 4 �C.

5. Storage Buffer: PBS + 0.1% (w/v) BSA + 0.02% (v/v) NaN3,

weigh 0.01 g BSA, add 20 μl of 1 M NaN3 stock solution andfill up to 10 ml with PBS. Store at 4 �C.

276 Heidi Pertl-Obermeyer and Gerhard Obermeyer

6. 0.2 M triethanolamine, pH 8.2: Weigh 0.371 g of triethanola-mine and add 8 ml PBS. Adjust pH with NaOH. Fill up to10 ml with PBS.

7. Cross-link solution II: 20 mM dimethyl pimelimidate dihy-drochloride (DMP) in 0.2 M triethanolamine, pH 8.2 (seeNote 8), 5.1 mg DMP and add 1 ml of 0.2 M triethanolaminesolution. Prepare just before use! This solution is used forcovalently binding a primary mouse antibody to the beads.

8. 50 mMTris, pH 7.5: Weigh 0.303 g Tris and add 40 ml doubledistilled water. Adjust pH value with 1 N HCl. Fill up to 50 mlwith water.

9. Elution Buffer: 0.1 M glycine hydrochloride, pH 2.5, 1% (v/v)Triton X-100. Weigh 0.375 g glycine and add 30 ml doubledistilled water. Add 500 μl Triton X-100. Adjust pH value with1 N HCl and fill up to 50 ml with water.

10. Saturated 1 M Tris solution: Weigh 6.057 g Tris and fill up to50 ml with water.

2.5 In-Solution

Trypsin Digestion

Always use the highest grade reagents available. To avoid keratincontamination of the samples, wear powder-free gloves and workclean. To avoid contaminations with released softening agents donot use autoclaved tips and tubes.

1. 10 mM Tris–HCl, pH 8.0: Weigh 0.121 g Tris and make up to100 ml with double distilled water. Adjust pH value with 1 NHCl. Store at room temperature.

2. UTU Buffer: 6 M urea, 2 M thiourea, in 10 mM Tris–HCl,pH 8.0, weigh 9.009 g urea, 3.806 g thiourea and fill up to25 ml with 10 mM Tris–HCl, pH 8.0. Store as 1 ml aliquots at�20 �C.

3. Reduction buffer: 6.5 mM DTT (DL-dithiothreitol), weigh5 mg DTT and add 5 ml of 10 mM Tris–HCl, pH 8.0. Store500 μl aliquots at �20 �C.

4. Alkylation buffer: 27 mM iodoacetamide (IAA), weigh2.49 mg iodoacetamide and add 5 ml of 10 mM Tris–HCl,pH 8.0. Keep the 500 μl aliquots protected from light at�20 �C.

5. Endoproteinase Lys-C: 0.5 μg/μl stock solution, dissolve15 μg Lys-C in 30 μl resuspension buffer (provided by manu-facturer). Keep at �20 �C for short-term storage or at �80 �Cfor long-term storage.

6. Sequencing grade modified trypsin: 0.5 μg/μl stock solution,dissolve 20 μg trypsin in 40 μl resuspension buffer (provided bymanufacturer). Keep at �20 �C for short-term storage or at�80 �C for long-term storage.

Interactome Analysis in Living Cells 277

7. 10% (v/v) trifluoroacetic acid (TFA): Add 1 ml TFA to 9 mldouble distilled water in a glass bottle. Store at roomtemperature.

2.6 Peptide

Desalting with C18-

StageTips

Prepare fresh solutions and store them at room temperature. Useanalytical grade or hypergrade agents.

1. Acetonitrile (ACN).

2. Buffer A: 5% (v/v) ACN, 0.1% (v/v) TFA, to 9.4 ml ultra-purewater add 500 μl ACN, and 100 μl of 10% (v/v) TFA in a glassbottle.

3. Buffer B: 80% (v/v) ACN, 0.1% (v/v) TFA, to 1.9 ml ultrapurewater add 8 ml ACN, and 100 μl of 10% (v/v) TFA in a glassbottle.

4. C18 StageTips (see Note 9).

5. Centrifuge.

6. Spin adaptors (Fig. 2).

7. Vacuum concentrator.

Fig. 2 Production of stop-and-go-extraction tips (C18-StageTips) according to[6]. A double-disk StageTip is inserted into a spin adapter (e.g., screw insulator,www.skiffy.com) and placed into a 2 ml microcentrifuge tube where the lid hasbeen removed. These 2 ml tubes (W) are used to collect solutions forequilibration and wash steps. For elution of the bound tryptic peptides the spinadapter with the C18-StageTip is placed into a fresh 1.5 ml tube (E)

278 Heidi Pertl-Obermeyer and Gerhard Obermeyer

3 Methods

3.1 Formaldehyde

Cross-Linking and

Preparation of Cell

Lysates

This work flow was developed to study putative interaction partnersfor the PM H+ ATPase in lily pollen [4]. Although the followingprotocol uses pollen cultures as starting material, it can be easilyadapted to and optimized for other plant tissues or cell cultures.

1. Incubate pollen grains of 5 flowers (~0.5 g fresh weight) in10 ml germination medium (Med B) for 10 min at roomtemperature in petri dishes.

2. For optimization of the cross-linking reactions, incubate pollengrains in germination medium with different formaldehydeconcentrations (0.0625–1% (w/v) PFA (see Fig. 1.a) (seeNote 10) for 20 min at room temperature. In addition, it isvery important to define the optimal incubation time for thecross-linking reaction (see Fig. 1b). Therefore, incubate pollengrains in the before tested optimal PFA concentration (e.g.,0.5% (w/v) PFA) for 0, 10, 20, 30, 45, and 60 min at roomtemperature. Keep controls (¼ untreated, not cross-linked pol-len grains, 0% PFA, 0 min) on ice.

3. Stop the cross-linking reactions by adding glycine to a finalconcentration of 125 mM (see Note 11) and incubate for10 min at room temperature.

4. Filter the pollen grain culture through a 8 μm filter using avacuum filtration unit and immediately transfer the remainderon the filter (¼ pollen grains) with a spatula into a prechilledglass beaker filled with liquid nitrogen. Avoid thawing of thepollen grains!

5. Homogenize the pollen grains in liquid N2 using a precooledmortar and pestle to a very fine powder. Avoid thawing of thepowder!

6. Transfer the fine powder with a spatula into a precooled (liquidN2) 1.5 ml microcentrifuge tube. Let the liquid nitrogen evap-orate and cautiously add 1 ml ice-cold Lysis buffer (see Note12).

7. Incubate on a rotational shaker at 4 �C for 10 min.

8. Centrifuge at 14,000 � g for 10 min at 4 �C.

9. After centrifugation, collect the supernatant (¼ cell lysate)using a pipette or a syringe with a fine needle. Do not disturbthe pellet.

10. Analyze the cross-linking reactions by SDS-PAGE and/orimmunodetection (see Fig. 1a and b.). In order to analyze thesample by mass spectrometry, reverse the PFA cross-links of thecell lysates by incubating the cell lysates in 6� sample loading

Interactome Analysis in Living Cells 279

buffer for 5–30 min at 95 �C and analyze by SDS-PAGEand/or Western Blot (see Fig. 1c).

11. Store cell lysates at �80 �C.

3.2 Covalent

Coupling of Antibodies

to Magnetic Beads

This protocol uses a monoclonal primary antibody for immunopre-cipitation, but can be used for polyclonal antibodies, too.

1. Resuspend the magnetic beads in the original vial by brieflyvortexing and transfer the required amount of beads(100–500 μl beads) into a 1.5 ml microcentrifuge tube (seeNote 13).

2. Place the tube in the particle concentrator (¼ magnet) for2 min at room temperature.

3. Suck out the supernatant and discard. Do not touch the col-lected magnetic beads inside the tube with the pipette tip.

4. Remove the tube from the magnet and add 1 ml PBS, pH 7.4to the beads. Mix well by pipetting up and down.

5. Repeat steps 2–4 three times in total. Take a 10 μl aliquot (¼“untreated beads”; washed beads before antibody capture, seeFig. 3) for SDS-PAGE analysis.

6. Place the tube on the magnet for 2 min at room temperature.The now washed magnetic beads are ready for capture of targetIg (¼ primary antibody).

7. Discard the supernatant and add 100–1000 μl primary mono-clonal antibody to the corresponding amount of magneticbeads into the tube, for example, to 500 μl washed magneticbeads 1 ml monoclonal antibody against a PM H+ ATPase(hybridoma clone 46E5B11F6) [7] is added (see Note 14).

8. Incubate tube with slow rotation mixing for 2 h at 4 �C.

9. Place the tube on the magnet for 2 min at room temperature.

10. Take out the supernatant (¼ unbound primary antibody) andstore at 4 �C (see Note 15).

11. Wash magnetic beads with 1 ml PBS, pH 7.4 by slowly pipet-ting up and down.

12. Place the tube on the magnet for 2 min at room temperature.

13. Remove from the magnet and pipette off the supernatant anddiscard.

14. Repeat steps 11–14 three times in total. Take a 5 μl aliquotfrom the third washing step (¼ “beads – DMP”; magneticbeads before DMP treatment, see Fig. 3) for SDS-PAGEanalysis.

15. Place the tube on the magnet for 2 min, pipette off the super-natant and discard. Remove the tube from the magnet and add

280 Heidi Pertl-Obermeyer and Gerhard Obermeyer

1 ml of 0.2 M triethanolamine, pH 8.2 to the beads. Mix wellby cautiously pipetting up and down for 2 min.

16. Repeat step 15 three times in total.

17. Place the tube on the magnet for 2 min, pipette off the super-natant and resuspend beads in 1 ml of cross-link solutionII. Prepare the solution immediately before adding to thebeads and check pH value.

18. Incubate the tube with rotational mixing for 30 min at 20 �C.

19. Place the tube on the magnet for 2 min and discardsupernatant.

20. Remove from the magnet and stop the cross-linking reactionby resuspending the beads in 1 ml of 50 mM Tris, pH 7.5 andincubate for 15 min at RT by rotational mixing.

230--150--100--

80--60--

50--

40--

30--

25--

20--15--

kDa Xlink Ig to beads mock elution

Ab

untreated beads

beads -DM

P

beads + DM

P

W1 W3 E1 E2

Fig. 3 Coupling antibody to magnetic beads for immunoaffinity purification. Anti-PM H+ ATPase beads were prepared by covalently binding a monoclonal Ab(antibody) against the PM H+ ATPase to magnetic IgG beads with dimethylpimelimidate (DMP). A mock elution of the beads was performed to check thecoupling efficiency and to remove uncoupled material. Ab ¼ monoclonalantibody against the PM H+ ATPase (heavy and light chains), 0.5 μl;�DMP ¼ beads incubated with anti-PM H+ ATPase Ab for 2 h at 4 �C but notcovalently coupled with DMP, 5 μl; +DMP¼ Ab covalently coupled to beads withDMP, 5 μl; 10 min denaturing at 95 �C; W1–W3 ¼ wash in lysis buffer, 20 μl;E1/E2 ¼ eluates 1 and 2, 20 μl; 20 min denaturing at RT. Membrane wasincubated with secondary antibody only [4]

Interactome Analysis in Living Cells 281

21. Place the tube on the magnet for 2 min and discardsupernatant.

22. Wash beads with 1 ml Wash Buffer by cautiously pipettingup-and-down.

23. Repeat steps 21–22 three times in total. Take a 5 μl aliquotfrom the third washing step (¼ “beads + DMP”; magneticbeads after DMP treatment, see Fig. 3) for SDS-PAGE analysis.

24. Resuspend the beads in 1 ml Storage Buffer and store the nowactivated beads at 4 �C.

3.3 Immuno-

precipitation (Antigen

Binding to Ig-Coated

Beads)

Prior to elution of the captured target antigen or cross-linkedantigen complexes a mock elution using Lysis Buffer without anyadditional proteins has to be performed to eliminate coelution ofimpurities (see Fig. 3).

1. Place the tube with activated magnetic beads on the magnet for2 min, and pipette off and discard supernatant.

2. Wash beads with 1 ml PBS, pH 7.4 for 2 min by slowly pipet-ting up-and-down.

3. Place the tube with activated magnetic beads on the magnet for2 min, and pipette off and discard supernatant.

4. Add 1 ml Lysis Buffer and cautiously vortex beads for 2 min.

5. Repeat steps 3–4 three times in total. Take a 20 μl aliquot fromeach wash step (¼W1–W3) for SDS-PAGE analysis (see Fig. 3).

6. Add 200 μl Elution Buffer to the beads, vortex slowly for 2 minand place the tube on the magnet for 2 min.

7. Pipette off supernatant, transfer into a fresh 1.5 ml microcen-trifuge tube and immediately add 20 μl of saturated 1 M Trissolution. Take a 20 μl aliquot (E1) for SDS-PAGE.

8. Repeat steps 6–7 for second eluate (E2).

9. Wash beads with 1 ml PBS, pH 7.4 and add 1 ml StorageBuffer. Store beads at 4 �C or keep on with binding of targetantigen.

10. Therefore, place the tube again on the magnet for 2 min,pipette of the supernatant and discard.

11. Remove from the magnet, add 1 ml PBS, pH 7.4, and slowlypipette up-and-down for washing the beads.

12. Repeat steps 10–11 three times.

13. Thaw 1 ml of cross-linked cell lysate (see Subheading 3.1) onice and add again fresh protease inhibitors (10 μM leupeptin,1 μM pepstatin A, 1 mM phenylmethanesulfonyl fluoride(PMSF), 2 μM E-64). Mix well.

14. Incubate with slow rotation on a rotational shaker for 2 h at4 �C.

282 Heidi Pertl-Obermeyer and Gerhard Obermeyer

15. Place the tube on the magnet for 2 min, pipette off the super-natant (¼ unbound protein) and transfer into a 1.5 ml micro-centrifuge tube. Keep the tube on ice.

16. Remove the tube from the magnet, add 1 ml Lysis Buffer andcarefully wash beads with captured target proteins for 2 min bycautiously vortexing.

17. Place the tube on the magnet for 2 min, and transfer superna-tant into fresh 1.5 ml tubes (wash fraction, W1).

18. Repeat steps 16–17 three times in total. Keep the wash frac-tions on ice (W1–W3).

19. Remove tube from the magnet, and add 200 μl Elution Bufferto the beads. Vortex gently for 2 min.

20. Place the tube on the magnet for 2 min.

21. Transfer the supernatant into a fresh 1.5 ml microcentrifugetube and immediately add 20 μl of saturated 1 M Tris solutionto the elution fraction. Keep the tube on ice (¼ E1, eluate 1).

22. Repeat steps 19–21 for second eluate (¼ E2, eluate 2).

23. Store all samples at �20 �C until used for analysis via immu-nodetection and/or mass spectrometry.

24. Remove the tube from the magnet, add 1 ml PBS, pH 7.4, andwash beads by slowly pipetting up-and-down.

25. Place the tube on the magnet for 2 min, and pipet off thesupernatant and discard.

26. Remove the tube from the magnet, and add 1 ml StorageBuffer and store beads at 4 �C.

3.4 In-Solution

Digestion

In order to analyze the cross-linked samples by mass spectrometrythe covalent bonds of the protein complexes (¼ PFA cross-links) inthe elution fractions have to be reversed.

1. Incubate the cross-linked samples (eluates E1 and/or E2,200 μl) for 20 min at 95 �C.

2. Dry down the sample in a vacuum concentrator to completedryness.

3. Dissolve the pellet in 50 μl UTU Buffer, and ultrasonicate for10 min at room temperature (RT) to solubilize proteins.

4. Add 1 μl of reduction buffer (per 50 μg of total protein) forreduction of cysteine residues, and incubate for 30 min at RT.

5. Add 1 μl of alkylation buffer (per 50 μg of total protein) andincubate for 20 min at RT protected from light.

6. Add 0.5 μl endoproteinase Lys-C (per 50 μg of total protein,1:100 ratio), and incubate for 3 h at RT.

Interactome Analysis in Living Cells 283

7. Dilute with 4 volumes of 10 mM Tris–HCl, pH 8.0 (seeNote 16).

8. Add 1 μl of trypsin solution per 50 μg of total protein (1:100ratio) and incubate at 37 �C overnight.

9. After digestion stop the reaction by acidifying the digest to0.2% (v/v) TFA final concentration. Use a 10% (v/v) TFAstock solution.

10. Centrifuge samples at 10,600 � g for 5 min at RT to get ridof insoluble material.

3.5 Peptide

Desalting and

Purification

1. Insert the 200 μl C18-StageTip into a spin adapter (see Fig. 2)and place it into a fresh 2 ml microcentrifuge tube where the lidhas been removed.

2. Place into a centrifuge and load 50 μl of Buffer B into the C18tip for equilibration. Centrifuge at 1000 � g for 2 min at RT.

3. Load 100 μl of Buffer A into the C18 tip, and centrifuge at1000 � g for 3 min at RT.

4. Repeat step 3.

5. Remove the tube from the centrifuge, discard the waste fromthe microcentrifuge tube, and insert the tube with the C18back into the centrifuge.

6. Load the tryptic peptide mixture into the C18 tip. Centrifugeat 2650 � g for 5 min at RT. If necessary repeat this stepto force the sample solution through.

7. Load 100 μl of Buffer A into the C18 tip for washing, andcentrifuge at 1000 � g for 3 min at RT.

8. Repeat step 7.

9. Remove the tube from the centrifuge and discard the wastefrom the microcentrifuge tube. Place the C18 tip with the spinadapter into a fresh 1.5 ml microcentrifuge tube, and insert thetube into the centrifuge.

10. Load 20 μl Buffer B into the C18 tip to elute the bound trypticpeptides from the C18 matrix. Centrifuge at 1000 � g for2 min at RT.

11. Repeat step 10.

12. Spin down the final volume of the eluate (¼ 40 μl) to dryness ina centrifugal vacuum concentrator (~ 45 min).

13. Store samples at �80 �C until used for LC-MS/MS.

284 Heidi Pertl-Obermeyer and Gerhard Obermeyer

4 Notes

1. The PFA stock solution is prepared by heating the solution inculture medium (e.g., lily pollen germination medium Med B)to ~60 �C for 30 min and by adding 1–2 NaOH pellets. Thesolution was cooled to room temperature and filtered througha 0.22 μm filter. Wear gloves and work in a chemical fumehood. We noticed precipitations in mannitol containing media!

2. Despite breaking and homogenizing frozen cells with mortarand pestle, cell lysis can be performed by different methods(e.g., homogenization with a Teflon Potter-Elvehjem–typehomogenizer). It is very important to keep the homogenatecold, and therefore, working with an ice bath and as fast aspossible is very crucial. To avoid heating of protein solutionsultra-sonication is not recommended.

3. Leupeptin is an inhibitor of serine and cysteine proteases. Toprepare a 10 mM stock solution add 1050 μl ultrapure water to5 mg leupeptin. Prepare 100–250 μl aliquots and store at�20 �C.

4. Pepstatin A is a highly selective inhibitor of acid proteases(aspartyl peptidases). To prepare a 10 mM stock solution add729 μl DMSO (dimethyl sulfoxide) to 5 mg pepstatin A. Ali-quot and store at �20 �C.

5. PMSF inhibits serine and cysteine proteases. Weigh 1.742 gand make up to 10 ml with DMSO. Prepare 1 ml aliquots andstore at �20 �C. Work cautiously due to the high toxicityof PMSF.

6. E-64 is a highly selective cysteine protease inhibitor which willnot inhibit serine proteases like other cysteine protease inhibi-tors. To prepare a 1 mM stock solution add 2798 μl ultrapurewater to 1 mg E-64. Store at �20 �C.

7. Dynabeads (Dynabeads™ M-280 Sheep anti-mouse IgG,Thermo Fisher Scientific, Waltham, MA, USA) are uniform,superparamagnetic, polystyrene beads with affinity purifiedsheep anti-mouse IgG covalently attached onto the bead sur-face. The beads will bind specific antigens via a mouse primaryantibody. If only a polyclonal primary antibody is available tocatch your protein of interest the use of sheep anti-rabbit IgGbeads is necessary. These beads efficiently bind rabbit IgGs ofall subclasses.

8. Always prepare the cross-link solution freshly. DMP is stored at�20 �C, and therefore allow to equilibrate with room temper-ature before use. Check pH value of the cross-link solution asthe pH should not be less than 8.0! If necessary, adjust pH with3 M NaOH.

Interactome Analysis in Living Cells 285

9. The 200 μl C18 StageTips are commercially available (e.g.,catalog number: 87784, Pierce C18 Tips, Thermo Fisher Sci-entific, Waltham, MA, USA) or can be manufactured in-houseaccording to [6].

10. For 0.0625% (w/v) PFA add 62.5 μl of a 10% (w/v) PFA stocksolution, for 0.125% (w/v) add 125 μl of a 10% (w/v) PFAstock solution, for 0.25% (w/v) add 250 μl of a 10% (w/v) PFAstock solution, for 0.5% (w/v) add 500 μl of a 10% (w/v) PFAstock solution, and for 1% (w/v) add 1 ml of a 10% (w/v) PFAstock solution to 10 ml pollen cultures.

11. Add 1 ml of a 1.25 M glycine stock solution to 10 ml pollencultures.

12. Add the precooled lysis buffer drop-by-drop. If the buffer isadded too fast rests of liquid nitrogen start to boil and thefrozen fine cell powder easily spouts out from the tube.

13. The required amount of beads depends on the concentrationof the used primary antibody solution.

14. Calculate the amount of antibody and beads according to themanufacturer’s recommendations. The required amount ofantibody depends on the concentration, specificity and affinityof the used primary antibody and the amount and specificity ofyour target antigen in the sample. A highly specific (monoclo-nal or polyclonal) antibody against the desired target protein,for which putative interaction partners should be identified, inan adequate amount is a prerequisite for immunoaffinitypurification.

15. Because the titer (i.e., the concentration) of most antibodysolutions is quite high and the used amount of primary anti-body is mostly in excess, the recovered antibody solution canstill be used for some further immunodetection experiments.

16. This dilution step is necessary in order to reduce the high saltconcentration for the following trypsin digestion, which isfavorable for best trypsin operation.

Acknowledgments

The research work is partially financed by the Austrian ResearchFund (FWF, P29626).

References

1. Miernyk JA, Thelen JJ (2008) Biochemicalapproaches for discovering protein-proteininteractions. Plant J 53:597–609

2. Engelsberger WR, Erban A, Kopka J et al(2006) Metabolic labeling of plant cell cultures

with K15NO3 as a tool for quantitativeanalysis of proteins and metabolites. Plant Meth-ods 2:14

3. Vasilescu J, Guo X, Kast J (2004) Identificationof the protein-protein interactions using in vivo

286 Heidi Pertl-Obermeyer and Gerhard Obermeyer

cross-linking and mass spectrometry. Proteomics4:3845–3854

4. Pertl-Obermeyer H, Schulze WX, Obermeyer G(2014) In vivo cross-linking combined withmass spectrometry analysis reveals receptor-likekinases and Ca2+ signalling proteins as putativeinteraction partners of pollen plasma membraneH+ ATPases. J Proteome 108:17–29

5. Pertl-Obermeyer H, Obermeyer G (2013)Pollen-cultivation and preparation for proteomestudies. Methods Mol Biol 1072:435–449

6. Rappsilber J, Mann M, Ishihama Y (2007) Pro-tocol for micro-purification, enrichment,pre-fractionation and storage of peptides forproteomics using StageTips. Nat Protoc2:1896–1906

7. Villalba JM, Lutzelschwab M, Serrano R (1991)Immunolocalisation of the plasma membraneH+ ATPase in maize coleoptiles and enclosedleaves. Planta 185:458–461

Interactome Analysis in Living Cells 287

Chapter 21

Proteome Analysis of 14-3-3 Targets in Tomato Fruit Tissues

Yongming Luo, Yu Lu, Junji Yamaguchi, and Takeo Sato

Abstract

Tomato is a major crop plant and an important constituent of the human diet. Exclusive features such asbearing fleshy fruits and undergoing a phase transition from partially photosynthetic to fully heterotrophicmetabolism make tomato fruit a model system for fruit development studies. Although the tomato genomehas been completely sequenced, functional proteomics studies are still at their starting stage. Proteomicstechnologies, especially the combination of multiple approaches, provide a very powerful tool to accuratelyidentify functional proteins and investigate certain sets of proteins in more detail. The direct binding ofplant 14-3-3 proteins to their multiple target proteins modulates the functions of the latter, suggesting thatthese 14-3-3 proteins are directly involved in various physiological pathways. This chapter outline methodsfor the identification of 14-3-3 protein complexes in tomato fruit tissues. These methods include detailedprotocols for protein extraction, coimmunoprecipitation, SDS-PAGE, SYPRORuby staining, in-gel trypsindigestion, and LC-MS/MS analysis for 14-3-3 interactomics.

Key words Tomato fruit, 14-3-3 protein, Interactome

1 Introduction

The tomato (Solanum lycopersicum L.) is the most economicallyimportant crop in commercial production worldwide. In contrastto model plants that bear dry fruits, tomato plants constitute theideal model system for studying the development of fleshy fruit[1, 2]. Although the full genome of this species was sequenced in2012 [3], few studies have investigated protein–protein interac-tions in tomato plants.

The 14-3-3 proteins are highly conserved, versatile regulatoryproteins that participate in a diverse range of cellular processesthrough direct binding to their target proteins. In plants, 14-3-3target proteins are distributed across a large number of physiologi-cal pathways, including those involved in primary metabolism,hormone signaling, cell growth and division, and response to mul-tiple environmental stresses (Fig. 1). In Arabidopsis, proteomicanalyses have identified over 300 14-3-3 target proteins, reflectingthe complicated regulatory network of 14-3-3 proteins in plant

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_21, © Springer Science+Business Media, LLC, part of Springer Nature 2020

289

physiology [4, 5]. Arabidopsis 14-3-3 proteins serve as carbon–nitrogen (C/N) nutrient balance regulators, with the abundance ofthese 14-3-3 proteins being essential for plant in response to C/Nnutrient status [6–9]. Because carbon and nitrogen nutrients alsoprofoundly affect fruit development and quality, the identificationof 14-3-3 target proteins in tomato fruit tissue is critical for under-standing the ability of 14-3-3 proteins to modulate primary metab-olism in tomato fruit tissue.

This chapter describes a protocol for the identification of 14-3-3 target proteins, it includes coimmunoprecipitation of these targetproteins from transgenic tomato fruit expressing FLAG tag-fused14-3-3 protein, followed by LC-MS/MS analysis of the precipi-tated proteins. With this method, 106 proteins are identified,including key enzymes involved in carbon metabolism and photo-synthesis [10], suggesting the need for further research on thefunctions of 14-3-3 proteins in tomato fruit.

2 Materials

2.1 Plant Material The tomato (Solanum lycopersicum L.) cultivar Micro-Tom, expres-sing FLAG tag-fused 14-3-3λ driven by a CaMV 35S promoter(FLAG-14-3-3λ), and wild-type Solanum lycopersicum L wereemployed [10] (see Note 1). Tomato plants are grown in 9 cmpots containing peat moss-based soil Jiffy-Mix soil (Sakata Seed,Japan), supplemented with HYPONeX nutrient mixture (N:P:K ¼ 6:10:5) (HYPONeX JAPAN, Japan), at 25 �C and 16:8 hlight–dark cycles .

2.2 Immunopreci-

pitation of 14-3-3

Complexes from

Tomato Fruit Tissue

1. Protein extraction buffer: 100 mM Tris–HCl, pH 7.5, 10%glycerol (v/v), 150 mM NaCl, 5 mM MgCl2, 1 mM EDTA,0.5% Triton X-100 (v/v), 10 μMMG132, 1� complete prote-ase inhibitor mixture (Roche, Switzerland), 1� PhosSTOPphosphatase inhibitor cocktail (Roche, Switzerland).

Fig. 1 Illustration of plant 14-3-3 protein functions in various physiological pathways. The 14-3-3 proteinspreferentially form saddle-shaped homo- or heterodimers, in which a broad central groove is able to bind totarget proteins. These 14-3-3 proteins bind to the target protein mainly through the phosphorylated 14-3-3binding motifs in the latter

290 Yongming Luo et al.

2. Wash buffer: 100 mM Tris–HCl, pH 7.5, 10% glycerol (v/v),150 mM NaCl, 5 mM MgCl2, 1 mM EDTA, 0.5% TritonX-100 (v/v).

3. Bradford reagent (Bio-Rad, USA).

4. Immunoprecipitation bead: anti-FLAG M2 antibody conju-gated to magnetic beads (Sigma-Aldrich, USA).

5. Elution buffer: 3 � FLAG peptides (5 mg/mL) (Sigma-Aldrich, USA).

2.3 SDS-PAGE

and SYPRO Ruby Stain

1. 2� SDS sample buffer: 4% SDS, 100 mM dithiothreitol,125 mMTris–HCl, pH 6.8, 20% glycerol, 0.02% bromophenolblue (BPB).

2. Unstained protein molecular weight markers (Bio-Rad, USA).

3. Ready-made 10% polyacrylamide (w/v) gels (Perfect NT Gel;DRC, Japan).

4. SDS-PAGE running buffer: 25 mM Tris, 192 mM glycine,0.1% SDS (w/v).

5. SYPRO Ruby Protein Gel Stain (Invitrogen, USA).

6. Wash solution: 10% methanol (v/v), 7% acetic acid (v/v).

2.4 In-Gel Trypsin

Digestion

1. Dehydration solution: 100% acetonitrile.

2. Wash buffer: 50 mM ammonium bicarbonate.

3. Reduction solution: 10 mM dithiothreitol, 50 mM ammoniumbicarbonate.

4. Alkylation solution: 55 mM iodoacetamide, 50 mM ammo-nium bicarbonate.

5. Trypsin solution: Sequence grade modified trypsin (Promega,USA), 10 μg/mL in 50 mM ammonium bicarbonate.

6. Extraction solution I: 50% acetonitrile (v/v), 5% formic acid(v/v).

7. Extraction solution II: 70% acetonitrile (v/v), 5% formic acid(v/v).

8. Dissolving solution: 5% acetonitrile (v/v), 0.1% formic acid(v/v).

9. Hitech tube crystal (M-50001; HITECH, Japan).

10. Peptide low absorbable micropipette tip (BM2051; BM Bio,Japan).

11. Ultrafree-MC centrifugal filters (pore size 0.45 μm; Millipore,USA).

2.5 LC-MS/MS

Analysis

1. Reverse-phase (RP) chromatography buffer A (RPB-A): 0.1%formic acid (v/v).

14-3-3 Interactome in Tomato Fruit 291

2. Reverse-phase (RP) chromatography buffer B (RPB-B): 0.1%formic acid (v/v) in acetonitrile.

3. Reverse-phase (RP) chromatography column: nano-HPLCcapillary column (NTCC-360/75-3-125; Nikkyo Technos,Japan).

4. EASY-nLC 1000 liquid chromatograph (ThermoFisher Scien-tific, USA).

5. Orbitrap Elite mass spectrometer (ThermoFisher Scientific,USA).

3 Methods

3.1 Immunopreci-

pitation of 14-3-3

Complex from Tomato

Fruit Tissue

1. Harvest the expanding green tomato fruit from WT andFLAG-14-3-3λ plants and grind the fruits in liquid nitrogenusing mortar and pestle [10] (see Note 2).

2. Equilibrate 20 μL of anti-FLAG M2 magnet beads for eachsample with extraction buffer at least for two times to removethe stock buffer.

3. Add 3 μL protein extraction buffer supplemented with inhibi-tors per mg fresh weight of tomato fruit; transfer the suspen-sion to a prechilled 1.5 mL tube and place on ice.

4. Centrifuge twice at 20,000 � g for 5 min at 4 �C and removethe insoluble residues.

5. Determine the protein concentration in the supernatant by theBradford method and transfer lysates containing 3 mg proteinsto new 1.5 mL tubes. Add sufficient extraction buffer to equal-ize lysate volume between negative control (WT) and FLAG-14-3-3λ.

6. Add 20 μL anti-FLAG M2 magnetic beads to the proteinextracts.

7. Incubate samples at 4 �C for 1 h on a rotary shaker.

8. Spin down magnet beads particles at 8000 � g for 30 s at 4 �Cand discard the supernatant by placing the tube in the appro-priate magnetic separator. Then wash the beads for three timeswith wash buffer and add 140 μL of wash buffer.

9. Add 60 μL of 3�FLAG peptides solution and incubate at 4 �Cfor 1 h on a rotary shaker to elute the FLAG-14-3-3λ proteinsfrom the beads.

10. Centrifuge at 8000 � g for 30 s at 4 �C and transfer thesupernatant into a new 1.5 mL tube by placing the tube inthe appropriate magnetic separator.

292 Yongming Luo et al.

11. Centrifuge at 20,000 � g for 1 min to remove the unexpectedremaining beads completely and transfer 180 μL of the super-natant into a new 1.5 mL tube.

12. Add 720 μL of prechilled 100% acetone, vortex mix for 10 s,and incubate at �30 �C for over 2 h.

13. Centrifuge at 13,000 � g for 10 min at 4 �C and discard thesupernatant.

14. Dry up the precipitate with a vacuum centrifuge for 15 min at37 �C.

15. Add 36 μL of 1 � SDS sample buffer and incubate the samplesat 37 �C for 1 h.

16. Centrifuge by 20,000 � g for 5 min to spin down any unex-pected particle before applying to SDS-PAGE.

3.2 SDS-PAGE

and SYPRO Ruby

Staining

1. Load immunoprecipitated samples and molecular weight mar-kers onto the lanes of a 10% SDS–polyacrylamide gel and startthe electrophoresis.

2. Stop the electrophoresis when the BPB dye front reaches theone third position of the gel.

3. Transfer the gel carefully into a container containing SYPRORuby staining solution. Cover the container with aluminumfoil and gently shake the gel for 3 h at room temperature (seeNote 3).

4. Discard the SYPRO Ruby staining solution and wash the geltwice each for 30 min with wash buffer.

5. Discard the wash buffer and wash the gel with milli-Q water for10 min.

6. Observe the protein bands with a Safe Imager™ 2.0 Blue-Light Transilluminator (see Note 4).

3.3 In-Gel Trypsin

Digestion

1. Excise the entire gel of each lane with a clean scalpel and chopthe excised gels into pieces of approximately 1 � 1 mm; transferthese pieces into a 1.5mLmicrocentrifuge tube (seeNotes 5–7).

2. Add 400 μL 100% acetonitrile to each tube and shake for15 min at room temperature (see Note 6).

3. Remove the acetonitrile and dry the gel with a vacuum centri-fuge (approx. 15 min at 37 �C).

4. Add 400 μL reduction solution to each tube and allow the drygel pieces to soak by shaking at 56 �C for 45 min.

5. Cool the tube to room temperature, remove the reductionsolution, and add 400 μL alkylation solution; shake for30 min in the dark.

14-3-3 Interactome in Tomato Fruit 293

6. Discard the alkylation solution and wash the gel samples with400 μL gel washing buffer for 10 min.

7. Discard the washing buffer and dehydrate again with 400 μLdehydration solution for 10 min. Repeat the dehydrationprotocol.

8. Dry the gel with a vacuum centrifuge for 15 min at 37 �C.

9. Add a sufficient amount of trypsin solution to each dried gelsamples, making sure the trypsin solution covers all the gelpieces, and incubate at 37 �C for 16 h (see Note 8).

10. Add 100 μL extraction solution I and shake for 30 min at roomtemperature. Transfer the supernatant to a new tube (seeNote 9). Repeat this procedure with extraction solution II.

11. Dry the solution with a vacuum centrifuge (approx. 1.5 h at37 �C).

12. Add 20 μL dissolving solution to dissolve the dried peptides,and filter each with an Ultrafree-MCCentrifugal Filter to avoidcontamination of gel pieces.

3.4 LC-MS/MS

Analysis and Protein

Identification

1. Transfer the trypsin-digested peptide solution into an HPLCvial (11-19-102) (AMR, Japan) suitable for the autosampler(Accela AS) (ThermoFisher Scientific, USA).

2. Inject the samples in the HPLC apparatus; peptides areseparated on an analytical nano-capillary column with anEASY-nLC 1000 liquid chromatograph system (ThermoFisherScientific, USA).

3. Elute the peptides at a column flow rate of 300 nL/min byapplying a three-step linear gradient: 0~55 min 0~35% RPB-B,55~60 min 35~100% RPB-B, 60~68 min 100% RPB-B.

4. Survey the full-scan spectra obtained with the Orbitrap massanalyzer (ThermoFisher Scientific, USA).

The ten most intense precursor ions, ranging from 300 to1500m/z, are scanned and measured in the mass spectrometerat 120,000 resolution at m/z 400. These ions are sequentiallyisolated and fragmented (collision-induced dissociation at35 eV), with the corresponding fragment ions measured inthe linear ion trap.

5. Search the TAIR10 (http://www.arabidopsis.org/ index.jsp)and ITAG2.4 (http://solgenomics.net/) databases using theSEQUEST algorithm embedded in Proteome Discoverer 1.4software (ThermoFisher Scientific, USA).

6. Use the following parameters for the searches (see Note 10):

(a) Precursor ion tolerance of 10 ppm,

(b) Product ion mass tolerance of 0.8 Da,

294 Yongming Luo et al.

(c) Trypsin as the proteolytic enzyme, allowing up to twomissed cleavages,

(d) Carbamidomethylation on cysteine as a fixed modifica-tion, and,

(e) Oxidation of methionine as a variable modification.

7. Employ an automatic decoy database strategy to estimate falsediscovery rate (FDR), and filter the resulting peptides to pres-ent only those proteins with <1% FDR. Accept only thosematched peptides with XCorr values for singly (z ¼ 1), doubly(z ¼ 2), and triply (z ¼ 3) charged ions of �1.5, �2.0, and�2.5, respectively. Consider positively identified peptides to beputative interactors with tomato 14-3-3λ only when at leasttwo of three replicates have SEQUEST scores >10 when com-pared with negative control WT plants.

4 Notes

1. FLAG-tag is a polypeptide epitope tag, having the amino acidsequence DYKDDDDK, that can be added to a protein forimmunoprecipitation.

2. Expanding green fruits were harvested in this experiment. Theculture period and fruit condition are dependent on the pur-pose of the experiment [1].

3. Staining incubation time can be shortened with a microwaveoven, as described in the manufacturer’s protocol (Invitrogen,USA).

4. The successful immunoprecipitation of FLAG-14-3-3λ can beconfirmed by western blotting with anti-FLAG antibody aswell as by SYPRO Ruby staining.

5. Gels stained with CBB or silver stain should be destained.

6. Our protocol includes adding approximately 100 μL gel toeach tube, along with 400 μL solution.

7. Tubes should be tolerant to organic solvents such as acetoni-trile, with low absorbability of proteins and peptides.

8. The solution should cover all gel pieces. After 30 min, thesamples should be checked, with more trypsin solution addedif the liquid was absorbed by the gel pieces.

9. Beginning with this step, pipette tips with low absorbability ofpeptides should be used.

10. These parameters can be adjusted, depending on the purposeof these experiments.

14-3-3 Interactome in Tomato Fruit 295

Acknowledgments

This work was supported by a Grant in-Aid for Scientific Researchto T.S. [Nos. 15K18819 and 17K08190], by Grants in Aid forScientific Research to J.Y. [Nos. 15H0116705, 262921888, and18H02162] from the Japan Society for the Promotion of Science(JSPS), and by a grant from The NOASTEC foundation, HokkaidoUniversity Young Scientist Support Program to T.S. Lu was sup-ported by a JSPS research fellowships (2016-2018) and JSPS Post-doctoral fellowships for Research in Japan (2018-2020). Luo wassupported by a Support Grant for Self-Supported InternationalGraduate Student (Hokkaido University Faculty of Science:2017-2018). This work was also supported by a CooperativeResearch Grant of the Plant Transgenic Design Initiative, GeneResearch Center, University of Tsukuba.

References

1. Tohge T, Alseekh S, Fernie AR (2014) On theregulation and function of secondary metabo-lism during fruit development and ripening. JExp Bot 65:4599–4611

2. Shikata M, Hoshikawa K, Ariizumi T et al(2016) TOMATOMA update: phenotypicand metabolite information in the Micro-Tommutant resource. Plant Cell Physiol 57:e11

3. The Tomato Genome Consortium (2012) Thetomato genome sequence provides insightsinto fleshy fruit evolution. Nature485:635–641

4. Oecking C, Jaspert N (2009) Plant 14-3-3proteins catch up with their mammalian ortho-logs. Curr Opin Plant Biol 12:760–765

5. Chang IF, Curran A, Woolsey R et al (2009)Proteomic profiling of tandem affinity purified14-3-3 protein complexes in Arabidopsis thali-ana. Proteomics 9:2967–2985

6. Sato T, Maekawa S, Yasuda S et al (2009)CNI1/ATL31, a RING-type ubiquitin ligasethat functions in the carbon/nitrogen response

for growth phase transition in Arabidopsisseedlings. Plant J 60:852–864

7. Sato T, Maekawa S, Yasuda S et al (2011)Identification of 14-3-3 proteins as a target ofATL31 ubiquitin ligase, a regulator of the C/Nresponse in Arabidopsis. Plant J 68:137–146

8. Yasuda S, Sato T, Maekawa S et al (2014)Phosphorylation of Arabidopsis ubiquitinligase ATL31 is critical for plant carbon/nitro-gen nutrient balance response and controls thestability of 14-3-3 proteins. J Biol Chem289:15179–15193

9. Yasuda S, Aoyama S, Hasegawa Y et al (2017)Arabidopsis CBL-interacting protein kinasesregulate carbon/nitrogen-nutrient responseby phosphorylating ubiquitin ligase ATL31.Mol Plant 10:605–618

10. Lu Y, Yasuda S, Li X et al (2016) Characteriza-tion of ubiquitin ligase SlATL31 and proteo-mic analysis of 14-3-3 targets in tomato fruittissue (Solanum lycopersicum L.). J Proteome143:254–264

296 Yongming Luo et al.

Chapter 22

The Use of Proteomics in Search of Allele-Specific Proteinsin (Allo)polyploid Crops

Sebastien Christian Carpentier

Abstract

Most organisms are diploid, meaning they only have two copies of each chromosome (one set inheritedfrom each parent). Polyploid organisms have more than two paired (homologous) sets of chromosomes.Many plant species are polyploid. Polyploid species cope better with stresses thanks to the redundancy in thechromosome copy number and dispose in this way a greater flexibility in gene expression. Allopolyploidspecies are polyploids that contain an alternative set of chromosomes by the cross of two (or more) species.Gene variants unique for a preferential phenotype are most probable candidate markers controlling theobserved phenotype. Organ or tissue-specific silencing or overexpression of one parental homeolog is quitecommon. It is very challenging to find those tissue-specific gene variants. High-throughput proteomics is asuccessful method to discover them. This chapter proposes two possible workflows depending on theavailable resources and the knowledge of the species. An example is given for an AAB hybrid and an ABBhybrid. Allele-specific gene responses are picked up in this workflow as gene loci displaying genotype-specific differential expression that often have single amino acid polymorphisms. If the resources aresufficient, a genotype-specific mRNAseq database is recommended where a link is made to the allele-specific transcription levels. If the resources are limited, allele-specific proteins can be detected by thedetection of genotype-specific peptides and the identification against existing genomics libraries of theparents.

Key words Polyploidy, Homeolog, LC MSMS, Allelic variance

1 Introduction

Agricultural productivity results from the plant genotype–environ-ment interaction and farm management (G � E � M). For asustainable agriculture, the right genotype is grown in the rightenvironment. The flexibility of cultivars toward the environment isdetermined by genetic diversity (G) and a deeper understandingthereof toward the phenotype. An absolute priority in the currentbreeding programs is the identification of sources of natural varia-tion with potential to rise the tolerance toward unfavorable (a)biotic constraints while minimizing the yield penalty. Across thevascular plant genera, a considerable proportion is polyploid.

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_22, © Springer Science+Business Media, LLC, part of Springer Nature 2020

297

Especially many of our crops are polyploid with complex heterolo-gous genomes. Barker and coworkers performed a study to assessthe polyploidy among the plant species and found that 24% of theinvestigated species were polyploid [1]. Polyploids and especiallyallopolyploids likely have an evolutionary advantage [1]. Genomereorganization and the associated greater flexibility in gene expres-sion allow coping with immediate stresses which may be beyond thetolerances of their progenitors/parents. Gene loss or silencing,neo- and/or subfunctionalization, intergenomic transfer, alleledominance/codominance, differences in transcription/translationefficiency and posttranslational modifications exemplify how thegenome, transcriptome, and proteome are regulated followingpolyploidization events. A(n) (allo)polyploid genome is thus apatchwork of gene variants enablingmany genotype� environmentinteractions (Fig. 1). Gene variants unique for a preferential phe-notype are most probable candidate markers controlling theobserved phenotype. GWAS (genome-wide association studies)have been successfully applied in many crops, but this is challengingfor a complex multigene trait. Moreover, organ- or tissue-specificsilencing or overexpression of one parental homeolog is quitecommon [2–5]. The link between gene or protein abundance andSNP (single nucleotide polymorphism) or single amino acid poly-morphisms (SAAP) and a preferential phenotype is more stringentwith knowledge of the tissue, time point, and environmental con-dition of the gene/protein activity [6]. RNA sequencing effectivelycombines gene expression quantification with gene sequencing and

Fig. 1 Example of an allele-specific protein in an AAB hybrid. The A genome is depicted in blue, the B genomein Red. The sequence of this on chromosome 11 encoded protein is depicted and shows 2 SAAP indicated inyellow

298 Sebastien Christian Carpentier

allows for SNP calling [5]. Yet most current-day read mappingsoftware have difficulties to process complex (polyploid) genomes.The read mapping efficiency to the reference genome might bebiased, and the degree of heterozygosity greatly increases compu-tational effort, hampering quantitative results per genome[7]. Consequently, the RNA reads are not separated and tracedback to their (sub)genome. Algorithms like PolyCat and HANDS2process reads based on classification toward their genome of originbut heavily depend on the presence and the quality of referencegenomes [7, 8]. Since not all cultivars of interest carry the referencesequence, mapping efficiency biases can still occur when one refer-ence genome is more closely linked to a constituting genome thanthe other [7]. Proteomics allows for picking up and quantifying theactual differential products without previous genome knowledge[3, 9]. 2DE-based proteomics is very useful to identify allele-specific protein isoforms in complex protein families [10]. This isexemplified in banana for the HSP70 protein family [4]. However,2DE is no longer the tool of choice in high-throughput differentialproteomics because of the labor and time involved in producing2DE gels. Via LC-MSMS multiple allele-specific products (trypticspecific peptides) can be quantified in a high-throughput mannerwithout required prior knowledge [11]. Proteomes clearly provideinsights into the consequences of genomic merging and reorgani-zation [3, 5, 12–15].

2 Methods

2.1 Workflow 1: No

Resources Available

for mRNA Seq (Fig. 2)

When no resources are available to construct a proper mRNA seqlibrary, genotype-specific proteins can be identified by analyzing asufficient amount of biological replicates and confirming the allelespecificity by identifying the genotypic peptides against a relevantlibrary. An example is worked out for an AAA and an AAB genotypeof banana/plantain. One is the well-known dessert banana, theother gives a fruit that is higher in starch content and is consumedcooked as a starch source. How to identify plantain (AAB) specificproteins? The plantain genome is not known. A reference genomeis known of a double haploid AA [16] variety that has been rese-quenced [17] and a reference B genome where the genomic readswere aligned to the AA reference genome to determine the chro-mosome locus [18]. The first plantain proteome was published in2018 [13].

1. Download the publicly available FASTA files of both parentgenotypes AA and BB (e.g., in the case of banana https://banana-genome-hub.southgreen.fr/).

2. Download the commonly proteomic contaminants (keratinsand trypsin) (https://www.uniprot.org/).

Identification and Quantification of Homeologs via LC MSMS 299

3. Merge both AA and BB libraries and contaminants andremove preferably proteins that are identical via CD hit [19](see Note 1).

4. Characterize the peptides via label-free shotgun proteomics (seeNote 2) and search all runs against your constructed libraryand the decoy database.

5. Quantify the peptides based on their abundance (Fig. 3) (seeNote 3).

Fig. 2 Workflow for the identification of allele-specific proteins when no resources are available to generatemRNA seq libraries. Example of an AAB hybrid

300 Sebastien Christian Carpentier

6. Construct a volcano plot (see Note 4).

7. Select candidate peptides (see Tables 1 and 2 “Volcano plotselected”).

8. Confirm the peptide identification at a minimum confidencelevel of 95% (see Tables 1 and 2 “Max probability”) (see Note5).

9. BLAST each peptide against your library (see Tables 1 and2 “Match in chromosome”). (Stand-alone and API BLASTsee https://blast.ncbi.nlm.nih.gov/Blast.cgi).

10. Eliminate all peptides with a 100% match with more than onegene locus (see Tables 1 and 2 “Locus specific”).

11. Confirm on spectral counting that the spectrum is unique forthe genotype (see Tables 1 and 2 “Allele counting”).

12. Send all the Amino Acid substitutions in the protein to Panther[20] http://pantherdb.org/tools/csnpScoreForm.jsp tojudge the impact.

13. Send all the differential proteins to Panther [20] for a GOenrichment.

2.2 Workflow 2:

Resources Are

Available to Generate

mRNA Seq Libraries

(Fig. 4)

When sufficient resources are available to construct a proper mRNAseq library, genotype-specific proteins can be conducted by analyz-ing a sufficient amount of biological replicates and confirming theallele specificity by identifying the genotypic peptides against itsown mRNA library. An example is worked out for an AAA and anABB genotype of banana. How to identify B genome–derivedproteins in the ABB genotype?

Fig. 3 Label-free quantification of the peptides. The visualization of the peptide abundance in 3D (retentiontime, intensity, m/z) helps to confirm the absence of the B allele–specific peptide in the AAA genotype (leftpicture). Even if the peptide was not selected for MSMS in a run the abundance pattern confirms that thepeptide is present or absent

Identification and Quantification of Homeologs via LC MSMS 301

1. Check the quality of your reads via FastQC.

2. Map the reads against the reference genome [21] (or assemblethem de novo if no ref. gene is available) and assemble BinaryAlignment Map (BAM) files.

3. Concatenate the BAM files and apply variant calling with vcfTOOLS and SAM tools [22, 23].

4. Generate a FASTA file from each library.

5. Download the commonly proteomic contaminants and mergethem with the mRNA generated FASTA.

6. Characterize the peptides via label-free shotgun proteomics (seeNote 2) and search all runs against your constructed libraryand the decoy database.

7. Quantify the peptides based on their abundance (see Note 3).

Table 1Example of protein inference between allelic isoforms

Identified peptidea

Maxprobability(%)

Match inchromosome

Locusunique

Volcano plotselected

Spectralcountsb

Allelespecific1B 1A AAA AAB

ADPNVDFAFCSQSLR

100 1 1 1 0 1 B

GDPNVDFTFCSQSLR

100 1 1 0 10 1 A

GLAIISLK 99 1 1 1c 1 0 33 B

SAAGSDGADLHGLAIISLK

100 1 1 0 10 1 A

SAAGSDGADLR 100 1 1 1 0 2 B

SCLDACR 100 1 1 0 0 46 80 0

RSIVGETCNQIAR 100 1 1 0 27 4 A

SIVGETCNQIAR 100 1 1 0 40 25 A

SIVSETCNQIAR 100 1 1 1 0 30 B

VVYADAASELR 100 1 1 0 0 3 0d

aSingle amino acid polymorphism is indicated in bold. The SAAPG48A inGDPNVDFTFCSQSLR has been evaluated by

PANTHER as probably damagingbTotal number of spectral counts for 21 AAA samples and 30 AABcThe peptide GLAIISLK should not be present in the AAA sample since in the AA genome the sequence is (H)GLAIISLK(L) while in the BB genome the sequence is (R)GLAIISLK(L)dA BLAST suggests that VVYADAASELR is also allele specific since the AA genome sequence would code for

EVYADAASDLR, but the low spectral counts and the high variability in abundance prevent us from confirming this

peptide

302 Sebastien Christian Carpentier

Table2

Exam

pleof

proteininferenceanddistinctionbetweenparalogous

andallelic

isoforms

Peptide

aMaximum

probability

(%)

Match

inchromosom

eLocus

unique

Volcano

plot

selected

Spectral

countsb

Allele

specific

6B6A

7B6A

9A4B

AAA

AAB

APGGCNNPCTVFK

100

11

10

089

131

0

CAADIN

GQCPAALK

100

11

10

13

B

CAADIN

GQCPAALKAPGGCNNPC

TVFK

100

11

10

1B

CSYTVWAAAVPGGGR

100

11

00

71

126

0

DDQTSTFTCPGGANYR

100

11

10

25

B

DDQTSTFTCPGGTNYR

100

11

047

74

A

NCPDAYSYPK

100

11

11

10

048

92

0

NCPDAYSYPKDDQTSTF

TCPGGANYR

100

11

10

75

B

NNCPDAYSYPKDDATSTFTCPGG

TNYR

100

11

00

19

40

0

QLNQGQSWTIN

VNAGTTGGR

100

11

00

32

37

0

RNCPDAYSYPK

100

11

00

79

0

RNCPDAYSYPKDDQTSTF

TCPGGANYR

100

11

10

16

B

TDQYCCNSGSCGPTDYSR

100

11

10

76

B

(continued

)

Identification and Quantification of Homeologs via LC MSMS 303

Table2

(continued)

Peptide

aMaximum

probability

(%)

Match

inchromosom

eLocus

unique

Volcano

plot

selected

Spectral

countsb

Allele

specific

6B6A

7B6A

9A4B

AAA

AAB

TGCSFDGSGR

100

11

00

193

342

0

TGCSFDGSGRGR

100

11

00

825

0

a Single

aminoacid

polymorphism

isindicated

inbold.TheSAAP

T218A

inDDQTSTFTCPGGTNYR

has

beenevaluated

byPANTHER

asprobably

ben

ign.A

BLAST

additionally

confirm

sthat

CAADIN

GQCPAALK

isalso

allele

specificsince

theAA

gen

omesequen

cewould

codeforCAADIN

GQCPGALK.A

BLAST

confirm

sthat

TDQYCCNSGSCGPTDYSRisalso

allelespecific,since

theAAgen

omesequen

cewould

codeforTDQYCCNSGSCSPTDYSR

bTotalnumber

ofspectralcountsfor21AAAsamplesand30AAB

304 Sebastien Christian Carpentier

8. Construct a volcano plot (see Note 4).

9. Select candidate peptides.

10. Confirm the peptide identification at a minimum confidencelevel of 95% (see Note 5).

11. BLAST the peptide against all your libraries.

Fig. 4 Workflow for the identification of allele-specific proteins when resourcesare available to generate mRNA seq libraries. Example of an ABB hybrid

Identification and Quantification of Homeologs via LC MSMS 305

12. Eliminate all peptides that have a 100% match with more thanone gene locus and/or library (see Note 6).

13. Upload your BAM files and the reference genome to visualizethe read count per allele [24].

14. Confirm in Integrative Genomics Viewer (IGV) on read count-ing that the peptide is unique for the genotype, confirm thedifferent alleles, and quantify the allele expression (seeNote 7).

15. Send all the AA substitutions in the protein to Panther [20]http://pantherdb.org/tools/csnpScoreForm.jsp to judge theimpact.

16. Send all the differential proteins to Panther [20] for a GOenrichment.

3 Notes

1. It is preferred to eliminate duplicate proteins because the num-ber of proteins has an influence on the search statistics and thedecoy database.

2. It is preferred to quantify the peptides via label-free MSMSbecause it is cheaper and because in this way the peptides arequantified at the MS level. Consequently, interesting allele-specific peptides can be flagged even if they were of lowerabundance and not selected for MSMS. Alternatively, a fasterhybrid MS can be applied that can perform MS and MSMS inparallel.

3. Alternatively, spectral counting can be applied though lesspowerful (see Tables 1 and 2, Fig. 3).

4. Volcano plots are commonly used to display the results ofomics experiments. A volcano plot is a type of scatterplot thatshows statistical significance (P value) versus magnitude ofchange (fold change).Volcano plots can be made in many soft-ware packages such as Microsoft excel or R studio. Since we arelooking here to discover proteins that are unique for a certainallele, the peptide should not be detected in the other genotypenot containing the allele. Theoretically the fold difference isthen infinite. However, in practice a false positive (low) abun-dance value can occur due to mismatching between LC runs.To avoid toomany false negative results the cut off value for thefold change can be set lower than infinite. The minimumrecommended cut off value for the corrected repeated testing(e.g., Benjamini–Hochberg correction [25]) ANOVA is 0.01.

5. Some of the statistically interesting peptides (unique for agenotype class) might not result in a hit if the exact peptidesequence is not present in the database. If the spectrum is of

306 Sebastien Christian Carpentier

good quality, an error tolerant or a cross species search can beconducted, de novo sequencing or spectral clustering [26] canbe done to identify the peptide.

6. In this BLAST search you can find paralogous genes within thesame library/genome and homologous genes in the othermRNA libraries.

7. Realize that the quantification of the differential peptides inproteomics is only possible via absolute quantification with apeptide standard.

Acknowledgments

Jelle van Wezemael, Nadia Campos, Farhana Bhuiyan, and KusayArat are gratefully acknowledged for technical assistance.

References

1. Barker MS, Arrigo N, Baniaga AE et al (2016)On the relative abundance of autopolyploidsand allopolyploids. New Phytol 210:391–398

2. Adams KL, Cronn R, Percifield R, Wendel JFet al (2003) Genes duplicated by polyploidyshow unequal contributions to the transcrip-tome and organ-specific reciprocal silencing.Proc Natl Acad Sci U S A 100:4649–4654

3. Carpentier SC, Pants B, Renaut J et al (2011)The use of 2D-electrophoresis and de novosequencing to characterize inter- and intra-cultivar protein polymorphisms in an allopoly-ploid crop. Phytochemistry 72:1243–1250

4. Vanhove A-C, Vermaelen W, Swennen R et al(2015) A look behind the screens: characteri-zation of the HSP70 family during osmoticstress in a non-model crop. J Proteome119:10–20

5. Wesemael J, Hueber Y, Kissel E et al (2018)Homeolog expression analysis in an allotriploidnon-model crop via integration of transcrip-tomics and proteomics. Sci Rep 8:1353

6. ZivyM,Wienkoop S, Renaut J et al (2015) Thequest for tolerant varieties: the importance ofintegrating “omics” techniques to phenotyp-ing. Front Plant Sci 6:448

7. Page JT, Gingle AR, Udall JA (2013) PolyCat:a resource for genome categorization ofsequencing reads from allopolyploid organ-isms. G3 (Bethesda) 3:517–525

8. Khan A, Belfield EJ, Harberd NP (2016)HANDS2: accurate assignment of homoealle-lic base-identity in allopolyploids despite miss-ing data. Sci Rep 6:29234

9. Samyn B, Sergeant K, Carpentier S et al (2007)Functional proteome analysis of the bananaplant (Musa spp.) using de novo sequence anal-ysis of derivatized peptides. J Proteome Res6:70–80

10. Carpentier SC (2016) 2-D PAGEmap analysis.Springer, Berlin, pp 215–235

11. Carpentier SC, America T (2014) Plant prote-omics. Springer, Cham, pp 333–346

12. Soltis DE, Misra BB, Shan S et al (2016) Poly-ploidy and the proteome. Biochim BiophysActa 1864:896–907

13. Campos NA, Swennen R, Carpentier SC(2018) The plantain proteome, a focus onallele specific proteins obtained from plantainfruits. Proteomics 18:1700227

14. Koh J, Chen S, Zhu N et al (2012) Compara-tive proteomics of the recently and recurrentlyformed natural allopolyploid Tragopogon mirus(Asteraceae) and its parents. New Phytol196:292–305

15. Hu G, Koh J, Yoo M-J et al (2014) Proteomicsprofiling of fiber development and domestica-tion in upland cotton (Gossypium hirsutum L.).Planta 240:1237–1251

16. D’hont A, Denoeud F, Aury J-M et al (2012)The banana (Musa acuminata) genome andthe evolution of monocotyledonous plants.Nature 488:213

17. Martin G, Baurens F-C, Droc G et al (2016)Improvement of the banana “Musa acumi-nata” reference sequence using NGS data andsemi-automated bioinformatics methods.BMC Genomics 17:243

Identification and Quantification of Homeologs via LC MSMS 307

18. Davey MW, Gudimella R, Harikrishna JA et al(2013) A draft Musa balbisiana genomesequence for molecular genetics in polyploid,inter-and intra-specific Musa hybrids. BMCGenomics 14:683

19. Li W, Godzik A (2006) Cd-hit: a fast programfor clustering and comparing large sets of pro-tein or nucleotide sequences. Bioinformatics22:1658–1659

20. Thomas PD, Campbell MJ, Kejariwal A et al(2003) PANTHER: a library of protein familiesand subfamilies indexed by function. GenomeRes 13:2129–2141

21. Dobin A, Davis CA, Schlesinger F et al (2013)STAR: ultrafast universal RNA-seq aligner.Bioinformatics 29:15–21

22. Li H, Handsaker B, Wysoker A et al (2009)The sequence alignment/map format andSAMtools. Bioinformatics 25:2078–2079

23. Li H (2011) A statistical framework for SNPcalling, mutation discovery, associationmapping and population genetical parameterestimation from sequencing data. Bioinformat-ics 27:2987–2993

24. Thorvaldsdottir H, Robinson JT, Mesirov JP(2012) Integrative genomics viewer (IGV):high-performance genomics data visualizationand exploration. Brief Bioinform 14:178–192

25. Benjamini Y, Hochberg Y (1995) Controllingthe false discovery rate: a practical and powerfulapproach to multiple testing. J Royal Stat Soc B57:289–300

26. Johansson P, Alm R, Emanuelsson C et al(2006) SPECLUST: a web tool for clusteringof mass spectra. J Proteome Res 5(4):785–792

308 Sebastien Christian Carpentier

Chapter 23

Methods for Optimization of Protein Extractionand Proteogenomic Mapping in Sweet Potato

Thualfeqar Al-Mohanna, Norbert T. Bokros, Nagib Ahsan,George V. Popescu, and Sorina C. Popescu

Abstract

The complexity in chemical composition alongside the genomic complexity of crop plants poses significantchallenges for the characterization of their proteomes. This chapter provides specific methods that can beused for the extraction and identification of proteins from sweet potato, and a proteogenomic method forthe subsequent peptide mapping on the haplotype-derived sweet potato genome assembly. We outline twobasic methods for extracting proteins expressed in root and leaf tissues for the label-free quantitativeproteomics—one phenol-based procedure and one polyethylene glycol (PEG) 4000-based fractionationmethod—and discuss strategies for the organ-specific protein extraction and increased recovery oflow-abundance proteins. Next, we describe computational methods for improved proteome annotationof sweet potato based on aggregated genomics and transcriptomics resources available in our and publicdatabases. Lastly, we describe an easily customizable proteogenomics approach for mapping sweet potatopeptides back to their genome location and exemplify its use in improving genome annotations using a massspectrometry data set.

Key words Sweet potato proteomics, Phenol protein extraction, Polyethylene glycol protein extrac-tion, Proteogenomics analysis

1 Introduction

Sweet potato (Ipomoea batatas, (L.) Lam) is an important globalcommercial crop from the morning glory family (Convolvulaceae)[1]. Worldwide, sweet potato is the sixth most important food cropfollowing rice, wheat, potato, maize, and cassava. The sweet potatohas a complex genome, being hexaploid (2n¼ 6�¼ 90) and highlypolymorphic. As a consequence, sweet potato genome sequencing,assembly, and annotation has progressed slowly and convolutedlyover the past 10 years. At present, only a sparse collection ofgenomic resources and annotations exists for sweet potato; thisincludes a haplotype-resolved assembly of I. batatas genome [2],assembly and gene annotations of progenitor genomes (I. trifida

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_23, © Springer Science+Business Media, LLC, part of Springer Nature 2020

309

and I. triloba) [3, 4], the proteome annotations of Ipomoea nil [5],transcriptomics datasets available in public databases [6, 7]. Using anew technique of haplotype-resolved genome assembly, Yang et al.[2] de novo sequenced and assembled the hexaploid I. batataswithout the guidance of wild related diploid genomes. The studyexamined the probable evolutionary history of the cultivated sweetpotato. As such, sweet potato genome contained two B1 and fourB2 component genomes (B1B1B2B2B2B2) and was proposed tohave resulted from an initial crossing between a tetraploid ancestorand a diploid progenitor, followed by a whole genome duplicationevent [2].

Current extraction methodologies and present technologiesenable analysis of almost complete proteomes in humans and ani-mals [8]. In plant systems, although discovery and comparativeproteomics approaches have accelerated the pace of breakthroughsin experimental and crop plants, significant challenges remain. Thedepth by which proteomes are being explored and analyzed, andthe means of enhancing the confidence level of protein identifica-tion continues to be important issues in this field [8, 9]. Conse-quently, approaches for protein extraction and separation fromtissues are constantly evaluated for performance on parameterssuch as protein extraction and detection, accurate quantification,post-extraction artifacts [10], and amenability to combinatorialutilization or multiplexing for improving the extraction of low-abundance proteins and increased throughput [11]. In additionto these generic challenges, additional hurdles in obtaining proteinpreparations of sufficient quality and quantity for mass spectrome-try have been described for sweet potato, such as low-proteincontent in storage roots and high abundance of secondary meta-bolites in leaves.

Given the rapid advances in genomics and proteomics technol-ogies, proteogenomic methods are being developed to take advan-tage of available genomics resources for resolving proteomes as wellas to improve annotations of complex genomes [12, 13]. A pro-teogenomic analysis of Arabidopsis thaliana uncovered evidencethat 13% of the Arabidopsis proteome was incomplete due tomissing and incorrect gene models including 778 new protein-coding genes and refined annotation of 695 gene models[14]. For complex plant genomes with high ploidy, such as sweetpotato, proteogenomics tools can significantly improve genomeannotations while providing a basis for improved peptideidentification.

Recently we have performed a root and leaf LC-MS/MS anal-ysis and identified 4321 nonredundant proteins from sweet potato[15]. In this chapter, we describe two methods for protein extrac-tion and solubilization using sweet potato leaves and tuberousroots and a proteogenomic method to identify the composition ofleaf and root proteomes using a high-throughput label-freemethodology.

310 Thualfeqar Al-Mohanna et al.

2 Materials

2.1 Materials

for Phenol Procedure,

Method 1 (M1)

1. Acetone.

2. Ammonium acetate.

3. β-Mercaptoethanol,

4. Disodium EDTA.

5. Ethanol.

6. 8-Hydroxyquinoline

7. Hydrochloric acid.

8. Methanol.

9. Small ball bearing.

10. Sodium hydroxide (NaOH).

11. Sucrose.

12. Tris base and Tris–HCl.

2.1.1 Stock Solutions 1. Make 50 mL of 6 M sodium hydroxide:

(a) Add 12 g of NaOH to 40 mL of Milli-Q H2O and stiruntil dissolving completely.

(b) Bring the volume to 50 mL with Milli-Q H2O and store itat RT.

2. Make 50 mL of 0.2 M EDTA, pH 8.0.

(a) Add 1.86 g disodium EDTA to 30 mL of Milli-Q H2Oand stir it to dissolve EDTA.

(b) Add I mL of 6 M NaOH and monitor the pH to reach8.0.

(c) Bring the volume to 50 mL with Milli-Q H2O and store itat RT.

3. Make 500 mL of Extraction buffer, pH 8.0.

(a) Add 154 g of sucrose (final concentration 0.9 M) and6.0 g of Tris base (final concentration 100 mM) to350 mL of Milli-Q H2O.

(b) Add 25 mL of 0.2 M EDTA stock (final concentration10 mM) and stir vigorously until dissolving completely.

(c) Adjust the pH to 8.0 by using 6 M HCl.

(d) Bring the volume to 500 mL with Milli-Q H2O.

(e) Filter sterilize and store it at 4 �C.

4. Make 500 mL of Precipitation Solution:

(a) Add 3.85 g of ammonium acetate to 400 mL of methanol(purity >99%) and stir vigorously until dissolvingcompletely.

Optimization of Protein Identification for Label-Free Quantitative. . . 311

(b) Bring the volume to 500 mL with high purity ofmethanol.

(c) Store it at �20 �C.

5. Make 500 mL of Phenol Saturated, pH 8.0.

(a) Mix 500 mL of phenol and 500 mL of Tris–HCl, pH 8.0,in a dark glass bottle.

(b) Stir the buffer vigorously for 20 min and incubate it at4 �C for 2 h; then, remove the Tris buffer in the upperphase by using a vacuum aspirator in the fume hood.

(c) Add 500 mL of Tris–HCl, pH 8.0 of fresh buffer.

(d) Repeat step b.

(e) Add 0.5 g of 8-Hydroxyquinoline to make final concen-tration is 0.1% and stir the solution vigorously for 20 min.

(f) Repeat steps c and d.

(g) Check the pH to ensure that it is 8.0. Otherwise, repeatstep f.

(h) Store it at 4 �C.

6. Ball bearings (BBs).

(a) Rinse BBs with 3 volumes of phenol saturated buffer,pH 8.0 for 10 min at RT with inversion, and decant thesolution.

(b) Add 3 volumes of precipitation solution for 5 min at RTwith inversion and decant the solution.

(c) Wash 3 times with 100% methanol and dry them in oven65 �C on a small tray in a single layer.

(d) Store them in sterile falcon tubes.

(e) These ball bearings can be reused if they are not rusty.

2.2 Materials

for Polyethylene Glycol

(PEG) Procedure 4000,

Method 2 (M2)

1. β-Mercaptoethanol,

2. Ethanol.

3. Magnesium chloride.

4. NP-40 detergent.

5. Phenylmethylsulfonyl fluoride (PMSF).

6. Polyethylene glycol (PEG) 4000.

7. Sodium hydroxide (NaOH).

8. Sucrose.

9. Tris–HCl.

312 Thualfeqar Al-Mohanna et al.

2.2.1 Stock Solutions 1. Make 50 mL of 6 M NaOH as described above.

2. Make 500 mL of extraction buffer I:

(a) Add 15.35 g of Tris–HCl and 18.50 g Tris-Base (0.5 MTis-Buffer) to 350 mL of Milli-Q H2O and stir it untilcompletely dissolves.

(b) Add 10 mL of NP-40 (2% V/V) to Tris-Buffer.

(c) Dissolve 87.10 mg of PMSF (1 mM) in acetonitrile; then,transfer it to Tris-Buffer.

(d) Add 0.95 g of MgCl2 (20 mM) to the buffer and stir ituntil completely dissolves.

(e) Add 10mL of β-Mercaptoethanol (2% V/V) to the buffer.

(f) Adjust the pH to 8.3 if necessary (The pH of Tris buffer issupposed to be 8.3).

(g) Bring the volume to 500 mL with Milli-Q H2O.

3. Make 250 mL of extraction buffer II.

(a) Add 7.68 g of Tris–HCl and 9.25 g Tris-Base (0.5 MTis-Buffer) to 150 mL of Milli-Q H2O and stir it untilcompletely dissolves.

(b) Add 5 mL of NP-40 (2% V/V) to Tris-Buffer.

(c) Dissolve 43.55 mg of PMSF (1 mM) in acetonitrile; then,transfer it to Tris-Buffer.

(d) Add 0.48 g of MgCl2 (20 mM) to the buffer and stir ituntil completely dissolves.

(e) Add 59.90 g of sucrose (0.7 M) to the buffer.

(f) Add 5 mL of β-Mercaptoethanol (2% V/V) to the buffer.

(g) Adjust the pH to 8.3 if necessary.

(h) Bring the volume to 250 mL with Milli-Q H2O.

4. Make 200 mL of 50% PEG 4000.

(a) Add 100 g to 70 mL of Milli-Q H2O.

(b) Stir the solution vigorously until mixing completely.

(c) Bring the volume to 200 mL with Milli-Q H2O.

3 Methods

3.1 Overview

of the Protein

Extraction

and Optimization

Methodology

In this section, we describe two protein extraction methods and theoptimized protocols for sweet potato leaf and root tissue processing.

The pipeline for sample processing and protein extraction andsolubilization is presented in Fig. 1a. Two distinct protocols wereoptimized for the extraction of proteins from lyophilized sweetpotato tissue, a phenol-based Method 1 (M1) and polyethyleneglycol (PEG) 4000 fractionation-based method (M2) [16]. Briefly,

Optimization of Protein Identification for Label-Free Quantitative. . . 313

in M1, proteins are extracted with a Phenol Extraction buffer,followed by ammonium sulfate–based precipitation, acetone washof the pellet, and resuspension in ethanol for storage. In M2,proteins are extracted with NP-40 Extraction buffer, mixed withPEG4000 up to 15% final concentration, precipitated with acetone(4 volumes acetone for leaf protein, and 2 volumes of acetone forroot proteins), followed by resuspension in ethanol for storage.Compared to organic solvents such as phenol, polyethylene glycolsdo not generally denature proteins [17, 18]. Examples of totalprotein preparations obtained from sweet potato roots using

Fig. 1 Methodology for sweet potato protein extraction and identification. (a) Workflow describing themethodology for tissue processing, the phenol-based procedure (M1) and PEG fractionation (M2). (b and c)Representative two-dimensional SDS-PAGE of total protein preparations obtained from sweet potato roots.Total protein extracted using the Phenol-based procedure in shown in (b), and total protein extracted using thePEG 4000 procedure in shown in (c). Molecular weights (MW, kDa) of proteins in the marker lanes are listed onthe left. (d) Phenol-based procedure for the extraction of proteins from root and leaves showing the upperphase following phenol extraction

314 Thualfeqar Al-Mohanna et al.

Phenol and PEG 4000 are shown in representative 2D gels (Fig. 1band c). In our hands, the phenol procedure extracted more proteinfrom both storage roots and leaves compared to the PEG-basedmethod. However, cumulatively, the two methods extracted alarger diversity of protein classes.

3.1.1 Tissue Collection Sweet potato leaves and storage roots are utilized for proteinextraction. Leaf samples harvested from plants in the vegetativestage yield abundant protein than leaves from mature plants. Stor-age roots are collected from fully developed mature plants. Follow-ing collection, samples are immediately frozen in liquid nitrogenand stored at�80 �C until further processing. Frozen tissues can belyophilized or ground to a powder in liquid nitrogen using mortarand pestle.

3.1.2 Protein Extraction

Using Phenol Procedure

(M1) (See Notes 1–4)

In this section, we describe a phenol-based method for extraction andsolubilization of proteins from sweet potato root and leaf tissue.

1. Add 4 μL of β-Mercaptoethanol for each mL of extractionbuffer to make the final percent is 0.4%.

2. Weigh 200 mg of powdered sample in 2 mL tube and add3 BBs in it.

3. Add 750 μL of phenol saturated buffer, pH 8 and 750 μL ofextraction buffer, and vortexing vigorously for 1 h at RT to behomogeneous.

4. Centrifuge the sample at 15,550 � g for 10 min at 4 �C. Theresults for root and leaf are shown in Fig. 1d.

5. Transfer around 330 μL of the upper phase to a screw top2 mL tube.

6. Add 5 volumes (1650 μL) of ice-cold precipitation solution(stored at �20 �C), and add 3 BBs in it, vortex the samplebriefly to mix it [5].

7. Store the sample at �20 �C overnight (16 h) to precipitate theproteins.

8. Invert the sample that has Precipitated proteins to mix it.

9. Centrifuge at 3,312 � g for 10 min at RT.

10. Decant the supernatant and collect the pellets using a magneticstand.

11. Add 1 mL of ice-cold precipitation solution and vortexvigorously.

12. Centrifuge the sample at 15,550 � g for 5 min at RT.

13. Repeat step 10.

14. Wash the pellets with 1 mL 80% ice-cold acetone (stored at�20 �C) and vortex vigorously.

Optimization of Protein Identification for Label-Free Quantitative. . . 315

15. Repeat steps 12 and 13.

16. Wash the pellets with 1 mL 70% ice-cold ethanol (stored at�20 �C) and vortex vigorously.

17. Repeat step 15.

18. Resuspend the pellets in 300 μL 70% ethanol and store them at�20 �C.

3.1.3 Protein Extraction

Using Polyethylene Glycol

Procedure 4000 (M2)

In this section, we describe a PEG 4000-based method for extractionand solubilization of proteins from sweet potato root and leaf tissue.

1. Weigh 500 mg of powdered sample in 15 mL tube.

2. Add 5 mL of extraction buffer I, and vortexing vigorously for15 min at RT to be homogeneous.

3. Centrifuge the sample at 828 � g for 15 min at 4 �C.

(a) Supernatant: keep it for step 4.

(b) Pellets: add 5 mL of extraction buffer II, and repeatsteps 2 and 3.

4. Filter the supernatant by using a 2.0 μm filter to remove anyimpurities or insoluble residues in the supernatant.

5. Add 50% PEG 4000 to make final concentration is 15% in thesupernatant and mix the solution to be homogeneous.

6. Incubate the sample on the ice for 30 min.

7. Centrifuge the sample at 13,250 � g for 15 min at 4 �C andkeep the supernatant for next step [6, 7].

8. Add cold acetone (stored at �20 �C) 2 volumes for sweetpotato root and 4 volumes for sweet potato leaves to precipitatethe proteins.

9. Incubate the sample at �20 �C for 30 min.

10. Centrifuge at 15,550 � g for 5 min [8].

11. Keep the pellets for the next step.

12. Wash the pellets with 1 mL 80% ice-cold acetone (stored at�20 �C) and vortex vigorously.

13. Centrifuge the sample at 15,550 � g for 5 min at RT.

14. Decant the supernatant and collect the pellets.

15. Wash the pellets with 1 mL 70% ice-cold ethanol (stored at�20 �C) and vortex vigorously.

16. Repeat steps 13 and 14.

17. Resuspend the pellets in 500 μL 70% ethanol and store them at�20 �C.

316 Thualfeqar Al-Mohanna et al.

3.2 LC-MS/MS

and Peptide

Identification

Purified sweet potato protein preparations were identified usingLC-MS/MS. The LC-MS/MS analysis was performed as describedin [19] and outlined in Fig. 2. Briefly, the peptides were separatedthrough a linear reversed-phase gradient through a C18 column.Survey full scan MS spectra (m/z 400–1800) were acquired at aresolution of 17,500. The search was performed using MASCOTv. 2.4 (Matrix Science, Ltd., London, UK). The resulting peptide-spectrum matches (PSMs) were reduced to sets of unique PSMs byeliminating lower scoring duplicates. To provide high confidencedata, the MASCOT results were filtered for Mowse Score (>20).Peptide assignments from the database search were filtered down toa 1% FDR as previously described [19, 20]. Peptide spectrummatching of MS/MS spectra was searched against the NCBI Ipo-moea taxon (txid4119) proteins dataset containing 58,282 pro-teins (NCBI; downloaded 2/12/2018) using MASCOT v. 2.4(Matrix Science, Ltd., London, UK).

3.3 Proteogenomic

Analysis Workflow

Unique peptides from the LC-MS/MS analysis were queried usingtBLASTn against either the haplotype resolved sweet potatogenome or the transcriptome available at http://public-genomes-ngs.molgen.mpg.de/SweetPotato/DOWNLOADS/. To imple-ment a new classification method of BLAST search results, thatallow mapping of peptides to existing genome annotations whileallows the discovery of novel peptides, composition-based filteringwas turned off, the word size was decreased to 2, and the e-value

Fig. 2 Workflow for peptide search and identification by LC-MS/MS. A concatenated database containing“target” and “decoy” sequences was employed to estimate the false discovery rate (FDR) [20]. Peptideassignments from the database search were filtered down to a 1% FDR by a logistic spectral score aspreviously described [20, 21]

Optimization of Protein Identification for Label-Free Quantitative. . . 317

cutoff was raised to 1000 to increase the number of potential hitsreturned against query peptides [22]; only the top 50 hits/querywere generated to limit the size of the output files. tBLASTn resultswere next parsed to identify perfect and imperfect matches alongthe sweet potato genome and transcriptome. Perfect matches weredefined as peptide hits that matched its entire length with nomismatches or gaps. Imperfect matches were defined as peptidehits that mapped perfectly along >90% of the length of the querypeptide with a sequence identity >80% (this category excludedpreviously matched peptides). Multiple hits were allowed for indi-vidual peptides as they were expected due to a large number ofestimated genes, numerous predicted genome duplication eventsand the high ploidy of sweet potato. The proteogenomics analysisworkflow is outlined in Fig. 3.

3.4 Proteogenomic

Analysis Method (See

Note 5)

Here we describe the computational methods needed to analyze thepeptides identified in the proteomics screen. The method assumes thatgenome and transcriptome annotations are available for matchingand validating identified peptides. The method allows for the discov-ery and improvement of genomics annotations in the target genome.

3.4.1 Data Input

Processing

for Proteogenomic Analysis

1. MASCOT output file for individual tissues, extraction meth-ods, and replicate numbers should first be obtained.

2. Uniquely identified peptides from all tissues, extraction meth-ods, and replicates should be parsed, modified to remove fea-ture annotations, and pooled into a FASTA format query file.

(a) Read MASCOT output files into R using the “read.xlsx”function in the “openxlsx” package.

(b) Parse unique peptides from read MASCOT files using theR code: “all_peptides<-unique(c(file1$peptide_column,file2$peptide_column, filex, peptide_column)”.

Fig. 3 Workflow for the proteogenomic analysis of sweet potato

318 Thualfeqar Al-Mohanna et al.

(c) Use the “stringr” package to remove nonalphanumericcharacters using the “str_replace_all” function.

(d) Format the unique peptides from target datasets intoFASTA format using the “write.fasta” function in the“seqinr” package.

Example FASTA formatted file for use in successive steps:

>SWPT_1

MGKGPGLYTDIGKK

>SWPT_2

KKKPVTVSYNGEDKPGFLKK

>SWPT_3

MTLGAGGSSVVVPRN

>SWPT_4

MASLLLPGGRT

. . .

Alternatively, MASCOT output files can be parsed usingExcel.

3. Download a FASTA formatted genome for a target species ofinterest:(a) Obtain the sweet potato genome from http://public-

genomes-ngs.molgen.mpg.de/SweetPotato/DOWNLOADS/.

(b) Download the sweet potato transcriptome datasets fromthe same site.

Other transcriptome annotations may be available at thefollowing sites:

GT4SP: http://sweetpotato.plantbiology.msu.edu/index.shtml and.

SRA: https://www.ncbi.nlm.nih.gov/sra.

3.4.2 Blast Peptides

against Genome

and Transcriptome

Annotations

1. Build a nucleotide database from the genome using the “make-blastdb” function of BLAST 2.2.31.

2. Run tBLASTn to map peptides against the genome using thefollowing parameters:

(a) db: nucleotide database created in step 4,

(b) query: peptide query created in step 2,

(c) word_size: 2,

(d) outfmt: 6,

(e) comp_based_stats: F,

(f) evalue: 1000,

(g) max_target_seqs: 50.

Optimization of Protein Identification for Label-Free Quantitative. . . 319

3. Repeat steps 4 and 5 for the transcriptome of target species.

4. Read the BLAST output files into R to generate the peptide listmapping:

(a) Append the original query peptide sequences into a newcolumn for each hit using the “merge” function of the“dplyr” package.

(b) Calculate the length of each peptide for each hit using the“length” function.

(c) Calculate the coverage of each query peptide along a hitwithin the genome as the length of a query hit/length ofthe full peptide.

3.4.3 Classify Peptides

and Generate New

Annotations (See Note 6)

1. Design a peptide hit classification scheme based on perfect orimperfect matches within the genome or transcriptome andgenerate the corresponding category lists. Use the followingdefinitions:

(a) Perfect match:

l Coverage: 100%.

l Mismatches: 0.

l Gaps: 0.

(b) Imperfect match:

l Peptide hit was not previously classified as a perfectmatch.

l Coverage: >90%.

l Sequence identity: >80%.

2. Generate lists of classified peptide hits as follows:

(a) Perfect match on the genome and perfect match on thetranscriptome—these are the peptides that support cur-rent genome sequence and gene model annotations.

(b) Perfect match on the genome and imperfect match on thetranscriptome—this category includes short exonsextensions.

(c) Perfect match on the genome and no match on the tran-scriptome—this category includes putative codingsequences in intergenic regions, large exons extensions,intron retentions, exon alternative ORFs and 50 and30 UTRs.

(d) Imperfect match on the genome and no match on the tran-scriptome—this category represents a combination ofevents: putative coding sequences in intergenic regions,exons extensions, intron retentions, exon alternativeORFs and 50 and 30 UTRs combined with genome

320 Thualfeqar Al-Mohanna et al.

assembly errors, Single Nucleotide Polymorphisms(SNPs) and indels.

(e) Perfect match on the transcriptome and imperfect match onthe genome—this includes SAAVs, indels, synonymousSNPs and genome assembly errors.

(f) Perfect match on the transcriptome and no match on thegenome—this includes alternative splice junctions, indels,and genome assembly errors.

(g) Imperfect match on the transcriptome and imperfect-matchon the genome—includes SAAVs, indels, and combinationsof events including synonymous SNPs, alternative splicejunctions, genome assembly errors, short exonsextensions.

(h) Imperfect match on the transcriptome and no match on thegenome—this includes combinations of events includingSNPs, alternative splice junctions, indels, and genomeassembly errors.

3. Generate annotations for the peptides classified above: analyzethe novel peptides in categories b. to h. from step 2 and classifyeach peptide, using definitions from [13, 23], in one of thefollowing classes:

(a) Intergenic.

(b) Intron retention.

(c) Exon extension.

(d) 50 UTR

(e) 30 UTR

(f) Exon alternative ORFs.

(g) Alternative splicing event.

(h) SAAVs.

(i) Putative Indels.

(j) Putative synonymous SNPs.

(k) Genome assembly error.

4. Generate BED format annotations corresponding to the novelpeptides classified at the previous step.

5. Visualize the genome, transcriptome and the peptides annota-tions using IGV [24]. Superpose the peptide BED formatannotations as tracks on the sweet potato genome and tran-scriptome annotations.

3.5 Novel Peptide

Analysis

and Validation

For well annotated genomes, newly identified peptides in shotgunproteomics are typically assigned to one of the following categories[13, 23]: pseudogenes, lncRNA, intergenic region, exon extension,

Optimization of Protein Identification for Label-Free Quantitative. . . 321

intron retention, alternative splicing, alternative ORF, 50 or 30

UTR, and single amino acid variants (SAAV). However, the sweetpotato genome is only partially resolved and the low amount oftranscriptomic data does not capture entirely the gene expression.The peptides mapped in the previous step were first classified ineight categories based on matching existing genome and transcrip-tome annotations. The novel peptides where then assigned to oneof the eleven classes and used to improve the genome and tran-scriptome annotations. The peptides were visualized in the Integra-tive Genomics Viewer (IGV) [24] together with genomic andtranscriptomic annotations (Fig. 4). Selected novel peptides canbe further validated using orthogonal validation methods asin [23].

The protocols described herein provide a baseline set of toolsthat facilitate streamlined extraction of proteins for mass spectrom-etry applications and mapping of peptides to a target genome forgenome annotations through the use of proteogenomics. Proteo-genomics analysis methods described here provide additional evi-dence for currently annotated genes and transcripts and predictnovel ORFs and splice variants of annotated genes in complexpolyploid plants. While the optimized protein extraction and pro-teogenomics methods described in this chapter were exemplifiedfor the analysis of sweet potato, they are easily customizable forother plant proteomes and can be used for further improvement ofgenome and proteome annotations.

4 Notes

1. Handling multiple samples at once can be difficult and condu-cive to errors. We recommend processing 24 samples or less inparallel.

2. 20 mL phenol, 70 mL Precipitation solution, 20 mL Extrac-tion buffer, and 150 ball bearings are assumed to be sufficientfor 24 samples.

Fig. 4 Visualization of sweet potato genome transcriptome and peptide annotations files in the IntegrativeGenomics Viewer

322 Thualfeqar Al-Mohanna et al.

3. Before starting the extraction, incubate Phenol at 4 �C for30 min to phase-separate, remove the small clear upper phaseand use the lower phase as described in Phenol (M1) method.

4. Bring samples to room temperature (RT) and keep the samplessitting on ice for 5–10 min before weighing. For long-termstorage conditions, the weighing samples should be taken lessthan 30 min.

5. If additional genomic data is available, such as annotated pseu-dogenes, SAAVs, SNPs, ncRNAs, alternative ORFs, transfrags,the peptide classification can be extended with additional cate-gories of annotations (Pseudogene, ncRNA, short ORFs, etc.).

6. Multiple hits were retained along different regions of thegenome and transcriptome for both perfect and imperfectmatches.

Acknowledgments

Funding for this work was provided by the USDA-National Insti-tute of Food and Agriculture Hatch project MIS-145120, and theMississippi Agricultural & Forestry Experiment Station to SCP,RR, and MS. GVP acknowledges the support from the USDA-Agricultural Research Unit through the Big Data: Biocomputing,Bioinformatics, and Biological Discovery project 6066-21310-004-25-S.

References

1. International Potato Center (2017) Sweetpo-tato facts and figures. Accessed 15 Jan 2019.http://www.cipotato.org/sweetpotato/

2. Yang J, Moeinzadeh M-H, Kuhl H et al (2017)Haplotype-resolved sweet potato genometraces back its hexaploidization history. NatPlants 33:696–703

3. Hirakawa H, Okada Y, Tabuchi H et al (2015)Survey of genome sequences in a wild sweetpotato, Ipomoea trifida (HBK) G. Don. DNARes 22:171–179

4. Wu S, Lau KH, Cao Q et al (2018) Genomesequences of two diploid wild relatives ofcultivated sweetpotato reveal targets forgenetic improvement. Nat Commun 9:4580

5. Hoshino A, Jayakumar V, Nitasaka E et al(2016) Genome sequence and analysis of theJapanese morning glory Ipomoea nil. NatCommun 7:13295

6. Leinonen R, Sugawara H, Shumway M et al(2010) The sequence read archive. NucleicAcids Res 39:D19–D21

7. NCBI (2019) The Sequence Read Archiveonline at: https://www.ncbi.nlm.nih.gov/sra.Projects: SRA PRJNA79717, SRAPRJEB4145, SRA PRJNA72435

8. Aebersold R, Mann M (2016) Mass-spectrometric exploration of proteome struc-ture and function. Nature 537:347

9. Omenn GS, Lane L, Lundberg EK et al (2015)Metrics for the human proteome project 2015:Progress on the human proteome and guide-lines for high-confidence protein identification.J Proteome Res 14:3452–3460

10. Rose JKC, Bashir S, Giovannoni JJ et al (2004)Tackling the plant proteome: practicalapproaches, hurdles and experimental tools.Plant J 39:715–733

11. Erickson BK, Rose CM, Braun CR et al (2017)A strategy to combine sample multiplexingwith targeted proteomics assays for high-throughput protein signature characterization.Mol Cell 65:361–370

Optimization of Protein Identification for Label-Free Quantitative. . . 323

12. Jaffe JD, Berg HC, Church GM (2004) Pro-teogenomic mapping as a complementarymethod to perform genome annotation. Pro-teomics 4:59–77

13. Nesvizhskii AI (2014) Proteogenomics: con-cepts, applications and computational strate-gies. Nat Methods 11:1114–1125

14. Castellana NE, Payne SH, Shen Z et al (2008)Discovery and revision of Arabidopsis genes byproteogenomics. Proc Natl Acad Sci U S A105:21034–21038

15. Al-Mohanna T, Ahsan N, Bokros NT et al(2019) Tissue-specific proteomic and proteo-genomic analysis of sweetpotato (Ipomoeabatatas). J Proteome Res 18:2719–2734

16. Lee DG, Ahsan N, Lee SH et al (2007) Aproteomic approach in analyzing heat-responsive proteins in rice leaves. Proteomics7:3369–3383

17. Atha DH, Ingham KC (1981) Mechanism ofprecipitation of proteins by polyethylene gly-cols. Analysis in terms of excluded volume. JBiol Chem 256:12108–12117

18. Ingham KC (1990) Precipitation of proteinswith polyethylene glycol. Methods Enzymol182:301–306

19. Ahsan N, Belmont J, al CZ (2017) Highlyreproducible improved label-free quantitativeanalysis of cellular phosphoproteome by opti-mization of LC-MS/MS gradient and analyti-cal column construction. J Proteome165:69–74

20. Elias JE, Gygi SP (2007) Target-decoy searchstrategy for increased confidence in large-scaleprotein identifications by mass spectrometry.Nat Methods 4:207

21. Yu K, Sabelli A, DeKeukelaere L et al (2009)Integrated platform for manual and high-throughput statistical validation of tandemmass spectra. Proteomics 9:3115–3125

22. Zhou K, Panisko EA, Magnuson JK et al(2008) Proteomics for validation of automatedgene model predictions. United States.Accessed 15 Jan 2019 https://www.osti.gov/servlets/purl/1241230

23. Zhu Y, Orre LM, Johansson HJ et al (2018)Discovery of coding regions in the humangenome by integrated proteogenomics analysisworkflow. Nat Commun 9:903

24. Robinson JT, Thorvaldsdottir H, Winckler Wet al (2011) Integrative genomics viewer. NatBiotechnol 29:24

324 Thualfeqar Al-Mohanna et al.

Chapter 24

In Silico Analysis of Class III Peroxidases: HypotheticalStructure, Ligand Binding Sites, PosttranslationalModifications, and Interaction with Substrates

Sabine Luthje and Kalaivani Ramanathan

Abstract

Functional analyses of peroxidases are a major challenge. In silico analysis appears to be a powerful tool toovercome at least some of the problems that arose from (1) the numerous possible functions of peroxidases,(2) their low substrate specificity, and (3) the compensation of knockout mutants by other isoenzymes.Amino acid sequences and crystal structures of peroxidases were used for the prediction of tertiarystructures, posttranslational modifications, ligand and substrate binding sites, and so on of uncharacterizedperoxidases. This protocol presents tools and their applications for an in silico analysis of soluble andmembrane-bound peroxidases, but it may be used for other proteins, too.

Key words AtPrx47, AtPrx64, HRP, Tertiary structure, Topology, Posttranslational modification,Substrate channel analysis

1 Introduction

Plant peroxidases of the secretory pathway (EC 1.11.1.7; class IIIperoxidases, donor: H2O2 oxidoreductases) are a huge proteinfamily of heme-containing enzymes that bear at least a N-terminalsignal peptide directed to the endoplasmic reticulum (ER), show ahigh structural conservation,N-glycosylation, and other posttrans-lational modifications [1, 2]. Numerous possible functions, lowsubstrate specificity, and compensation of knockout mutants byother isoenzymes make functional analyses of these enzymes amajor challenge [3–5]. An in silico analysis appears to be a powerfultool to solve at least a part of these problems [6–10].

Manifold bioinformatic tools are available in the World WideWeb. The Bioinformatics Resource Portal of the Expert ProteinAnalysis System (ExPASy) provides a list of freely accessible predic-tion programs (https://www.expasy.org/proteomics). Tools for(1) protein sequences and identification, (2) proteomic

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_24, © Springer Science+Business Media, LLC, part of Springer Nature 2020

325

experiments, (3) function analysis, (4) sequence sites, features andmotifs, (5) protein modifications, (6) protein structure, (7) proteininteractions, and (8) similarity search and alignment can be foundat this site [11]. The results of two different programs and versionsfor the same prediction may reveal different results because of thealgorithms used.

Proteins can be positively identified by the Basic Local Align-ment Search Tool for proteins (BLASTp) in any protein database aslong as the coverage of the amino acid sequence is sufficient by thedetected peptides. Examples for protein databases are the servers ofthe Universal Protein database (UniProt) and the National Centerfor Biotechnology Information (NCBI) that collected amino acidsequences with experimental evidence [12, 13]. The ARAMEM-NON database makes amino acid sequences of thale cress (Arabi-dopsis thaliana (L.) HEYNH) and membrane proteins available[14]. PeroxiBase allocates amino acid sequences of 11,236 entriesfor peroxidases (as on January 2019), 6554 out of these are class IIIperoxidases [15]. At least 220 entries in the PeroxiBase are fromArabidopsis. Crystal structures of several soluble peroxidases areavailable at the protein databank (PDB, https://www.rcsb.org/)[16]. For example, 958 peroxidase templates are provided at PDB(as on January 2019). Twelve of these templates are isoenzymes ofArabidopsis.

The Fast Approximation of Smith & Waterman Algorithm(FASTA)-format of amino acid sequences [17] allows for predic-tion of physicochemical properties [18], posttranslational modifi-cations [19–22], topology [23–25], signal peptides and cellularlocalization of a protein [26, 27]. Tools like SwissModel and Pro-tein Homology/AnalogY Recognition Engine (Phyre2) predicttertiary structures of proteins by templates with high sequencesimilarity and confidence that enable analysis of structural compo-nents and substrates [28, 29]. Due to the fact that the structure ofthe cleavable N-terminal signal peptide is missing in templates ofsoluble peroxidases, this part is missing in models predicted bySwissModel. In contrast, Phyre2 calculates this part by an ab initiomethod. However, confidence of this part may be low. For predic-tion of ligand binding-sites, hypothetical models (PDB format) canbe submitted to 3D-LigandSite [30]. On the Protein Data BankEurope the interactive tool Proteins, Interfaces, Structures andAssemblies (PDBePISA) is available that allows for exploration ofmacromolecular interfaces [31, 32].

Different tools visualize predicted macromolecular structures.For example, University of California, San Francisco (UCSF) Chi-mera [33] or PyMOL Molecular Graphics System (Schrodinger,LLC, New York, USA), a Python based open-source viewer, arefrequently used for protein images. PyMOL allows for the calcula-tions of electrostatics with the Adaptive Poisson-Boltzmann Solver(APBS) plugin as well as ligand docking and binding-site analyses

326 Sabine Luthje and Kalaivani Ramanathan

by the Autodock/Vina plugin [34–36]. Autodock (http://autodock.scripps.edu/), FireDock (http://bioinfo3d.cs.tau.ac.il/FireDock/), PatchDock, or SwissDock can be used for predictionof molecular interactions between a target protein and a smallmolecule [37–41]. For docking analyses, templates of chemicalcompounds are available at the ZINC database [42].

This protocol used two plasma membrane-bound class IIIperoxidases from Arabidopsis (AtPrx47 and AtPrx64) as examples[10, 43] to show application of some of these tools for the in silicoanalysis of plant peroxidases. Soluble horseradish peroxidase (HRP,1hch.pdb) was used for comparison.

2 Materials

2.1 Amino Acid

Sequences

FASTA format of amino acid sequences.

2.1.1 PeroxiBase

(RedOxiBase)

http://peroxibase.toulouse.inra.fr/.

2.2 Physicochemical

Properties

2.2.1 ProtParam v. 1.0

https://web.expasy.org/protparam/

2.3 Topology

2.3.1 TMHMM v. 2.0

http://www.cbs.dtu.dk/services/TMHMM/

2.3.2 HMMTOP v. 2.0 http://www.enzim.hu/hmmtop/

2.4 Signal Peptides

and Localization

2.4.1 SignalP v. 4.1

http://www.cbs.dtu.dk/services/SignalP/

2.4.2 PSORT v. 1.0) http://psort1.hgc.jp/form.html

2.5 Posttranslational

Modifications

2.5.1 Pyrrolidone

Carboxylic Acid

Modification (PROSITE

v. 20.0)

https://prosite.expasy.org/

In Silico Analysis of Class III Peroxidases 327

2.5.2 N-Glycosylation

(NetNGlyc v. 1.0)

http://www.cbs.dtu.dk/services/NetNGlyc/

2.5.3 Palmitoylation

(CSS-PALM v. 2.0)

http://csspalm.biocuckoo.org/

2.5.4 GPI-Anchor

(GPI-SOM)

http://gpi.unibe.ch/

2.6 Tertiary

Structure

For modeling of the three-dimensional structure of peroxidases bySwissModel or Phyre2, the most similar crystallized class III per-oxidases with sufficient sequence similarities (>30%) to the peroxi-dase amino acid sequence were used as templates.

2.6.1 Modeling of Protein

Structure

SwissModel

https://swissmodel.expasy.org/

Phyre2 v. 2.0 http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id¼index

2.7 Interactive

Visualization

of Structures

Protein images were prepared either by UCSF Chimera v. 1.13.1 orby PyMOL v. 2.2. The electrostatics of protein surfaces was calcu-lated with the APBS plugin using PyMOL generated PQR andexisting hydrogens and termini. Docking models were preparedwith UCSF Chimera.

2.7.1 PyMol v. 2.2 https://pymol.org/2/

APBS Plugin https://raw.githubusercontent.com/Pymol-Scripts/Pymol-script-repo/master/plugins/apbsplugin.py

2.7.2 UCSF Chimera

v. 1.13.1

https://www.cgl.ucsf.edu/chimera/

2.8 Docking

Analyses

2.8.1 3DLigandSite v. 1.0

http://www.sbg.bio.ic.ac.uk/3dligandsite/

2.8.2 PatchDock Server

v. 1.3

(https://bioinfo3d.cs.tau.ac.il/PatchDock/)

2.8.3 SwissDock Server http://www.swissdock.ch/docking

Target Templates 1. HRP (1hch),

2. AtPrx47 (modeled by 5twt.1.A using SwissModel),

3. AtPrx64 (modeled by 3hdl.1.A using SwissModel).

328 Sabine Luthje and Kalaivani Ramanathan

Substrate Templates

(ZINC v. 12)

http://zinc.docking.org/

1. ZINC00057908: Esculetin (6,7-Dihydroxycoumarin).

2. ZINC00057733: Scopoletin (7-Hydroxy-5-methoxycou-marin, 7-Hydroxy-6-methoxychromen-2-on).

3. ZINC00056615: DAB (Diaminobenzidine).

4. ZINC36470923: Indol-3-acetic acid (IAA).

5. ZINC30320649: Nicotine-Adenine-dinucleotide reducedform (NADH).

6. ZINC00058258: Ferulic acid.

7. ZINC13512224: Guaiacol.

8. ZINC12359045: Coniferyl alcohol (4-(3-hydroxy-1-prope-nyl)-2-methoxyphenol).

9. ZINC00001554: Salicylic acid.

10. ZINC12418399: Sinapyl alcohol (4-(3-hydroxyprop-1-enyl)-2,6-dimethoxyphenol).

11. ZINC01532486 Caffeyl alcohol (4-(3-hydroxy-1-propen-1-yl)-1,2-benzenediol).

3 Methods

3.1 Amino Acid

Sequences

Amino acid sequences (FASTA-Format) of peroxidases were down-loaded from PeroxiBase.

3.1.1 PeroxiBase 1. Open PeroxiBase.

2. Go to multicriteria search.

3. Enter Arabidopsis, chose peptide (PEP) and search (see Note1).

4. Add class III peroxidase and search again.

5. Choose peroxidase of interest (e.g., AtPrx47, AtPrx64).

6. Export Amino Acid Sequence(S) in FASTA Format.

3.2 Physicochemical

Properties

1. Open ProtParam v.1.0.

2. Paste target sequence into the text box (see Note 2).

3. Click the Compute parameters button.

4. Extract predicted data of interest (number of amino acids,molecular weight, theoretical isoelectric point (pI), etc.).

In Silico Analysis of Class III Peroxidases 329

3.3 Topology

3.3.1 TMHMM

1. Open TMHMM v. 2.0 (see Note 3).

2. Paste target sequence(s) into the text box.

3. Chose output format (e.g., Output format: extensive, withgraphics).

4. Click the submit button.

5. Extract data of interest (e.g., number of transmembrane heli-ces, expected number of amino acids in transmembrane helices,and amino acid range of transmembrane helices) (see Note 4).

3.3.2 HMMTOP 1. Open HMMTOP v. 2.0.

2. Go to submit.

3. Paste target sequence into the text box.

4. Click the submit button.

5. Extract predicted data of interest (e.g., number of transmem-brane helices, amino acid range of transmembrane helices,location of N- and C-termini, etc.).

3.4 Signal Peptides

and Localization

3.4.1 SignalP

1. Open SignalP v. 4.1.

2. Paste target sequence into the text box (see Note 5).

3. Select Organism group (e.g., Eukaryotes).

4. Use default for other parameters or adapt to the experimentalsetup.

5. Submit.

6. Extract data of interest.

3.4.2 PSORT 1. Open PSORT v. 1.0.

2. Select source of Input sequence (e.g., plant).

3. Type a sequence ID (e.g., AtPrx47).

4. Paste target sequence into the text box.

5. Submit.

6. Extract final results and data of interest.

3.5 Posttranslational

Modifications

3.5.1 Pyrrolidone

Carboxylic Acid (PCA)

1. Open PROSITE v. 20.0.

2. Paste target sequence into the text box (Quick Scan mode ofScan Prosite) (see Note 6).

3. Use default attitudes or disable.

4. Scan.

5. Extract predicted features as far as useful.

330 Sabine Luthje and Kalaivani Ramanathan

3.5.2 N-Glycosylation 1. Open NetGlyc v. 1.0.

2. Paste target sequence(s) into the text box.

3. Use default attitudes or change.

4. Submit.

5. Extract data of interest (e.g., position, potential of predictedN-glycosylation sites).

3.5.3 Palmitoylation 1. Open CSS-PALM v. 2.0.

2. Select online.

3. Paste target sequence into the text box (see Note 7).

4. Use default attitudes or change threshold as needed.

5. Submit.

6. Extract results or visualize.

7. For FAQs, go to documentation.

3.5.4 GPI-Anchor 1. Open GPI-SOM.

2. Paste target sequence(s) into the text box.

3. Create a job name.

4. Click the GO button.

5. Extract results.

3.6 Tertiary

Structure

3.6.1 Modeling of Protein

Structure

SwissModel

1. Start modeling.

2. Paste target sequence into the text box (see Note 8).

3. Add a project name.

4. Add your e-mail address to be informed at the end ofmodeling.

5. Click the Build Model button (see Note 9).

6. Choose the model with best QMEAN value.

Phyre 2 1. Open Phyre2 v. 2.0 (see Note 10).

2. Enter e-mail address to be informed when job has finished.

3. Paste target sequence into the text box.

4. Set modeling Mode to intensive.

5. After the job has finished, model (PDB) and link with resultswill be sent by e-mail.

6. Open results link.

7. Extract data of interest and in case of multiple template predic-tion templates as well (see Note 11).

In Silico Analysis of Class III Peroxidases 331

3.7 Interactive

Visualization

of Structures

Protein images have been prepared by PyMOL (Fig. 1a–f) andUCSF Chimera (Fig. 1g–o) as described below.

3.7.1 PyMOL PyMOL is free for educational use. For publication of PyMOLimages, a license is necessary. Tutorials, Script Library, Pluginsand Commands, and so on are given at https://pymolwiki.org/index.php/Main_Page.

Tertiary Structure 1. Open the target PDB (e.g., 1hch) in PyMOL v. 2.2 in theUpper Control Panel, by file and open (see Note 12).

2. Modify the protein structure in the PyMOLViewer window bythe Object Control Panel, in layer target (e.g., 1HCH): H(Hide) everything; S (show) cartoon.

3. Visualize cofactors (heme and calcium atoms) of the peroxidaseby the following steps. PyMOL Viewer window, SelectionTools: S (sequence).

4. PyMOL Viewer window, Selection Tools: Selecting: Atoms.

5. Move the gray bar at the bottom of the sequence to the otherend and select the heme (HEM),

6. Modify the molecule in the PyMOL Viewer window. Use theObject Control Panel, choose the layer of the selected HEM(sele): S (Show) sticks, C (Color) grays and gray 60.

7. Deselect the heme by a click to the background in theDisplay Area.

8. Rotate the molecule according to the Mouse Legend (leftbutton and mouse movement). Select δ�Meso edge, and cen-ter atom of the heme in the Display Area.

9. Change the color of both atoms in the PyMOL Viewer win-dow, use the Object Control Panel: layer (sele): C (Color)reds, red.

10. Deselect by click to background in the Display Area.

11. Select both calcium (CA) atoms at the end of the sequence.

12. Visualize the atoms in the PyMOL Viewer window. Use theObject Control Panel; choose the layer of the selected CA(sele): S (Show) spheres, C (Color) cyans, cyan.

13. Deselect by click to background in the Display Area.

14. Highlight β-Sheets in the PyMOL Viewer Window, Use Selec-tion Tools: Selecting: Residues

15. Select the three amino acids of each β-sheet in the structureaccording to the Mouse Control Legend. Mouse Mode:3-Button Viewing, find β-sheets by rotation of the molecule(left button and mouse movement). Zoom in if necessary(right button and mouse movement).

332 Sabine Luthje and Kalaivani Ramanathan

Fig. 1 In silico analyses of AtPrx47 and AtPrx64 in comparison to HRP. Electrostatic potential surface of (a) thereference protein HRP is shown in comparison to Phyre2 models of (b) AtPrx47 and (c) AtPrx64. Negativeelectrostatic potential (red); positive electrostatic potential (blue); neutral electrostatic potential (white). (d)HRP with helices (pink), β-sheets (yellow), heme (gray sticks), and Ca2+ ions (blue spheres) and four disulfidebonds (blue sticks). Alignment of models predicted by SwissModel (orange) and Phyre2 (blue) for (e) AtPrx47and (f) AtPrx64. Alignment of AtPrx47 models by the two prediction programs clearly reveals the missingβ-sheets from the modeling. (g) Ferulic acid binding in HRP, (h) caffeyl alcohol binding in AtPrx47, (i) sinapylalcohol binding in AtPrx64. Helices (green), substrates (blue sticks) showing different substrate affinity patternbased on the channel electrostatics. (j) Active site of HRP with ferulic acid, (k) active site of AtPrx47 withcaffeyl alcohol, (l) active site of AtPrx64 with sinapyl alcohol. Substrate channel of (m) HRP, (n) AtPrx47, and(o) AtPrx64. Hydrophobic residues (yellow), polar residues (green), basic residues (blue) and acidic residues(red) are shown. Figures a–f were created by the PyMOL Molecular Graphics system and a–c with the ABPSplugin. Figures g–o were created by UCSF Chimera Molecular Visualization Application. Further results of insilico analysis of AtPrx47 and AtPrx64 may be found in Luthje and Martinez-Cortes [10]

16. Modify the color of β-sheets in the PyMOL Viewer window,use the Object Control Panel, layer (sele): C (Color) yellows,yellow.

17. Deselect by click to background of the Display Area.

18. Rotate the molecule for optimal perspective (left button andmouse movement).

19. Upper Control Panel, Ray.

20. Upper Control Panel, File, Save Image as, for example, png.

Alignment 1. Open the target PDBs in PyMOL Upper control panel, by fileand open.

2. PyMOL Viewer window, Object Control Panel, layer all: H(Hide) everything; S (Show) cartoon.

3. PyMOL Viewer Window, Object Control Panel, layer target 1:C (Color) for example, blues, blue.

4. PyMOL Viewer Window, Object Control Panel, layer target 2:C (Color) for example, oranges, orange.

5. PyMOL Viewer Window, Object Control Panel, layer target 1:A (Action) align to molecule, Object: target 2.

6. Deselect by click to background in the Display Area.

7. Upper control panel, Ray.

8. Upper control window, File, Save Image as, for example, png.

APBS Plugin 1. Open the target in PDB format by PyMOL in the UpperControl Panel, file and open.

2. Upper Control Panel, choose Plugin and APBS Tools.

3. Use “PyMOL generated PQR and PyMOL generated Hydro-gens and termini” (see Note 13).

4. Set Grid and Run APBS (see Note 14).

5. Open visualization.

6. Update.

7. In the Molecular Surface Box choose solvent accessible surfaceand Show.

8. Upper Control Panel, open File and Save Image as, for exam-ple, png in the main menu for all Images of interest.

9. Upper Control Panel, type “rotate y, 180” in the command lineto get the back view.

10. Upper Control Panel, type “rotate x, 90” in the command lineto get the top view.

334 Sabine Luthje and Kalaivani Ramanathan

3.7.2 UCSF Chimera

Tertiary Structure

1. Open UCSF Chimera v.1.13.1.

2. Select file menu and Open the required PDB model.

3. Open Actions menu, select the Surface sub-menu to show thesurface of the protein.

4. Open Actions menu, select transparency according to the req-uisite (e.g., 30%).

5. Open Select menu, Residue option will help to highlight theamino acids based on their properties.

Active Center 1. Select file menu and Open the required PDB model.

2. Select, Structure, Protein.

3. Action, Ribbon, hide.

Surface by Properties

of Residues

1. Select file menu and Open the required PDB model.

2. Open Select menu, select Residue, amino acid category (e.g.,Hydrophobic).

3. Open Actions menu, select Color, yellow.

4. Open Select menu, select Residue, amino acid category, polar.

5. Open Actions menu, select Color, green.

6. Open Select menu, select Residue, standard amino acids, basicamino acids: LYS, ARG, HIS.

7. Open Actions menu, select Color, blue.

8. Open Select menu, select Residue, standard amino acids, acidicamino acids: GLU, ASP.

9. Open Actions menu, select Color, blue.

10. Open Select menu, select residue, all nonstandard, HEM.

11. Open Actions menu, select Color, by heteroatom.

3.8 Docking

Analyses

3.8.1 3DLigandSite

1. Open 3DLigandSite v. 1.0.

2. Paste target sequence into the text box (see Note 15).

3. Enter your e-mail address.

4. Enter a job description.

5. Start prediction by the 3dligandSite search button.

6. A link with results will be sent by e-mail.

7. Extract the predicted model (PDB format) and data of interest.

In Silico Analysis of Class III Peroxidases 335

3.8.2 Protein-Heme

Docking

1. The Protein Database (PDB) file from SwissModel was used inPatchDock server v. 1.3 to get a heme bound peroxidasestructure.

2. The receptor molecule and the substrate (heme) molecule aregiven as PDB files.

3. Clustering RMSD (Root-Mean-Square Deviation) value is nor-mally kept as 4.0.

4. An e-mail address is given to get the results.

5. The results obtained are redefined by FireDock server andranked based on the ACE (Atomic Contact Energy) values.

6. The best heme bound peroxidase PDB model is chosen forsubstrate docking.

3.8.3 SwissDock

Analysis

Protein–Substrate Docking

1. The heme bound protein (peroxidase) (PDB file) is used as thetarget in the SwissDock server.

2. The substrate (ligand) obtained from ZINC database v. 12.0 isuploaded in MOL2 format.

3. The job name and the e-mail address are given and the dockingprocedures are initiated by click on the Start Docking button.

4. A link to a zip file with all docking possibilities is obtained as aresult to the e-mail address.

3.8.4 Visualization

of Docking Results

1. UCSF Chimera v 1.13.1 was used for docking analysis.

2. After opening the required PDB file (target.pdb) obtainedfrom the SwissDock online tool, the substrate interaction isanalyzed.

3. Under Tools menu, Surface/binding analysis submenu, ViewDock option is used.

4. Open Dock results tab pops up, which helps in selecting thesubstrate cluster file, “Clusters.dock4.pdb”.

5. Then the type selection is made by clicking “Dock 4, 5 or 6”option.

6. This gives all the possible docking results of the substrate.

7. Then the substrate docking is checked manually and the plau-sible product is decided by its access to the substrate channeland based on “Delta G values.”

8. This procedure has been repeated for each protein with allsubstrates tested (Table 1).

336 Sabine Luthje and Kalaivani Ramanathan

4 Notes

1. Alternatively, enter short name of peroxidase of interest (e.g.,AtPrx64).

2. Alternatively, enter accession number (ACC) (e.g., Q9SZB9 orQ43872) or a sequence identifier (ID) and select endpoints ofsequence in the next step before computing parameters.

3. Older version may be used alternatively.

4. Plots can be extracted in postscript, as script for gnuplot, or rawdata for plotting if necessary.

5. Alternatively, upload a file in FASTA format or use multiplesequences in FASTA format.

6. UniProtKB ACC.No. or identifiers or PDB identifiers arepossible.

7. Input can be multiple sequences in FASTA format or anuploaded file.

8. Or upload a target sequence file in FASTA format.

9. Alternatively, search for templates before starting modeling.

Table 1Docking analyses of natural and artificial peroxidase substrates

Substrates HRP –ΔG AtPrx47 –ΔG AtPrx64 –ΔG

1-methoxynaphthalene �5.86 �5.42 �5.84

Caffeyl alcohol �5.70 �6.56 �6.48

Coniferyl alcohol �5.45 �6.21 �6.16

Esculetin/ 6,7-Dihydroxycoumarin �5.61 �6.11 �6.27

Ferulic acid �6.29 �5.65 �6.01

Sinapyl alcohol �6.11 �5.64 �6.54

Salicylic acid �5.40 No result �5.53

IAA �5.95 �5.62 �5.49

NADH �7.13 �8.34 �9.54

DAB �5.40 No result �5.23

Guaiacol �5.43 �4.96 �5.49

Docking analyses have been done by UCSF Chimera v. 1.13.1 with models predicted by SwissDock server using

SwissModels and ZINCdock files as templates [42]. Structures of AtPrx47 and AtPrx64 were predicted by SwissModelusing templates from class III peroxidases from switchgrass (5twt1A) and highly glycosylated peroxidase from royal palm

tree (3hdl1A), respectively [32, 44]. Crystal structure of HRP (1 hch) was used for comparison [45]. Caffeyl alcohol fits

best for AtPrx47, whereas sinapyl alcohol appears to be preferred by AtPrx64. Macromolecular structure of AtPrx47 willneed further elucidation, because β-sheets were predicted neither by SwissModel nor by Phyre2 [10] and salicylic acid or

3,30-diaminobenzidine (DAB) could not be fitted to the model

In Silico Analysis of Class III Peroxidases 337

10. Login to have expert mode with more options.

11. Models predicted by Phyre2 will automatically be submitted to3DLigandSite and a link for the prediction will be given in theresults.

12. Alternatively, Upper Control Panel, Plugin, PDB Loader Ser-vice, enter the 4-digit PDB code (e.g., 1 hch).

13. If this does not work choose one of the other options.

14. In case of unassigned atoms, delete those in the Object ControlPanel of the PyMOL Viewer window. Use the newly createdlayer (unassigned), go to A (Action) and remove atoms. RunAPBS again.

15. Submit your own protein structure.

Acknowledgments

This work was supported by a PhD student grant to K. R. from theDr. Elisabeth Appuhn Foundation.

References

1. Welinder KG, Justesen AF, Kjaersgard IV et al(2002) Structural diversity and transcription ofclass III peroxidases fromArabidopsis thaliana.Eur J Biochem 269:6063–6081

2. Zamocky M, Furtmuller PG, Obinger C(2010) Evolution of structure and function ofclass I peroxidases. Arch Biochem Biophys500:45–57

3. Hiraga S, Sasaki K, Ito H et al (2001) A largefamily of class III plant peroxidases. Plant CellPhysiol 42:462–468

4. Passardi F, Cosio C, Penel C et al (2005) Per-oxidases have more functions than a Swiss armyknife. Plant Cell Rep 24:255–265

5. Cosio C, Dunand C (2009) Specific functionsof individual class III peroxidase genes. J ExpBot 60:391–409

6. Luthje S, Meisrimler CN, Hopff D et al (2011)Phylogeny, topology, structure and functionsof membrane-bound class III peroxidases invascular plants. Phytochemistry 72:1124–1135

7. Herrero J, Esteban-Carrasco A, Zapata JM(2013a) Looking for Arabidopsis thaliana per-oxidases involved in lignin biosynthesis. PlantPhysiol Biochem 67:77–86

8. Herrero J, Fernandez-Perez F, Yebra T et al(2013b) Bioinformatic and functional charac-terization of the basic peroxidase 72 from Ara-bidopsis thaliana involved in ligninbiosynthesis. Planta 237:1599–1612

9. Shigeto J, Nagano M, Fujita K et al (2014)Catalytic profile of Arabidopsis peroxidases,AtPrx-2, 25 and 71, contributing to stem lig-nification. PLoS One 9:e105332

10. Luthje S, Martinez-Cortes T (2018)Membrane-bound class III peroxidases: unex-pected enzymes with exciting functions. Int JMol Sci 19:E2876

11. Artimo P, Jonnalagedda M, Arnold K et al(2012) ExPASy: SIB bioinformatics resourceportal. Nucleic Acids Res 40:W597–W603

12. UniProt Consortium T (2018) UniProt: theuniversal protein knowledgebase. NucleicAcids Res 46:2699

13. Sayers EW, Agarwala R, Bolton EE et al (2019)Database resources of the National Center forBiotechnology Information. Nucleic Acids Res47:D23–D28

14. Schwacke R, Schneider A, Van Der Graaff Eet al (2003) ARAMEMNON, a novel databasefor Arabidopsis integral membrane proteins.Plant Physiol 131:16–26

15. Fawal N, Li Q, Savelli B et al (2013) Peroxi-Base: a database for large-scale evolutionaryanalysis of peroxidases. Nucleic Acids Res 41:D441–D444

16. Berman HM,Westbrook J, Feng Z et al (2000)The Protein Data Bank. Nucleic Acids Res28:235–242

338 Sabine Luthje and Kalaivani Ramanathan

17. Lipman DJ, Pearson WR (1985) Rapid andsensitive protein similarity searches. Science227:1435–1441

18. Gasteiger E, Hoogland C, Gattiker A et al(2005) Protein identification and analysistools on the ExPASy server. In: Walker JM(ed) The proteomics protocols handbook.Humana Press, New York, pp 571–607

19. Ren J, Wen L, Gao X et al (2008) CSS-Palm2.0: an updated software for palmitoylationsites prediction. Protein Eng Des Sel21:639–644

20. Sigrist CJA, de Castro E, Cerutti L et al (2012)New and continuing developments at PRO-SITE. Nucleic Acids Res 21:D344–D347

21. Blom N, Sicheritz-Ponten T, Gupta R et al(2004) Prediction of post-translational glyco-sylation and phosphorylation of proteins fromthe amino acid sequence. Proteomics4:1633–1649

22. Fankhauser N, M€aser P (2005) Identificationof GPI anchor attachment signals by a Koho-nen self-organizing map. Bioinformatics21:1846–1852

23. Krogh A, Larsson B, von Heijne G et al (2001)Predicting transmembrane protein topologywith a hidden Markov model: application tocomplete genomes. J Mol Biol 305:567–580

24. Tusnady GE, Simon I (2001) The HMMTOPtransmembrane topology prediction server.Bioinformatics 17:849–850

25. Moller S, Croning MDR, Apweiler R (2001)Evaluation of methods for the prediction ofmembrane spanning regions. Bioinformatics17:646–653

26. Nielsen H, Krogh A (1998) Prediction of sig-nal peptides and signal anchors by a hiddenMarkov model. Proc Int Conf Intell Syst MolBiol 6:122–130

27. Nakai K, Horton P (1999) PSORT: a programfor detecting the sorting signals of proteins andpredicting their subcellular localization. TrendsBiochem Sci 24:34–35

28. Waterhouse A, Bertoni M, Bienert S et al(2018) SWISS-MODEL: homology modellingof protein structures and complexes. NucleicAcids Res 46:W296–W303

29. Kelley LA, Mezulis S, Yates CM et al (2015)The Phyre2 web portal for protein modeling,prediction and analysis. Nat Protoc10:845–858

30. Wass MN, Kelley LA, Sternberg MJ (2010)3DLigandSite: predicting ligand-binding sitesusing similar structures. Nucleic Acids Res 38(Suppl):W469–W473

31. Krissinel E, Henrick K (2007) Inference ofmacromolecular assemblies from crystallinestate. J Mol Biol 372:774–797

32. Moural TW, Lewis KM, Barnaba C et al (2017)Characterization of class III peroxidases fromSwitchgrass. Plant Physiol 173:417–433

33. Pettersen EF, Goddard TD, Huang CC et al(2004) UCSF chimera a visualization systemfor exploratory research and analysis. J ComputChem 25:1605–1612

34. Grosdidier A, Zoete V, Michielin O (2011)SwissDock, a protein-small molecule dockingweb service based on EADock DSS. NucleicAcids Res 39:W270–W277

35. Morris GM, Huey R, LindstromWet al (2009)Autodock4 and AutoDockTools4: automateddocking with selective receptor flexiblity. JComput Chem 16:2785–2791

36. Seeliger D, de Groot BL (2010) Ligand dock-ing and binding site analysis with PyMOL andautodock/Vina. J Comput Aided Mol Des24:417–422

37. Nanda T, Tripathy K, Ashwin P (2011) Inte-gration of Bioinformatics Tools for ProteomicsResearch. J Comput Sci Syst Biol S13. https://doi.org/10.4172/jcsb.S13-002

38. Hetenyi C, van der Spoel D (2011) Towardprediction of functional protein pockets usingblind docking and pocket search algorithms.Protein Sci 20:880–893

39. Andrusier N, Nussinov R, Wolfson HJ (2007)FireDock: fast interaction refinement in molec-ular docking. Proteins 69:139–159

40. Mashiach E, Schneidman-Duhovny D, Andru-sier N et al (2008) FireDock: a web server forfast interaction refinement in molecular dock-ing. Nucleic Acids Res 36:W229–W232

41. Schneidman-Duhovny D, Inbar Y, Nussinov Ret al (2005) PatchDock and SymmDock: ser-vers for rigid and symmetric docking. NucleicAcids Res 33:W363–W367

42. Irwin JJ, Sterling T, Mysinger MM et al (2012)ZINC: a free tool to discover chemistry forbiology. J Chem Inf Model 52:1757–1768

43. Lee Y, Rubio MC, Alassimone J et al (2013) Amechanism for localized lignin deposition inthe endodermis. Cell 153:402–412

44. Watanabe L, de Moura PR, Bleicher L et al(2010) Crystal structure and statistical cou-pling analysis of highly glycosylated peroxidasefrom royal palm tree (Roystonea regia). J StructBiol 169:226–242

45. Berglund GI, Carlsson GH, Smith AT et al(2002) The catalytic pathway of horseradishperoxidase at high resolution. Nature 417:463

In Silico Analysis of Class III Peroxidases 339

Chapter 25

MALDI Mass Spectrometry Imaging of Peptides in Medicagotruncatula Root Nodules

Caitlin Keller, Erin Gemperline, and Lingjun Li

Abstract

Mass spectrometry imaging is routinely used to visualize the distributions of biomolecules in tissue sections.In plants, mass spectrometry imaging of metabolites is more often conducted, but the imaging of largermolecules is less frequently performed despite the importance of proteins and endogenous peptides to theplant. Here, we describe a matrix-assisted laser desorption/ionization mass spectrometry imaging methodfor the imaging of peptides in Medicago truncatula root nodules. Sample preparation steps includingembedding in gelatin, sectioning, and matrix application are described. The method described is employedto determine the spatial distribution of hundreds of peptide peaks.

Key words Medicago truncatula, Root nodules, Peptides, Mass spectrometry imaging, MSI, MALDI

1 Introduction

Matrix-assisted laser desorption/ionization mass spectrometryimaging (MALDI-MSI) is a powerful tool to visualize the spatialdistribution of molecules in a tissue [1]. In MALDI-MSI, a laser isfired at discrete positions across a matrix-covered tissue. At eachposition, a mass spectrum is collected. Once the instrument collectsmass spectra at all of the positions, software programs generate apixel for each discrete position and extracts the ion intensity for aparticular m/z across all pixels to create an image, or heat map, forthat m/z. In this way, hundreds of images can be generated from asingle instrument run. To prepare a sample for analysis, the generalsample preparation steps are flash freezing and embedding, section-ing, and applying a suitable matrix. Sample preparation is a criticalstep to preserve the sample and to achieve good signal of the chosenanalytes [2, 3]. For example, the matrix coating, which assists inionizing analyte molecules in the tissue section, can influence thetype of analytes in a sample that will ionize and the spatial resolutionof the imaging experiment. MALDI-MSI has been applied to manydifferent analyte types, including metabolites [4, 5], neuropeptides

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_25, © Springer Science+Business Media, LLC, part of Springer Nature 2020

341

[6], and proteins [7] in many different organisms. However, appli-cations of the technique to plants have focused on small molecules[8], with only a few focusing on larger molecules [9–12].

Here, we provide a detailed protocol focusing on applyingMALDI-MSI to investigate peptides present in the root nodulesof Medicago truncatula (Medicago) [9]. Medicago formsspecialized organs, called root nodules, on its roots as a result of asymbiotic relationship with rhizobia bacteria for biological nitrogenfixation. Plant peptides are involved in the formation of the noduleon the roots of the plant, as well as in plant growth and develop-ment [13, 14]. For example, nodule-specific cysteine-rich peptidesare involved in the differentiation of bacteria into bacteroids in theroot nodules [15], and CLAVATA3/embryo-surrounding region(CLE) peptides are involved in autoregulation of nodulation[16, 17]. Thus, the protocol here aims to provide a method thatcan be used to determine the spatial distribution of plant peptidesvia MALDI-MSI to further our understanding about these impor-tant biomolecules.

2 Materials

Prepare all solutions using ultrapure water (Milli-Q) and high-performance liquid chromatography (HPLC) grade organic sol-vents, unless otherwise noted. Reagents can be stored at roomtemperature, unless otherwise noted.

2.1 Embedding

Nodules

1. Plant material: Medicago truncatula plants inoculated withSinorhizobium meliloti (Rm1021).

2. Embedding Media: 100 mg/mL gelatin (DB Difco™). Dis-solve gelatin in 37 �C water bath.

3. Plastic embedding containers suitable for storage at �80 �C.

4. Dry ice.

2.2 MALDI-MSI

Sample Preparation

1. Optimal cutting temperature (OCT) compound.

2. 25 � 75 mm glass slides

3. 50% Methanol: methanol, water (50:50 v:v)

4. 50% Methanol 0.1% FA: methanol, water (50:50 v:v), 0.1%formic acid (FA)

5. DHB matrix solution: 40 mg/mL 2,5-dihydroxybenzoic acid(DHB) in 50% methanol 0.1% FA. Sonicate the matrix untilcompletely dissolved.

6. 50% Acetonitrile: acetonitrile, water (50:50 v:v)

7. 50% Acetonitrile 0.1% FA: acetonitrile, water (50:50 v:v), 0.1%formic acid

342 Caitlin Keller et al.

8. CHCA matrix solution: 5 mg/mL α-cyano-4-hydroxycin-namic acid (CHCA) in 50% acetonitrile 0.1% FA. Sonicate thematrix until completely dissolved.

9. SA matrix solution: 5 mg/mL sinapic acid (SA) in 50% aceto-nitrile 0.1% FA. Sonicate the matrix to completely dissolve it.

3 Methods

3.1 Embedding

Nodules

1. Figure 1 demonstrates the sample workflow for MALDI-MSIof Medicago root nodules.

2. Trim nodules from the plant with about 2–4 mm of surround-ing roots (see Note 1).

3. Place nodule in a plastic cup or similar holding container ofappropriate size for your sample (for example a5 mm � 5 mm � 5 mm square plastic cup for very smallsamples) with a drop of 100 mg/mL gelatin (see Note 2).

4. Place on dry ice and wait for nodule and gelatin to freeze. Thegelatin will turn white when frozen.

5. Once the nodule is frozen, fill the embedding container with100 mg/mL gelatin. Wait for the entire embedding container

Fig. 1 MALDI-MSI scheme showing the sample preparation, instrument analysis, and data analysis steps for atypical experiment

Peptide Imaging in Medicago Truncatula 343

with gelatin to freeze. Once the gelatin is completely white, thenodule can be stored at �80 �C (see Note 3) prior to MSIanalysis.

3.2 MALDI-MSI

Sample Preparation

1. Take the embedded nodule and trim sample to rectangle with acouple mm of gelatin surrounding the tissue on all sides. Dothis quickly to minimize the time the sample is at roomtemperature.

2. Attach sample to a cryostat chuck with a drop of OCT com-pound (see Note 4).

3. Allow sample on the chuck to equilibrate in the cryostat at�20 �C for 15 min.

4. Align the sample so that the cryostat is cutting sections evenlyacross the root and root nodule. This can be done by takingabout five sections and adjusting the chuck if part of the sampleis being missed (see Note 5).

5. Once the center of the nodule (or other desired depth) isreached, thaw mount sections onto a glass slide by warmingthe back of the slide against your hand and then placing thefront of the slide gently onto the tissue section.

6. Continue until a desired number of sections across the z stack(i.e., the depth) of the root nodule are obtained.

7. Keep the sections in a dry environment (i.e., dry box) whilepreparing the TM Sprayer for matrix application (see Note 6).

8. Turn nitrogen gas on TM Sprayer to 10 psi, and the solventpump to 0.25 mL/min. The solvent for the pump shouldmatch what the matrix is dissolved in (without the FA), so forDHB this would be 50% methanol and for CHCA this wouldbe 50% acetonitrile. Turn on the TM Sprayer and laptop (seeNote 7).

9. Set the temperature on the software to the appropriate temper-ature for the desired solvent and TM Sprayer system (seeNote 8). As a starting point, 80 �C is the appropriate tempera-ture for 50% methanol.

10. Load the dissolved matrix (i.e., DHB, CHCA, SA; see Note 9)into the sample loop with the knob in the load position.

11. Load the TM Sprayer method and manually change gas pres-sure and flow rate if method differs from the initial parametersof 10 psi and 0.25 mL/min. The TM Sprayer has recom-mended methods for specific matrices and analyte types,although method parameters may need to be optimized for aspecific application. For DHB imaging of peptides, methodparameters typically used are 1250 velocity, 0.1 mL/min,12 passes, 30 s dry time, rotate and offset (cc pattern), 10 psi,80 �C. For imaging of peptides using CHCA and SA as

344 Caitlin Keller et al.

matrices, method parameters to start from are 1100 velocity,0.2 mL/min, 8 passes, 30 s dry time, rotate and offset (cc pat-tern), 10 psi, 85 �C (see Note 10).

12. Once the TM Sprayer has reached the appropriate temperature,add slides containing sample to the sample holder. Secure slidesin place as necessary to prevent movement during matrixapplication.

13. Switch the sample loop knob to the spray position. Oncematrix is coming out of the nozzle, start the TM Sprayerprogram.

14. After the matrix application is finished, cool down the systemwhile flushing with the solvent the matrix is dissolved in (forDHB, this would be 50% methanol) at 0.25 mL/min. Rinsethe sample loop three times with solvent and toggle the knob.Once the system is below 50 �C, the system can be turned off.

15. Store the sample in a dry box at �20 �C if running on theinstrument the following day.

3.3 MSI Data

Acquisition

on the MALDI LTQ

Orbitrap XL

1. Place glass slide(s) with sample into the slide adapter. If import-ing the image of the glass slide, scan the slide in the adapterwith a scanner. Then add the backing plate and insert the plateinto the instrument. Alternatively, the slide can be scanned afterinserting the plate into the instrument with the camera in theinstrument (see Note 11).

2. Open the plate image in the MALDI source dialog box in theTune software. Zoom in as necessary to see sample, dependingon sample size. Draw boxes around the areas to be imaged (seeNote 12). Save this as a MALDI position file. For MS1 imag-ing, using a rectangle box and raster motion works best. Alsoset the desired spatial step size (75 μm is the smallest raster sizewithout oversampling).

3. In Xcalibur software (Thermo), set up the sequence by addingthe file name, path location, instrument method, and MALDIposition file. The instrument method contains parameterscontrolling the mass resolution, mass range, and centroid/profile data. The instrument method also requires a tune file,which controls the laser energy and the microscans (micro-scans/step is controlled in the instrument file). The microscansand microscans/step should match to ensure that one pixel isone mass spectrum in the data file.

4. Check the laser energy by shooting the laser on a matrix onlyarea that is not being imaged and checking the signal level. Youcan adjust the laser energy in your tune file as necessary to getthe optimal signal.

5. Start the sequence.

Peptide Imaging in Medicago Truncatula 345

3.4 Data Processing 1. Once the data is collected, it can be viewed in ImageQuest, orexported to another software program. To visualize the data inImageQuest, use the average spectra within a selected area toolto view an average spectrum of a certain area of the sample. Inthe bottom window of ImageQuest, there should be a spec-trum from the sample. Figure 2 shows example spectra aver-aged over the nodules for peptide imaging results with DHB,CHCA, and SA matrices.

2. Look through the peaks in the collected spectrum, zooming inas appropriate, and when one wants to visualize the distributionof a certain peak in the tissue, select add new data set. Select thesingle dataset option with plot type Mass Range/TIC. Use them/z for the mass range and select the desired tolerance window(i.e., 5 ppm). Repeat as necessary to visualize the m/z in thesample. Under the 2D tab, there are other color bar options aswell as smoothing options.

3. To view in MSiReader [18], export the data in ImageQuestinto an imzML format, keeping the data in profile.

4. Load the imzML file into MSiReader and select the desiredmass tolerance, image smoothing, and color bar parameters.Insert a m/z that is localized across the sample to visualize thesample (one can find a good m/z for this in ImageQuest).Normalize to the total ion count (TIC). To pull out m/zunique to the sample, use the polygon tool to create interro-gated and reference zones. Outline around the sample to createan interrogated zone, then create a matrix only region for thereference zone.

5. Use the extract peaks unique to the interrogated zone tool tocreate a list of m/z present in the image. One will need to setpercentage numbers for the threshold a m/z needs to be abovein the interrogated zone and the threshold a m/z needs to bebelow in the reference zone to be added to the list. Also set thealgorithm for peak centroid calculation (typically paraboliccentroid works well).

6. Once the list has been created, use the generate an image foreach peak in a list tool to create images for all the m/z. Manu-ally go through the images and remove any bad images (i.e.,images that have signal in the matrix as well as the sample ordo not appear to have any signal anywhere). Figure 3 showsexample MALDI-MSI images generated from peptide imagingof root nodules with either DHB or CHCA as the matrix.Different distributions across the root and root nodules areobserved.

346 Caitlin Keller et al.

Fig. 2 Example spectra average over the entire root nodules for MALDI-MSI onthe root nodules with different matrices. The matrices are CHCA (a), DHB (b), andSA (c)

Peptide Imaging in Medicago Truncatula 347

4 Notes

1. For best results, select nodules that are red in color and elon-gated rod in shape rather than round. These are the nodules inwhich the symbiosis is well developed.

2. To make the sectioning process easier, ensure that the nodule isas flat as possible with the root in line with the nodule. This willhelp to get both the root and the nodule in the same planewhen sectioning.

3. If the nodule is not completely frozen when covered in gelatin,it will not stick to the bottom of the cup and instead will floatup to the middle or top of the cup. This makes the noduleharder to find and may result in the positioning of the nodulebeing lost. After adding the gelatin, the cup should be keptlevel while waiting for the rest of the gelatin to freeze. If thegelatin freezes at an angle, it will be harder to level the nodulewhile sectioning to get both the root and root nodule in asingle section. Avoid air bubbles close to the nodule whenadding the gelatin, as this also will make the nodules harderto section.

4. OCT compound is beneficial as it provides a way to “glue” thesample to the cryostat chuck during sectioning. However, it is apolymeric species and will suppress analyte signal if it comesinto contact with the sample. Thus, care should be taken toensure that the OCT compound does not come into contactwith the sample or with the blade or stage of the cryostat. Byplacing a small drop of OCT to the back of the gelatin sur-rounded sample, where the OCT does not contact the sampleor blade, the sample can be secured onto the chuck without anyinterference from the OCT.

Fig. 3 MALDI-MSI images of peptides with either DHB (a, b) or CHCA (c, d, e) as the matrix. The images aregenerated at � 5 ppm

348 Caitlin Keller et al.

5. For plant root nodules, our lab typically uses 16 μm, but othersection thicknesses between 8 and 35 μm [19], can be used.

6. After sectioning and before matrix application, washing stepsto remove highly abundant lipid species can increase signalintensity and observed protein peaks [20]. For protein imag-ing, ethanol washes and potentially a Carnoy wash are typicallyused to remove the lipid species that can suppress proteinsignal. For endogenous peptide imaging, washes may (or maynot) remove the target peptides, depending on the chemicalproperties of the peptides. Thus, care should be taken whenusing washing techniques with peptides to ensure that they arenot being removed in the washing steps.

7. Here the TM Sprayer is used to apply the matrix evenly acrossthe sample. It is important that the matrix is applied in ahomogenous manner at all points on the tissue so that matrixinhomogeneity does not skew the results. A matrix applicationmethod should be reproducible run-to-run to ensure thatresults remain consistent. Other automatic sprayers can beused (i.e., home-built or the Bruker ImagePrep). Other matrixapplication techniques include the airbrush and sublimation[21]. Airbrush application can be achieved easily with minimalexpense, however, user-to-user variation can be high andreproducibility can be a challenge. Sublimation provides verysmall crystal size and good imaging results for metabolomicsstudies, but due to the dry application, the method requiresfurther recrystallization steps for analysis of larger molecules(i.e., peptides and proteins) [22].

8. The temperature of the TM Sprayer should be about 5 �Cbelow the temperature at which the “puffing” sound starts.This sound indicates that the matrix is not being sprayed in aconsistent manner. If run at a temperature when the solvent is“puffing” the matrix will not cover the sample homogeneously,which will negatively affect results.

9. There are many different matrices to choose from. DHB andCHCA are both commonmatrices and can be used for a varietyof analytes. Other matrices may be used primarily for largerpeptides and proteins (i.e., SA) or primarily for negative mode(i.e., 9-aminoacrilamide). Matrices other than DHB andCHCA may work well depending on your desired analyte.

10. If this method is too wet, you can cut the flow rate in half anddouble the number of passes to achieve the same matrix densitybut with a drier spray.

11. The preferred scanning method depends on the sample andtime considerations. For the nodules, scanning in with thecamera on the instrument provides good alignment andimage quality, but this takes 25 min per slide. For larger tissues,

Peptide Imaging in Medicago Truncatula 349

the scanner separate from the instrument works well andsaves time.

12. To check the alignment of the image to the slide in the instru-ment you can click a point on the image and check the cursorposition on the camera box on the tune page to see where theactual position is. It can also be helpful to check the outside ofthe boxes to ensure the sample is not being cut off.

Acknowledgments

This work was supported in part by funding from the NationalScience Foundation (NSF) Division of Integrative Organismal Sys-tems (IOS) RESEARCH PGR award #1546742, the University ofWisconsin-Madison Graduate School and the Wisconsin AlumniResearch Foundation (WARF). The MALDI-Orbitrap andQ-Exactive instruments were purchased through an NIH sharedinstrument grant (NCRR S10RR029531). LL acknowledges aVilas Distinguished Achievement Professorship and Charles Mel-bourne Johnson Professorship with funding provided by theWARFand University of Wisconsin-Madison School of Pharmacy.

References

1. Caprioli RM, Farmer TB, Gile J (1997) Molec-ular imaging of biological samples: localizationof peptides and proteins using MALDI-TOFMS. Anal Chem 69:4751–4760

2. Goodwin RJ, Pennington SR, Pitt AR (2008)Protein and peptides in pictures: imaging withMALDI mass spectrometry. Proteomics8:3785–3800

3. Buchberger AR, DeLaney K, Johnson J et al(2018) Mass spectrometry imaging: a review ofemerging advancements and future insights.Anal Chem 90:240–265

4. Ye H, Gemperline E, Venkateshwaran M et al(2013) MALDI mass spectrometry-assistedmolecular imaging of metabolites during nitro-gen fixation in the Medicago truncatula-Sinor-hizobium meliloti symbiosis. Plant J75:130–145

5. Gemperline E, Jayaraman D, Maeda J et al(2015) Multifaceted investigation of metabo-lites during nitrogen fixation in Medicago viahigh resolution MALDI-MS imaging andESI-MS. J Am Soc Mass Spectrom26:149–158

6. Chen RB, Li L (2010) Mass spectral imagingand profiling of neuropeptides at the organ andcellular domains. Anal Bioanal Chem397:3185–3193

7. Chaurand P, Norris JL, Cornett DS et al(2006) New developments in profiling andimaging of proteins from tissue sections byMALDI mass spectrometry. J Proteome Res5:2889–2900

8. Lee YJ, Perdian DC, Song Z et al (2012) Useof mass spectrometry for imaging metabolitesin plants. Plant J 70:81–95

9. Gemperline E, Keller C, Jayaraman D et al(2016) Examination of endogenous peptidesin Medicago truncatula using mass spectrome-try imaging. J Proteome Res 15:4403–4411

10. Poth AG,Mylne JS, Grassl J et al (2012) Cyclo-tides associate with leaf vasculature and are theproducts of a novel precursor in petunia (Sola-naceae). J Biol Chem 287:27033–27046

11. Cavatorta V, Sforza S, Mastrobuoni G et al(2009) Unambiguous characterization and tis-sue localization of Pru P 3 peach allergen byelectrospray mass spectrometry and MALDIimaging. J Mass Spectrom 44:891–897

12. Grassl J, Taylor NL, Millar AH (2011) Matrix-assisted laser desorption/ionisation mass spec-trometry imaging and its development forplant protein imaging. Plant Methods 7:11

13. Tavormina P, De Coninck B, Nikonorova Net al (2015) The plant peptidome: an

350 Caitlin Keller et al.

expanding repertoire of structural features andbiological functions. Plant Cell 27:2095–2118

14. Batut J, Mergaert P, Masson-Boivin C (2011)Peptide signalling in the rhizobium-legumesymbiosis. Curr Opin Microbiol 14:181–187

15. Van de Velde W, Zehirov G, Szatmari A et al(2010) Plant peptides govern terminal differ-entiation of bacteria in symbiosis. Science327:1122–1126

16. Mortier V, Den Herder G, Whitford R et al(2010) CLE peptides control Medicago trun-catula nodulation locally and systemically.Plant Physiol 153:222–237

17. Mortier V, De Wever E, Vuylsteke M et al(2012) Nodule numbers are governed by inter-action between CLE peptides and cytokininsignaling. Plant J 70:367–376

18. Robichaud G, Garrard KP, Barry JA et al(2013) MSiReader: an open-source interfaceto view and analyze high resolving power MSimaging files on matlab platform. J Am SocMass Spectrom 24:718–721

19. Qin L, Zhang Y, Liu Y et al (2018) Recentadvances in matrix-assisted laser desorption/ionisation mass spectrometry imaging(MALDI-MSI) for in situ analysis of endoge-nous molecules in plants. Phytochem Anal29:351–364

20. Seeley EH, Oppenheimer SR, Mi D et al(2008) Enhancement of protein sensitivity forMALDI imaging mass spectrometry afterchemical treatment of tissue sections. J AmSoc Mass Spectrom 19:1069–1077

21. Gemperline E, Rawson S, Li L (2014) Optimi-zation and comparison of multiple MALDImatrix application methods for small moleculemass spectrometric imaging. Anal Chem86:10030–10035

22. Yang J, Caprioli RM (2011) Matrix sublima-tion/recrystallization for imaging proteins bymass spectrometry at high spatial resolution.Anal Chem 83:5728–5734

Peptide Imaging in Medicago Truncatula 351

Chapter 26

Cystatin Activity–Based Protease Profiling to SelectProtease Inhibitors Useful in Plant Protection

Marie-Claire Goulet, Frank Sainsbury, and Dominique Michaud

Abstract

Protease inhibitors of the cystatin protein superfamily show potential in plant protection for the control ofherbivorous pests. Here, we describe a cystatin activity–based profiling procedure for the selection of potentcystatin candidates, using single functional variants of tomato cystatin SlCYS8 and digestive Cys proteasesof the herbivore insect Colorado potato beetle as a case study. The procedure involves the capture of targetCys proteases with biotinylated versions of the cystatins, followed by the identification and quantitation ofcaptured proteases by mass spectrometry. An example is given to illustrate usefulness of the approach as analternative to current procedures for recombinant inhibitor selection based on in vitro assays with syntheticpeptide substrates. A second example is given showing its usefulness as a tool to compare the affinity spectraof inhibitor variants toward different subsets of target protease complements.

Key words Plant protease inhibitors, Herbivorous insect digestive proteases, Cystatin activity–basedprotease profiling, Cys protease capture, Biotinylated cystatins

1 Introduction

Many authors have discussed the potential of plant protease inhi-bitors to protect crops from herbivorous pests [1, 2], and theimplementation of protease inhibitor–expressing plant lines in agri-cultural fields has been documented in recent years [3, 4]. Theseproteins act as competitive pseudosubstrate inhibitors to enter theactive site cleft of target proteases and prevent peptide bond hydro-lysis [5]. Inhibited enzymes in the pest midgut may no longerprocess plant proteins, causing dietary protein wastage, aminoacid shortage, developmental delays, and eventual death of theherbivore [6].

Significant efforts have been deployed over the years toimprove the potency of protease inhibitors for pest control, mostlyinvolving the rational design of inhibitor variants with improvedactivity toward model target proteases and/or the engineering ofhybrid inhibitor fusions integrating multiple functional domains

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_26, © Springer Science+Business Media, LLC, part of Springer Nature 2020

353

[7]. At present, a complementary task to harness the full potentialof these proteins in plant protection consists of developing analyti-cal tools adapted to the functional characterization and properselection of potent inhibitor candidates. Current procedures tocompare the potency of protease inhibitors against herbivore diges-tive proteases typically rely on in vitro protease inhibitor assays withsynthetic peptide substrates to calculate dissociation constants (Kd)toward model proteases or to determine threshold inhibitory con-centrations (e.g., IC50) for the inhibition of specific proteasefamilies in midgut extracts [8]. Such measurements, although giv-ing basic information about the relative inhibitory potency ofinhibitor candidates, say little about the eventual value of theseproteins in actual plant–pest contexts. Peptide substrates for diag-nostic purposes are selected based on their specificity toward well-defined protease families but their resistance to some isoforms ofthese families, or their susceptibility to isoforms of other proteasefamilies, cannot be ruled out a priori when assessing complexprotease complements such as those usually found in herbivorousarthropods [9, 10]. Most importantly, in vitro assays to monitorthe activity of model protease subsets do not consider the wholecomplement of protease targets in the herbivore, and hence the fullrange of protease isoforms eventually staying active in the midgutafter inhibitor intake [11].

In practice, a straightforward way to select protease inhibitorsamong a collection of possible variants may not be only to test theirinhibitory potency against a few selected protease models, but alsoto compare their effective binding range against the whole range ofeventual protease targets in the pest midgut. To this end, wedevised an activity-based functional proteomics approach thatallows for a direct comparison of inhibitor affinity profiles towardwhole midgut protease complements in crude protein extracts ofherbivorous insects [12]. The approach involves the capture ofinsect cysteine (Cys) proteases with Cys-type inhibitors of theplant cystatin protein family [13], followed by liquid chromatogra-phy–tandem mass spectrometry (LC-MS/MS) peptide analysis ofthe captured proteases. Unlike in vitro activity assays with peptidesubstrates, a picture of direct protease–inhibitor interactions insource extracts is obtained, with no masking or confounding effectsto generate over- or underestimation of target protease bindingranges [11]. A step-by-step protocol is here described for theprocedure, using single functional variants of tomato cystatinSlCYS8 and digestive proteases of the major coleopteran pest Col-orado potato beetle as a “plant (inhibitor)–insect (proteases)”model of agronomic significance [14].

354 Marie-Claire Goulet et al.

2 Materials

2.1 Biotinylated

Plant Cystatins

1. Biotinylated cystatins expressed in Escherichia coli for cystatin–protease complex enrichment on avidin-embedded agarosebeads [12, 15] (see Note 1).

2.2 Insect Midgut

Proteins

1. Insect proteins from snap-frozen Colorado potato beetle (Lep-tinotarsa decemlineata) fourth-instar larvae reduced to a finepowder in liquid nitrogen (Praxair) [11].

2.3 Laboratory Tools

and Materials

1. Reduced glutathione–Sepharose agarose beads(GE Healthcare).

2. Novagen™ human factor Xa (EMD Millipore).

3. Pierce NeutrAvidin™ agarose beads (Thermo FisherScientific).

4. Bio-Rad Protein Assay Kit™ (Bio-Rad).

5. Sequencing grade trypsin (Promega).

6. Mini-PROTEAN Tetra Cell™ unit for protein 1D gel electro-phoresis (Bio-Rad).

7. Multi-Mix™ tube rotator (VWR).

8. Gel scanner and image analysis software for protein densitome-try in polyacrylamide slab gels (see Note 2).

9. Temperature-controlled shaker.

10. Refrigerated centrifuge.

11. Centrifugal vacuum concentrator.

12. Basic UV–visible spectrophotometer.

2.4 Media, Buffers

and Other Solutions

Culture media, buffers, and solvents are made up as aqueous solu-tions with ultrapure water and analytical grade reagents. All solu-tions are stored at 4 �C unless otherwise indicated. Workingbuffers, reagents, and standard step-by-step protocols for SDS-PAGE are used as described in [16].

1. Luria–Bertani (LB) medium: 10 g/L tryptone, 5 g/L yeastextract, 5 g/L NaCl, pH 7.0.

2. 10 μg/mL chloramphenicol (Sigma-Aldrich).

3. 100 μg/mL carbenicillin (Sigma-Aldrich).

4. 1 mM isopropyl ß-D-1-thiogalactopyranoside (Sigma-Aldrich).

5. 50 μM D-biotin (Sigma-Aldrich).

6. Phosphate buffered-saline (PBS), pH 7.4.

7. 100 mM citrate phosphate buffer, pH 6.0.

Cystatin Activity-Based Protease Profiling 355

8. Agarose beads washing buffer: 100 mM citrate phosphate,pH 6.0, supplemented with 250 mM NaCl and 10 mM L-cysteine.

9. 100 mM ammonium bicarbonate.

10. 50% (v/v) acetonitrile.

11. 10 mM dithiothreitol (Sigma-Aldrich).

12. 55 mM iodoacetamide (Sigma-Aldrich).

13. 2.0% (v/v) acetonitrile/1.0% (v/v) formic acid.

14. 50% (v/v) acetonitrile/1.0% (v/v) formic acid.

15. 0.1% (v/v) formic acid.

3 Methods

3.1 Capture of Target

Proteases with

Biotinylated Cystatins

The whole procedure in this chapter includes two steps consistingof (1) capturing cystatin-sensitive target protease isoforms withbiotinylated AviTagged cystatins (this section) (Fig. 1); and(2) quantifying the captured proteases by LC-MS/MS analysis(Subheading 3.2).

3.1.1 Heterologous

Expression and Purification

of the AviTagged Cystatins

1. Grow 5-mL E. coli cultures overnight at 37 �C in LB mediumsupplemented with 10 μg/mL chloramphenicol and 100 μg/mL carbenicillin (see Notes 3 and 4).

2. Transfer the overnight cultures in 500 mL of LB mediumcontaining 100 μg/mL carbenicillin.

3. Allow the bacteria to multiply at 37 �C under agitation, untilreaching an OD600 of 0.4–0.6.

4. Add 1 mM isopropyl ß-D-1-thiogalactopyranoside (Sigma-Aldrich) to induce protein expression, and 50 μM D-biotin(Sigma-Aldrich) to induce AviTag peptide biotinylation (seeNote 5).

5. Grow bacteria for 16 h at 37 �C under agitation (see Note 6).

6. Centrifuge the cultures at 6000 � g for 5 min at 4 �C, anddiscard the supernatants.

7. Submit the pellets to several freeze/thaw cycles at �20 �C tobreak bacterial cells. A minimum of three freeze–thaw cycles isneeded to obtain proper lysis of the bacteria.

8. Affinity-purify the AviTagged cystatins with reduced glutathi-one–Sepharose agarose beads (GE Healthcare) as described bythe provider (see Note 3).

9. Remove the GST moiety by cleavage with Novagen™ humanfactor Xa (EMD Millipore) as described by the provider.

356 Marie-Claire Goulet et al.

10. Proof-check the overall quality of the purified inhibitor pre-parations on Coomassie blue-stained polyacrylamide slab gelsfollowing 12% (w/v) SDS-PAGE [16].

11. Quantify the purified cystatins by densitometric analysis ofCoomassie blue-stained bands using a high-resolution gelscanner and an appropriate image analysis software, after gen-erating a protein standard curve with bovine serum albumin asa reference.

3.1.2 Binding of

AviTagged Cystatins to

NeutrAvidin™ Agarose

Beads

1. Centrifuge the NeutrAvidin™ agarose beads for 2 min at500 � g to remove commercial storage buffer (see Note 7).

2. Wash the beads with one volume of PBS, pH 7.4, and centri-fuge for 2 min at 500 � g.

3. Repeat step 2 twice.

4. Add biotinylated AviTagged cystatins (Subheading 3.1.1) inexcess concentration (see Note 8) and incubate the beads at

Fig. 1 Schematic overview of the cystatin activity-based protease capture procedure. PHASE 1: Bacteriallyexpressed AviTagged inhibitors (e.g., AviTagged cystatins) are ligated to D-biotin by the action a biotin ligase,in vivo during their expression in E. coli or in vitro following their recovery from the bacteria (Subheading3.1.1). Phase 2: The biotinylated inhibitors (cystatins) are incubated with NeutrAvidin™ agarose beads togenerate cystatin-embedded agarose matrices (or beads) for protease capture (Subheading 3.1.2). Phase 3:The beads are incubated with a crude protein extract of the putative target proteases (Subheading 3.1.3) tocapture those protease isoforms that show affinity for the tested inhibitor variants (Subheading 3.1.4). Thesebeads bound to the inhibitor–target protease complexes then serve as source material for LC-MS/MS analyses(Subheading 3.2)

Cystatin Activity-Based Protease Profiling 357

20 �C for 30 min with gentle agitation on a VWR Multi-Mixtube rotator.

5. Wash the beads with 10 volumes of PBS, pH 7.4, and centri-fuge for 2 min at 500 � g.

6. Repeat step 5 twice.

7. Submit 5 μL of the cystatin-embedded beads to 12% (w/v)SDS-PAGE and stain with Coomassie blue to confirm propercystatin binding. The beads can be stored at 4 �C until use,pending adequate stability of the recombinant inhibitor.

3.1.3 Extraction of Target

Proteases

1. Extract insect powder proteins in two volumes (e.g., 2 mLbuffer/g fresh powder) of 100 mM citrate phosphate buffer,pH 6.0.

2. Keep the mixture on ice for 10 min.

3. Discard insoluble material by centrifugation at 20,000 � g for10 min at 4 �C.

4. Assay soluble proteins in the supernatant using the Bio-RadProtein Assay™ kit (Bio-Rad), as described by the provider.

5. Use the supernatant as freshly prepared for Subheading 3.1.4,or keep it at �80 �C until use.

3.1.4 Target Protease

Capture on Cystatin-

Embedded Agarose Beads

1. Incubate 20 μL of cystatin-embedded beads (Subheading3.1.2) with 5.5 mg of insect proteins (Subheading 3.1.3) in900 μL of 100 mM citrate-phosphate buffer, pH 6.0, for40 min at 20 �C with gentle agitation (see Note 9).

2. Collect the beads by centrifugation for 2 min at 1000 � g.

3. Wash by resuspension in 900 μL of agarose beads washingbuffer (see Note 10).

4. Centrifuge for 2 min at 1000 � g.

5. Repeat steps 3 and 4 twice.

6. Add 10 μL of concentrated (4�) SDS-PAGE loading buffer[16] and 10 μL of agarose beads washing buffer to 20 μL ofagarose beads.

7. Heat for 5 min at 95 �C.

8. Submit 25-μL samples of the resulting mixtures to 12% (w/v)SDS-PAGE and stain the resolved proteins with Coomassieblue [16].

9. Using a scalpel, collect protein band(s) in the gelcorresponding to cystatin-captured proteases (see Note 11). Arepresentative example of protein fraction profiles during theenrichment process is shown on Fig. 2 for AviTagged SlCYS8-captured proteases in Colorado potato beetle crude proteinextracts.

358 Marie-Claire Goulet et al.

3.2 Mass

Spectrometric

Analysis of Captured

Proteases

Gel slices collected at Subheading 3.1.4 are used as source materialto identify and quantify cystatin-captured protease isoforms. Thispart of the procedure first involves protein sample preparation formass spectrometry, followed by the LC-MS/MS analysis per se,peptide-based identification of the captured proteases, and theirquantitation based on peptide spectral count sampling statistics.

3.2.1 Sample

Preparation for Mass

Spectrometry

1. Wash the gel slices for 5 min in water and destain proteins threetimes with equal volumes of 100 mM ammonium bicarbonateand 50% (v/v) acetonitrile.

2. Dry the gel slices by washing for 10 min in 50% (v/v)acetonitrile.

3. Reduce and alkylate entrapped proteins with 10 mM dithio-threitol and 55 mM iodoacetamide, respectively.

Mr

31

21

14.4

6.5

45

66

97116200

b

c

a

Beads(empty)

Beads+biotin

Flow BeadsWashCrudeextract

FlowWash

biotin/SlCYS8 biotin/Q47P

Fig. 2 Avidin-affinity enrichment of Colorado potato beetle Cys proteases captured using biotinylatedAviTagged SlCYS8. Q47P, a single functional variant of SlCYS8 with limited inhibitory activity againstpapain-like Cys proteases [17], was here used as a negative control for the protease capture step. Biotinylatedcystatins bound to the avidin beads were incubated with the insect protein extract for target protease capture(Subheading 3.1.4). Proteins in test and control (Q47P) samples were visualized by Coomassie Blue stainingfollowing 12% (w/v) SDS-PAGE. The crude (Crude extract), flow-through (Flow), washing (Wash), and beads-bound (Beads) protein fractions are shown on the gel. A 30-kDa protein was readily detected in the Beadsfraction using wild-type SlCYS8 (Box a), corresponding to the previously described Cys protease LdP30purified from Colorado potato beetle midgut extracts by affinity chromatography with the model plant cystatinOCI as a ligand [18]. Boxes b and c correspond to avidin and AviTagged SlCYS8 recovered from the affinitybeads, respectively. Mr, on the left, refers to molecular weight protein markers (kDa)

Cystatin Activity-Based Protease Profiling 359

4. Hydrolyze the proteins for 18 h at 37 �C with 125 nMSequencing grade trypsin as described by the provider(Promega).

5. Extract resulting peptides from the gel matrix by incubation for10 min in 2% (v/v) acetonitrile/1.0% (v/v) formic acid.

6. Perform a second extraction in 50% (v/v) acetonitrile–1.0%(v/v) formic acid.

7. Pool the two extractions and dry peptides in a centrifugalvacuum concentrator.

8. Resuspend in 12 μL of 0.1% (v/v) formic acid, from which 5 μLare taken for LC-MS/MS analysis.

3.2.2 LC-MS/MS

Analysis

Our LC-MS/MS analyses are performed at the Proteomics Plat-form of CHU de Quebec Research Center (http://proteomique.ulaval.ca), Quebec, QC, Canada. In brief, peptide samples pro-duced at Subheading 3.2.1 are resolved by online reversed-phasenanoscale capillary LC and analyzed by electrospray MS/MS. AnEksigent ekspert™ nanoLC425 System is used, coupled to aTriple-TOF 5600 plus mass spectrometer equipped with a nanoe-lectrospray ion source (Sciex). Peptide separations take place in self-pack PicoFrit columns [75 μm ID/15 μm tip] (New Objective)packed with Reprosil-Pur C18 AQ media composed of 3-μm par-ticles with pores of 120 A (Dr. Maisch, Woburn, MA, USA). Thepeptides are eluted at 300 nL/min over 35 min along a 5–35%(v/v) acetonitrile–0.1% (v/v) formic acid linear gradient. Full-scanmass spectra [400–1250m/z] are acquired under a data-dependentacquisition mode using the Analyst software, version 1.7 (Sciex).The 20 most intense ions are selected for collision-induced dissoci-ation, with the dynamic exclusion period set a 20 s and a peptide ionmass tolerance of 100 ppm.

3.2.3 Identification of

Captured Proteases

MGF peak list files are generated with the Protein Pilot software,version 4.5 (Sciex) and analyzed using the Mascot software, version2.5.1 (Matrix Science) to search the Uniprot protein sequencedatabase (http://www.uniprot.org/). Search parameters for pro-tein matching are set as follows: a fragment ion mass tolerance of0.1 Da; a parent ion tolerance of 0.1 Da; iodoacetamide derivativesof Cys residues as fixed modification; oxidized Met residues asvariable modification; and a maximum allowed of two missed tryp-sin cleavages. MS/MS-based peptide and protein identifications arevalidated using the SCAFFOLD software, version 4.7.1 (ProteomeSoftware). A false discovery rate of 1%, as determined with theScaffold Local FDR algorithm, is applied for both peptides andproteins. Proteins that contain similar peptides and cannot bedifferentiated based on the MS/MS spectra are grouped to satisfythe principle of parsimony.

360 Marie-Claire Goulet et al.

3.2.4 Quantitation of

Captured Protease

Peptides

Quantitative analysis of MS spectra is performed using spectralcount sampling statistics [19] on those peptides that correspondto the digestive Cys protease (or intestains) of Colorado potatobeetle [12]. Differential numbers of captured proteases for theSlCYS8 variants are discriminated statistically with a significancethreshold of 5%, considering spectral count mean values greaterthan 4 for at least one inhibitor variant [20].

3.3 Working

Examples

Spectral count data in Subheading 3.2.4 may be used to addressquestions of practical or scientific interest. We used the approach inrecent years to identify potent inhibitor variants for Coloradopotato beetle control [11, 12, 21] (Subheading 3.3.1). We alsoused it to address basic questions about the evolution and struc-ture/function relationships of protease–inhibitor interactions inplant/insect systems, again taking the Colorado potato beetle as amodel [10, 22] (Subheading 3.3.2).

3.3.1 Example 1: The

Protease Capture Approach

as a Decision Tool to Select

Cystatins Useful in

Herbivore Pest Control

Attempts to implement resistance to Colorado potato beetle inpotato using recombinant protease inhibitors have been hamperedby the onset of multiple compensatory responses in this insect, likethe expression of “insensitive” proteases or an increased consump-tion of leaf tissue to counterbalance the loss of digestive proteasefunctions following inhibitor intake [23]. Despite obvious con-straints in practice, protein engineering efforts have led over theyears to the development of improved recombinant inhibitors even-tually useful in plant protection [7], such as for instance the SlCYS8variants P2V and T6R, both shown to exhibit improved inhibitorypotency against Colorado potato beetle cathepsin L-like andcathepsin B-like midgut protease activities [14]. Unexpectedly,transgenic potato lines engineered to express P2V showed strongdetrimental effects against Colorado potato beetle fourth-instarswhile T6R-expressing plant lines showed no effect for similar levelsof recombinant cystatin in leaves [11]. Such an apparent discrep-ancy between the in vivo (feeding assay) and in vitro (proteaseassay) data could be explained using the cystatin-based proteasecapture approach, which indicated a broader affinity range for P2Vtoward Colorado potato beetle proteases despite similar inhibitoryactivities measured for the two cystatin variants using syntheticpeptide substrates (Fig. 3).

3.3.2 Example 2: The

Protease Capture Approach

as an Analytical Tool to

Address Basic Questions

on the Evolution and

Protease Binding

Preferences of Plant

Cystatins

Complex protease inhibitor complements in plants are the result ofevolutionary processes often involving gene duplication and posi-tive selection of nonsynonymous mutations at functionally signifi-cant amino acid sites [24]. A well-documented case is the 8-domainpotato multicystatin, an 88-kDa protease inhibitor induced in leaftissue by Colorado potato beetle feeding [25]. The eight domainsof this protein present hypervariable amino acid sites at conservedprotease inhibitory motifs, assumed to be instrumental in its broad

Cystatin Activity-Based Protease Profiling 361

inhibitory range against insect digestive proteases [26]. Anunsolved question at this point is the influence of structural con-straints to amino acid variability on the contribution of positiveselection to cystatin function. Amino acid substitutions retained inplant cystatins during their evolution often involved closely relatedresidues [22] and the actual effects of positively selected aminoacids on cystatin functional diversity remain to be fully explored.

Toward this goal, we used the cystatin-based protease captureapproach to compare the affinity spectra of SlCYS8 single variantsbearing a leucine (L), an isoleucine (I) or a valine (V) in place of theoriginal proline-2 (P2) at positively selected amino acid site 2 in theN-terminal region [22] (see Fig. 3a for a visual representation of P2on SlCYS8). L, I, and V differ from each other only by the spatialorientation of their terminal methyl groups and/or the distancebetween these functional groups and the α-carbon atom. We previ-ously reported roughly similar inhibitory spectra for the P2I, P2L,and P2V variants toward Colorado potato beetle midgut proteases,based on in vitro assays with diagnostic peptide substrates forcathepsin L-like and cathepsin B-like activities [14]. A closer lookat protease targets of the three variants using the protease captureapproach, in fact, revealed an altered Cys protease binding profilefor P2L [22]. Whereas wild-type SlCYS8, and single variants P2Iand P2V, showed a net preference for the insect “intestain B”(IntB) protease subfamily, P2L showed a well-balanced dual affinitypattern for isoforms of the IntB and “intestain D” (IntD)

Loop 1

Loop 2P2

T6

Inhi

biti

on ra

te (%

)

WT T6R P2V0

25

50

75

100

Rela

tive

no. o

f sp

ectr

a

0

2

6

10

P2V

4

8

CBA

SlCYS8 variant (1 µM) SlCYS8 variant

WT T6R

Z-Arg-Arg-MCA

0

40

60

80

20

20 400 10 5030

WTT6R

P2V

Inhi

biti

on ra

te (%

)

Inhibitor (nM)

Z-Phe-Arg-MCA

Fig. 3 Affinity spectra of tomato SlCYS8 and single functional variants T6R and P2V toward Colorado potatobeetle digestive Cys proteases. (a) Structure model for SlCYS8 (GenBank Accession No. AF198390) showingthe approximate position of residues Pro-2 (P2) and Thr-6 (T6) targeted to produce P2V and T6R. Details forthe in silico modeling are given in ref. [10]. (b) Z-Arg-Phe-methylcoumarin (MCA) (cathepsin L-like) and Z-Arg-Arg-MCA (cathepsin B-like) hydrolyzing activities in larval midgut extracts preincubated with the three cystatinvariants. Data on this panel were inferred from ref. [14]. Each bar is the mean of three independent (insectreplicate) values� SE. (c) Relative spectral counts for digestive Cys protease (intestain) peptides captured withbiotinylated SlCYS8, P2V, or T6R in midgut extracts of fourth-instars. Data on this panel were inferred from ref.[11]. Spectral counts are expressed relative to total spectra counted for wild-type SlCYS8 (mean value of 1).Each bar is the mean of three independent (insect replicate) values � SE

362 Marie-Claire Goulet et al.

subfamiles (Fig. 4). These observations showing different proteaseisoform preferences for P2I, P2L, and P2V were pointing to aneffective contribution of closely related amino acids to the positiveselection-driven diversification of plant cystatin function. Theywere suggesting, from an experimental standpoint, the usefulnessof our protease capture approach to assess basic scientific questionsabout the evolution, structure, and function of these ubiquitousplant proteins.

4 Notes

1. As an example, we here use functional variants of tomatocystatin SlCYS8 [14] bearing a D-biotin-bound Avitag peptide(SGGLNDIFEAQKIEWHE∗ [15]) at the C-terminus (see ref.[12]).

Rela

tive

# of

sp

ectr

a

B

ATotal IntB

0

2

4

8

10

6

0

4

8

10

WT P2L P2VP2IWT P2L P2VP2I

WT P2VP2I P2L

IntB IntD

IntD

WT P2L P2VP2I0

20

40

60

2

6

12

Fig. 4 Affinity spectra of wild-type SlCYS8 and single variants P2I, P2L, and P2V toward Colorado potato beetledigestive Cys proteases. (a) Relative spectral counts for intestain peptides captured with biotinylated wild-typeSlCYS8 (WT), P2I, P2L, or P2V in midgut extracts of fourth instars. Spectral counts are expressed relative towild-type SlCYS8 (mean value of 1) for all detected intestains (Total, corresponding to subfamilies IntA–F) orfor peptides specific to major intestain subfamilies IntB and IntD [12]. Each bar is the mean of threeindependent (insect replicate) values � SE. (b) Intestain subfamily preference patterns of wild-type SlCYS8,P2I, P2L, and P2V for major intestain families IntB and IntD. Pie charts illustrate the relative proportions ofIntB- vs IntD-specific peptides detected in the insect crude extract. Data on this figure were inferred fromreference [22]

Cystatin Activity-Based Protease Profiling 363

2. Several image analysis software programs may be used forprotein densitometry, available on the market or freelydistributed. We here use the Phoretix 2-D Expression software,v. 2005 (NonLinear USA).

3. Wild-type tomato cystatin SlCYS8 and single functional var-iants of this inhibitor (P2I, P2L, P2V, and T6R [14]) are hereused for the demonstrations. All cystatin variants are expressedin and purified from E. coli, strain AVB101 (Avidity LLC) usingthe glutathione S-transferase (GST) gene fusion system(GE Healthcare). Gene constructs for the GST fusions aredescribed in refs [12, 22]. AVB101 E. coli cells express a biotinligase, BirA, driving the in vivo biotinylation of AviTag peptides(see Note 4).

4. As an alternative to AVB101 E. coli cells, biotinylation can beperformed in vitro following cystatin affinity purification usinga commercial preparation of the BirA biotin ligase(EC 6.3.4.15) (Avidity LLC). This procedure is typically com-pleted within 1 h. Detailed protocols for in vitro biotin ligationof AviTag peptides are available on the provider’s website(https://www.avidity.com/resources/protocols).

5. D-biotin is prepared as a 5 mM stock solution in warm 10 mMbicine, pH 8.3, and filter-sterilized through a 0.2-μm filterbefore use.

6. Optimal conditions for heterologous expression may vary fromone protein to another. Temperature may be reduced at 20 �Cat this step if the protein tends to form inclusion bodies.

7. Different resins and agarose beads are available for avidin-basedenrichment. We used to work with the TetraLink Avidin™resin from Promega [12] but this product was no longer avail-able commercially in recent years. NeutrAvidin™ was hereselected given its high specificity and strong affinity for biotin(Kd¼ 10�15M). Strong denaturing conditions are required forprotein elution, which ensures retention of biotinylated pro-teins on agarose beads throughout the protease captureprocess.

8. Ten microliters of NeutrAvidin™ agarose beads can supportapproximately 4 μg of purified cystatin, corresponding toapproximately 320 pmol of inhibitor. If inhibitors of differentmolecular weights are compared in a given experiment,amounts applied to the beads must be adjusted so as to useequimolar concentrations of inhibitor. In the present case, weadded 8 μg of cystatin per 10 μL of beads, that is, twice theirbinding capacity, to ensure saturation with the biotinylatedinhibitors. Sufficient volumes of reaction mixture must beprepared at this step to ensure proper mixing of the solutionduring the 30-min incubation.

364 Marie-Claire Goulet et al.

9. The volume of protein extract added must be optimized basedon the abundance of Cys proteases in source extract. A prefil-tration or precipitation step may be necessary before proteasecapture for those extracts (e.g., plant leaf extracts) that containdilute amounts of proteases. pH of the binding reaction couldalso require adjustment for some extracts given its possibleinfluence on inhibitor–protease interactions. The binding reac-tion was here performed at pH 6.0, corresponding to theoverall pH optimum of Colorado potato beetle midgut Cysproteases.

10. The washing step must be optimized as to minimize nonspe-cific binding while maintaining target protease binding to theimmobilized inhibitor. The agarose beads washing buffer washere supplemented with 250 mM NaCl to minimize nonspe-cific binding and with 10 mM L-cysteine to provide reducingconditions for Cys protease activity. L-Cysteine may be alsoincluded in the binding reaction mixture to maintain targetproteases under an active form.

11. AviTagged inhibitors and captured proteases could, in somecases, exhibit similar molecular weights. Optimization of elec-trophoretic conditions before protease band recovery might beindicated in such cases to avoid the masking of captured pro-teases following Coomassie blue staining and eventual interfer-ence by the inhibitor, found in large amounts in the beadeluate, during the MS/MS analysis.

Acknowledgments

Work supported by Discovery and Discovery Accelerator Supple-ment grants from the Natural Science and Engineering ResearchCouncil of Canada to D.M.

References

1. Schluter U, Benchabane M, Munger A et al(2010) Recombinant protease inhibitors forherbivore pest control: a multitrophic perspec-tive. J Exp Bot 61:4169–4183

2. MacedoMLR, de Oliveira CFR, Costa PM et al(2015) Adaptive mechanisms of insect pestsagainst plant protease inhibitors and futureprospects related to crop protection: a review.Protein Pept Lett 22:149–163

3. Chen M, Shelton A, Ye GY (2011) Insect-resistant genetically modified rice in China:from research to commercialization. AnnuRev Entomol 56:81–101

4. Li Y, Hallerman EM, Liu Q et al (2016) Thedevelopment and status of Bt rice in China.Plant Biotechnol J 14:839–848

5. Birk Y (2003) Plant protease inhibitors.Springer, New York, NY

6. Broadway RM (2000) The adaptation ofinsects to protease inhibitors. In: Michaud D(ed) Recombinant protease inhibitors in plants.CRC Press, Boca Raton, FL, pp 80–88

7. Sainsbury F, Benchabane M, Goulet MC,Michaud D (2012) Multimodal protein con-structs for herbivore insect control. Toxins4:455–475

Cystatin Activity-Based Protease Profiling 365

8. Michaud D, Nguyen-Quoc B (2000) Usingnatural and modified protease inhibitors. In:Michaud D (ed) Recombinant protease inhibi-tors in plants. CRC Press, Boca Raton, FL, pp114–127

9. Srinivasan A, Giri AP, Gupta VS (2006) Struc-tural and functional diversities in lepidopteranserine proteases. Cell Mol Biol Lett11:132–154

10. Vorster J, Rasoolizadeh A, Goulet MC et al(2015) Positive selection of digestive Cys pro-teases in herbivorous Coleoptera. Insect Bio-chem Mol Biol 65:10–19

11. Rasoolizadeh A, Munger A, Goulet MC et al(2016) Functional proteomics-aided selectionof protease inhibitors for herbivore insect con-trol. Sci Rep 6:38827

12. Sainsbury F, Rheaume AJ, Goulet MC et al(2012) Discrimination of differentially inhib-ited cysteine proteases by activity-basedprofiling using cystatin variants with tailoredspecificities. J Proteome Res 11:5983–5993

13. Benchabane M, Schluter U, Vorster J et al(2010) Plant cystatins. Biochimie92:1657–1666

14. Goulet MC, Dallaire C, Vaillancourt LP et al(2008) Tailoring the specificity of a plant cysta-tin toward herbivorous insect digestive cysteineproteases by single mutations at positivelyselected amino acid sites. Plant Physiol146:1010–1019

15. Beckett D, Kovaleva E, Schatz PJ (1999) Aminimal peptide substrate in biotin holoen-zyme synthetase-catalyzed biotinylation. Pro-tein Sci 8:921–929

16. Smith BJ (1984) SDS polyacrylamide gel elec-trophoresis of proteins. In: Walker JM(ed) Methods in molecular biology, Proteins,vol 1. Humana Press, Clifton, NJ, pp 41–55

17. Arai S, Watanabe H, Kondo H et al (1991)Papain-inhibitory activity of oryzacystatin, arice seed cysteine proteinase inhibitor, dependson the central Gln-Val-Val-Ala-Gly region con-served among cystatin superfamily members. JBiochem 109:294–298

18. Visal-Shah SD, Vrain TC, Yelle S et al (2001)An electroblotting, two-step procedure for thedetection of proteinases and the study of pro-teinase/inhibitor complexes in gelatin-containing polyacrylamide gels. Electrophore-sis 22:2646–2652

19. Zhang B, VerBerkmoes NC, LangstonMA et al(2006) Detecting differential and correlatedprotein expression in label-free shotgun prote-omics. J Proteome Res 5:2909–2918

20. Old WM, Meyer-Arendt K, Aveline-Wolf Let al (2005) Comparison of label-free methodsfor quantifying human proteins by shotgunproteomics. Mol Cell Proteomics4:1487–1502

21. Oppert B, Rasoolizadeh A, Michaud D (2014)The coleopteran gut and targets for pest con-trol. In: Hoffmann K (ed) Insect molecularbiology and ecology. CRC Press, Boca Raton,FL, pp 291–317

22. Rasoolizadeh A, Goulet MC, Sainsbury F et al(2016) Single substitutions to closely relatedamino acids contribute to the functional diver-sification of an insect-inducible, positivelyselected plant cystatin. FEBS J 283:1623–1635

23. Cingel A, Savic J, Lazarevic J et al (2016)Extraordinary adaptive plasticity of Coloradopotato beetle: “ten-striped spearman” in theera of biotechnological warfare. Int J Mol Sci17:1538

24. Christeller JT (2005) Evolutionary mechan-isms acting on proteinase inhibitor variability.FEBS J 272:5710–5722

25. Bouchard E, Cloutier C, Michaud D (2003)Oryzacystatin I expressed in transgenic potatoinduces digestive compensation in an insectnatural predator via its herbivorous prey feed-ing on the plant. Mol Ecol 12:2439–2446

26. Kiggundu A, Goulet MC, Goulet C et al(2006) Modulating the proteinase inhibitoryprofile of a plant cystatin by single mutationsat positively selected amino acid sites. Plant J48:403–413

366 Marie-Claire Goulet et al.

Chapter 27

A Pipeline for Metabolic Pathway Reconstruction in PlantOrphan Species

Cristina Lopez-Hidalgo, Monica Escandon, Luis Valledor,and Jesus V. Jorrin-Novo

Abstract

In the era of high-throughput biology, it is necessary to develop a simple pipeline for metabolic pathwayreconstruction in plant orphan species. However, obtaining a global picture of the plant metabolismmay bechallenging, especially in nonmodel species. Moreover, the use of bioinformatics tools and statisticalanalyses is required. This chapter describes how to use different software and online tools for the recon-struction of metabolic pathways of plant species using existing pathway knowledge. In particular, Quercusilex omics data is employed to develop the present pipeline.

Key words Metabolic pathways, Enzymes, Metabolomics, Proteomics, Transcriptomics

1 Introduction

Plants have an extraordinary level of metabolic diversity. The widediversity has provided a vast source of natural products that areindispensable resources for humans, especially, for our health andsurvival. Emphasizing that, Rai et al. [1] reported that over 60% ofthe drugs introduced in the past 20 years are based on plant extractsor their close derivatives. Owing to its importance, the study ofdiversity of plant metabolism is essential. In order to achieve thisobjective, elucidation of metabolic pathways and their reconstruc-tion cannot be avoided. Metabolic pathways are defined as full setof biochemical reactions that occur sequentially in biological sys-tems. The substrates and products of these reactions are the meta-bolites, whose transformations are catalyzed by enzymes.Nevertheless, the discovery of full metabolic pathways and meta-bolites in plants is far from being completed [2]. Reconstruction ofmetabolic pathways is vital to achieve this. In fact, despite theavailability of several complete plant genomes (the model plantArabidopsis thaliana (GCA_000001735.2; https://www.ncbi.

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_27, © Springer Science+Business Media, LLC, part of Springer Nature 2020

367

nlm.nih.gov/assembly/GCF_000001735.4) or the forest tree Q.suber (GCA_002906115.1; https://www.ncbi.nlm.nih.gov/assembly/GCF_002906115.1), and the growing amount of transcrip-tomic, proteomic, and metabolomic data are currently available,making sense of all this data at the metabolic level still remains amajor issue for plant scientists.

Some of the obstacles to effective metabolic pathway recon-struction were the total number of metabolites in plant kingdomwhose estimation is between 100,000 and 200,000 [3], and thehigh degree of compartmentation in plants [4]. In addition, themetabolic pathway elucidation becomes more difficult, consideringthe impressive range of secondary metabolites to escape from bioticor abiotic stressors, and the plant alteration of metabolic composi-tions during different physiological and environmental conditions[5]. Finally, standing out, plant metabolomes can also reflect differ-ent genetic backgrounds, due their metabolic changes related withthe environmental conditions of the origin of the population[6]. This difficulty is aggravated in nonmodel species, such as thevast majority of trees, due to incomplete or nonsequenced gen-omes, poor availability of structurally and functionally annotateddatabases [7], and the lack of optimized protocols for wet and insilico analyses that allow for the acquisition of feasible genetics,transcriptomics, proteomics, and metabolomics data.

Despite their difficulty as orphan and recalcitrant plant species,forest trees have been considered at the wide system level such asother model plants [6, 8, 9]. These works have implicated the use ofmultidisciplinary approaches, from visual phenotype to molecular-omics, through physiological and biochemical approaches. In fact,nowadays, the knowledge on biosynthetic and metabolic pathwaysof tree natural products is largely incomplete, but the genomic andmetabolomic information is expected to give clues to missingenzymes and reactions for biosynthesis of diverse chemical sub-stances including those with medicinal and nutritional values, inaddition to the elucidation of vital mechanisms underlying cellularphysiology by deciphering relationships between genotype andphenotype [10].

In this direction, trying to fill this gap with the use of theavailable high-throughput -omics, its combination and the imple-mentation of required methodology, we hoped to provide a modelworkflow for the reconstruction of metabolic pathways of plantspecies using existing pathway knowledge, starting with some soft-ware guidelines, followed by the metabolic pathway image repre-sentation. This method was implemented in [11].

368 Cristina Lopez-Hidalgo et al.

2 Materials

The reconstruction of metabolic pathways requires some informa-tion about multiple molecular level information that constitutes ametabolic pathway, such as enzymes (transcripts or proteinsequences) and metabolites, subtracts and products in the meta-bolic reactions. For this, the employment of different omics tech-nologies such as transcriptomics, proteomics, and metabolomics isessential. The workflow begins with the omics analysis of the tissuesof interest. The omics data obtained (Subheading 2.1) consisting ofsets of identified transcripts, proteins, or annotated metabolites willbe integrated in several metabolic pathways. Different software andonline tools will be employed to carry out the integration men-tioned previously (Subheading 2.2). These tools provide the visualrepresentation of metabolic pathways. The workflow is shown inthe Fig. 1.

2.1 Datasets The data employed in this work belongs to previously publishedworks [11–13].

2.1.1 Transcriptomics

Datasets

The transcripts FASTA format file is obtained as indicated in[13]. In this work, a complete annotation of Q. ilex transcriptomeis carried out by using Sma3s v2 annotator (http://www.bioinfocabd.upo.es/node/11). Further information is describedin Chapter 4 of this book.

2.1.2 Proteomics

Datasets

The protein FASTA format file is obtained as indicated in[13]. Once, the proteins are identified, the Proteome Discoverersoftware version 2.2 allows for the exportation of amino acidsequences in a FASTA format file.

2.1.3 Metabolomics

Datasets

The metabolites were obtained as indicated in [11]. Some pipelinesare implemented for metabolite identification, including both com-mercial software such as Compound Discoverer 3.0 (Thermo Sci-entific™) and Progenesis QI software (Nonlinear Dynamics) andopen and free software packages such as MZmine2 [14], XCMS[15], and MSDIAL [16]. The former group of software identifycompounds using online database search tools includingmzCloud™, Chemspider™, KEGG, and METLIN [17], andlocal or in-house databases. In the employed data, data raws areanalyzed by AMDIS (http://www.amdis.net/) and metabolites are“tentatively assigned” based on GC retention times (RT) and m/zvalues through searches in different databases, including the GolmMetabolome Database [18], Alkane, Fiehn library 1 y 2 [19],GC-TSQ, MoSys, and NIST/EPA/NIH Mass Spectral Library.

The annotated metabolites are named using the KEGG com-pound reference database. For MapMan visualization, the name ofthe metabolites must be compatible with the MapMan metabolite

Metabolic Pathway Reconstruction in Plant Species 369

identifiers. These identifiers can be shown in “MappingMetabolitesdownload” located in MapMan store (https://mapman.gabipd.org/mapmanstore) or also in the result file “Mercator_result”when the Mercator transcript or protein annotation is carried out.

Fig. 1 The workflow for metabolic pathway reconstruction is divided in four steps: omics data collection,bioinformatics for (semi)quantitative analyses, bioinformatics for annotation, and data visualization. Employedsoftware and tools are referenced (transcriptomics, proteomics, and metabolomics)

370 Cristina Lopez-Hidalgo et al.

2.2 Integration Tools Different resources and web application are employed to integratethe multiple omics information. The first one, Mercator [20], is anonline tool to batch classify protein or gene sequences into Map-Man functional plant categories. This tool allows for the automaticstructuring of whole plant transcriptomes and/or proteomes. Oncethe annotation and functional plant categorization have been con-ducted, KEGG (Kyoto Encyclopedia for Genes and Genomes;http://www.genome.jp/kegg/) and MapMan (http://mapman.gabipd.org/) [21, 22] are used to visualize the data in differentplant metabolic pathways.

3 Methods

3.1 Functional Plant

Categorization

1. Go to the MERCATOR sequence annotation website (http://www.plabipd.de/portal/mercator-sequence-annotation).

2. Upload FASTA format file (transcripts or proteins sequences(Fig. 2a) (see Note 1).

3. Press START.

4. On the following page, the process status is displayed (Fig. 2b).This can take several minutes.

5. Once the process is completed, you will already see the func-tional categories pie chart (Fig. 3).

6. Moreover, you can download the result. This result consists ofa simple table (txt format file) which lists the classified tran-scripts or proteins in the different functional categories and theoutcome descriptions. This column contains the annotatedtranscript or protein description, with information as shownin Table 1.

7. Now, extract the enzyme-related transcripts or proteins fromthe fourth column of the result table (DESCRIPTION col-umn). These enzymes usually have an Enzyme Commissionnumber (EC) (http://www.enzyme-database.org/) (seeNote 2).

8. Create a txt format file with a single column with EC numbers(transcripts or proteins) and C numbers (metabolites).

3.2 KEGG Metabolic

Pathways

The KEGG metabolic pathway database contains a collection ofpathway maps that allow for the representation of molecular inter-actions and reactions. Both transcripts and proteins can beemployed. In order to see the presence of transcripts or proteinsrelated to enzymes, the process must be conducted twice.

1. Copy the EC and C numbers in the KEGG mapper (https://www.genome.jp/kegg/tool/map_pathway1.html) or uploadthe previously generated file with the list of these numbers(Fig. 4).

Metabolic Pathway Reconstruction in Plant Species 371

2. Select the organism in search against (Press org and write athfor Arabidopsis thaliana (thale cress) or other species, such aspop for Populus trichocarpa).

3. Press execute.

4. On the following page, the result is displayed (Fig. 5a) (seeNote 3).

5. Choose the metabolic pathway by clicking on the name (e.g.,ath00020 Citrate cycle (TCA cycle)—Arabidopsis thaliana(thale cress)).

6. A picture of the metabolic pathway with the metabolic reac-tions is shown (Fig. 5b). The detected items are highlightedin red.

Fig. 2 (a) Screenshot of Mercator sequence annotator. This tool performs Blast searches against ArabidopsisTAIR 10, Swiss-Prot, and Uniref90. In the picture, other databases can be shown. The results are filtered by athreshold (Blast_cutoff). (b) Screenshot of the Mercator sequence annotator status process. (c) Screenshot ofthe Mercator finished status. When the status process indicates that it is finished, the results can bedownloaded

372 Cristina Lopez-Hidalgo et al.

Fig. 3 (a) Functional categorization and distribution in percentage of the proteins or genes, according to thecategories established by MERCATOR. The pie chart shows different functional categories: PS (Photosynthe-sis), major CHO metabolism, minor CHO metabolism, glycolysis, fermentation, gluconeogenesis/glyoxylatecycle, OPP (Oxidative Pentose Phosphate), TCA/org transformation, mitochondrial electron transport/ATPsynthesis, cell wall, lipid metabolism, N-metabolism, amino acid metabolism, S-assimilation, metal handling,secondary metabolism, hormone metabolism, cofactor and vitamin metabolism, tetrapyrrole synthesis, stress,redox, polyamine metabolism, nucleotide metabolism, biodegradation of xenobiotics, C1-metabolism, mis-cellaneous RNA, DNA, protein, signaling, cell, micro RNA, natural antisense, development, transport, and notassigned. (b) Screenshot of the functional categorization result file (txt format) which lists for each gene orprotein in the first column the BINCODE, in the second column the BINCODE name, in the third column thefasta file identifier, the fourth column the description of the annotated gene or protein, and in the fifth columnthe type of molecular component (T is transcript; P is protein; and M is metabolite. The description containsthe information indicated in Table 1

Metabolic Pathway Reconstruction in Plant Species 373

3.3 MapMan

Metabolic

Representation

1. Start MapMan 3.6.0RC1 (https://mapman.gabipd.org/mapman-download) (see Note 4).

2. Load your mapping. The mapping file is the MERCATORannotation file. For that, choose the side panel Mapping(Fig. 6), click right on Mapping and press “New Mapping.”

Table 1Example of the information indicated in DESCRIPTION column

Annotated transcript/protein Functions in Involved in Located in Expressed in

(p48715|rbl_sinal:98.2) Ribulosebisphosphatecarboxylase largechain precursor(EC 4.1.1.39)(RuBisCO largesubunit)(Fragment)—Sinapisalba (White mustard)(Brassica hirta) &(atcg00490: 97.1)large subunit ofRUBISCO. Proteinis tyrosine-phosphorylated, andits phosphorylationstate is modulated inresponse to ABA inArabidopsis thalianaseeds. RBCL

Ribulose-bisphosphatecarboxylaseactivity

Response tocadmium ion,carbon fixation,peptidyl-cysteineS-nitrosylation,response toabscisic acidstimulus

In 10 components 24 plant structures

Expressedduring Contains InterPro domain/s

BEST Arabidopsis thaliana proteinmatch

Componenttype

14 growthstages

Ribulose bisphosphate carboxylase,large subunit, C-terminal(InterPro:IPR000685), Ribulosebisphosphate carboxylase, largesubunit, ferredoxin-like N-terminal(InterPro:IPR017443), Ribulosebisphosphate carboxylase, largesubunit, N-terminal (InterPro:IPR017444), Ribulosebisphosphate carboxylase, largechain, active site (InterPro:IPR020878)

Ribulose bisphosphate carboxylaselarge chain, catalytic domain(TAIR:AT2G07732.1). &(reliability: 194.2) & (originaldescription: 526 nucleotides)

T

The annotated transcript/protein information contains the EC numbers about enzymes related proteins and transcripts

374 Cristina Lopez-Hidalgo et al.

Choose “From file” and select the MERCATORannotation file.

3. Once it has been loaded, import the metabolites and transcriptor protein list in the Experiment folder (Fig. 6). This must be atxt format file with two columns (“identifier” and “value”). Forqualitative data, the value data is one (Fig. 6). This way, all theomics items can appear with the same color. For quantitativedata, the values should contain log fold changes between atreatment and a reference (see Note 5).

4. Choose a pathway from the Pathways folder to visualize theinterested metabolic pathway.

5. A picture of the citrate cycle pathway is shown (Fig. 7).

All the tools and software mentioned in this chapter allow forthe creation and visualization of different metabolic pathways. Thisintegration is qualitative, but there are other tools that allow forquantitative and qualitative multi-omics data integration. As exam-ples, Omics Visualizer with Cytoscape [23] and pRocessomics(https://github.com/Valledor/pRocessomics) are bioinformaticsplatforms for visualizing molecular interaction networks, allowingforhigh throughput data sets integration. These graphical represen-tations are useful for the biological interpretation of metabolicpathways and making metabolic sense of the multiple levels ofomics data.

Fig. 4 Screenshot of Search Pathway mapping tool. This tool searches against KEGG pathway maps the givenobjects (genes, transcripts, proteins, and metabolites)

Metabolic Pathway Reconstruction in Plant Species 375

Fig. 5 (a) Screenshot of the KEGG pathways mapper results. The results consist of a list of the assignedtranscripts/proteins and the metabolites to each KEGG metabolic pathway. (b) Screenshot of the citrate cyclepathway. The detected transcripts and metabolites are indicated in red

376 Cristina Lopez-Hidalgo et al.

Fig. 6Workflow with screenshots of the process that it is essential to carry out for visualizing the data on mapsof biological processes. The data must be uploaded to Experiment and Mapping folders

Metabolic Pathway Reconstruction in Plant Species 377

Fig. 7 Screenshots of different means to visualize the citrate cycle pathway in MapMan. (a) Core metabolismoverview. (b) Metabolites. (c) TCA representation. Each red square represents a metabolite or a transcript/protein. More details can be found in [22]

378 Cristina Lopez-Hidalgo et al.

4 Notes

1. It is strongly advised to analyze previously the FASTA formatfile in the Mercator4 Fasta Validator tool (http://plabipd.de/portal/mercator-fasta-validator).

2. The EC number can be extracted by different informaticssoftware as Excel or R.

3. Many times, the EC and C numbers are not found. This isbecause many enzymes and metabolites are not associated toany pathways. Also, it may be due to the fact that the ECnumber is deprecated.

4. The installation instruction is here (https://mapman.gabipd.org/web/guest/mapman-download-instructions).

5. It is important that transcripts and metabolites identifiers arethe same that appear in the mapping file (Third column,IDENTIFIER (Fig. 3b)).

References

1. Rai A, Saito K, Yamazaki M (2017) Integratedomics analysis of specialized metabolism inmedicinal plants. Plant J 90:764–787

2. Viant MR, Kurland IJ, Jones MR et al (2017)How close are we to complete annotation ofmetabolomes? Curr Opin Chem Biol36:64–69

3. Ernst M, Silva DB, Silva RR et al (2014) Massspectrometry in plant metabolomics strategies:from analytical platforms to data acquisitionand processing. Nat Prod Rep 31:784

4. Allen DK, Libourel IGL, Shachar-Hill Y(2009) Metabolic flux analysis in plants: copingwith complexity. Plant Cell Environ32:1241–1257

5. Fiehn O (2002) Metabolomics - the linkbetween genotypes and phenotypes. PlantMol Biol 48:155–171

6. Meijon M, Feito I, Oravec M et al (2016)Exploring natural variation of Pinus pinasterAiton using metabolomics: is it possible toidentify the region of origin of a pine from itsmetabolites? Mol Ecol 25:959–976

7. Valledor L, Carbo M, Lamelas L et al (2018)When the tree let us see the forest: systemsbiology and natural variation studies in forestspecies. In: Progress in botany. Springer, Ber-lin, Heidelberg, pp 345–367

8. Correia B, Valledor L, Hancock RD et al(2016) Integrated proteomics and metabolo-mics to unlock global and clonal responses of

Eucalyptus globulus recovery fromwater deficit.Metabolomics 12:141

9. Pascual J, Canal MJ, Escandon M et al (2017)Integrated physiological, proteomic and meta-bolomic analysis of UV stress responses andadaptation mechanisms in Pinus radiata. MolCell Proteomics 16:485–501

10. Qi Q, Li J, Cheng J (2014) Reconstruction ofmetabolic pathways by combining probabilisticgraphical model-based and knowledge-basedmethods. BMC Proc 8:1–10

11. Lopez-Hidalgo C, Guerrero-Sanchez VM,Gomez-Galvez I et al (2018) A multi-omicsanalysis pipeline for the metabolic pathwayreconstruction in the orphan species Quercusilex. Front Plant Sci 9:1–16

12. Guerrero-Sanchez VM, Maldonado-AlconadaAM, Amil-Ruiz F et al (2017) Holm oak(Quercus ilex) Transcriptome. De novosequencing and assembly analysis. Front MolBiosci 4:70

13. Guerrero-Sanchez VM, Maldonado-AlconadaAM, Amil-Ruiz F et al (2019) Ion torrent andlllumina , two complementary RNA-seq plat-forms for constructing the holm oak (Quercusilex ) transcriptome. PLoS One 7454228:1–18

14. Pluskal T, Castillo S, Villar-Briones A et al(2010) MZmine 2: modular framework forprocessing, visualizing, and analyzing massspectrometry-based molecular profile data.BMC Bioinformatics 11:395

Metabolic Pathway Reconstruction in Plant Species 379

15. Gowda H, Ivanisevic J, Johnson CH et al(2014) Interactive XCMS online: simplifyingadvanced metabolomic data processing andsubsequent statistical analyses. Anal Chem86:6931–6939

16. Tsugawa H, Cajka T, Kind T et al (2015)MS-DIAL: data-independent MS/MS decon-volution for comprehensive metabolome anal-ysis. Nat Methods 12:523–526

17. Guijas C, Montenegro-Burke JR, Domingo-Almenara X et al (2018) METLIN: a technol-ogy platform for identifying knowns andunknowns. Anal Chem 90(5):3156–3164

18. Nielsen J, Jewett M (2007) Metabolomics. Apowerful tool in systems biology. Springer,Heidelberg

19. Kind T, Wohlgemuth G, Lee DY et al (2009)FiehnLib – mass spectral and retention indexlibraries for metabolomics based on quadru-pole and time-of-flight gas chromatography/

mass spectrometry. Anal Chem81:10038–10048

20. Lohse M, Nagel A, Herter T et al (2014) Mer-cator: a fast and simple web server for genomescale functional annotation of plant sequencedata. Plant Cell Environ 37:1250–1258

21. Thimm O, Bl€asing O, Gibon Y et al (2004)MAPMAN: a user-driven tool to display geno-mics data sets onto diagrams of metabolic path-ways and other biological processes. Plant J37:914–939

22. Usadel B, Poree F, Nagel A et al (2009) Aguide to using MapMan to visualize and com-pare Omics data in plants: a case study in thecrop species, maize. Plant Cell Environ32:1211–1229

23. Shannon P, Markiel A, Ozier O et al (2003)Cytoscape: a software environment forintegrated models of biomolecular interactionnetworks. Genome Res 13:6

380 Cristina Lopez-Hidalgo et al.

Chapter 28

Detection of Plant Low-Abundance Proteins by Meansof Combinatorial Peptide Ligand Library Methods

Egisto Boschetti and Pier Giorgio Righetti

Abstract

The detection and identification of low-abundance proteins from plant tissues is still a major challenge.Among the reasons are the low protein content, the presence of few very high-abundance proteins, and thepresence of massive amounts of other biochemical compounds. In the last decade numerous technologieshave been devised to resolve the situation, in particular with methods based on solid-phase combinatorialpeptide ligand libraries. This methodology, allowing for an enhancement of low-abundance proteins, hasbeen extensively applied with the advantage of deciphering the proteome composition of various plantorgans. This general methodology is here described extensively along with a number of possible variations.Specific guidelines are suggested to cover peculiar situations or to comply with other associated analyticalmethods.

Key words Plant proteome, Low-abundance proteins, Combinatorial peptide ligand library

1 Introduction

In the last 10 years plant proteomics has experienced a fast growthespecially thanks to the development or optimization of relevanttechniques, allowing for an in-depth discovery of proteins presentin various organs. In contrast to animal proteomics, there arespecific difficulties that hamper proper discoveries [1–3]. One ofthe major drawbacks is that in some tissues (like leaves) a fewproteins dominate the landscape and prevent proper discovery oflow-abundance polypeptides. This is further aggravated by thepresence of various plant constituents (polyphenols, polysacchar-ides) that strongly interfere with various sample manipulations,such as protein capture via various chromatographic means andanalyses via different electrophoretic methodologies. In spite ofthe relative paucity of genomics data, progresses have been exten-sively made. However, additional efforts could be useful to assistscientists to tackle the sequencing of more and more plant genomes(most of the papers published so far deal with the proteomes of

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_28, © Springer Science+Business Media, LLC, part of Springer Nature 2020

381

Arabidopsis thaliana and rice, Oryza sativa, and focus on profilingorgans, tissues, cells, or subcellular proteomes). The present chap-ter deals with an emerging and powerful methodology for thedetection of low-abundance proteins already extensively adoptedin animal proteomics [4–6], namely, the combinatorial peptideligand library (CPLL) technique.

2 Plant Proteins: a Minor Component of Plant Extracts with Specific Properties

The protein content in plant cells is about 20 times lower than inanimal cells since the major part of the biomass is constituted ofthick polysaccharidic cell walls. Proteins present in various organsare of very different molecular mass up to very large constructscomprising a number of hydroxyproline molecules [7]. Due totheir intricate combination with polysaccharides (e.g., cellulose,lignins, hemicelluloses, and pectins) the solubility of plant proteinscan be challenging requiring for instance the presence calcium ionsin aqueous buffers as well as other salts or chaotropes [8] or evencomplex deglycosylation processes [9]. Often, stringent extractionmethods are necessary to dissociate tightly bound proteins to cellwalls, entailing the use of complex procedures such as sequentialwashings with a 0.2 M CaCl2 buffered solutions followed by boil-ing extraction with 62 mM Tris–HCl buffer, pH 6.8, 2% SDS, 10%v/v glycerol, and 100 mM dithiothreitol [10]. In addition to allthese differences from animal proteins, plant proteins are relativelyresistant to current proteases, impairing their analysis with classicalmethods based on polypeptide breakdown and peptide sequencing.Many low-concentration enzymes are present (proteases, lipases,nucleases, oxidases, and signaling involved hydrolases) not onlysuggesting a great level of dynamic roles but also necessitatingspecific preliminary treatments like delipidation, nucleic acidshydrolysis, and sugar extractions.

Although plant proteins are scarcely present in plant tissues,some of them are much more abundant as compared to others, thuscreating an extremely large dynamic concentration range. This is anorgan-dependent plant situation. A typical case is represented byRuBisCO in leaves extracts that does not allow for detecting thepresence of many other low-abundance leaf species [11]. In seeds, amassive protein presence is related to storage proteins [12]. Amonghigh-abundance proteins prolamins and gliadins in wheat [13],vicilin in maize embryo [14], and beta-conglycinin and glycinin(dominant proteins in soybean seeds) can be mentioned [15].

The above-described situations necessitate a drastic reductionof protein dynamic concentration range to access LAP otherwisemasked in mass spectrometry by strong signals generated by HAP.In electrophoretic separation techniques the large surface areaoccupied by concentrated proteins (spots in 2-DE and thick

382 Egisto Boschetti and Pier Giorgio Righetti

bands in SDS-PAGE) overlaps with LAPs and prevents their detec-tion. However, prior to resolving the question of the reduction ofthe dynamic concentration range, pretreatments of plant extractsare most of the time mandatory.

3 Pretreatments of Plant Extracts to Eliminate Interfering Material

Since the initial plant crude extracts comprise many interferingsubstances incompatible with the use of CPLLs, special treatmentsare necessary to prepare a “clean” protein solution [16]. Improvedprotein extraction from recalcitrant tissues, nonproteinaceousmaterial extraction and protein precipitation are the most popularstrategies.

For plant tissue extraction several points have to be considered.Plant cells are rich in proteases, which requires the presence ofinactivating agents, and rich in polysaccharides (lots of them poly-anionic) interacting directly with CPLLs and thus interfering withprotein capture. The presence of various pigments, lipids, polyphe-nols, and secondary metabolites completes the list of products thatmay be problematic with protein separation and analyses [17].

The following general rules should be adopted: (1) the aqueousextraction should be performed in relatively low ionic strength toprevent the solubilization of nucleic acids; (2) with highly viscousmaterial, such as latex and honey, a dilution is recommended;(3) when dealing with proteins that are engaged within the cellwall, such as pollen proteins, some amounts of nonionic detergent(less than 0.5–1%) and urea (less than 3 M) should be used at aconcentration compatible with CPLLs. Examples are given in theliterature with detailed technical information [18]. A preliminarylipid removal step is particularly recommended with plant seedssuch as soya beans, peanuts, corn, sunflower, and many others.Simple extractions with nonpolar organic solvents are possiblewith some risks of protein denaturation. Other methods are alsodescribed involving a sequence of operations [19]. Pigments andpolyphenols elimination can be obtained with a phenol treatmentassociated with some amounts of polyvinylpyrrolidone.

Precipitation is an essential step in the preparation of the pro-tein solution. TCA alone or associated with acetone with reducingagents allows for precipitating most proteins leaving in the super-natant plenty of undesired materials [20]. Pellets containing pro-teins are then separated by centrifugation and redissolved in theselected buffer.

Protein precipitation can also be performed with ammoniumsulfate or with polyethylene glycol to collect precipitates that arefree or almost free of CPLL-interfering substances. Ammoniumsulfate precipitation of all proteins present in a plant extract isperformed at 80–90% saturation. Naturally at the end of this

Low-Abundance Plant Proteomics 383

operation the salt is to be removed. This task is accomplished bysimple dialysis at a very low molecular cutoff (e.g., 3500 Da) or bycentrifugation using appropriate filtration-integrated devices oreven desalting chromatography.

Another possible approach is acidic precipitation addressingsome categories of proteins; it can be operated by acidifying thesolution with acetic acid at pH 3–4.

To complete the picture, a less popular method is proteinprecipitation by a chloroform–methanol mixture with water in1:4:3 proportions [21].

Several variants to the abovementioned methods are alsodescribed to precipitate proteins [22]; however, the protein solubi-lization protocol to follow is not always easy and frequently neces-sitates the presence of zwitterionic surfactants and chaotropes.

The associations of undesirable material elimination and pro-tein precipitation can sometimes be a good option. For instancepigments are eliminated by using a Tris-HCl solution saturatedwith phenol followed by a protein precipitation with ammoniumsulfate in the presence of methanol [23]. This possibility is espe-cially recommended when the analysis of proteins is based ontwo-dimensional electrophoresis (2-DE), but it depends on theplant organ where from the protein extraction is obtained.

All the above sample pretreatments contribute to obtain signif-icantly better analytical results especially when usingtwo-dimensional electrophoresis and related methods. However,not all plant-derived biological material, such as wine, needs to besubmitted to a preliminary treatment [24].

All the above-described preliminary operations not onlyremove undesired materials but also contribute to concentratemany proteins that are present in very low amounts.

In spite of many available cleanup protocols specific to plantextracts, some of them are not compatible with CPLLs. Figure 1illustrates possible options for four typical plant extracts used inconjunction with CPLLs.

4 The Reduction of Protein Dynamic Range with Low-Abundance ProteinEnhancement

In spite of the presence of many proteins, plant proteomic analysissuffers from the very low level of gene expression and the presenceof individual proteins with particularly high concentration com-pared to all others. This situation results in proteomes where theindividual components concentration difference spans over severalorders of magnitude. Many proteins are present only in few copiesand consequently very difficult to detect. In this context depletionand enrichment procedures have been devised to improve the

384 Egisto Boschetti and Pier Giorgio Righetti

situation like in animal proteomes. Precipitation [25, 26] fraction-ation [27], depletion [28], and enrichment [29, 30] are the majorapproaches.

Affinity-based selective separation methods (e.g., for the analy-sis of phosphoproteomes) are another way to enrich for proteincategories. As an example for the analysis of phosphoproteome, alabeling of phosphate groups on serine and threonine residues by abiotin tag followed by a separation using avidin affinity chromatog-raphy has been described [31]. Unfortunately the abovementionedapproaches are labor-intensive and comprise an intrinsic risk oflosing very low-abundance proteins due to the multiple manipula-tions. Moreover, they do not concentrate polypeptides present intrace amounts all together. The method described in this chapterdrastically reduces the dynamic range of protein concentration,allowing for a large analytical scan of the entire proteome. It provedits efficiency with a number of plant extracts such as maize seeds[32], spinach [33] and Arabidopsis thaliana leaves [34], rubberplant latex [35], and fruits [36].

In the last decade a method has been devised and progressivelyoptimized to compress the dynamic concentration range in orderto decrease the concentration of high-abundance proteins, thusreducing the signal coverage and at the same time concentratingthe rare species. This process is operated by the so-called combina-torial peptide ligand libraries or CPLLs that has been extensivelydescribed for a number of biological extracts including plants [37–39].

Fig. 1 Summary of plant extracts pretreatment possibilities as a function of theirsource

Low-Abundance Plant Proteomics 385

CPLLs are a mixture of small beads (ca. 65 μm diameter) towhich hexapeptides are covalently linked and commercializedunder the trade name of ProteoMiner. The number of peptidesreaches various millions depending on the number of amino acidsused for the synthesis; however, each bead carries a single type ofpeptide in a large number of copies. This is thus a mixed bed ofbeads different from each other and individually capable to capturea protein or a group of them. When a plant protein extract isexposed to such a solid phase under large overloading conditions,each bead with affinity to an abundant protein will rapidly becomesaturated and the vast majority of the same protein will remainunbound. In contrast, trace proteins will not saturate thecorresponding partner beads unless the sample volume is largeenough to provide for increasing amounts of proteins. Once theexcess of unbound proteins is eliminated by filtration or centrifuga-tion, all captured proteins can be harvested by elution at a muchlower dynamic concentration range than in the original biologicalsample. Proteins present at trace levels become thus detectable bycurrent analytical methods.

To succeed, two technical essential conditions are to be met:(1) each single bead must contain copies of one unique hexapeptideligand (one-bead-one-peptide) [40] and (2) an oversaturated load-ing condition [41] is required. In agreement with these statements,Huhn et al. [42] and Rivers et al. [43] showed that the loadingvolume is critical for the reduction of the dynamic range, increasingthe number of proteins identifications while increasing samplevolume.

The entire protein adsorption mechanism is regulated by sev-eral physicochemical parameters such as pH [44], buffer ionicstrength, presence of competitors, temperature and proteinconcentration.

To date a large number of applications is available. When justconsidering plant proteomics applications, this technique extendsthe elucidation of proteome compositions from various organs[36], the detection of allergens [6], and the differential proteinexpression upon stress conditions [34].

5 Materials and Methods

Ammonium bicarbonate, ammonium sulfate, CHAPS (3-[(3-cho-lamidopropyl)-dimethylammonio]-1-propanesulfonate), chloro-form, citric acid, dithiothreitol, ethylene glycol, formic acid,guanidine, glycine, iodoacetamide, methanol, potassium phosphatemonobasic, sodium chloride, sodium dodecyl sulfate, sodium phos-phate dibasic dehydrate, thiourea, tris(hydroxymethyl) amino-methane, and urea are chemicals and biochemicals of high puritygrade from Sigma-Aldrich, Saint Louis, MO. Protease inhibitor

386 Egisto Boschetti and Pier Giorgio Righetti

cocktail is from Roche Diagnostics, Basel, CH., or from Sigma-Aldrich, Saint Louis, MO. Rapigest SF is from Waters Corp.,Milford, MA.

ProteoMiner (a CPLL), a solid-phase combinatorial peptideligand library as a mixed bed, is available from Bio-Rad Labora-tories, Hercules, CA, USA (see Note 1).

Vortex and benchtop centrifuge from Thermo Fisher Scientific.Centricon, centrifugal filters for cutoff 3000 or 10,000 kD are

from Millipore Corp. Milford, MA.

6 Protein Capture with Concomitant Dynamic Range Reduction

The capture of proteins by CPLLs is operated according to specificsituations. For instance, when attempting to analyze thelow-abundance proteome the recommended general procedure isto use physiological buffer conditions of pH and ionic strength andat room temperature. By the modulation of these generic captureconditions it is thus possible to target the reduction of dynamicconcentration range on various category of proteins (see Fig. 2 anddetailed descriptions below). A number of other preliminary con-siderations have to be accounted in order to reach optimizedresults. To this end refer to Notes 1–7.

Another point of interest to be mentioned is the nature of theplant sample. To prevent possible negative interferences withCPLLs the protein extract must be clear without products in sus-pension. It has also to be DNA-free and should not contain lipids,pigments, polysaccharides, and other chemicals that prevent aproper capture of proteins. For complete information see Subhead-ing 3.

6.1 General Capture

Method under

Physiological

Conditions

The amount of neutral salt to reach physiological conditions is150 mM. Most frequently the buffer used is a 25 mM phosphatebuffer containing 0.15 M sodium chloride, pH 7.2 (PBS). Thesebuffers mimic the conditions that reign within a cell; by definitionthese conditions fully preserve the biological functions of proteins.

(a) Equilibrate the plant protein clear sample with the selectedphysiological buffer (e.g., PBS). This operation is performedby different ways. For instance, if the protein sample is alyophilized material, dissolve the powder in the buffer andclarify by centrifugation. Otherwise a dialysis or a diafiltrationoperation or a desalting chromatography or even a desaltingby centrifugation using dedicated membrane devices can beadopted. For information on protein amounts and concentra-tions, see Notes 3, 8–11. If the protein sample contains pro-teases (this is frequently the case) a tablet of a cocktail ofprotease inhibitors is added.

Low-Abundance Plant Proteomics 387

(b) Equilibrate the CPLLs using a physiological buffer; this is thesame buffer used to dissolve the protein sample. Then drainout the excess of liquid by low-speed centrifugation at about1250 � g at 20 �C for a few min.

(c) Put in contact the protein sample and the CPLL beads; stirgently to maintain the beads suspended within the liquid. Thecontact time should be of at least 2–3 h or overnight. Roomtemperature incubation is the most current option; however,the protein capture can be performed at a different tempera-ture, such as for instance at 4 �C (see Note 12).

(d) Eliminate the excess of supernatant by centrifugation. Washthe CPLL beads with the equilibration buffer to remove theexcess of proteins and store at 4 �C while waiting for theprotein elution (see Subheading 7).

6.2 Protein Capture

in Low-Ionic Strength

The reduction of ionic strength of the capture buffer promotes orintensifies electrostatic interactions. Weakly charged proteins canthus be more easily attracted by the electrical charge of the beads.The result of the decrease of ionic strength is an increase of bindingcapacity as this is the case when dealing with ion exchange chroma-tography. Under these conditions the amount of proteins offered

Initial protein aqueous clear extract

Low-abundance proteins

CPLLs CPLLs CPLLs CPLLs CPLLs

Largecollection

Lowionic

strengthPBS

AcidicpH

AlkalinepH

Lyotropicsalts

Stringentcollection

Acidicproteins

Alkalineproteins

Hydrophobicproteins

Fig. 2 Protein capture phase of the dynamic range reduction process by meansof CPLLs. The operation is most generally performed under physiological con-ditions; however, the reduction of ionic strength allows for capturing moreproteins especially those that have week affinity for the CPLLs. Specific condi-tions can be used to enhance either or alkaline or hydrophobic low-abundanceprotein capturing

388 Egisto Boschetti and Pier Giorgio Righetti

to the beads should at least be doubled to reach optimized condi-tions for the reduction of dynamic protein concentration range.

The technical methodology remains exactly the same as per theprotein capture under physiological conditions (see section above).

6.3 The Capture

of Dominantly Acidic

Proteins

Proteins carry electrical charges of both signs depending on thepH. At low pH the positive charge is exacerbated and proteins thatat neutral pH are negatively charged will reverse their electrical sign.In this case proteins that were captured by CPLLs because ofattractive electrical sign could be repulsed by them. This is whyfor a good reproducibility a perfect control of pH is mandatory.Since proteins are captured by the beads thanks also to otherinteractions, the variation of pH may contribute to weaken theinteraction intensity with certain protein species to the point thatno capture occurs.

The range of pH values where the CPLLs can be operated isbetween 3 and 10 [44]. Beyond these limits virtually all proteins arecharged, respectively, either positively or negatively.

As a general rule when the operation is performed in acidicconditions the capture of anionic proteins is enhanced.

(a) Adjust the protein extract at the desired acidic pH (mostgenerally pH 4) by adding dropwise either acetic acid or citricacid up to pH stabilization. This operation can also be per-formed by buffer exchange (dialysis or diafiltration or gelfiltration). Remove possible materials in suspension.

(b) Equilibrate the CPLL beads using an acidic buffer of the samepH selected for the protein extract. Then drain out the excessof liquid by centrifugation under low-speed (about 1250 � gat 20 �C for a few min).

(c) Mix the protein sample and the CPLL beads; stir gently tomaintain the beads in suspension for at least 2–3 h or over-night at constant room temperature. A majority of acidicproteins are captured.

(d) The protein capture extent will depend also on the ionicstrength of the buffer as described in Subheading 4.

(e) Eliminate the excess of supernatant by centrifugation.

(f) Wash 2–3 times the CPLL beads with the equilibration bufferto remove the excess of proteins and proceed for the elution ofcaptured proteins by one of the methods described inSubheading 7.

6.4 The Capture

of Dominantly Cationic

Proteins

As stated above (Subheading 6.3), when varying the buffer pH,proteins acquire a different net electrical charge. In alkaline condi-tions the dominant charge is negative for proteins having an iso-electric point below the environmental pH.

Low-Abundance Plant Proteomics 389

As a general rule when the operation is performed in alkalineconditions the capture of cationic proteins is enhanced.

(a) Adjust the protein extract at the desired alkaline pH (mostgenerally pH 9) by adding dropwise a solution of ammoniumhydroxide or of Tris base up to pH stabilization. This opera-tion can also be performed by buffer exchange (dialysis ordiafiltration or gel filtration). Remove possible material insuspension.

(b) Equilibrate the CPLL beads using an alkaline buffer of thesame pH selected for the protein extract. Then drain out theexcess of liquid by centrifugation at about 1250 � g at 20 �Cfor a few min.

(c) Mix the protein sample and the CPLL beads; stir gently tomaintain the beads in suspension for at least 2–3 h or over-night at constant room temperature. A majority of cationicproteins are captured while the majority of anionic proteinsmay stay in solution. The capture extent will depend also onthe ionic strength of the buffer as described in Subheading 4.

(d) Eliminate the excess of supernatant by centrifugation.

(e) Wash 2–3 times the CPLL beads with the equilibration bufferto remove the excess of proteins and proceed for the elution ofcaptured proteins by one of the methods described inSubheading 7.

6.5 Focus

on Hydrophobic

Protein Capture

Among natural amino acids composing proteins the most hydro-phobic are isoleucine, leucine, valine and phenylalanine. They con-tribute to confer a certain degree of hydrophobicity to the entirepolypeptidic construct. A typical method of separating this cate-gory of proteins is hydrophobic chromatography [45]. Thismethod is based on the use of structuring salts selected from theHofmeister series. The most common process is to equilibrate thecolumns by using a buffer comprising at least 1 M ammoniumsulfate. Under these conditions the most hydrophobic proteinsare adsorbed by the CPLL beads and are thus subtracted from theprotein solution. Electrostatic interactions are minimized becauseof the presence of strong salt ions. Within the present context withan entire proteome, the capture of hydrophobic protein by CPLLscan easily be enhanced. The technical details are as follows:

(a) To the protein extract add the desired amount of lyotropic salt(generally this is 1 M ammonium sulfate final concentration).In case of difficulties with protein precipitation the user shouldrefer to Note 13. Protein equilibration can alternatively beequilibrated with a buffer comprising the lyotropic salt bybuffer exchange (dialysis or diafiltration or gel filtration). Apossible cloudy material may appear in the supernatant andshould be removed by centrifugation at 10,000 � g for10 min.

390 Egisto Boschetti and Pier Giorgio Righetti

(b) Equilibrate the CPLL beads using a buffer containing thesame amount of ammonium sulfate adopted for sample condi-tioning. Then drain out the excess of liquid by centrifugationunder low-speed (about 1250 � g at 20 �C for a few min).

(c) Mix the protein sample and the CPLL beads; stir gently just tomaintain the beads in suspension for at least 2–3 h or over-night at constant room temperature. A majority of hydropho-bic proteins are captured.

(d) Eliminate the excess of supernatant by centrifugation. Wash2–3 times the CPLL beads with the equilibration buffer con-taining the lyotropic salt to remove the excess of biologicalmaterial and proceed for the elution of captured proteins byone of the methods described in Subheading 7.

7 Recovery Protocols of Plant Protein from CPLLs

Interaction forces between proteins and CPLL beads are of differ-ent nature. The most dominating forces are electrostatic interac-tions, hydrophobic associations, hydrogen bonding, and van derWaals interactions. They can act singularly or collectively accordingto the sequence of the hexapeptides.

Electrostatic interactions are probably the most representativeforces; they depend on the environmental pH and can bechallenged by the presence of salt ions. These forces are attractiveor repulsive depending on the sign of the electrical charge. Changesin temperature influence the intensity of electrostatic interactions:for instance a decrease in temperature increases the interaction.

Hydrophobic associations are only attractive. They are attrib-uted to the presence of hydrophobic amino acids throughout theprotein sequence. This interaction results from an association ofcooperative molecules capable to repel water. As a result watermolecules around these associations are particularly structured con-tributing thus to strengthen the association with the global reduc-tion of entropy. Hydrophobic associations are modulated by theenvironmental temperature (an increase of temperature up a certainlevel reinforces the molecular association) and by the presence ofstructuring salts.

The dissociation of hydrophobic interactions (this is the pur-pose of this section) is produced by competing molecules such asheavy alcohols, glycols and detergents and water-destructuringmolecules (chaotropic agents such as urea and guanidine). A simplereduction of ionic strength may also contribute to decrease thestrength of weak hydrophobic associations.

Hydrogen bonding is largely present in polypeptidic structures.It takes its origin from two electronegative atoms that share thesame hydrogen atom. For instance the protonation of glutamic and

Low-Abundance Plant Proteomics 391

aspartic acids acts as a donating group contribute to the creation ofhydrogen bonding. These interactions occur when the distancebetween molecular species is short: the shortest the distance thestrongest the hydrogen bond.

Hydrogen bonding is quite sensitive to pH changes, competi-tors and water destructuring agents (guanidine and urea). In cer-tain cases, analog molecules (arginine and citrulline) could act ascompetitors of hydrogen bonding.

Within the context of the protein interactions with CPLLs,grafted hexapeptides may comprise chemical groups capable tointeract as a mixed mode. In this case, concomitant electrostaticinteractions, hydrophobic associations, and hydrogen bonding maybe present. This situation is to be considered when designing aproper elution protocol for protein harvesting. Global proteinelution from CPLLs is the most common option; however, alterna-tive fractionated elution may facilitate the analytical process withthe delivery of more detailed information in terms of proteomecomposition. A summary of various options is given in Fig. 3.

7.1 Global Protein

Harvesting

Global protein elution from CPLLs is the most frequent option inprotein harvesting for proteomics analysis. To this end all involvedinteraction forces have to be challenged. In frequent cases it hasbeen observed that after elution some proteins are still present onthe beads. They are polypeptides retained with high associationconstants, among them low-abundance proteins. If they are noteluted they escape the proteomics analysis with a significant reduc-tion efficiency of the CPLL treatment. It is within this context thatseveral global elution methods can be devised.

7.1.1 Global Protein

Elution

with SDS-Containing

Buffers

This is one of the most efficient elution methods. It involvessodium dodecyl sulfate (SDS) as repeatedly described [46]. SDS isknown in electrophoresis to confer to proteins a similar globalcharge by sticking on the proteins via hydrophobic associationsand exposing thus strong sulfonate groups. With this profoundrestructuring, proteins desorb from the solid CPLL phase. Thisoperation is performed in the presence of dithiothreitol preventingthe formation of disulfur bonds while enhancing the solubility ofproteins. In addition the high temperature of treatment (boilingwater bath) accelerates the elution procedure to just a few minutes.

(a) Prepare an aqueous solution of 3% SDS (this concentrationcould be as high as 10%) and add dithiothreitol (DTT) up to afinal concentration of 25 mM.

(b) To 100 μL of CPLL beads loaded with proteins add 200 μL ofSDS-DTT solution. Mix gently while preventing the forma-tion of foam and then put in a boiling bath for 10 min.

(c) Cool down the bead suspension and separate the supernatantby low-speed centrifugation (e.g., 2000 � g for 10 min).

392 Egisto Boschetti and Pier Giorgio Righetti

(d) Make another protein extraction with an additional 200 μL ofSDS-DTT solution under the same conditions and separatethe supernatant. Pool the latter with the first eluate and storein the cold while waiting for proteomic analyses. For compati-bility with the following analytical determinations see Sub-heading 8. In a number of cases SDS present in the eluatemust be eliminated; Note 14 gives detailed instructions forprotein precipitation. To check that all proteins are desorbedfrom CPLL beads, a recommendation is given in Note 15.

7.1.2 Global Protein

Elution with Guanidine

Hydrochloride Solutions

Another efficient agent capable to desorb proteins from complexaffinity column is guanidine hydrochloride. Such a solution is usedat a quite high concentration. It easily competes against electro-static interactions. Guanidine is a strong chaotropic agent able toweaken hydrogen bondings and hydrophobic associations. Thefinal result is the total desorption of proteins that are captured byCPLLs. Naturally after exposure with guanidine hydrochloride,desorbed proteins are destructured and hence denatured.

(a) Prepare an aqueous solution of 6 M guanidine and adjust thepH to 6 by addition of 3–6 M hydrochloric acid.

(b) To 100 μL of CPLL beads loaded with proteins add 200 μL ofthe guanidine elution solution, mix gently for 10 min.

(c) Separate the proteins that are in solution in the supernatant bylow speed centrifugation (e.g., 2000 � g for 10 min). Thenrepeat the operation with the recovered CPLL pellet to be sure

Fig. 3 Protein elution phase from CPLLs. A variety of protein desorption methodscan be combined either as global protein elution or as fractionated elution. Thelatter can be composed of two, three, or more desorption steps. When more thanone desorption is involved, proteins are collected as a function of elutionstringency or by challenging individually the elemental molecular interactions

Low-Abundance Plant Proteomics 393

that all proteins located within the bead pores are extracted.The second supernatant also recovered by centrifugation ispooled with the first one.

(d) The assembled eluate solution is not directly analyzable bycurrent methods because of the presence of high concentra-tions of guanidine. The protein solution must thus be dialyzedagainst any appropriate buffer and if necessary concentratedand precipitated.

(e) To check that all proteins are desorbed from CPLL beads arecommendation is given with Note 15.

7.2 Fractionated

Elution Approaches

Several sequenced elution methods have been reported. They arealso detailed in a dedicated book where variations are described[47]. The principle is to start with a relatively mild elution stepfollowed by other desorption steps each of them being performedwith chemical agents or displacers of increased stringency. Thereason behind this approach is first to be sure that all proteins aredesorbed and second that each fraction is populated by a lowernumber of species compared to a global protein elution, thus facil-itating the following analytical procedures.

7.2.1 Two-Step Elution

with Increased Stringency

(a) Prepare two different desorbing aqueous solutions. The first iscomposed of 4 M urea, 1% CHAPS, 5% acetic acid, the secondis a 6 M guanidine-HCl, pH 6.0.

(b) To 100 μL of CPLL beads loaded with proteins add 200 μL ofthe first elution solution. Mix gently for about 10 min.

(c) Separate proteins that are in the supernatant from the beads bylow speed centrifugation (e.g., 2000 � g for 10 min). Treatthe CPLL beads a second time under exactly the same condi-tions and pool the two supernatants. Store this first eluate inthe cold.

(d) Mix then CPLL beads pellets with 200 μL of guanidine-HCLsolution and gently shake for 10 min. Separate the supernatantby centrifugation and repeat the operation. Separate the sec-ond supernatant by low-speed centrifugation and pool withthe first one. Store this second eluate in the cold.

(e) The two eluates are ready for protein analysis by chromatog-raphy, mass spectrometry or electrokinetic methodologies.

7.2.2 Three-Step Elution

with Increased Stringency

(Option 1)

(a) Prepare three different desorbing aqueous solutions. The firstis composed of 2 M thiourea, 7 M urea, and 2% CHAPS (herenamed TUC). The second solution is composed of 9 M ureaacidified to pH 3 by acetic acid or citric acid (here namedUCA). The third solution is a mixture of acetonitrile, isopro-panol, ammonia at 20% and water (6, 12, 10 and 72% respec-tively) (here named AIAW).

394 Egisto Boschetti and Pier Giorgio Righetti

(b) To 100 μL of CPLL beads loaded with proteins add 200 μL ofTUC solution; mix gently for 10 min.

(c) Separate proteins that are in the supernatant from the beads bylow speed centrifugation (e.g., 2000 � g for 10 min). Treatthe CPLL beads a second time under exactly the same condi-tions and pool the two supernatants. Store this first eluate inthe cold.

(d) Proceed similarly for the obtention of the second and the thirdeluate.

(e) Store the three eluates in the cold.

(f) The eluates are ready for protein composition analysis by massspectrometry, chromatography, or electrokinetic methodolo-gies. For compatibility with analytical determinations seeSubheading 8.

7.2.3 Three-Step

Increased Stringency

Elution (Option 2)

(a) Prepare three different desorbing aqueous solutions. The firstis composed of 1 M sodium chloride, the second composed of3 M guanidine-HCl pH 6.0 and the third comprising 9 Murea titrated with citric acid up to pH 3–3.5.

(b) Proceed as three steps elution described above in Subheading7.2.2.

(c) Store the three eluates in the cold.

(d) The eluates are ready for protein composition analysis by massspectrometry, chromatography or electrokinetic methodolo-gies (see Subheading 8).

7.3 Direct on-Bead

Protein Digestion

When the analysis of captured proteins is performed by theso-called shotgun approach, the most direct way to proceed is tomake a digestion of the captured proteins directly on the beads. Themethod is derived from the in-solution digestion of proteins[48]. The operation requires some excess of trypsin, since part ofit will be captured by the CPLL beads. Basically the process is asfollows:

(a) After protein capture on the peptide library beads (whateverthe method or the physicochemical conditions), the beads arerapidly washed twice with 200 μL of 100 mM ammoniumbicarbonate containing 0.1% Rapigest (this is not mandatory,but it facilitates the proteolysis process). This is obtained byadding 1 mL of 100 mM ammonium bicarbonate to the 1 mgRapigest vial lyophilizate and shake gently for few minutes).The bead suspension is then vortexed for few min.

(b) Add 300 μL of 10 mM DTT and heat the bead suspension at65 �C for 1 h under gentle stirring or occasional shaking.

Low-Abundance Plant Proteomics 395

(c) Add 300 μL of 55 mM iodoacetamide, mix and store in thedark for 60 min at room temperature.

(d) Add 60 μL of 0.2 μg/μL trypsin sequencing grade.

(e) Vortex the bead suspension and incubate overnight at 37 �Cunder gentle shaking.

(f) Add 200 μL of 500 mM formic acid, vortex for few secondsand incubate for about 40 min at room temperature.

(g) Recover then the supernatant by filtration (30,000 MWCO)under centrifugation (e.g., 10,000� g for 20 min) in order toseparate insoluble material and beads.

(h) In order to fully extract the remaining peptides wash the beadsunder centrifugation once with 50 μL of 500 mM formic acidand mix to previous filtrate.

(i) Stripped beads could then be kept at �20 �C for possiblefurther analysis.

(j) The solution of peptides is then dried by speedvac and redis-solved in 20 μL HPLC solvent for LC–mass spectrometryanalysis.

8 Compatibility Between Protein Elution from CPLLs and Analysis

Proteomic analysis for samples obtained after treatment withCPLLs may not be directly streamlined. However, there are situa-tions when the proteomics analysis can be directly applied afterprotein harvesting.

Proteomics analysis frequently starts with SDS-PAGE separa-tion for which the sample composition is critical. In this respectprotein harvesting by boiling the CPLL beads with a solution ofsodium dodecyl sulfate in the presence of reducing agents appearsfully compatible with SDS-PAGE with no preliminary formulations[49]. In other circumstances the elution of proteins is operated by amixture of chemical agents that are compatible with isoelectricfocusing. This is the elution with TUC (see Subheading 7.2.2).After this first protein separation dimension it can then be possibleto extend to two-dimensional electrophoresis and then protein spotidentifications. Nevertheless, the treatment of CPLLs with TUCsolutions may not elute 100% of proteins from the beads andshould be completed by another orthogonal desorption operation.To circumvent this situation TUC solution could comprise someamounts of cysteic acid that produces an almost exhaustive desorp-tion of proteins. In this case due to the very low pI value of cysteicacid, which collects at the anode, two-dimensional electrophoresiscan also be easily performed [50].

396 Egisto Boschetti and Pier Giorgio Righetti

When 2D-DIGE is used as two-dimensional electrophoresisanalysis, the only elution that is compatible with this technique isthe use of 20 mM Tris buffer containing 7 M urea, 2 M thioureaand 4% CHAPS, pH 8.5 (sodium carbonate could also be usedinstead of Tris).

For direct ELISA-based assays of eluted proteins, the denatur-ing desorption agents that can be compatible are 0.2 M glycine-HCl, 2% NP-40, pH 2.4; 0.1 M acetic acid, 2% NP-40; 1 M NaCl,2% NP-40; or 0.1 M acetic acid containing 40% ethylene glycol. Incase a single eluent does not desorb all proteins from the beadsthese solutions could be used as a sequence and the eluates pooled.

In all other circumstances, the protein solution collected fromCPLL beads needs to be treated in order to equilibrate them inappropriate buffers by diafiltration, by extensive dialysis or gelfiltration.

It is here recalled that protein elution from beads may not benecessary. This is the case when trypsin digestion is operateddirectly on the beads and the obtained peptides directly analyzedby LC-MS/MS [41]. This approach is recommended especiallywhen dealing with small samples involving small volumes of beadswith time saving and largely reduced protein losses.

9 Practical Application Examples of CPLL-Treated Plant Extracts

Examples are numerous in the literature and it is out of scope tomake a general review on the subject. Essential application exam-ples are focused on the analysis of plant proteomes [51], the discov-ery of specific proteins [52] the detection of expression differencesupon specific conditions [53] and the discovery of plant allergens[54]. Overall, various plant organs extracts have been thus analyzedwith the intervention of combinatorial peptide ligand libraries andfew illustrative examples are given.

Within the domain of low-abundance proteome investigations,studies on particularly recalcitrant plant proteomes should be men-tioned. This is materialized by the analysis of avocado and bananapulps [54, 55]. In both cases, about 1% total protein is embeddedeither in solid oil (avocado) or in huge amounts of polysaccharides(banana). In order to improve discovery of low-abundance species,in parallel with the standard, native condition extraction, a denatur-ing solubilization protocol has been implemented, based on 3%boiling SDS (an anathema in CPLL treatments, since it wouldcompletely inhibit the protein capture). This issue has been circum-vented in two ways: (1) SDS removal by the classical acetone–methanol precipitation and (2) the dilution of SDS from 3 to0.1% in presence of another CPLL-compatible surfactant, like0.5% CHAPS. This procedure allowed for identifying 1012 uniqueproteins; 174 of them were in common with the control, untreated

Low-Abundance Plant Proteomics 397

sample and 190 present only in the control. Overall 648 newproteins have been detected via CPLLs. In the case of banana, outof a total number of 1131 proteins identified, 849 were attributedto the CPLL technology.

From olive fruit pulp [56], where only native extraction wasapplied, the number of unique gene products found was only252, but already much higher compared to what was known fromthe literature. Examples of analysis of fruit proteins before and aftertreatment with CPLLs are illustrated on Fig. 4.

To the large list of known protein allergens from plants thereare molecules that are below the detection limits. They can beevidenced after treatment with CPLLs. One of the most represen-tative examples is the discovery of low-abundance allergens fromcypress pollen [61]. From patient serum exposure the list of cypresspollen allergens has been enriched of several new, never-describedspecies such as chaperone protein HSP104, a Sigma factor SigBregulation protein (a hydrolase involved in stress regulation mech-anism), and Rab-like protein. A number of other allergens havebeen discovered using CPLLs in Hevea latex [35], mango [62], andbanana [55].

Fig. 4 SDS–polyacrylamide gel electrophoresis analysis of various fruit pulp protein extracts before and aftertreatment with combinatorial peptide ligand library. (a) banana pulp [55]; (b) mango pulp [57]; (c) lemon pulp[58]; (d) orange pulp [59]; (e) wolfberry pulp [60]; (f) avocado pulp [54]; (g) olive pulp [56]. By courtesy fromBoschetti and Righetti [36]

398 Egisto Boschetti and Pier Giorgio Righetti

The detection of plant protein markers due to environmentalunexpected factors obtained by the use of CPLLs has been exten-sively reviewed [36]. Biotic (e.g., pathogen attacks) and nonbioticfactors (temperature changes, flooding, drought, contact withheavy metals, etc.) have been described. In most cases defensemechanisms involving signaling proteins as well as antioxidativecomplexes are involved.

10 Notes

1. ProteoMiner, a CPLL (combinatorial peptide ligand library), isa beaded and porous mixed bed affinity-like solid phasedesigned for proteomics applications. It is commercialized byBio-Rad Laboratories, Hercules, USA.

Prior to use, commercial CPLLs need to be conditionedfor an optimal efficiency. When CPLLs are delivered dry, theyneed to be fully rehydrated to recover gel pores compatiblewith protein-free diffusion. These beads carrying different hex-apeptides (from very hydrophilic to highly hydrophobicsequences) do not have all the same swellability properties.To comply with various situations it is first advised to slurry100 mg of dry beads in 2 mL methanol for 30 min whileshaking gently and then add 2 mL of phosphate buffer (e.g.,25 mM pH 7). The rehydration is to be extended overnight atroom temperature. The rehydrated beads are then washedextensively, with the buffer selected for the capture of proteinsas described above for the aqueous slurry. Rehydrated andbuffer-equilibrated beads can be stored in the cold at 4 �Cand used within the day.

2. It is recommended not to reuse hexapeptide beads because(1) some level of carryover may appear, with consequent mis-interpretation of data, and (2) some hexapeptides may havebeen modified as a consequence of stringent elution conditionsfrom previous operations.

3. The sample should be clear and not contain lipids in suspen-sion. Large amounts of nucleic acids or viscous polysaccharides,when present, should also be removed using current methods.Samples should not contain a large amount of detergents ordenaturing agents. For example, nonionic detergents are toler-ated at concentrations not exceeding 0.5% (wt/vol); urea isalso tolerated at a concentration not exceeding 3 M.

The method can be applied to a large variety of plantprotein extracts after appropriate elimination of interferingbiopolymers. Nevertheless, specific aspects of optimizationmight have to be considered according to the encounteredissues. If for instance the protein concentration is below

Low-Abundance Plant Proteomics 399

0.1 mg/mL it may be useful to have a preliminary concentra-tion. This would improve the capture by CPLLs in case theaffinity is too low. Among possible concentration methods aredialysis followed by lyophilization or membrane concentrationunder centrifugation.

The presence of proteases, relatively frequent in plantextracts, is deleterious for the integrity of proteins. Their activ-ity must be stopped with various selected inhibitors or inactiva-tion agents prior to contact with CPLLs.

4. It may happen that during the capture stage, bead aggregationoccurs. In this case the supernatant must be separated by high-speed centrifugation and the collected solid material is to bewashed extensively with PBS under strong shaking (e.g., vor-tex) up to the dissociation of beads from each other. Chemicalagents are not recommended since they may desorb capturedproteins.

5. An insufficient decrease of high-abundance proteins treatedwith CPLLs may mean that the amount of proteins in the initialsample was not sufficient to saturate the beads. This could beresolved by either increasing the amount of sample or bydecreasing the volume of CPLL beads.

6. It is reminded that the enrichment of low-abundance proteinsrenders the sample to be analyzed more complex (many moreproteins are detectable). The subsequent analytical operationsmight become of difficult interpretation. To facilitate the anal-ysis it is advised to fractionate the collected proteins or to elutethe capture proteins sequentially.

7. Generally CPLL treatments are highly reproducible; however,if results are not exactly the same from an experiment toanother, it is advised to check the ionic strength and the pHof the initial sample. Actually even little modifications of theseparameters alter the affinity of proteins for the hexapeptidesbaits grafted on CPLL beads with consequent modification ofthe molecular interaction process.

8. The protein concentration to offer to CPLLs should bebetween 1 and 10 mg/mL. Lower concentrations may renderthe capture of very low-abundance proteins challenging whenthe dissociation constant is too high.

9. The total amount of plant protein from the sample should beminimum 50 mg for 100 μL of hexapeptide ligand librarybeads.

10. To increase the probability to find low-copy species the loadingshould be increased.

11. When the volume of the sample is very small the volume ofCPLLs should be decreased; however, the smallest volume of

400 Egisto Boschetti and Pier Giorgio Righetti

beads usable without losing too much the selectivity is around10 μL.

12. Incubation of plant proteins with CPLLs should be performedat room temperature. An increase of temperature may engen-der stronger hydrophobic associations; a decrease of tempera-ture may result in an acceleration of electrostatic interactions.Low temperatures also increase the viscosity of the sample withmore difficult diffusion within the pores of the gel beads.Temperature fluctuations between serial experiments may ren-der the reproducibility challenging. Use always exactly thesame incubation temperature throughout similar experiments.

13. The presence of ammonium sulfate in the plant protein samplemay engender partial protein precipitation with consequentprotein losses. To prevent this phenomenon, the concentrationof lyotropic salts has to be adjusted case-by-case below thecritical level of precipitation.

14. The precipitation of proteins by methanol–chloroform in viewof eliminating sodium dodecyl sulfate can be performed byadding four volumes of cold pure methanol to the proteinsolution while stirring vigorously for few minutes. Then threevolumes of pure cold chloroform are added with continuousstirring. Finally three additional volumes of deionized water areadded. The protein precipitation process is complete within10–20 min at room temperature. Proteins are removed bycentrifugation at 15,000 � g for about 5 min at 4 �C (aggre-gated proteins will be located the liquid interface). The aque-ous layer is then pipetted out and discarded. Four othervolumes of methanol are added while stirring for a few min-utes. The supernatant is removed again by centrifugation at15,000 � g for about 5 min at 4 �C without disturbing theprotein precipitate. A last wash with acetone may facilitate theremoval of methanol. Protein pellets are dissolved by using anappropriate buffer compatible with subsequent operations.

15. After elution the CPLL beads are theoretically free of proteins.To check for the protein absence 100 μL of the “eluted” beadsis mixed with 10% SDS solution containing 25 mM DTT andboiled for 10 min. The supernatant is then recovered anddirectly analyzed by SDS-PAGE. Staining must be very sensi-tive (e.g., silver staining). The presence of protein bands indi-cates an incomplete protein desorption.

Low-Abundance Plant Proteomics 401

References

1. Jorrın-Novo JV, Maldonado AM, Echevarrıa-Zomeno S et al (2009) Plant proteomicsupdate (2007–2008): second-generation pro-teomic techniques, an appropriate experimen-tal design, and data analysis to fulfill MIAPEstandards, increase plant proteome coverageand expand biological knowledge. J Proteome72:285–314

2. Agrawal GK, Rakwal R (2008) Plant proteo-mics: technologies, strategies, applications.Wiley, Hoboken

3. Agrawal GK, Job D, Zivy M et al (2011) Timeto articulate a vision for the future of plantproteomics - a global perspective: an initiativefor establishing the international plant proteo-mics. Proteomics 11:1559–1568

4. Boschetti E, Hernandez-Castellano LE, Righ-etti PG (2019) Progress in farm animal prote-omics: the contribution of combinatorialpeptide ligand libraries. J Proteome 197:1–13

5. Boschetti D’AA, Candiano G, Righetti PG(2018) Protein biomarkers for early detectionof diseases: the decisive contribution of CPLLs.J Proteome 188:1–14

6. Boschetti E, Fasoli E, Righetti PG (2015) Thediscovery of low-abundance allergens by prote-omics analysis involving combinatorial peptideligand libraries. Jacobs J Allergy Immunol2:015

7. Hijazi M, Velasquez SM, Jamet E et al (2014)An update on post-translational modificationsof hydroxyproline-rich glycoproteins: toward amodel highlighting their contribution to plantcell wall architecture. Front Plant Sci5:395–405

8. Millar DJ, Whitelegge JP, Bindschedler LVet al(2009) The cell wall and secretory proteome ofa tobacco cell line synthesising a secondarywall. Proteomics 9:2355–2372

9. Xu MS, Chen S, Wang WQ et al (2013)Employing bifunctional enzymes for enhancedextraction of bioactives from plants: flavonoidsas an example. J Agric Food Chem61:7941–7948

10. Cho WK, Hyun TK, Kumar D et al (2015)Proteomic analysis to identify tightly-boundcell wall protein in rice calli. Mol Cells38:685–696

11. Demirevska-Kepova K, Simova-Stoilova L,Kjurkchiev S (1999) Barley leaf RuBisCO,RuBisCO-binding protein and RuBisCO acti-vase and their protein/protein interactions.Bulg. J Plant Physiol 25:31–44

12. Li G, Nallamilli BR, Tan F et al (2008)Removal of high-abundance proteins for

nuclear subproteome studies in rice (Oryzasativa) endosperm. Electrophoresis29:604–617

13. Ribeiro M, Nunes-Miranda JD, Branlard G(2013) One hundred years of grain omics:identifying the glutens that feed the world. JProteome Res 12:4702–4716

14. Xiong E, Wu X, Yang L et al (2014)Chloroform-assisted phenol extractionimproving proteome profiling of maizeembryos through selective depletion of high-abundance storage proteins. PLoS One 9:e112724

15. Tavakolan M, Alkharouf NW, Matthews B et al(2014) SoyProLow: a protein databaseenriched in low abundant soybean proteins.Bioinformation 10:599–601

16. Carpentier SC, Panis B, Vertommen A et al(2008) Proteome analysis for non-modelplants: a challenging but powerful approach.Mass Spectrom Rev 27:354–377

17. Gengenheimer P (1990) Preparation ofextracts from plants. Methods Enzymol182:174–193

18. Boschetti E, Bindschedler L, Tang C et al(2009) Combinatorial peptide ligand librariesand plant proteomics: a winning strategy at aprice. J Chromatogr A 1216:1215–1222

19. Wang W, Vignani R, Scali M et al (2004)Removal of lipid contaminants by organic sol-vents from oilseed protein extract prior to elec-trophoresis. Anal Biochem 329:139–141

20. Mechin V, Damerval C, Zivy M (2007) Totalprotein extraction with TCA-acetone. Meth-ods Mol Biol 355:1–8

21. Wessel D, Flugge UI (1984) A method for thequantitative recovery of proteins in dilute solu-tions in the presence of detergents and lipids.Anal Biochem 138:141–143

22. Isaacson T, Damasceno CM, Saravanan RS et al(2006) Sample extraction techniques forenhanced proteomic analysis of plant tissues.Nat Protoc 1:769–774

23. Faurobert M, Pelpoir E, Chaıb J (2007) Phe-nol extraction of proteins for proteomic studiesof recalcitrant plant tissues. Methods Mol Biol355:9–14

24. Cereda A, Kravchuk AV, D’Amato A et al(2010) Proteomics of wine additives: miningfor the invisible via combinatorial peptideligand libraries. J Proteome 73:1732–1739

25. Kim YJ, Wang Y, Gupta R et al (2015) Prot-amine sulfate precipitation method depletesabundant plant seed-storage proteins: a case

402 Egisto Boschetti and Pier Giorgio Righetti

study on legume plants. Proteomics15:1760–1764

26. Lee HM, Gupta R, Kim SH et al (2015) Abun-dant storage protein depletion from tuber pro-teins using ethanol precipitation method:suitability to proteomics study. Proteomics15:1765–1769

27. Alam I, Sharmin S, Kim KH et al (2013) Animproved plant leaf protein extraction methodfor high resolution two-dimensional polyacryl-amide gel electrophoresis and comparative pro-teomics. Biotech Histochem 88:61–75

28. Mortezai N, Harder S, Schnabel C et al (2010)Tandem affinity depletion: a combination ofaffinity fractionation and immunoaffinitydepletion allows the detection oflow-abundance components in the complexproteomes of body fluids. J Proteome Res9:6126–6134

29. Mithoe SC, Menke FL (2015) Phosphopeptideimmuno-affinity enrichment to enhance detec-tion of tyrosine phosphorylation in plants.Methods Mol Biol 1306:135–146

30. Wu XN, Xi L, Pertl-Obermeyer H et al (2017)Highly efficient single-step enrichment of lowabundance phosphopeptides from plant mem-brane preparations. Front Plant Sci 8:1673

31. Kwon SJ, Choi EY, Seo JB et al (2007) Isola-tion of the Arabidopsis phosphoproteomeusing a biotin-tagging approach. Mol Cells24:268–275

32. Fasoli E, Pastorello EA, Farioli L et al (2009)Searching for allergens in maize kernels viaproteomic tools. J Proteome 72:501–510

33. Fasoli E, D’Amato A, Kravchuk AVet al (2011)Popeye strikes again: the deep proteome ofspinach leaves. J Proteome 74:127–136

34. Frohlich A, Gaupels F, Sarioglu H et al (2012)Looking deep inside : detection oflow-abundant proteins in leave extracts of Ara-bidopsis thaliana and phloem exudates ofCucurbita maxima. Plant Physiol159:902–914

35. D’Amato A, Bachi A, Fasoli E et al (2010)In-depth exploration of Hevea brasiliensislatex proteome and “hidden allergens” viacombinatorial peptide ligand libraries. J Prote-ome 73:1368–1380

36. Righetti PG, Boschetti E (2016) Global pro-tein expression analysis in plants by means ofpeptide libraries. J Proteome 143:3–14

37. Nguyen-Kim H, San Clemente H, Balliau Tet al (2016) Arabidopsis thaliana root cellwall proteomics: increasing the proteome cov-erage using a combinatorial peptide ligandlibrary and description of unexpected Hyp in

peroxidase amino acid sequences. Proteomics16:491–503

38. Zhu W, Xu X, Tian J et al (2016) Proteomicanalysis of Lonicera japonica immature flowerbuds using combinatorial peptide ligandlibraries and polyethylene glycol fractionation.J Proteome Res 15:166–181

39. Ye Z, Zhou S, Thannhauser TW et al (2014)Identification of drought-induced leaf pro-teomes in switchgrass. Proc Plant AnimalGenome Conference, San Diego

40. Righetti PG, Boschetti E (2013) Combinato-rial peptide libraries to overcome the classicalaffinity-enrichment methods in proteomics.Amino Acids 45:219–229

41. Thulasiraman V, Lin S, Gheorghiu L et al(2005) Reduction of the concentration differ-ence of proteins in biological liquids using alibrary of combinatorial ligands. Electrophore-sis 26:561–3571

42. Huhn C, Ruhaak LR, Wuhrer M et al (2012)Hexapeptide library as a universal tool for sam-ple preparation in protein glycosylation analy-sis. J Proteome 75:1515–1528

43. Rivers J, Hughes C, McKenna T (2011) Asym-metric proteome equalization of the skeletalmuscle proteome using a combinatorial hexa-peptide library. PLoS One 6:e28902

44. Fasoli E, Farinazzo A, Sun CJ et al (2010)Interaction among proteins and peptidelibraries in proteome analysis: pH involvementfor a larger capture of species. J Proteome73:733–742

45. Eriksson KO, Belew M (2011) Hydrophobicinteraction chromatography. Methods Bio-chem Anal 54:165–181

46. Candiano G, Dimuccio V, Bruschi M et al(2009) Combinatorial peptide ligand librariesfor urine proteome analysis: investigation ofdifferent elution systems. Electrophoresis30:2405–2411

47. Boschetti E, Righetti PG (2013)Low-abundance protein discovery: state of theart and protocols. Elsevier, Waltham

48. Fonslow BR, Carvalho PC, Academia K et al(2011) Improvements in proteomic metrics oflow abundance proteins through proteomeequalization using ProteoMiner prior to Mud-PIT. J Proteome Res 10:3690–3700

49. Righetti PG, Boschetti E, Zanella A et al(2010) Plucking, pillaging and plundering pro-teomes with combinatorial peptide ligandlibraries. J Chromatogr A 1217:893–900

50. Farinazzo A, Fasoli E, Kravchuk AV et al(2009) En bloc elution of proteomes fromcombinatorial peptide ligand libraries. J Prote-ome 72:725–730

Low-Abundance Plant Proteomics 403

51. Jorrın-Novo JV, Valledor-Gonzalez L, Castil-lejo-Sanchez MA et al (2018) Proteomics anal-ysis of plant tissues based on two-dimensionalgel electrophoresis, in Advances in Plant Eco-physiology Techniques

52. Campos NA, Swennen R, Carpentier SC(2018) The plantain proteome, a focus onallele specific proteins obtained from plantainfruits. Proteomics 18:1700227

53. Singh P, Pitambara, Rajput RS et al (2018)Proteomics approaches to study host pathogeninteraction. J Pharmacogn Phytochem7:1649–1654

54. Esteve C, D’Amato A, Marina ML et al (2012)Identification of avocado (Persea americana)pulp proteins by nanoLC-MS/MS via combi-national peptide ligand libraries. Electrophore-sis 33:2799–2805

55. Esteve C, D’Amato A, Marina ML et al (2013)In-depth proteomic analysis of banana (Musaspp.) fruit with combinatorial peptide ligandlibraries. Electrophoresis 34:207–214

56. Esteve C, D’Amato A, Marina ML et al (2012)Identification of olive (Olea europaea) seed andpulp proteins by nLC-MS/MS via combinato-rial peptide ligand libraries. J Proteome75:2396–2403

57. Fasoli E, Righetti PG (2013) The peel and pulpof mango fruit: a proteomic samba. BiochimBiophys Acta 1834:2539–2545

58. Fasoli E, Colzani M, Aldini G et al (2015)Lemon peel and Limoncello liqueur: A proteo-mic duet. Biochim Biophys Acta1834:1484–1491

59. Lerma-Garcıa MJ, D’Amato A, Simo-AlfonsoEF et al (2016) Orange proteomic fingerprint-ing: from fruit to commercial juices. FoodChem 196:739–749

60. D’Amato A, Esteve C, Fasoli E et al (2013)Proteomic analysis of Lycium barbarum(Goji) fruit via combinatorial peptide ligandlibraries. Electrophoresis 34:1729–1736

61. Shahali Y, Senechal H, Poncet P (2018) Theuse of combinatorial hexapeptide ligand library(CPLL) in allergomics. Methods Mol Biol1871:393–403

62. Gomez Cardona EE, Heathcote K, Teran MLet al (2018) Novel low-abundance allergensfrom mango via combinatorial peptide librariestreatment: a proteomics study. Food Chem269:652–660

404 Egisto Boschetti and Pier Giorgio Righetti

Chapter 29

iTRAQ-Based Proteomic Analysis of Rice Grains

Marouane Baslam, Kentaro Kaneko, and Toshiaki Mitsui

Abstract

Cereal proteins have formed the basis of human diet worldwide, and their level of consumption is expectedto increase. The knowledge of the protein composition and variation of the cereal grains is helpful forcharacterizing cereal varieties and to identify biomarkers for tolerance mechanisms. Grains produce a widearray of proteins, differing under conditions. Quantitative proteomics is a powerful approach allowing theidentification of proteins expressed under defined conditions that may contribute understanding thecomplex biological systems of grains. Isobaric tags for relative and absolute quantitation (iTRAQ) is amass spectrometry–based quantitative approach allowing, simultaneously, for protein identification andquantification from multiple samples with high coverage. One of the challenges in identifying grainsproteins is their relatively high content (~90–95%) of carbohydrate (starch) and low protein (~4–10%)and lipid (~1%) fractions. In this chapter, we present a robust workflow to carry out iTRAQ quantificationof the starchy rice grains.

Key words Protein biomarkers, Chalkiness, Isobaric tags for relative and absolute quantification(iTRAQ), Oryza sativa, Seed proteomics

1 Introduction

Although starch is the most dominant component of the rice grain,it does not explain all variation in grain quality between rice culti-vars. Total protein content also influences rice grain quality but alsodoes not completely account for all known variation in grain qual-ity. Variation in rice grain protein composition influences taste andtexture of cooked rice. Differences in protein abundance are asso-ciated with different genotypic and phenotypic traits [1–4]. Thus,proteomics can directly and globally explore the protein levels andits respective posttranslational modifications [5]. Proteomics couldbe a powerful tool to better understand the genetic basis of plantresponses to environmental cues by directly comparing proteinabundance under stress conditions between genotypes differing intheir stress responses. Recently, a mass spectrometry (MS)–basedquantitative proteomics is becoming indispensable for gaininginsights into the biological systems at the molecular level. Isobaric

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8_29, © Springer Science+Business Media, LLC, part of Springer Nature 2020

405

tags for relative and absolute quantitation (iTRAQ) is one of themost popular chemical tagging approaches which allows for multi-plexing up to eight samples in a single run with high coverage. TheiTRAQmethod screen for global proteomic changes for identifyingdifferentially regulated proteins and the activated transductionpathways. The potential candidate proteins may then be utilizedin elucidating the molecular mechanism explaining the response ofplants to a particular environmental condition. This field ofresearch aims to identify molecular features that can be developedas biomarkers for crop improvement and provides genetic resourcesunderlying grain chalkiness, one of the principal targets for theimprovement of rice characteristics. iTRAQ technology has beenapplied to rice [6–8], wheat [9–13], cotton [14], and other cropspecies. Recently, this technique has been employed in the studiesof grain development [15–17] and chalkiness under high tempera-ture to identify potential sources of tolerance for variety improve-ment [6]. Such studies provide therefore an excellent startingmaterial for further elucidating the molecular and biochemicalbasis of grain aspects and crops improvement. While recently severalstudies have examined the quantitative proteomics of leaves, roots,and stems, it has been challenging to focus on the “subdiscipline”of grains such as rice owing to the complexity and the relatively lowprotein content and usually high amount of interfering compoundsmainly starch (and others, e.g., rigid cell wall and phenolic com-pounds). In order to overcome the problem of starchy endosperm,we have successfully optimized the conditions for rice grain prote-ome analysis by iTRAQ LCMS/MS. Here, the experimental work-flow will lay the basis for further profound grains studies in the fieldof proteomics.

2 Materials

For accurate mass spectrometry analysis, it is recommended to usechromatography and mass spectrometric grade reagents and pre-pare all the solutions in ultrapure water (18 MΩ/cm resistivity at25 �C).

2.1 Plant Material Seeds from rice plants (Oryza sativa L. cv. Koshihikari) grown inpaddy field or controlled (Biotron LPH-1.5PH-NCII, Nihon-ika,Tokyo, Japan) conditions (see Note 1).

2.2 Protein

Extraction

and Quantification

1. Rice grain grader (RGQI20A, Satake, Hiroshima, Japan).

2. Viewer (Fujicolor lightbox New-5000 Inverter, Fuji film Co.,Tokyo, Japan).

3. Grain huller.

4. Coffee mill (MJ-51, Melitta Japan).

406 Marouane Baslam et al.

5. Rice milling machine (KETT Electric Laboratory, Tokyo,Japan).

6. Razor blade.

7. Mortar and pestle.

8. High-speed microcentrifuge (Himac CF-RXII, HITACHI).

9. Mixer (Delta mixer Se-08, TAITEC).

10. Extraction solution: 7 M urea, 2 M thiourea, 1% (w/v)CHAPS, 1% (w/v) Triton X-100, 10 mM dithiothreitol(DTT).

11. Methanol.

12. Chloroform.

13. Pierce bicinchoninic acid (BCA) protein assay kit (ThermoFisher Scientific Pierce, Rockford, IL.).

14. Pierce™ 660 nm Protein Assay Reagent (Thermo Fisher Sci-entific Pierce, Rockford, IL.)

15. Bovine serum albumin (BSA) protein standard.

2.3 Protein Digestion

and iTRAQ Labeling

1. Block Bath (CB-100A, AS ONE Corporation, Osaka, Japan).

2. Urea, 8.0 M.

3. Endoproteinase Lys-C (Wako, Tokyo, Japan), 1 μg μL�1 (seeNote 2).

4. Trypsin (Wako, Tokyo, Japan), 1 μg μL�1 (see Note 2).

5. iTRAQ Reagent Multiplex kit: (4-plex: 114, 115, 116, 117;AB SCIEX, Foster, CA).

6. iTRAQ Dissolution buffer provided in iTRAQ kit.

7. Absolute Ethanol.

8. Reducing buffer: 50 mM Tris-(2-carboxyethyl) phosphine(TCEP) (see Note 2).

9. Alkylating solution: 200 mM methyl methanethiosulfonate(MMTS) (see Note 2).

2.4 Cation Exchange

Liquid

Chromatography

and Peptide Desalting

1. 0.5 mL Syringe (Hamilton).

2. Speed vac (CC-105, TOMY, Tokyo, Japan).

3. MonoSpin® C18 columns (GL science).

4. ICAT cation exchange buffer pack (Applied Biosystems),including elution, loading, cleaning, and storage contain ace-tonitrile buffers.

5. Formic acid 1% (v/v).

6. Acetonitrile: 5% (v/v) in 0.1% (v/v) formic acid.

7. Activation solution: 80% acetonitrile in 1% (v/v) formic acid.

Proteomics of Starchy Rice Grains 407

2.5 Mass

Spectrometry

1. Liquid chromatography system (EASY-nLC 1000 and DiNa-AKYA (Tech Corporation).

2. ESI nano stage (KYA Tech Corporation).

3. LTQ Orbitrap XL mass spectrometer (Thermo FischerScientific).

4. MonoCap C18 High Resolution 2000; 0.1 mm i.d. � 2000 mm, (GL Science).

5. Solvent A: 2% (v/v) acetonitrile in 0.1% (v/v) formic acid.

6. Solvent B: 80% (v/v) acetonitrile in 0.1% (v/v) formic acid.

3 Methods

3.1 Sample

Preparation

1. Husk rice seeds with a grain huller (see Note 3).

2. Polish 10 g of grain samples for 30–40 min using a rice millingmachine in order to remove the embryo and aleurone layer.

3.2 Protein

Extraction

1. Resuspend 200 mg of powdered sample of starchy grain in0.4 mL of extraction buffer (see Notes 4 and 5) by vortexingfor 15 s at high speed.

2. Centrifuge at 20,000 � g for 10 min (4 �C). Transfer theresulting supernatant to an Eppendorf tube.

3. Add 400 μL of methanol to 100 μL of supernatant and mix byvortexing.

4. Add 100 μL of chloroform and 300 μL of ultrapure water andvortex for 5 s.

5. Centrifuge for 1 min at 10,000 � g at 4 �C and remove theupper aqueous phase.

6. Add 400 μL methanol and vortex thoroughly.

7. Centrifuge at 10,000 � g at 4 �C, for 15 min. Discard thesupernatant and keep the pellet.

3.3 Protein Digestion 1. Resuspend immediately the protein pellet in 8 M urea. Solubi-lize completely the protein sample if necessary, by incubationovernight at 4 �C or alternatively by sonication.

2. Determine the protein concentration by Pierce 660 nm. Pro-tein Assay kit (Thermo Fisher Scientific) using bovine serumalbumin (BSA) as a standard.

3. Add to the protein solution (50 μg total protein) 2 μL ofdissolution buffer from iTRAQ kit and 2 μL of reducing buffer(TCEP); mix well by vortexing for 15 s, and spin down briefly.

4. Incubate the mixture at 60 �C for 1 h, and spin down thesolution.

408 Marouane Baslam et al.

5. Alkylate by adding 1 μL of the cysteine blocking reagent fromiTRAQ kit. Mix well by vortexing for 15 s and spin down.

6. Incubate at 37 �C for 1 h (see Note 6), and spin down.

7. Dilute with an equal volume of iTRAQ dissolution bufferprovided in the kit. For protein digestion add 5 μL of endo-proteinase Lys-C (1 μg μL�1) and incubate at 37 �C for 3–4 h.Dilute 10 times with ultra-pure water (see Note 7). Vortex for30 s and spin down the solution.

8. Add 5 μL of the trypsin solution to each sample tube forfurther digestion. Vortex to mix for 1 min and spin downbriefly to bring all the solution down at the bottom. Incubateat 37 �C for 12–16 h (overnight) (see Note 8).

3.4 iTRAQ Peptide

Labeling

1. Bring the iTRAQ reagents to label peptides provided as set offour (iTRAQ® Reagent 114, 115, 116, and 117) out of thefreezer to room temperature. Spin down to bring the solutionto the bottom of the vial.

2. Add 500 μL of absolute ethanol provided in iTRAQ kit to eachvial of the iTRAQ Reagent. Vortex each vial for 30 s and thenspin down the solution.

3. Transfer the entire contents of each freshly prepared iTRAQreagent to their respective tryptic peptide sample tube. Vortexeach tube for 30 s to mix, then spin.

4. Incubate the iTRAQ labeling reaction tubes for 1 h at roomtemperature (see Notes 9–12).

5. Add 400 μL of ultrapure water, vortex for 30 s, and spin down.

6. Combine the content of each iTRAQ reagent-labeled sampletube into one tube.

7. Vortex to mix, then spin.

3.5 Cation Exchange

Chromatography

1. Set the cation exchange column in a 0.5 mL syringe and clean itwith 1 mL of cleaning buffer to condition the cartridge. Keepthe injection flow in this and following steps at 1 drop persecond. Divert to waste.

2. Inject 2 mL of the Cation Exchange Buffer-Load. Divert towaste.

3. Slowly inject (¼1 drop/second) the mixed iTRAQ-labeledpeptide samples onto the cation-exchange cartridge and collectthe flow-through in a sample tube (see Notes 13 and 14).

4. Wash with 1 mL of loading buffer (see Note 15).

5. To elute the peptide, slowly inject (¼1 drop/s) 500 μL ofelution buffer. Collect the eluted peptides as a single fractionin an Eppendorf tube.

Proteomics of Starchy Rice Grains 409

6. Add 1 mL of cleaning buffer to wash the undigested proteins(i.e., trypsin) from the cation-exchange cartridge and collectthe flow-through in two fractions of 0.5 mL each.

7. Dry the concentrated iTRAQ-labeled peptide samples (50 μL)in a speed vac for further fractionation.

8. Wash the column with 2 mL of storage buffer. Seal the columnwith Parafilm to avoid drying out. Store the cartridge at2–8 �C.

3.6 C18 Spin Columns

and Peptide Desalting

1. Add 500 μL of 1% (v/v) formic acid to acidify the peptidessolution.

2. Add 100 μL of 80% acetonitrile in 1% formic acid.

3. Centrifuge the C18 cartridge at 5000 � g for 2 min.

4. Equilibrate the C18 column by adding 1% formic acid to thecartridge. Centrifuge at 10,000 � g for 2 min.

5. Transfer completely the iTRAQ-labeled peptides onto theequilibrated C18 cartridge. Centrifuge at 10000 � g for2 min and collect the flow-through. Load the flow-throughfraction again onto the C18 column and centrifuge at10,000 � g for 1 min (see Note 16).

6. Wash off the column by adding 1.5 mL of 1% formic acid (seeNote 17).

7. Elute the peptides by adding 600 μL of 80% acetonitrile in 1%formic acid. Collect the flow-through in new Eppendorf tubethe eluted peptides by centrifugation at 10,000 � g for 2 min.

8. Dry the desalted peptides in a speed vacuum for further ana-lyses by MS/MS.

3.7 Mass

Spectrometry

1. Reconstitute the iTRAQ-labeled peptides in 20 μL of 2% ace-tonitrile in 0.1% formic acid.

2. Load the iTRAQ-labeled peptides (20 μL) onto a trap column(HiQ sil C-18W-3; 0.5 mm i.d. � 1 mm, 3 μm particle size)with buffer A using a DiNa-A system (KYA Tech., Tokyo,Japan).

3. For the MS calibration parameters, apply a linear gradient from0 to 33% buffer B for 600 min, followed by another lineargradient 33–100% buffer B for 10 min, and back to 0% bufferB in 15 min.

4. Load directly the peptides eluted from the HiQ sil C-18W-3column on a separation column (MonoCap C18 High Resolu-tion 2000; 0.1 mm i.d. x 2000 mm). Subsequently, the sepa-rated peptides are introduced into a LTQ-Orbitrap XL massspectrometer (Thermo Fisher Scientific) at a flow rate of300 nl/min and an ionization voltage 1.7–2.5 kV. The LTQ

410 Marouane Baslam et al.

Orbitrap XL mass spectrometer includes an octupole acting ascollision cell able to perform an alternative peptide fragmenta-tion termed higher energy collision-induced dissociation(HCD).

5. Operate a liquid chromatography-MS/MS (LC-MS/MS)spectrometer using Xcalibur 2.0 software (Thermo Fisher Sci-entific). The mass range selected for MS scan is set to350–1600 m/z and the top three peaks are subjected toMS/MS analysis. The full MS scan is detected in the Orbitrap,and the MS/MS scans are detected in the linear ion trap andOrbitrap. The normalized collision energy for MS/MS is set to35 eV for collision-induced dissociation (CID) and 45 eV forhigher-energy C-trap dissociation (HCD). High resolution ofFourier transform mass spectrometer (FTMS) is maintained at60,000 resolutions.

Divalent or trivalent ions are subjected to MS/MS analysisin dynamic exclusion mode; the peaks obtained from theLC-MS are detected as divalent or trivalent ions, and thereforea mass difference of 1 would be detected as only 0.5 or 0.3.This small difference could not accurately be detected in theLC-MS system. Therefore, the peptides containing Asn/Glnare analyzed by MS/MS to distinguish between deamidatedpeptides and isomerized peptides.

Proteins are identified with Proteome Discoverer v. 1.4software, the SEQUEST HT (Thermo Fisher Scientific), andMsAmanda [18] search tool using the UniProt (http://www.uniprot.org/) O. sativa subsp. japonica database (63,535 pro-teins) with the following parameters: enzyme, trypsin; maxi-mum missed cleavages site, 2; peptide charge, 2+ or 3+; MStolerance, 5 ppm; MS/MS tolerance, �0.5 Da; dynamic modi-fication, carboxymethylation (C), oxidation (H, M, W),iTRAQ 4-plex (K, Y, N-terminus). It has been suggested thata higher proportion of the proteome can be quantified by usingmultiple search engines [19]. The False discovery rates must be<1%.

4 Notes

1. Seed samples can be stored in a dry cool room over a 4–10 �Ctemperature range until they are used.

2. Prepare fresh immediately before use.

3. This step can be omitted if seed samples are limited.

4. When the rice flour becomes a rice cake, add 0.8 mL water (twovolumes of extraction buffer) (Fig. 1).

5. This step should be carried out on ice.

Proteomics of Starchy Rice Grains 411

6. Do not heat above 37 �C, since the urea degradation productsmight modify amino acids residues of proteins.

7. For effective trypsin digestion, dilute each protein preparationso that the final concentrations of the detergents and otherreagents do not inhibit trypsin activity.

8. The total volume of the digestion mixture must be less than300 μL. If it is higher, lyophilize and reconstitute with 300 μLof dissolution buffer.

9. Allow iTRAQ vials to warm up first.

10. Each iTRAQ vial may be provided in different volumes by thevendor, so the final volumes should not be the same for allthe tags.

11. If the pH of the peptide/iTRAQ mixture is less than 7.5, thelabeling efficiency would be significantly reduced. For optimallabeling efficiency, the pH must be between 7.5 and 8.5.

12. Labeling with the 4-plex iTRAQ for 1 h, while the 8-plexversion reagent requires a reaction time for 2 h.

13. Samples should be loaded with a relatively slow flow rate tomaximize the binding of the peptides to the column.

14. Test the pH of the sample by placing 0.5 μL of the solutiononto a pH paper. If the pH is not between 2.3 and 3.3, adjustby adding more Cation Exchange buffer-Load.

15. The eluted solution should be collected in a new tube to avoidunforeseen trouble.

Fig. 1 Photograph of starchy grain aspects during the steps of protein extraction process

412 Marouane Baslam et al.

16. The flow-through loaded for second time should be collectedin a fresh tube to check the recovery rate of the peptide.

17. The washed fractions should be collected in a new Eppendorftube. In the case of the sample eluted with Clean Buffer(1.0 M), 2 mL of 0.1% formic acid should be added.

Acknowledgments

This research was supported by KAKENHI Grants-in-Aid for Sci-entific Research (A) (15H02486) from Japan Society for the Pro-motion of Sciences, Strategic International Collaborative ResearchProgram by the Japan Science and Technology Agency (JSTSICORP), and Grant for Promotion of KAAB Projects (NiigataUniversity) from the Ministry of Education, Culture, Sports, Sci-ence, and Technology, Japan.

References

1. Tsutsui K, Kaneko K, Hanashiro I et al (2013)Characteristics of opaque and translucent partsof high temperature stressed grains of rice. JAppl Glycosci 60:61–67

2. Wakasa Y, Yasuda H, Oono Y et al (2011)Expression of ER quality control-relatedgenes in response to changes in BiP1 levels indeveloping rice endosperm. Plant J65:675–689

3. Lin CJ, Li CY, Lin SK et al (2010) Influence ofhigh temperature during grain filling on theaccumulation of storage proteins and grainquality in rice (Oryza sativa L.). J Agric FoodChem 58:10545–11055

4. Lin SK, Chang MC, Tsai YG et al (2005) Pro-teomic analysis of the expression of proteinsrelated to rice quality during caryopsis develop-ment and the effect of high temperature onexpression. Proteomics 5:2140–2156

5. Ralhan R, DeSouza LV, Matta A et al (2008)Discovery and verification of head-and-neckcancer biomarkers by differential proteinexpression analysis using iTRAQ labeling, mul-tidimensional liquid chromatography, and tan-dem mass spectrometry. Mol Cell Proteomics7:1162–1173

6. Kaneko K, Sasaki M, Kuribayashi N et al(2016) Proteomic and glycomic characteriza-tion of rice chalky grains produced under mod-erate and high-temperature conditions in fieldsystem. Rice 9:26

7. Wang SZ, Chen WY, Xiao WF et al (2015)Differential proteomic analysis using iTRAQ

reveals alterations in hull development in rice(Oryza sativa L.). PLoS One 10:10 e0133696

8. Wang ZQ, Xu XY, Gong QQ et al (2014) Rootproteome of rice studied by iTRAQ providesintegrated insight into aluminum stress toler-ance mechanisms in plants. J Proteome98:189–205

9. Fu Y, Zhang H, Mandal SN et al (2016) Quan-titative proteomics reveals the central changesof wheat in response to powdery mildew. JProteome 130:108–119

10. Kang GZ, Li GZ, Wang LN et al (2014)Hg-responsive proteins identified in wheatseedlings using iTRAQ analysis and the roleof ABA in hg stress. J Proteome Res14:249–267

11. Alvarez S, Choudhury SR, Pandey S (2014)Comparative quantitative proteomics analysisof the ABA response of roots of drought-sensitive and drought-tolerant wheat varietiesidentifies proteomic signatures of droughtadaptability. J Proteome Res 13:1688–1701

12. Ge P, Hao PC, Cao M et al (2013) iTRAQ-based quantitative proteomic analysis revealsnew metabolic pathways of wheat seedlinggrowth under hydrogen peroxide stress. Prote-omics 13:3046–3058

13. Ford KL, Cassin A, Bacic A (2011) Quantita-tive proteomic analysis of wheat cultivars withdiffering drought stress tolerance. Front PlantSci 2:44

14. Liu J, Pang CY, Wei HL et al (2015) iTRAQ-facilitated proteomic profiling of anthers from aphotosensitive male sterile mutant and wild

Proteomics of Starchy Rice Grains 413

type cotton (Gossypium hirsutum L.). J Prote-ome 126:68–81

15. Cui Y, Yang MM, Dong J et al (2017) iTRAQ-based quantitative proteome characterizationof wheat grains during filling stages. J IntegrAgric 16:20156–22167

16. Yang MM, Yang J, Dong WC et al (2016)Characterization of proteins involved in earlystage of wheat grain development by iTRAQ. JProteome 136:157–166

17. Ma CY, Zhou JW, Chen GX et al (2014)iTRAQ-based quantitative proteome and

phosphoprotein characterization reveals thecentral metabolism changes involved in wheatgrain development. BMC Genomics 15:1029

18. Dorfer V, Pichler P, Stranzl T et al (2014) Auniversal identification algorithm optimizedfor high accuracy tandem mass spectra. J Pro-teome Res 13:3679–3684

19. Elias JE, Haas W, Faherty BK et al (2005)Comparative evaluation of mass spectrometryplatforms used in large-scale proteomics inves-tigations. Nat Methods 2:667–675

414 Marouane Baslam et al.

INDEX

A

Algae ...............................................................82, 197–210

Allele specific proteins......................................4, 297–305

Allelic variance...........................................................4, 158

Apoplast .................................................4, 80, 83, 86, 107

Apoplastic fluid..........................................................79–86

Arabidopsis thaliana .......................................79–86, 170,

179, 187, 225, 233, 236, 239, 242, 260, 274,

310, 326, 367, 372, 374, 381, 385

AtPrx47 ...............................................327–330, 333, 337

AtPrx64 ...............................................327–329, 333, 337

B

Bioinformatics ............................................ 2, 3, 7, 12, 58,

141, 325, 370, 375

Biomarkers....................................5, 22, 25, 35, 135, 406

Biotinylated cystatins ........................................... 355–359

Bottom-up................................................. 5, 57, 157, 242

C

Cell walls.................................................79, 86, 179, 373,

382, 383, 406

Chalkiness ......................................................................406

Chlamydomonas reinhardtii...........................12, 197, 200

Chloroplasts......................................................... 4, 69–78,

119, 207, 242

Cocoa pod ............................................................ 133–145

Co-immunoprecipitation (co-IP)........................ 273, 290

Combinatorial peptide ligand library..............4, 381–401

Confidence parameters .................................................166

Custom protein databases ........................................59, 65

Cys proteases ...................................................6, 354, 359,

361–363, 365

Cystatin activity-based protease profiling ........... 353–365

Cytoscape..............................................23, 25, 29–33, 35,

38–41, 43–48, 55, 375

D

Data acquisition ..................................... 5, 140, 169–177,

214, 217, 221, 266, 345

Databases .................................................... 3, 22, 58, 102,

112, 142, 158, 184, 207, 214, 229, 253, 266,

294, 300, 310, 326, 360, 368, 411

Data dependent acquisition (DDA)....................... 5, 127,

141, 170–172, 174–176, 192, 214–223, 232

Data Integration Analysis for Biomarker discovery using a

Latent component method for Omics studies

(DIABLO) .................................22, 25, 35–38, 52

Data validation ..........................................................2, 7–8

Detergent-resistant membrane (DRM)................. 90, 91,

95–98, 101, 103, 104

Dimethyl labeling................................................ 133–145,

183, 242, 243, 245, 248–249

E

Electron transfer dissociation (ETD)................. 193, 194,

227, 229, 232, 233, 236–238

Endoplasmic reticulum (ER).............................. 117–129,

225, 325

Experimental design............................143, 148, 199, 215

F

Forest species.................................................................158

14-3-3 proteins ........................................... 275, 289, 290

Fragmentation .................................................12, 89, 127,

134, 163, 185, 192, 194, 199, 224–239, 253,

266, 411

Functional proteomics ..............................................6, 354

Fungal disease................................................................136

G

Genomes....................................................... 3, 50, 53, 57,

121, 142, 257, 289, 298, 309, 367, 381

GFP-trap............................................................... 257–269

Glycoproteomics ...........................................................227

H

Higher-energy collisional dissociation .........................411

Holm oak............................................. 8, 57–67, 157–167

Homeolog .....................................................................298

Horseradish peroxidase (HRP) .......................... 120, 124,

261, 265, 327, 328, 333, 337

Hydrophilic interaction liquid chromatography

(HILIC) ................................................... 227, 228,

230, 231, 238, 243–246, 249

Hydrophobic proteins ...................................90, 390–391

Hypothetical structure ......................................... 325–338

Jesus V. Jorrin-Novo et al. (eds.), Plant Proteomics: Methods and Protocols, Methods in Molecular Biology, vol. 2139,https://doi.org/10.1007/978-1-0716-0528-8, © Springer Science+Business Media, LLC, part of Springer Nature 2020

415

I

Identification ....................................................... 3, 23, 57,

69, 80, 89, 109, 127, 133, 148, 158, 170, 183,

207, 215, 226, 244, 257, 274, 290, 297, 310,

325, 359, 369, 386

Immunoaffinity ...............................................5, 260, 261,

276, 281, 286

Immunoaffinity purification ................................... 5, 260,

276, 281, 286

Immunoprecipitation (IP) .................................. 241–255,

259, 280–283, 290–293, 295

In gel digestion............................................90–93, 97–99,

102, 291, 293, 294

Inhibitors ............................................................. 6, 74, 77,

81, 82, 119, 128, 150, 151, 180, 188, 189, 198,

201, 209, 228, 260, 276, 281, 285, 292,

353–365, 386, 387, 400

In silico analysis ....................................... 6, 325–338, 368

In solution digestion..........................................90, 93, 98,

102, 103, 110, 111, 113, 273, 277, 283, 284, 395

Interaction networks.................................. 5, 21–55, 257,

259, 267, 375

Interactome ....................................................................... 4

In vivo cross-linking............................................. 273–286

Isobaric tags for relative and absolute quantification

(iTRAQ)...............................................3, 118, 134,

148, 183, 405–413

Isolation.......................................................11–20, 70–76,

108, 117–129, 163, 173, 193, 194, 206, 218,

221, 223, 253

Isotopic variants ...........................................133–135, 144

L

Label-free.........................................................3, 5, 6, 118,

119, 125, 148, 183, 197–210, 215, 257–269,

300–302, 310

Label-free quantification (LQF)............................. 5, 148,

197–210, 215, 257–269, 301

Ligand binding-sites .....................................................326

Lipids ....................................................7, 12–16, 89, 107,

147, 179, 268, 349, 373, 383, 387, 399

Liquid chromatography coupled to tandem mass

spectrometry (LC MS/MS) ................. 5, 89–105,

112, 125, 137, 139, 144, 147, 149, 160,

169–174, 176, 199, 200, 202, 206, 229, 232,

238, 252–253, 260, 262, 266, 284, 290, 291,

294, 295, 310, 317, 354, 356, 357, 360, 397, 411

Low-abundance protein......................180, 310, 381–401

Lysine acetylation........................................ 148, 242, 253

M

Mascot ..................................................57, 121, 127, 128,

142–144, 193, 202, 207, 317–319, 360

Mass spectrometry (MS)..................................... 5, 69, 90,

108, 118, 133, 147, 157, 169, 189, 204, 226,

242, 259, 273, 310, 341, 354, 382, 405

Mass spectrometry imaging (MSI) ..................... 341–350

Matrix-assisted laser desorption/ionization

(MALDI) .................................................. 341–350

MaxQuant .................................109, 112, 113, 214–219,

221, 250, 253, 259, 260, 266, 267

Medicago truncatula ............................................ 341–350

Membrane trafficking................................................80, 90

Metabolic pathways..........................................8, 367–379

Metabolites ............................................. 7, 12–16, 22–24,

30, 32–37, 179, 241, 310, 341, 367–371, 373,

375, 376, 378, 379, 383

Metabolomics ................................. 23, 30, 349, 368–370

Microalgae ....................................................7, 11–20, 197

Microdomain...........................................................89–105

Moniliophthora roreri ....................................................136

Multiple co-inertia analysis (MCIA) .............................. 22

N

Nano-LC-MS/MS.........................................89–105, 125

N-linked glycans................................................... 225, 226

Non-model species........................................................158

Nucleus ....................................................70, 71, 124, 242

O

Offline fractionation............................................. 241–255

Orbitrap ...........................................................3, 109, 112,

127, 144, 163, 172, 174–177, 192–194, 202,

206, 213–216, 219, 229, 266, 269, 292, 294,

345, 408, 410, 411

Orphan plant species........................................8, 157–167

P

Parallel reaction monitoring (PRM) ...................... 5, 170,

213–224

Partial least squares (PLS) ...........................22, 25, 30–33

Partial least square-discriminant analysis

(PLS-DA).......................................................22, 35

Peptides ........................................................ 4, 65, 86, 89,

108, 121, 133, 147, 158, 169, 182, 198, 213,

230, 242, 260, 274, 291, 299, 310, 325, 342,

353, 382, 409

Peroxidases class III ............................................. 325–338

Phenol protein extraction.................................... 314, 316

Phosphopeptides ........................................ 148, 149, 151,

153, 155, 180, 181, 189, 191–193, 198–201,

205, 206, 208, 213–224

Phosphoproteome...................................... 181, 183, 184,

198, 199, 202, 242, 385

Phosphoproteomics ................................... 148, 179–194,

197–210

416PLANT PROTEOMICS: METHODS AND PROTOCOLSIndex

Phosphorylation ..................................... 7, 135, 147–155,

180, 184, 198, 199, 207, 215, 223, 224, 241

Pigments ........................................................7, 12–16, 19,

147, 383, 384, 387

Pinus ................................................................................ 58

Plasma membrane ...........................................79, 89–105,

107–114, 119, 327

Pollen .......................................................... 274, 275, 279,

285, 286, 383, 398

Polyacrylamide gel electrophoresis (PAGE) .................97,

158, 259, 261, 357, 358, 398

Polyethylene glycol (PEG) ........................ 109, 312–314,

316, 383

Polyploidy......................................................................298

Post-translational modification (PTM)................ 4–6, 54,

58, 70, 135, 147–149, 179, 180, 194, 198, 213,

225, 226, 241, 259, 298, 325–338, 405

Principal components analysis (PCA) ......................21, 22

Protease inhibitors .............................................. 6, 74, 77,

81, 82, 119, 128, 188, 201, 209, 260, 276, 281,

285, 353–365, 386

Proteases ....................................................... 6, 16, 74, 77,

81, 82, 92, 99, 119, 128, 134, 179, 180, 188,

201, 204, 207, 209, 253, 260, 274, 276, 281,

285, 353–365, 382, 383, 386, 387, 400

Protein networks .......................................................42, 49

Protein-protein interaction...............................25, 28, 37,

40, 49, 70, 225, 259, 267, 268, 273, 274, 289

Proteogenomics ...............................................6, 309–323

Purification ................................................. 3, 5, 6, 16–19,

69–78, 81, 83–84, 90–91, 93–96, 100–101, 183,

202, 205, 210, 259, 273, 276, 281, 284, 286,

356, 357, 364

Q

Quantitative proteomics .......................... 6, 80, 133–145,

147–149, 169–177, 405, 406

Quercus ilex.......................................... 8, 57–67, 157–167

R

Rice (Oryza sativa) .......................................57, 107–114,

118, 225, 309, 382, 405–413

RNA-seq analysis..........................24, 26, 49, 57, 60, 298

Root nodules ........................................................ 341–350

S

Secretion ................................................................. 80, 117

Seed.............................................158, 161, 165, 260, 290

Sequence database.........................................................360

SEQUEST .......................................................57, 65, 160,

163, 294, 295, 411

Shotgun .......................................................................3, 25

Signaling .................................................70, 71, 107, 108,

111, 180, 198, 241, 289, 373, 382, 399

Single amino acid polymorphisms (SAAP).................298,

302, 304

Skyline......................................... 171–174, 214, 216–222

Sodium dodecyl sulfate polyacrilamide gel electrophoresis

(SDS-PAGE)..................................... 77, 110, 111,

123, 161, 190, 263–264, 273, 275, 279–282,

291, 293, 314, 355, 359, 383, 396, 401

Sparse partial least squares (sPLS)........30–34, 44, 45, 47

Stable-isotope labeling......................................... 133, 148

Subcellular ....................................................4, 69–78, 118

Substrate ....................................................... 80, 136, 223,

242, 354, 362, 367

Substrate channel analysis.................................... 333, 336

Substrates.............................................................. 325–338

Sweetpotato......................................................6, 309–323

SYPRO Ruby............................................... 291, 293, 295

T

Tandem mass tags (TMT) ..................134, 147–155, 183

Targeted data acquisition (TDA) ....................... 170–172,

174–176

Targeted quantification................................171, 213–224

Tertiary structure ....................................... 326, 328, 331,

332, 334, 335

TiO2-based phosphopeptide enrichment ........... 199, 202

Tomato ...................................5, 274, 289–295, 354, 362

Topologies ...................................... 47, 54, 326, 327, 330

Transcriptomics .6, 7, 23, 30, 49, 57–67, 198, 213, 310,

322, 368–370

Transmembrane domains ..............................90, 102, 107

Tropical fruits ....................................................... 179–194

Trypsin ...................................................... 89, 92, 93, 100,

108–111, 113, 114, 121, 125, 127, 134, 136,

138, 142, 150, 152, 160, 162, 172, 180, 189,

191, 200, 201, 204, 207, 228, 230, 233, 237,

243, 245, 248, 253, 254, 259, 260, 262, 265,

273, 277, 284, 286, 291, 293–295, 299, 355,

360, 395–397, 407, 409–412

Two-dimensional gel electrophoresis (2-DE) ....... 89, 90,

157, 382, 384

Two-phase partitioning ...................................91, 95, 104

U

Un-targeted quantification...........................................170

PLANT PROTEOMICS: METHODS AND PROTOCOLSIndex 417