Proteomics of Diatoms: Discovery of Polyamine ... - CORE
-
Upload
khangminh22 -
Category
Documents
-
view
0 -
download
0
Transcript of Proteomics of Diatoms: Discovery of Polyamine ... - CORE
PROTEOMICS OF DIATOMS: DISCOVERY OF POLYAMINE
MODIFICATIONS IN BIOSILICA-ASSOCIATED PROTEINS
D I S S E R TAT I O N
zur Erlangung des akademischen Grades
Doctor of Philosophy
(Ph. D.)
vorgelegt
dem Bereich Mathematik und Naturwissenschaften
der Technischen Universität Dresden
von
M. Sc. Alexander Milentyev
geboren am 12. Februar 1988 in Leninsk, Kazakhstan.
Eingereicht am 1. Juli 2018
Die Dissertation wurde in der Zeit von 6. Januar 2014 bis 6. Januar 2018
im Max-Planck-Institut für molekulare Zellbiologie und Genetik angefertigt.
S U M M A R Y
Diatoms are eukaryotic unicellular algae that employ highly specialized proteins called
silaffins for making nanopatterned silica-based cell walls. These proteins share little
or no homology across diatom species and are extensively post-translationally modi-
fied. Apart from conventional modifications (e. g., phosphorylation and glycosylation)
lysine residues of silaffins bear polyamine chains with highly heterogeneous molecu-
lar structure. The latter appear to be specific for silicifying organisms and therefore
hypothesized to play a key role in biosilica synthesis. However, polyamine modifica-
tions of lysines, modified proteins, and modification sites remain poorly characterized.
To address these questions, we developed a method to quantify polyamines and iden-
tify sites of polyamine modifications in proteins from phylogenetically closely related,
yet morphologically distinct diatoms Thalassiosira pseudonana, T. oceanica, and Cyclotella
cryptica. We demonstrated that the overall pattern of polyamines followed the phyloge-
netic proximity across these diatom species and showed that polyamine modifications
occurred at consensus sites even in proteins showing no sequence similarity.
Consensus sites
Modified proteins
Modified peptidesT. oceanica
C. crypca
T. pseudonana
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
16
1 (
2×
QA
C)
17
5 (
1×
QA
C)
18
9 (
1×
QA
C)
20
4 (
3×
QA
C)
21
8 (
3×
QA
C)
23
2 (
2×
QA
C)
24
6 (
2×
QA
C)
26
1 (
4×
QA
C)
27
5 (
4×
QA
C)
28
9 (
3×
QA
C)
30
3 (
2×
QA
C)
30
3 (
3×
QA
C)
31
7 (
1×
QA
C)
31
7 (
2×
QA
C)
33
1 (
1×
QA
C)
33
1 (
2×
QA
C)
16
3 (
2×
QA
C)
20
5 (
1×
QA
C)
24
8 (
2×
QA
C)
31
9 (
3×
QA
C)
33
3 (
2×
QA
C)
34
7 (
2×
QA
C)
39
9 (
3×
QA
C)
41
3 (
2×
QA
C)
42
7 (
2×
QA
C)
143.1543
100 200 300 400 500 600 700 800 900 1000 1100 1200 1300
m/z
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
220000
240000
260000
280000
300000
320000
340000
360000
380000
Rela
tive A
bundance
187.1076
R=20906
z=1
705.3769
R=10606
z=2
834.5221
R=9806
z=1
215.1025
R=19206
z=1
143.1542
R=23906
z=1
965.5628
R=9106
z=1231.0975
R=18506
z=1616.4504
R=11306
z=1
747.4897
R=10306
z=1
427.1820
R=13606
z=1
316.1500
R=15806
z=1
1094.6050
R=8406
z=11195.6550
R=8006
z=1
545.4136
R=12006
z=11324.6517
R=7406
z=1
789.4678
R=9406
z=1
921.4961
R=8406
z=1
1135.5883
R=7606
z=1
362.1371
R=14006
z=1
691.3907
R=8706
z=1
1076.6036
R=7606
z=1
863.4427
R=8406
z=1
992.5208
R=7506
z=1
y10
y9
y8
y7
y6
y5
y4
y3
y2
143.1542
a2
b2
b3
b4 b9
Modified lysines
KSEDAAAVDAKASKESHMSISMSISGDMSMAKSHKAEAEDVTAEDVTEMSMAKAGKDEASTEDSTEDMCMPFAKSDKEMSVKSKFAKSDKEMSVKSKQGKTEMSVKSDKEMSVKSKQGKTEMSVADKEMSVKSKQGKTEMSVADAKA
0
1
2
3
4
bits
M
Y
E
K
DAS
T
GD
ESA
S
K
M
E
D
G
P
K
E
D
VAS
V
S
D
E
G
M
T
E
M
D
SA
R
G
M
L
ATS
HEDGP
SKV
G
S
M
EDA
E
D
V
T
P
GSAKQPLVTSATQAHGEVPMGAKL
E
V
S
A
T
A
S
E
P
K
M
E
G
SPA
V
Q
M
E
A
GSD
T
M
D
V
K
G
AS
Y
V
K
I
S
E
A
MG
KDPAS
EGHGGDHSISMSMHSSKAEKQAIEAAVEED
VAGPAKAAKLFKPKASKAGSMPDEAGAKSA
KMSMDTKSGKSEDAAAVDAKASKESHMSIS
GDMSMAKSHKAEAE DVTEMSMAKAGK DE
ASTEDMCMPFAKSDKEMSVKSKQGKTEMSV
ADAKASKESSMPSSKAAKIFKGKSGKSGSL
SMLKSEKASSAHSLSMPKAEKVHSMSA
Diatom Biosilica
H3N+
HOOC
D-V-T-E-M-S-M-A-K-A-G-Kb1 b2 b3 b4 b5 b6 b7 b8 b10 b11
y11 y10 y9 y8 y7 y6 y5 y4 y3 y2 y1
86.0964
1324.6517
1267.6076
143.1542
b9
iii
Z U S A M M E N FA S S U N G
Kieselalgen (Diatomee) sind eukaryotische einzellige Algen die hochspezifische Prote-
ine (sogenannte Silaffine) erzeugen, um ‘nanopatterned’ Silica-Zellwände herzustellen.
Diese Proteine zeigen geringe oder gar keine Homologie innerhalb der Diatomeen Gat-
tung und sind ausgiebig (extensiv) posttranslatorisch modifiziert. Zum Unterschied zu
konventioneller Modifikation (z.B. Phosphorylierung und Glykosylierung) weisen Ly-
sinreste von Silaffinen einige Polyaminketten mit sehr heterogenen molekularen Struk-
turen auf. Diese Modifikationen sind spezifisch für Kieselalgen und spielen somit hypo-
thetisch eine Rolle in der Biosilica-Synthese. Allerdings sind Lysin Polyamin Modifika-
tionen, modifizierte Proteine und modifizierte Stellen kaum charakterisiert. Um diese
Frage zu beantworten entwickelten wir eine Methode Polyamine zu quantifizieren und
die Position von Polyamin-Modifikationen in engverwandte Proteine zu identifizieren
(in morphologisch unterschiedliche Diatomeen Thalassiosira pseudonana, T. oceanica und
Cyclotella cryptica). Wir zeigten, dass das Gesamtmuster von Polyaminender phylogene-
tischen Nähe dieser Kieselalgenarten folgt und dass diese Polyaminmodifikationen an
Konsensusstellen sogar in Proteinen auftraten, die keine Sequenzähnlichkeit zeigten.
Diatomeen-BiosilicaKonsensusstellen
Modifizierte Proteine
Modifizierte PeptideT. oceanica
C. crypca
T. pseudonana
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
16
1 (
2×
QA
C)
17
5 (
1×
QA
C)
18
9 (
1×
QA
C)
20
4 (
3×
QA
C)
21
8 (
3×
QA
C)
23
2 (
2×
QA
C)
24
6 (
2×
QA
C)
26
1 (
4×
QA
C)
27
5 (
4×
QA
C)
28
9 (
3×
QA
C)
30
3 (
2×
QA
C)
30
3 (
3×
QA
C)
31
7 (
1×
QA
C)
31
7 (
2×
QA
C)
33
1 (
1×
QA
C)
33
1 (
2×
QA
C)
16
3 (
2×
QA
C)
20
5 (
1×
QA
C)
24
8 (
2×
QA
C)
31
9 (
3×
QA
C)
33
3 (
2×
QA
C)
34
7 (
2×
QA
C)
39
9 (
3×
QA
C)
41
3 (
2×
QA
C)
42
7 (
2×
QA
C)
143.1543
100 200 300 400 500 600 700 800 900 1000 1100 1200 1300
m/z
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
220000
240000
260000
280000
300000
320000
340000
360000
380000
Rela
tive A
bundance
705.3769
R=10606
z=2
834.5221
R=9806
z=1
215.1025
R=19206
z=1
143.1542
R=23906
z=1
965.5628
R=9106
z=1231.0975
R=18506
z=1616.4504
R=11306
z=1
747.4897
R=10306
z=1
427.1820
R=13606
z=1
316.1500
R=15806
z=1
1094.6050
R=8406
z=11195.6550
R=8006
z=1
545.4136
R=12006
z=11324.6517
R=7406
z=1
789.4678
R=9406
z=1
921.4961
R=8406
z=1
1135.5883
R=7606
z=1
362.1371
R=14006
z=1
691.3907
R=8706
z=1
1076.6036
R=7606
z=1
863.4427
R=8406
z=1
992.5208
R=7506
z=1
y10
y9
y8
y7
y6
y5
y4
y3
y2
143.1542
b2
b3
b4 b9
KSEDAAAVDAKASKESHMSISMSISGDMSMAKSHKAEAEDVTAEDVTEMSMAKAGKDEASTEDSTEDMCMPFAKSDKEMSVKSKFAKSDKEMSVKSKQGKTEMSVKSDKEMSVKSKQGKTEMSVADKEMSVKSKQGKTEMSVADAKA
0
1
2
3
4
bits
M
Y
E
K
DAS
T
GD
ESA
S
K
M
E
D
G
P
K
E
D
VAS
V
S
D
E
G
M
T
E
M
D
SA
R
G
M
L
ATS
HEDGP
SKV
G
S
M
EDA
E
D
V
T
P
GSAKQPLVTSATQAHGEVPMGAKL
E
V
S
A
T
A
S
E
P
K
M
E
G
SPA
V
Q
M
E
A
GSD
T
M
D
V
K
G
AS
Y
V
K
I
S
E
A
MG
KDPAS
EGHGGDHSISMSMHSSKAEKQAIEAAVEED
VAGPAKAAKLFKPKASKAGSMPDEAGAKSA
KMSMDTKSGKSEDAAAVDAKASKESHMSIS
GDMSMAKSHKAEAE DVTEMSMAKAGK DE
ASTEDMCMPFAKSDKEMSVKSKQGKTEMSV
ADAKASKESSMPSSKAAKIFKGKSGKSGSL
SMLKSEKASSAHSLSMPKAEKVHSMSA
H3N+
HOOC
D-V-T-E-M-S-M-A-K-A-G-Kb1 b2 b3 b4 b5 b6 b7 b8 b10 b11
y11 y10 y9 y8 y7 y6 y5 y4 y3 y2 y1
86.0964
1324.6517
1267.6076
143.1542
b9Lysin Modifikationen
v
C O N T E N T S
summary ii
zusammenfassung iv
list of figures x
list of tables xiv
abbreviations xvi
1 introduction 1
1.1 Diatoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Diatom biosilica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Biosilicification in nature . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Diatom biosilica structure and cell cycle . . . . . . . . . . . . . . . . 5
1.2.3 The cell biology of biosilica morphogenesis . . . . . . . . . . . . . . 7
1.3 The role of polyamine PTMs in diatom biosilicification . . . . . . . . . . . 8
1.3.1 Identifying biomolecules associated with diatom biosilica . . . . . 9
1.3.2 PTM complexity of biosilica-associated proteins . . . . . . . . . . . 12
1.3.3 Lysine ε-polyamine PTMs in biosilica-associated proteins . . . . . . 15
1.4 Mass spectrometry in PTM discovery . . . . . . . . . . . . . . . . . . . . . 20
1.4.1 Modification-specific proteomics . . . . . . . . . . . . . . . . . . . . 20
1.4.2 Analysis of polyamine-modified lysines by MS . . . . . . . . . . . . 22
1.4.3 Fractionation of proteins and peptides prior to MS . . . . . . . . . 24
1.4.4 MS/MS analysis in modification-specific proteomics . . . . . . . . 25
1.4.5 Bioinformatics tools for modification-specific proteomics . . . . . . 30
1.5 Rationale of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2 aim of the thesis 35
vii
viii contents
3 results and discussion 37
3.1 A method for analysis of ε-polyamine PTMs . . . . . . . . . . . . . . . . . 38
3.1.1 Establishing a method to analyse ε-polyamines . . . . . . . . . . . . 38
3.1.2 Method applicability for lysine PTM profiling . . . . . . . . . . . . 40
3.1.3 Profiling of lysine PTMs in silaffin-3 . . . . . . . . . . . . . . . . . . 43
3.2 Profiling lysine PTMs in biosilica extracts . . . . . . . . . . . . . . . . . . . 46
3.2.1 Lysine PTM profile and characteristic fragments . . . . . . . . . . . 47
3.2.2 Elucidation of phosphopolyamine structures . . . . . . . . . . . . . 59
3.2.3 Lysine PTM profiles of AFSM extracts . . . . . . . . . . . . . . . . . 62
3.2.4 Comparison of AFIM and AFSM profiles in T. pseudonana . . . . . 65
3.2.5 Phylogenetic relationship across three diatom species . . . . . . . . 67
3.3 PTM localization and discovery of consensus motifs . . . . . . . . . . . . . 72
3.3.1 Multiple protease strategy for mapping lysine PTMs . . . . . . . . 72
3.3.2 Selection of deprotection technique . . . . . . . . . . . . . . . . . . 74
3.3.3 Mapping lysine PTMs on tpSil3 using iterative search strategy . . 77
3.3.4 Deconvolution of raw MS/MS spectra . . . . . . . . . . . . . . . . . 80
3.3.5 PTM mapping by polyamine-specific fragments . . . . . . . . . . . 83
3.3.6 Identification of consensus motifs harboring lysine PTMs . . . . . 85
4 conclusions and outlook 97
5 materials and methods 101
5.1 Synthesis of polyamine standards . . . . . . . . . . . . . . . . . . . . . . . . 104
5.2 Isolation of biosilica-associated proteins . . . . . . . . . . . . . . . . . . . . 105
5.3 Expression of tpSil3 from synthetic gene . . . . . . . . . . . . . . . . . . . . 107
5.4 HCl hydrolysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.5 AQC-derivatization of amino acids and polyamines . . . . . . . . . . . . . 108
5.6 LC-MS/MS analysis of QAC-derivatives . . . . . . . . . . . . . . . . . . . . 108
5.7 Amino acid measurement using UV-detection . . . . . . . . . . . . . . . . 109
5.8 Direct infusion MS/MS analysis . . . . . . . . . . . . . . . . . . . . . . . . 110
5.9 Acetylation of phosphopolyamines . . . . . . . . . . . . . . . . . . . . . . . 110
5.10 31P NMR measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.11 Deglycosylation with TFMS . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
contents ix
5.12 Treatment with HF·pyridine soluble complex . . . . . . . . . . . . . . . . . 111
5.13 Anhydrous HF-treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.14 Protein analysis by GeLC-MS/MS . . . . . . . . . . . . . . . . . . . . . . . 112
5.15 Proteomics data processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
a appendix 117
a.1 Analytical data for synthetic standards . . . . . . . . . . . . . . . . . . . . 121
a.2 XICs of QAC-derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
b bibliography 163
acknowledgments 175
publications 177
declaration / erklärung 179
L I S T O F F I G U R E S
Figure 1.1 Images of the cell walls of 5 diatoms . . . . . . . . . . . . . . . . . 3
Figure 1.2 Structure of the diatom frustule and cell cycle. . . . . . . . . . . . 6
Figure 1.3 Hypothetical mechanism for catalysis of silicic acid condensation. 12
Figure 1.4 PTM complexity of silaffins . . . . . . . . . . . . . . . . . . . . . . 13
Figure 1.5 Chemical structures of modified lysine residues. . . . . . . . . . . 16
Figure 1.6 Proteomics approaches . . . . . . . . . . . . . . . . . . . . . . . . . 21
Figure 1.7 AQC derivatization chemistry . . . . . . . . . . . . . . . . . . . . . 23
Figure 1.8 Schematic view of the LTQ Orbitrap Velos. . . . . . . . . . . . . . 26
Figure 1.9 Principles of DDA and peptide fragmentation. . . . . . . . . . . . 28
Figure 1.9 Phylogenetic tree and SEM images of three diatoms . . . . . . . . 32
Figure 3.1 Generic structure of lysine PTMs. . . . . . . . . . . . . . . . . . . . 39
Figure 3.2 Calibration curves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Figure 3.3 mass spectrometry (MS)-spectrum of acidic hydrolysate of tpSil3 44
Figure 3.4 AA content and lysine PTMs profile of tpSil3 . . . . . . . . . . . . 45
Figure 3.5 Schematic diagram of ε-polyamine fragmentation. . . . . . . . . . 48
Figure 3.6 HCD MS/MS of synthetic standards. . . . . . . . . . . . . . . . . . 49
Figure 3.7 Fragment spectra of isomeric lysine derivatives PTM 303 . . . . . 51
Figure 3.7 Fragment spectra of isomeric lysine derivatives PTM 303 (con-
tinued from previous page) . . . . . . . . . . . . . . . . . . . . . . 52
Figure 3.8 Phosphopolyamine tandem mass spectrometry (MS/MS) spectra 60
Figure 3.9 31P-NMR spectrum of T. pseudonana biosilica hydrolysate. . . . . . 61
Figure 3.10 Full lysine PTM profiles of AFSM biosilica extracts. . . . . . . . . 64
Figure 3.11 Venn diagram and phylogenetic tree . . . . . . . . . . . . . . . . . 66
Figure 3.12 Hypothetical routes for lysine modifications. . . . . . . . . . . . . 69
Figure 3.13 Coverage for: (a) native tpSil3; (b) tpSil3 expressed in E. coli. . . . 73
Figure 3.14 Peptide coverage obtained for tpSil3 treated with: TFMS, HF·pyridine
complex, anhydrous HF. . . . . . . . . . . . . . . . . . . . . . . . . 75
xi
xii list of figures
Figure 3.15 Gel images. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Figure 3.16 Silaffin mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Figure 3.17 Deconvolution of MS/MS spectra. . . . . . . . . . . . . . . . . . . 81
Figure 3.17 Deconvolution of MS/MS spectra. . . . . . . . . . . . . . . . . . . 82
Figure 3.18 MS/MS spectra of modified peptides with characteristic ions . . 84
Figure 3.19 MS/MS spectra of modified peptides with characteristic ions . . 86
Figure 3.20 Silaffin mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Figure 3.21 Graphical representations of the local protein contexts of modi-
fied lysines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Figure 3.21 Graphical representations of the local protein contexts of modi-
fied lysines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Figure 3.22 Sequence logos of local protein contexts of PTM sites separately
for each diatom species . . . . . . . . . . . . . . . . . . . . . . . . . 93
Figure 3.23 Local protein contexts of modified lysines in KXXK motifs . . . . . 96
Figure 4.1 Mapped PTMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Figure 5.1 Chemical structures and synthesis of internal standards . . . . . . 104
Figure 5.2 Sequence design of tpSil3 expressed from a synthetic gene. . . . . 107
Figure A.1 Reactions of AQC, which might occur in buffered aqueous solu-
tions and/or during storage . . . . . . . . . . . . . . . . . . . . . . 118
Figure A.2 Calibration curves for amino acids . . . . . . . . . . . . . . . . . . 119
Figure A.3 Number of amino acid residues. Experimental and theoretical
amino acid content of tpSil3. . . . . . . . . . . . . . . . . . . . . . . 120
Figure A.4 Sequences of biosilica-associated proteins. . . . . . . . . . . . . . . 122
Figure A.5 XICs of phosphopolyamines. . . . . . . . . . . . . . . . . . . . . . . 123
Figure A.6 Full lysine PTM profiles of AFSM biosilica extracts. . . . . . . . . 124
Figure A.7 XICs of QAC-derivatized lysine derivatives . . . . . . . . . . . . . 125
Figure A.8 Fragment spectra of ornithine derivative PTM 275-orn (internal
standard) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Figure A.9 Fragment spectra of lysine derivative m/z 161 . . . . . . . . . . . . 127
Figure A.10 Fragment spectra of lysine derivative m/z 175 . . . . . . . . . . . . 128
Figure A.11 Fragment spectra of lysine derivative m/z 189 . . . . . . . . . . . . 129
Figure A.12 Fragment spectra of lysine derivative m/z 232 . . . . . . . . . . . . 130
Figure A.13 Fragment spectra of lysine derivative m/z 275 . . . . . . . . . . . . 131
list of figures xiii
Figure A.14 Fragment spectra of lysine derivative m/z 289 . . . . . . . . . . . . 132
Figure A.15 Fragment spectra of lysine derivative m/z 317a . . . . . . . . . . . 133
Figure A.16 Fragment spectra of lysine derivative m/z 317b . . . . . . . . . . . 134
Figure A.17 Fragment spectra of lysine derivative m/z 331a . . . . . . . . . . . 135
Figure A.18 Fragment spectra of lysine derivative m/z 331b . . . . . . . . . . . 136
Figure A.19 Fragment spectra of lysine derivative m/z 205 . . . . . . . . . . . . 137
Figure A.20 Fragment spectra of lysine derivative m/z 319 . . . . . . . . . . . . 138
Figure A.21 Fragment spectra of lysine derivative m/z 333 . . . . . . . . . . . . 139
Figure A.22 Fragment spectra of lysine derivative m/z 347 . . . . . . . . . . . . 140
Figure A.23 Fragment spectra of lysine derivative m/z 399 . . . . . . . . . . . . 141
Figure A.24 Fragment spectra of lysine derivative m/z 413 . . . . . . . . . . . . 142
Figure A.25 Fragment spectra of lysine derivative m/z 427 . . . . . . . . . . . . 143
L I S T O F TA B L E S
Table 1.1 Overview of silaffin PTMs. . . . . . . . . . . . . . . . . . . . . . . . 19
Table 3.1 Calculated m/z values for ε-polyaminated lysines. . . . . . . . . . 39
Table 3.2 Catalogue of lysine polyamine modifications and their charac-
teristic fragments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Table 3.3 Tabular representation of the data from Fig. 3.10. . . . . . . . . . . 70
Table 5.1 (a) chemicals and reagents. . . . . . . . . . . . . . . . . . . . . . . . 102
Table 5.1 (b) materials and (c) instrumentation. . . . . . . . . . . . . . . . . . 103
Table 5.2 HPLC gradient used for the analysis of QAC-derivatives. . . . . . 109
Table 5.3 Cleavage specificity of the proteases used in the thesis. . . . . . . 113
Table 5.4 HPLC gradient used for the analysis of peptides. . . . . . . . . . . 113
Table 5.5 Mascot search parameters. . . . . . . . . . . . . . . . . . . . . . . . 115
Table A.1 Calculated N×QAC-derivatization groups for ε-polyamines. . . . 119
Table A.2 Sequences of identified post-translationally modified proteins . . 144
Table A.3 Contingency tables for Fisher’s exact test . . . . . . . . . . . . . . 161
xv
A B B R E V I AT I O N S
AAA amino acid analysis
ACN acetonitrile
AGC automatic gain control
AFIM ammonium fluoride insoluble material
AFSM ammonium fluoride soluble material
AIF all-ion fragmentation
AMQ 6-aminoquinoline
AQC 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate
BAP biosilica-associated protein
BLAST Basic Local Alignment Search Tool
BSA bovine serum albumin
CID collision-induced dissociation
CIAP calf intestinal alkaline phosphatase
CTC chlorotrityl chloride
DAD diode array detector
DBU 1,8-diazabicyclo[5.4.0]undec-7-ene
DCM dichloromethane
DIAD diisopropyl azodicarboxylate
DIPEA N,N-diisopropylethylamine
DMF dimethylformamide
DTT dithiothreitol
DDA data-dependent acquisition
xvii
xviii abbreviations
DNA deoxyribonucleic acid
cDNA complementary DNA
DMSO dimethyl sulfoxide
EDTA ethylenediamine tetraacetate
ER endoplasmic reticulum
ESI electrospray ionization
ESAW enriched artificial seawater
ETD electron-transfer dissociation
ECD electron-capture dissociation
FA formic acid
FDR false discovery rate
FT MS Fourier transform mass spectrometry
FWHM full width at half maximum
GeLC-MS/MS gel electrophoresis liquid chromatography tandem mass spectrometry
GFP green fluorescent protein
GO gene ontology
HCD higher-energy collisional dissociation
HILIC hydrophilic interaction chromatography
HPLC high-performance liquid chromatography
HRMS high resolution mass spectrometry
HSQC N-hydroxysuccinimidyl 6-quinolinyl carbamate
HSQC heteronuclear single quantum coherence spectroscopy
IM immonium ion
IPTG isopropyl β-d-1-thiogalactopyranoside
IT ion trap
IAA iodoacetamide
LCPA long-chain polyamine
abbreviations xix
LC liquid chromatography
LC-MS/MS liquid chromatography coupled with tandem mass spectrometry
LTQ Linear Trap Quadropole
MRM multiple reaction monitoring
MS mass spectrometry
MS1 full scan
MS2 MS/MS scan
MS/MS tandem mass spectrometry
MW molecular weight
MWCO molecular weight cut-off
NHS N-hydroxysuccinimide
NMR nuclear magnetic resonance
nCE normalized collision energy
PBS phosphate-buffered saline
PCR polymerase chain reaction
PMSF phenylmethylsulfonyl fluoride
PSM peptide-spectrum match
PST peptide sequence tag
PTM post-translational modification
QAC 6-quinolinylaminocarbonyl
RPLC reversed-phase liquid chromatography
RT room temperature
SDS sodium dodecyl sulfate
SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis
SDV silica deposition vesicle
SEM scanning electron microscope
SFLP silaffin-like protein
xx abbreviations
SIT silicic acid transporter protein
natSil1A silaffin-1A from C. fusiformis
natSil2 silaffin-2 from C. fusiformis
tpSil1/2 silaffin-1/2 from T. pseudonana
tpSil3 silaffin-3 from T. pseudonana
tpSil4 silaffin-4 from T. pseudonana
TBAI tetrabutylammonium iodide
TEA triethylamine
THF tetrahydrofuran
TFMS trifluoromethanesulfonic acid
TFA trifluoroacetic acid
TIC total ion chromatogram
TOCSY two-dimensional nuclear magnetic resonance spectroscopy
UPLC ultra performance liquid chromatography
UV ultraviolet
XIC extracted-ion-chromatogram
1 I N T R O D U C T I O N
Contents1.1 Diatoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Diatom biosilica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Biosilicification in nature . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Diatom biosilica structure and cell cycle . . . . . . . . . . . . . . 5
1.2.3 The cell biology of biosilica morphogenesis . . . . . . . . . . . . 7
1.3 The role of polyamine PTMs in diatom biosilicification . . . . . . . . . . 8
1.3.1 Identifying biomolecules associated with diatom biosilica . . . 9
1.3.2 PTM complexity of biosilica-associated proteins . . . . . . . . . 12
1.3.3 Lysine ε-polyamine PTMs in biosilica-associated proteins . . . . 15
1.4 Mass spectrometry in PTM discovery . . . . . . . . . . . . . . . . . . . . 20
1.4.1 Modification-specific proteomics . . . . . . . . . . . . . . . . . . 20
1.4.2 Analysis of polyamine-modified lysines by MS . . . . . . . . . . 22
1.4.3 Fractionation of proteins and peptides prior to MS . . . . . . . 24
1.4.4 MS/MS analysis in modification-specific proteomics . . . . . . 25
1.4.5 Bioinformatics tools for modification-specific proteomics . . . . 30
1.5 Rationale of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1
2 introduction
1.1 diatoms
Diatoms are unicellular, eukaryotic, photosynthetic algae that produce micro- and
nano-scale silicified cell walls [1]. Diatoms occur in almost every aquatic and moist
environment on Earth, inhabiting not only oceans, seas, lakes, and streams, but also
soil and wetlands. These organisms have enormous biogeochemical and ecological
importance, since they are responsible for around one-fifth of the world net primary
production [2–4]. Their ocean-wide dominance is reflected by large marine sea-floor
sediments of silica including diatomaceous earth, most cherts, and a considerable frac-
tion of current fossil fuel reserves [5–8]. According to dating record of these fossils,
diatoms emerged relatively recently in geological time (about 180 mya) [8–10]. Since
then, their diversity exploded into ~250 living diatom genera with more than 200 000
species estimated to exist at the moment, although just half of them have been de-
scribed and classified by unique morphologies [11] (see Fig. 1.1).
The ecological and evolutionary importance of diatoms motivated researchers to
analyse their genomes [12–15]. These diatom sequencing projects shed light on their un-
usual evolutionary history. Whole-genome comparison has revealed incredibly rapid
and wide evolutionary divergence between Thalassiosira pseudonana and Phaeodactylum
tricornutum, that is comparable with those for fishes and mammals [16]. More re-
cent sequencing studies revealed that diatom genomes are highly chimeric and contain
multiple genes acquired through horizontal-transfer events [16]. Diatoms provide an
intriguing example of combining genes from different sources, contributing to many
unusual physiological features that are believed to underlay their evolutionary and eco-
logical success. For instance, diatoms possess an ornithine-urea cycle, which is similar
to that of animals but is absent in other plants [17, 18]. This metabolic coupling seems
to be fundamental for diatom physiology, because it affects the precursors for long-
chain polyamines (LCPAs), which are thought to be directly involved into formation of
ornately patterned biosilica cell walls – the most conspicuous and spectacular feature
of these organisms.
It has been argued that the evolutionary success and an incredible variety of diatoms
is largely due to their ability to build silicified cell walls, which may serve as an armour
protection against phytoplankton predators [20–22] and are energy efficient to produce
1.1 diatoms 3
(a) Actinoptychus senarius (b) Biddulphia antediluviana (c) Pleurosigmaangulatum
(d) Surirella fastuosa (e) Triceratium favum
Figure 1.1 Images of the cell walls of five different diatom species: (a) the circular shape of theradial centric diatom Actinoptychus senarius; (b) the rhomboid shape of the polar centric diatomBiddulphia antediluviana; (c) the rhomboid shape of the pennate diatom Pleurosigma angulatum;(d) the ovoid shape of the pennate diatom Surirella species; (e) the triangular shape of the polarcentric diatom Triceratium favum (images taken from [19]).
4 introduction
comparing to equivalent organic structures [23]. How this evolutionary trade-offs re-
late to fascinating evolutionary success and morphological diversity of diatoms is cur-
rently under debate [24]. It is clear, however, that advanced mechanical properties and
robust biosilica structure is relevant to their adaptation ability, which fits well in the
context of diatom evolutionary efficiency. Indeed, when designing buildings and air-
craft, architects and engineers have applied the same structural principles as diatoms
use to create their minute shells. Nowadays biosilica structures attract increased atten-
tion from a broad array of researchers, ranging from fundamental biologists to applied
material scientists [25, 26]. The present study focuses on the fundamental mechanistic
basis of biosilicification processes.
1.2 diatom biosilica
1.2.1 Biosilicification in nature
During evolution many organisms (e. g., diatoms, sponges, radiolaria) have acquired
the ability to build specifically structured, silica-based exo- or endoskeletons using
silicon (Si), the second most abundant element in the Earth’s crust [27, 28]. These intri-
cately shaped biomineral structures are produced via biosilicification, which refers to the
process by which inorganic silicon is incorporated into living organisms as biosilica (i. e.,
‘biogenic silica’). Interestingly, diatoms can produce these structures from silicon under
benign ambient and physiological conditions (from 4 to 40 C, atmospheric pressure),
while silica formation in diatoms is around 106 times faster than the corresponding
abiotic process [29]. In contrast to biosilicification, industrial syntheses of silica in vitro
are typically accomplished under extreme temperature, pressure and pH. Amorphous
silica is a widespread biologically produced inorganic material, and thus, owing to its
abundance and physical properties, is also widely used as the basic raw material in
semiconductors, glass, plastics, ceramics, optical fibers, insulators, detergents, cosmet-
ics, and chromatographic materials such as resins. It is not surprising that the exquisite
features of diatom biosilica has been regarded as a paradigm for future silica nanotech-
1.2 diatom biosilica 5
nology [30–35], mainly due to unique structural features of biosilica cell wall, which
will be discussed further in the next section.
1.2.2 Diatom biosilica structure and cell cycle
The diatom silica-based cell wall, or frustule, ranges from 2 to 2000 µm and shows three-
dimensional morphologies on the micro- and nano-scale that are precisely reproduced
over generations. These hierarchical porous structures are characterized by levels of
symmetry and complexity far beyond the capabilities of best technologies available
to date. Frustules display an incredible variety of shapes and forms across different
diatoms species [36], which have attracted scientists by their inordinate beauty ever
since the earliest microscopical observations [37].
Based on the shape and symmetry of their frustules, diatoms are traditionally di-
vided into two main groups: the centrics and the pennates (see Fig. 1.1, [1]). Centric
diatoms could be classified into two subgroups based on different types of symmetry:
radial centrics have a circular center of symmetry in the middle of the valve, while polar
centrics have bi- or multipolar valves with an elongated or distorted center of symme-
try. In contrast to centric diatoms, pennate species are bilaterally symmetrical and their
shells are typically elongated parallel to the longitudinal axis of symmetry.
Typically, the diatom frustule consists of two almost identically structured overlap-
ping halves (theca), hence the taxon name1. The slightly larger top half (epitheca) over-
laps the bottom one (hypotheca), allowing them to fit each other much like a Petri dish
and its lid (see Fig. 1.2a). Each theca consists of a valve and several girdle bands that
span the boundary of a diatom cell. Terminal girdle bands in the overlapping region
of both thecas are termed pleural bands. Valves usually display lace-like patterns of
nanometre-scaled pores, while girdle bands exhibit far less decoration diversity.
Diatoms primarily reproduce asexually through binary fission, where each new
daughter cell receives either the epi- or hypotheca from the parent cell (refer to Fig. 1.2b).
This form of division results in a size reduction of the daughter cell that receives the
smaller frustule from the parent and therefore the average cell size of a diatom popu-
lation decreases over time. To avoid the significant size reduction, diatoms are capable
1 the word ‘diatom’ originates from Greek diá-tom-os (= dichó-tom-os) meaning ‘cut in half’
6 introduction
(7)
(8) (1)
(2)
(3)
(4)(5)
(6)
new
girdle bands
new hypovalve
new hypovalve
valve
SDV
girdle
band SDV
epitheca
hypotheca
protoplast
pleural band
girdle bands
valve
plasma membrane
(a) Diatom cell wall structure
(b) Diatom cell cycle
Figure 1.2 Structure of the diatom frustule and cell cycle. (a) Diatom cell wall structure. Thecell wall is made up of two half shells, named the epitheca and hypotheca, which together fullyenclose the protoplast. Each theca consists of a valve and one or more girdle bands that runlaterally along the outline of the cell. The terminal girdle bands of each theca constitute theoverlap region of the cell wall in which the slightly larger epitheca overlaps the hypotheca. (b)diatom cell cycle: (1) cytokinesis and formation of a valve silica deposition vesicle (SDV) ineach daughter protoplast; (2) and (3) expansion of the SDV and formation of a new hypovalvewithin each SDV; (4) exocytosis of SDV contents; (5) separation of daughter cells; (6) formationof the first girdle band SDV; (7) consecutive formation and secretion of girdle bands; (8) DNAreduplication. Figure adapted from Kröger and Poulsen [33].
1.2 diatom biosilica 7
of sexual reproduction, where meiotic cell divisions and gamete fusion results in the
formation of an auxospore with the augmented cell volume [38].
1.2.3 The cell biology of biosilica morphogenesis
Silicon is an essential nutrient for biosilica formation, while its limitation in diatom cul-
tures induces a cell cycle arrest. The formation of diatom biosilica takes place in special-
ized intracellular compartments termed SDVs. The SDV is considered to be a cellular
‘reaction vessel’ in which all the chemical steps of silica formation and patterning take
place. Although the immediate monomeric precursor for silica polycondensation in-
side the SDV is unknown, orthosilicic acid (Si(OH)4), which occurs in natural habitats
in concentrations between 3 and 70 µm [39], represents the original source for cell wall
biogenesis [33]. Si(OH)4 is transported into the diatom cell by specific Na+-dependent
transporter proteins, termed silicic acid transporter proteins (SITs) [40–44].
As indicated in Fig. 1.2b, valves are formed only during cell division, while girdle
bands are produced over interphase. Therefore, biogenesis of the diatom cell wall
requires two different types of SDVs that are present at different stages of the cell cycle.
During cell division each sibling cell produces a valve SDV, which gradually grows as
more and more silica becomes deposited. When valve formation is complete, the SDV
fuses with the cell membrane depositing the newly formed biosilica structure on the
cell surface. Immediately after cell cleavage, each daughter cell initiates the synthesis
of a new hypovalve. As the cell volume increases during interphase, the stepwise
formation of girdle bands takes place. When the cell volume has reached the required
size a new round of cell-division cycle begins (different stages of the diatom cell cycle
are shown in Fig. 1.2b). Studies on other silicifying organisms have demonstrated that
SDVs are not a speciality of diatoms but rather appear to be the general organelles for
silica biogenesis [45].
However, despite being very common in nature, biosilicification remains a poorly
understood phenomenon. Species-specific diatom biosilica structures are precisely re-
produced over generations, presumably indicating that biosilica morphogenesis takes
place under precise morphogenetic control, which, in turn, implies the existence of spe-
cialized proteins that guide integration of silica precursors into the protein-based or-
8 introduction
ganic templates. It is currently hypothesized that SDVs contain protein-based organic
matrices that control silica formation, resulting in species specifically nanopatterned
biosilica, an organic-inorganic composite material. Therefore, better understanding
of the molecular mechanisms of biosilica morphogenesis should be achieved through
the identification and characterization of proteins, that are intimately associated with
the cell wall and are directly involved in silica polycondensation. In recent decades,
significant insight into the molecular mechanism of silica biomineralization has been
obtained by structural and functional analysis of biomolecules that are involved in the
process of diatom biomineralization [46–49]. Furthermore, the genome sequencing pro-
vided an important resource for investigating the biosilica forming machinery [12–16].
However, the search for biomolecules that determine biosilica patterns turned out to
be extremely challenging.
1.3 the role of polyamine post-translational mod-
ifications in diatom biosilicification
The diatom frustule represents an inorganic-organic hybrid material that is mainly
composed of nanopatterned inorganic silica as well as various specific organic macro-
molecules (proteins and/or peptides, polysaccharides, long-chain polyamines, for re-
view refer to [33, 46, 49, 50]). It is assumed that protein-based organic templates are di-
rectly involved into biosilicification process, and, therefore, these organic components
are embedded within biosilica structures [48–50]. Given that biosilica is usually very
robust and resistant to most chemical and physical treatments, it is very challenging to
extract molecules (and especially proteins) from these composites without degrading
or chemically modifying them. In pioneering works by Nakajima and Volcani in
the early 1970’s, the uncommon amino acids 3,4-dihydroxyproline [51] and ε-N,N,N-
trimethyl-δ-hydroxylysine [52] were isolated from acidic hydrolysates of purified cell
walls. Since this first biochemical evidence for the presence of post-translationally mod-
ified proteins in diatom biosilica, more components have been discovered, as extraction
methods have become more exhaustive and also less chemically aggressive.
1.3 the role of polyamine ptms in diatom biosilicification 9
1.3.1 Identifying biomolecules associated with diatom biosilica
The general biochemical approach to identify biosilica constituents is to separate the
intracellular organic material from biosilica, and then extract the biosilica embedded
components from purified cell walls. This approach led to the identification of a num-
ber of protein families (i. e., frustulins2 [53, 54], pleuralins3 [55]) that are tighly associated
with the biosilica but were later demonstrated to be incorporated after SDV exocytosis
and therefore none of these proteins are actively involved in silica biogenesis [56].
Long-chain polyamines LCPAs, another class of biosilica-associated organic molecules,
were discovered upon complete dissolution of the diatom biosilica in liquid HF [57], a
treatment known also to cleave O-glycosidic and phosphate ester bonds, whereas pep-
tide bonds remain intact [58, 59]. After subjecting LCPAs to strong acidic hydrolysis,
their molecular masses remained unaffected, excluding the presence of peptide bonds
in their structures [57]. The ESI-MS study further indicated that long-chain polyamines
represent linear chains up to 20 repeated propyleneimine units [57], the longest poly-
amine chains found in nature. It was shown later, that each diatom species display a
wide variety of LCPA structures, including the overall chain length, the degree of N-
methylation, and, unexpectedly, site-specific incorporation of a quaternary amines [31,
60, 61]. It was hypothesized later that various biosilica patterns can be generated by
polyamines of different chain lengths and structures (for review see [62]). Mass- and
NMR-spectroscopic analysis revealed the presence of LCPAs in other silicifying organ-
isms like sponges [6], further corroborating their involvement into biosilicification. All
LCPAs identified to date have either a propylenediamine, putrescine, or spermidine
basis molecule to which linear oligo-propyleneimine chains are attached. Furthermore,
in vitro experiments have shown, that polyamines of different chain lengths induce
rapid silica precipitation from a silicic acid solution [63], which is enhanced or made
species-specific by a synergistic action with highly-specialized peptides and proteins,
as discussed below.
Upon full dissolution of the diatom silica with anhydrous HF novel peptides from
C. fusiformis diatom termed silaffins4 were extracted and characterized [64]. The first
2 from ‘frustule’, a diatom silicified cell wall3 from ‘pleural band’, the overlap region of hypotheca and epitheca4 from ‘silica affinity’
10 introduction
discovered peptides silaffin-1A and silaffin-1B from C. fusiformis were thoroughly char-
acterized in a follow-up study [65], displaying that these peptides are highly post-
translationally modified. To avoid the harsh treatment with anhydrous hydrogen fluo-
ride, the diatom biosilica was dissolved by an acidified ammonium fluoride (NH4F/HCl
pH ~5.0) solution [66]. This method allows the extraction of silaffins in their na-
tive state, keeping O-linked modifications intact. Furthermore, Poulsen and Kröger
employed the same approach to characterize silica-associated organic material from
T. pseudonana and identified three bands by SDS-PAGE corresponding to higher molec-
ular weight silaffin polypeptides, and a single band corresponding to LCPAs [67].
The first identified silaffins were subjected to N-terminal Edman sequencing, and the
database sequence searches allowed the identification of silaffin-encoding genes [64,
67]. Altogether, one silaffin from C. fusiformis and four from T. pseudonana were de-
scribed and characterized [64–69]. Analysis of protein sequences revealed the same
gene organization, namely the presence of a 22-amino acid signal peptide for co-
translational import to endoplasmic reticulum (ER) [70], which is flanked by N-terminal
RXL-spacer (sequences and UniProt entries are listed in Fig. A.4a–A.4e). This similarity
suggests the operation of analogous processing pathways for this silica-associated pro-
tein family. However, it was also found, that silaffins do not share significant sequence
similarity, thus preventing the use of homology-based tools for the identification of
related proteins in diatom genome databases.
The lack of sequence conservation prompted a genome-based bioinformatics min-
ing of other putative biosilica-associated proteins. Scheffel et al. developed an amino
acid composition-based bioinformatics approach, which enabled the identification of
86 silaffin-like proteins (SFLPs) in the genome of the diatom T. pseudonana [71]. A
group of six W or Y-rich proteins (listed in Fig. A.4i–A.4h), that exhibited highly
repetitive sequence structures with silaffin-like motifs (KXXK), were demonstrated by
GFP-tagging to be directly associated with the girdle band region of biosilica. These
proteins, hence called cingulins5, could not be purified from T. pseudonana cell walls
using established biosilica extraction approaches (see Section 1.3.1). Each cingulin
contains one RXL-containing domain, which starts (Fig. A.4i–A.4k) or ends (Fig. A.4f–
A.4h) with the tripeptide sequence RXL. This motif is also present in the precursors of
biosilica-associated diatom proteins, where they play a role of the recognition site for
5 from ‘cingulum’, the girdle band region of a frustule
1.3 the role of polyamine ptms in diatom biosilicification 11
proteolytic cleavage at the C-terminus of the leucine residue. Nevertheless, no other
biosilica-associated proteins were identified in these functional genomic studies.
Another protein component that is tightly associated with diatom biosilica, polypep-
tides called silacidins6, were co-purified with silaffins after mild dissolution of T. pseudonana
biosilica [72]. These polypeptides were enriched with phosphorylated serine and acidic
amino acids (hence the name), and it was hypothesized that these highly acidic low-
molecular weight peptides assist silaffins and LCPAs in silica precipitation [73]. Later,
several homologues of the gene encoding the silacidin protein in T. pseudonana were
found in different centric diatoms species [74], which may suggest their involvement
in biosilicification process. However, sequence conservation for silacidins appears to be
rather an exception across diatom biosilica-embedded proteins. Despite the presence of
multiple repetitive motifs (highlighted in Fig. A.4), silaffins appear to completely lack
α-helices and β-sheets and have largely a random coil structure, similar to natively-
unfolded proteins [75].
The past decades of diatom research provided significant insight into molecular com-
ponents of the biosilica-forming machinery, particularly proteins and peptides, that
may act both as structural templates and mechanistic catalysts for the silica polycon-
densation reaction (for review refer to [46]). In order to explain the mechanism of silica
morphogenesis by both silaffins and LCPAs several models have been proposed [76].
It was argued that only the polyamine moieties, but not the phosphate groups, are
directly involved in catalysis of silicic acid polycondensation [34]. The ammonium and
amino groups of the oligo-propylamine chains of silaffins and LCPAs are believed to act
as acid-based catalysts for the condensation of silicic acid [77]. The mechanisms of sil-
ica formation in the silaffin-1A from C. fusiformis (natSil1A) and the LCPA/phosphate
systems appear to be very similar. In both cases, electrostatic interactions between
polyamine chains and phosphate groups lead to the formation of supramolecular ag-
gregates [66, 78]. These aggregates appear to be responsible for accelerating the con-
densation of oligo-silicic acid molecules [78]. Fig. 1.3 shows the proposed mechanism
for catalysis of silicic acid condensation by oligo-propyleneimine containing molecules.
This suggests, that conservation of post-translational modifications patterns rather than
conservation of the amino acid sequence can be essential for silaffin function.
6 from “silica” and “acidic” nature of these peptides
12 introduction
2 ×
- +
++
+
(1)
(2)
(3)
+ +
Figure 1.3 Hypothetical mechanism for silica poly-condensation reaction catalyzed bypolyamines present in silaffins (adapted from Kröger and Sandhage [34]). Two propylene-imine units (hereafter denoted as propylamines) within a polyamine chain contain the aminogroup and the ammonium group (R = H or CH3), which bind silicic acid molecules by ahydrogen-bonding interaction: (1) protonation of the amino group and deprotonation of ammo-nium group by silanol (–Si–OH) results in formation of a reactive silanolate ion (–Si–O –©) andoxonium ion (–Si–O +©H2); (2) the silanolate group reacts with the neighbouring silicon atomresulting in a siloxane bond (–Si–O–Si–) formation through the elimination of a water (H2O);(3) newly formed silica is replaced by two other silicic acid molecules, and the catalytic cyclecommences.
1.3.2 PTM complexity of biosilica-associated proteins
The set of proteins expressed in a diatom cell and embedded into the diatom biosilica,
here termed as the biosilicome, represents highly post-translationally processed proteins
and/or peptides. The post-translational modification (PTM) complexity is definitely
the most remarkable feature of biosilica-associated proteins. In the course of intra-
cellular maturation silaffin precursors undergo extensive post-translational processing,
including the proteolytic cleavage of the N-terminal signal peptide [70] and the cova-
lent attachment of a different chemical moieties at multiple amino acid residues. The
latter results in extremely complex protein structures bearing numerous PTMs. Silaffin
PTMs range from global modifications such as phosphorylation, which is found in all
eukaryotic species, to unique modifications such as polyamines attached to the ε-amino
groups of lysines. Additionally, complex glycosylation and sulfation were reported for
several silaffins from T. pseudonana and C. fusiformis. The identification and chemical
1.3 the role of polyamine ptms in diatom biosilicification 13
characterization of multiple PTMs on the same polypeptides remains challenging [79–
82]. Although protein sequences can be deduced from nucleotide sequences, post-
translational modifications, in general, cannot. As will be presented below, the current
knowledge of silaffin PTMs is limited to a few proteins, while many more remain
unknown.
KATK KSXK KSXKSP
Gly
cc
S
Gly
c
Gly
cS S
SS
X X
X
P
S
Glyc polyamine PTMs
sulfation
phosphorylation
unknown PTMsglycosylation
signal peptide K-(X/S)-X- K KXXK repeatSP
N+
N+
N N
N
KSXK
PP
P
KATA K
S
NN
H
H
Figure 1.4 PTM complexity of biosilica-associated proteins. Biosilica-associated proteinscould be modified by wide array of PTMs. Overview of PTMs identified from different di-atom species is provided in Table 1.1. Site specificity of the most PTMs remains unknown.
Despite the progress in research of PTMs of biosilica-associated proteins, to date only
three glycoproteins have been identified from the diatoms T. pseudonana and C. fusiformis
[67, 83]. It was shown that silaffin-2 from C. fusiformis (natSil2) represents a highly
glycosylated and sulfated protein [83]. After deglycosylation with trifluoromethane-
sulfonic acid [84, 85] both glycosylation and sulfation modifications are completely
removed; it is not clear, however, whether sulfation is directly linked to the polypep-
tide backbone of natSil2 or to protein-bound glycans. The carbohydrate composition of
protein-bound oligosaccharides appeared to be rather complex: galactose, rhamnose,
14 introduction
glucuronic acid, fucose, glucosamine, and a monomethylated deoxyhexose. Presum-
ably due to the abundance of glucuronic acid natSil2 is the only component of the
C. fusiformis ammonium fluoride extract that is stained by the polycationic carbocya-
nine dye ‘Stains all’, which indicates highly negative net charge [86, 87]. HF treatment
converts natSil2 into a strongly positively charged protein, indicating that the high neg-
ative charge density of natSil2 results solely from its HF-sensitive PTMs. As mentioned
previously, the sequence of natSil2 remains unknown.
The T. pseudonana silaffins are highly glycosylated and sulfated acidic proteins, thus
resembling natSil2 from C. fusiformis. Silaffins silaffin-1/2 from T. pseudonana (tpSil1/2),
which occur as high (tpSil1/2H) and low (tpSil1/2L) molecular weight isoforms, and
silaffin-3 from T. pseudonana (tpSil3) have rather different carbohydrate composition:
tpSil3 have a substantial amount of glucuronic acid, whereas both tpSil1/2 do not have
it at all. Additionally both tpSil1/2 and tpSil3 contain some unidentified monosac-
charides [67]. HF-treatment [58, 59] of tpSil3 resulted in a single band on SDS-PAGE
with an apparent molecular weight of 35 kDa, which is considerably higher than the
predicted molecular weight of the mature polypeptide 21.2 kDa due to presence HF-
insensitive modifications [67, 68]. Similarly, after treatment with HF both isoforms of
tpSil1/2 resulted in two bands on SDS-PAGE with the apparent molecular weights
much lower than in untreated samples, which again is due to the presence of PTMs re-
sistant to HF-treatment. Consequently, an exceptionally high negative charge imparted
by the carbohydrate and sulfate moieties to regulatory silaffins makes them incapable
to precipitate silica alone. However, it was also found that deglycosylated natSil2
possess an intrinsic silica silica precipitation activity in vitro [83]. This demonstrates
that glycosylation and sulfation may autoinhibit the silica formation by modulating
the silaffin function. Poulsen et al. speculated, that regulatory silaffins may be able
to influence silica morphogenesis by means of their interaction with silica-forming
molecules, although its mechanism remains unclear [67, 83].
All silaffins are presumed to be phosphorylated to a significant extent. The first
identified and characterized silaffin appeared to be extensively phosphorylated pro-
tein. These phosphate groups affect SDS-PAGE significantly, increasing the apparent
molecular weight from ∼3 to 6.5 kDa. The attachment sites of the phosphate groups
within natSil1A were analysed by 31P-NMR spectroscopy, because the confirmation
with tandem mass spectrometry (MS/MS) analysis was difficult. It was shown, that
1.3 the role of polyamine ptms in diatom biosilicification 15
phosphate groups linked to silaffin-1A from C. fusiformis , of which seven bind serines
and one binds a ε-N,N,N-trimethyl-δ-hydroxylysine [66]. The total phosphate analy-
sis of silaffins natSil2 in C. fusiformis, tpSil3 and tpSil1/2 in T. pseudonana demonstrates
that these proteins are also substantially phosphorylated. However, none of phosphate
groups were mapped directly to the polypeptide sequences of this silaffins [67, 83].
The phosphorylation of modified lysines will be discussed in Section 1.3.3.
Another phosphorylated peptide from T. pseudonana is silacidin, a highly acidic low-
molecular weight peptide which mainly consist of Ser more than 60 % of which are
also highly phosphorylated [72, 73]. Like in the case of natSil1A, these phosphates
were identified with 31P-NMR with no direct mapping by mass spectrometry. Never-
theless, it is clear that phosphorylation affects numerous serine residues and plays an
essential role in biosilica formation. The presence of phosphate groups makes natSil1A
able to precipitate silica in the absence of phosphate buffer, whereas dephosphory-
lated natSil1A completely lacks silica precipitation activity. If the phosphate group is
not present on the protein, it has to be supplied a in buffer and is used up stoichiomet-
rically in the process. These results strongly support the hypothesis that phosphates
in biosilica-associated proteins serve as polyanions required in vivo for silica formation
directed by LCPA and polyamine PTMs present in silaffins. The phosphate moieties on
biomineralization proteins play an important role in mineral formation, yet the kinases
catalyzing the phosphorylation of these proteins are poorly characterized. Recently,
a membrane-associated serine/threonine kinase has been identified in T. pseudonana
based on its similar expression pattern as tpSil3 [88]. However, it only phosphory-
lates a fraction of all silaffins and accounts for only ~25 % of all silaffin kinase activity,
indicating that many other kinases are active.
1.3.3 Lysine ε-polyamine modifications in biosilica-associated proteins
Lysine PTMs in silaffins exhibit more complex and elaborate modification patterns
than O-linked modifications. Silaffins contain multiple lysine residues, which can be
modified by covalent attachment of polyamine chains. These polyamines represent
multiple linearly linked propylenimine units, exhibiting variations in chain length and
degree of N-methylation. Thus, even within one type of PTM, multiple subtypes ex-
16 introduction
ist, thus greatly expanding the scope of silaffin lysine modification. Being positively
charged at physiological pH, lysine polyamine modifications, in fact, significantly in-
crease cationic net charge of silaffins, which is essential for these proteins to exert
their silica precipitation activity under the acidic pH of diatom silica deposition vesicle
lumen [64]. Such a modification allows for a combination of cationic and hydrogen-
bonding interactions to bind tightly to the surfaces of silica particles. Although any
amine possess inherent silica precipitation activity, diatoms may employ complex pat-
terns of variable lysine PTMs for the fine regulation of silica precipitation process.
However, there is only scarce information available on the structure of ε-polyamine-
modified lysines in silaffins. All lysine PTMs known to date were reviewed by Kröger
and Poulsen in [33] and depicted in Fig. 1.5. Interestingly, beyond histone proteins,
the proteome-wide extent of lysine modifications remains largely uncovered by the
most recent reviews in the PTM research field [89], presumably due to the limited set
of currently available chemical structures.
(a) Poly-N-methylpropylamine attached to ε-amine of lysine
+
(b) ε-N,N,N-trimethyl-δ-hydroxylysine
Figure 1.5 Chemical structures of modified lysine residues with basic structural units of post-translational modifications present in silaffins (reviewed by Kröger and Poulsen [33]). Thebasis of each structure is the lysine moiety (highlighted in black). Phosphorylation of the hy-droxyl group of hydroxylysine (highlighted in red) was also described elsewhere [52, 66]. Poly-amine modifications of lysine residues represent an oligo-propyleneimine residues attachedto the ε-amino group of lysine (highlighted in blue) [64, 65]. Some of the propyleneimine-modifications are N-methylated (highlighted in green).
Indeed, besides modifications of natSil1A and tpSil3, very little information is avail-
able regarding PTMs in silaffins. The analysis of the first discovered silaffin natSil1A
from C. fusiformis revealed the presence of polyamine modifications resistant to HCl
hydrolysis [57, 64]. This 3.5 kDa silaffin peptide contains three different lysine re-
sidues representing ε-N,N-dimethyllysine, ε-N,N,N-trimethyl-δ-hydroxylysine, and ε-
1.3 the role of polyamine ptms in diatom biosilicification 17
polyamine-modified lysines. The latter modification is composed of 4–9 linearly linked
propylenimine units, in which each N-atom except the first one is methylated (Fig. 1.5a).
Additionally, 31P-NMR analysis of natSil1A has shown that phosphorylation affects
side hydroxyl chain of ε-N,N,N-trimethyl-δ-hydroxylysine [66] (Fig. 1.5b). This modifi-
cation and its non-phosphorylated counterpart were first discovered in 1970 by Naka-
jima and Volcani in diatom cell walls of N. pelliculosa [52]. Later, similar modification
was also found in hydrolysates of T. pseudonana biosilica [90, and current work], demon-
strating that δ-hydroxylysine phosphorylation may be important for silaffin function.
Lysine ε-amino groups of another 24 kDa protein silaffin-3 from T. pseudonana (tpSil3),
where 30 of the 33 lysines are embedded in a KXXK motif, modified by ε-N,N-dimethy-
lation and polyamine chains [67, 68]. Based on the PTM mapping results in tpSil3,
Sumper et al. formulated empirical rules, referred to as the ‘polyamine code’ (as a lin-
guistic equivalent to the concept of ‘histone code’) [68]. According to one of these rules,
in each K(A/S/Q)XK motif the N-terminal lysine has two aminopropyl units, while the
C-terminal lysine becomes ε-N,N-dimethylated. The existence of such regularity indi-
cates the presence of a sophisticated multi-step enzymatic machinery for silaffin post-
translational modification. However, it would be premature to stipulate the presence of
rules for enzymatic modification based on a single mapped protein. Given the lack of
sequence conservation among known silaffins, the presumption that silaffin function is
not dependent on a specific polypeptide fold, but rather requires a particular arrange-
ment of conserved post-translational modifications leads to the logical and testable hy-
pothesis of structure-function relationship. The lack of sequence conservation between
silaffins may also reflect the large phylogenetic distance between diatom species, from
which they originate (T. pseudonana and C. fusiformis). However at the moment it is not
possible to draw any specific conclusion on silaffin similarity due to the limited set of
silaffin sequences available. Investigating this intriguing question would require larger
set of biosilica-associated proteins with mapped lysine PTMs from different diatom
species.
To the best of our knowledge, polyamine-modified proteins are somewhat unique
for biomineralizing organisms7, and the pathway for their modification remains enig-
7 Protein polyamine modifications occur also in sponges and silicifying haptophytes [44, 91, 92]. Addi-tionally, an unusual amino acid hypusine (the molecule comprised of hydroxyputrescine and lysine) wasfound in all eukaryotes and in some archaea [93, 94]. The only known protein containing hypusine iseukaryotic translation initiation factor 5A (eIF5A) and a similar protein found in archaebacteria [95, 96].
18 introduction
matic. Particularly, it is unclear which enzymes responsible for catalyzing the in-
dividual steps in silaffin processing: sequential transfer of propylenimine units to
ε-amino groups of lysine, and methylation of primary and secondary amines [97,
98]. T. pseudonana polyamines exist both in a lysine-bound form, and also as free
long-chain polyamines [57], much like the silaffins intimately associated with biosil-
ica (as discussed above in Section 1.3.1). Each LCPA typically represents a ∼0.6 to
1.5 kDa molecule, based on putrescine or spermidine and comprised of several propy-
limine units, which are usually N-methylated. Therefore, the chemical structures of
polyamine-modified lysine residues in silaffins are very similar to the oligo-N-methyl-
propylamines units of LCPAs, thus implying a commonality in LCPA biosynthesis and
post-translational modification of silaffins. The analysis of T. pseudonana genome re-
vealed the presence of a group of N-aminopropyltransferases [99], sometimes fused to
a eukaryotic Tudor domains, that bind histones on N-methylated lysines [100]. Thus,
these putative multi-domain enzymes may be involved in post-translational modifi-
cation of silaffin proteins in a targeted and site-specific way, but the details are only
beginning to be elucidated.
All silaffin PTMs known to date are summarized in the Table 1.1. Apparently, lysine
polyamine modifications, unlike other PTMs, are present in all silaffins. Consequently,
specific ε-polyamine modifications of lysines may be a characteristic feature of proteins
involved in silicon biomineralization. Consequently, a molecular definition of silaffins
as a protein class in the absence of sequence conservation may be based on the pres-
ence of lysine polyamine chains with varying length and methylation degree. The
site-specific location and spacing of positively charged lysine modifications in silaffins
may be crucial for its silica-binding function. At the same time, the potential variety
of lysine PTMs may be associated with the phylogenetic relationships among diatom
species and their incredible morphological diversity. Hence, elucidation of the chem-
ical modifications of silicifying proteins should provide a more complete mechanistic
understanding of biomineralization processes in diatoms.
Over the last decades, the chemical understanding of lysine modifications in silaffins
has substantially advanced, however, there is still a significant gap between currently
known modifications and the full complexity of endogenous silaffin modifications
PTMs. To put this work in the context of previous studies, we point out methodological
limitations of most proteomic analyses of silaffin proteins. Until present time the dis-
1.3 the role of polyamine ptms in diatom biosilicification 19
Table 1.1 Overview of silaffin PTMs identified from different diatom species. See [33, 46, 101]for review.
Diatom Silaffin
Post-translational modifications
Referencesat lysineat proline at hydroxyl
amino acidsat ε-amino group at δ-position
C. fusiformisnatSil1AnatSil1B
Methylation,ε-polyamine chains
Hydroxylation,phosphorylation Not present Phosphorylation [57, 64, 66]
natSil2 Methylation,ε-polyamine chains Unknown Hydroxyproline
Phosphorylation,glycosylation,
sulfation
[83]
T. pseudonana
tpSil1/2Methylation,
ε-polyamine chains Unknown Hydroxyproline,dihydroxyproline [67]
tpSil3 Methylation,ε-polyamine chains Hydroxylation Not present [67, 68]
tpSil4 Unknown Unknown Unknown Unknown [69]
N. pelliculosa Unknown Methylation Hydroxylation,phosphorylation
Hydroxyproline,dihydroxyproline Unknown [51, 52]
E. zodiacus Unknown Methylation,ε-polyamine chains Unknown Unknown Unknown [102]
covery of biosilica-embedded proteins either relied on laborious biochemical analyses
of purified silaffins [65–68], or indirect methods such as whole genome expression pro-
filing under different stress conditions [49, 103–106]. Although these studies identified
many protein candidates potentially involved in silica formation, all the utilized ap-
proaches are not suitable for silaffin PTM characterization. Despite great biological in-
terest in lysine PTMs, our knowledge of modification sites is limited to a few proteins in
a couple of evolutionary distant species (including T. pseudonana and C. fusiformis). This
substantially hampers or even precludes the comparison of polyamine protein-bound
structures in the context of evolutionary and functional conservation. Addressing these
questions is challenging at both the biological and methodological level, and prompts
the development of new analytical strategies and chemical methods for PTM charac-
terization. High-throughput approaches for the identification of PTMs are now being
developed. Recent advances in MS instrumentation coupled to the development of an-
alytical methods over the past several years now allow us to investigate the biosilicome
on a global scale.
20 introduction
1.4 mass spectrometry in ptm discovery
The characterization of protein post-translational modifications (PTMs) remains one of
the major challenges of MS-based proteomics. Historically, one of the first applications
of mass spectrometry in protein research was mapping of a PTMs on a single pro-
tein [107]. Although until recently mass spectrometers have substantially evolved, the
basic operating principles of these instruments remain conceptually the same. In the
current section, recent advances in development of analytical approaches, instrumen-
tation, and bioinformatics analyses, as well as their implications for characterization of
silaffin PTMs will be discussed.
1.4.1 Modification-specific proteomics
In general, mass spectrometric detection of PTMs can be achieved via three strate-
gies: top-down, middle-down, bottom-up approaches (Fig. 1.6a–1.6c) [108]. The bottom-up
proteomics approach Fig. 1.6c represents by far the most commonly applied strategy
for the chemical characterization of protein modifications [109]. This method refers
to the analysis of modified peptides released from the protein by enzymatic cleav-
age. In this approach, peptides are usually obtained via digestion of the protein with
a site-specific proteolytic enzyme(s), typically trypsin. Proteins can be digested in-
solution, or pre-fractionated by sodium dodecyl sulfate polyacrylamide gel electropho-
resis (SDS-PAGE) followed by in-gel digestion [110, 111]. The latter method allows
removal of low-molecular-weight contaminants already at electrophoresis step and in-
creases resolution in analytical separations. The resulting protein digest is separated
by reversed-phase liquid chromatography (RPLC), which is followed by MS/MS frag-
mentation. In most cases the observed mass shift in a peptide mass spectrum indicates
a certain PTM type. By searching for the corresponding mass shift, modified peptides
can be identified and the PTM sites mapped back to the protein sequence.
In contrast to the bottom-up approach, a ‘top-down’ analysis Fig. 1.6a can provide a
global view on PTMs present in intact proteins. PTM characterization by a top-down
approach may be achieved with nonergodic fragmentation techniques such as ETD and
ECD. However, top-down approach is less sensitive than bottom-up, and data interpre-
1.4 mass spectrometry in ptm discovery 21
Separaon
of proteins
MS-analysis
of intact proteins
(≤ 50 kDa)
Inte
nsit
y
Inte
nsit
y
MS/MS
(protein sequences)
LC-MS
(intact protein masses)
Protein mixture
(a) Top-down proteomics
Digeson
(Asp-N, Glu-C, etc.)
Separaon
of pepdes
MS-analysis
of pepdes
(~2-20 kDa)
Inte
nsit
y
Inte
nsit
y
MS/MS
(pepde sequences)
LC-MS
(intact pepde masses)
Protein mixture
(b) Middle-down proteomics
Digeson
(Trypsin)
Separaon
of pepdes
MS-analysis
of pepdes
(~0.5-3 kDa)
Inte
nsit
y
Inte
nsit
y
MS/MS
(pepde sequences)
LC-MS
(intact pepde masses)
Protein mixture
(c) Bottom-up proteomics
Figure 1.6 Proteomics approaches
tation may be non-trivial due to the higher complexity of both MS1 and MS2 spectra
from multiply charged precursor ions [112]. Here, middle-down strategy Fig. 1.6b, in
which proteins are digested into peptides commonly in the 3 to 9 kDa range, might rep-
resent an appropriate compromise, combining both the sensitivity and global overview
of silaffin PTM complexity. However, similarly to a top-down approach, longer pep-
tides (>3 kDa) generated in a middle-down fashion have much wider charge-state dis-
tributions as compared to bottom-up peptides, thus reducing the overall signal sensi-
tivity. Therefore, the bottom-up approach, which usually involves high-performance
liquid chromatography (HPLC) separation of in-gel-digested proteins, clearly demon-
strates an optimal sensitivity for mapping of silaffin PTMs. However, the use of con-
ventional trypsin-based bottom-up approach for lysine PTM mapping appears to be
premature, because it implies that both the ‘intact’ protein sequence and the PTMs are
exactly known.
We therefore shifted a ‘classical’ bottom-up paradigm towards a prior analysis of
modified lysines. Biosilica-embedded proteins can be broken down to amino acids by
22 introduction
hydrolysis, while methylated and polyamine-modified lysines are stable to both acid
and alkali treatment. Cleavage with hydrochloric acid (HCl) is the most common hy-
drolysis method [113], which was first applied in diatom research by Nakajima and
Volcani [51, 52] and further utilized by the groups of Sumper and Kröger to identify
polyamine-modified lysines in isolated silaffins [64–68, 83]. Surprisingly, the full quali-
tative and quantitative lysine PTM profiling has never been done so far in total diatom
biosilica extracts, while it represents a prerequisite for any further PTM mapping stud-
ies. The important advantage of this straightforward approach lies in its simplicity and
ability to characterize the lysine modifications in total biosilica hydrolysates without
any additional treatment. However, analysis of ε-polyamine-modified lysines displays
multiple analytical challenges, as discussed further.
1.4.2 Analysis of polyamine-modified lysines by mass spectrometry
Identification of amino acids in acidic hydrolysates is a classical protein analysis method,
which has been referred to as amino acid analysis (AAA) [114]. Therefore, identifica-
tion of lysines along with their PTMs represents a variation of AAA, though it is
specifically focused on profiling of post-translationally modified lysines. Hydrolysis
using hydrochloric acid (HCl) is currently universally applied to AAA, because HCl
can cleave peptide bonds completely independent of the amino acid sequence and
PTMs [113]. Hence, AAA is applicable to highly-modified proteins such as silaffins,
that are not easy to be analyzed by enzymatic proteolysis. Most amino acids are ob-
tained quantitatively from protein hydrolysates by HCl, which can easily be removed
by evaporation afterwards. Therefore, HCl hydrolysis provides a generic and straight-
forward method, which combines simplicity, accuracy, and wide applicability.
After HCl cleavage of biosilica-associated proteins, ε-polyamine-modified lysines
have to be analyzed in total biosilica hydrolysates. To address this issue, the RPLC sepa-
ration methods were traditionally used to separate, identify and quantify components
in complex mixtures. However, the conventional RPLC separation of underivatized
polyamines, or (in our case) ε-polyamine-modified and methylated lysines is challeng-
ing due to the low retention of hydrophilic and highly charged polyamines, and their
susceptibility to undergo severe tailing [115]. Hence, polyamine liquid chromatogra-
1.4 mass spectrometry in ptm discovery 23
phy (LC) either requires the utilization of ion-pairing techniques, which are generally
poorly compatible with MS, or hydrophilic interaction chromatography (HILIC) alter-
natives, which suffer of lower separation efficiency compared to the RPLC.
Due to the reasons above polyamine molecules are commonly analyzed by making
them react with different derivatizing agents [116] with attachment of bulky hydropho-
bic groups and thus enhancing the hydrophobicity of the derivatized compounds,
which could be separated on a reversed-phase column, resulting in higher sensitiv-
ity in comparison to underivatized molecules. Most of these derivatization reagents,
however, exhibit certain disadvantages and limitations, including derivative instability,
inconsistent production of derivatives, inability to derivatize secondary amino groups,
necessity of removal of excess reagent to avoid rapid RPLC column deterioration and
poor compatibility with ESI-MS [117–124].
+
AQC QAC-derivative NHS1° or 2° amine
+pH~9
Figure 1.7 AQC derivatization chemistry. Primary and secondary amines react with 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC) yielding 6-quinolinylaminocarbonyl-amines (QAC-derivatives) and N-hydroxysuccinimide (NHS)
In contrast to other reagents, derivatization with 6-aminoquinolyl-N-hydroxysuccin-
imidyl carbamate (AQC)8 was demonstrated to be a simple, fast and reproducible
pre-column derivatization [126–134]. AQC increases chromatographic retention and
improves electrospray ionization (ESI) of hydrophilic molecules, thus making them
directly amenable for RPLC. The AQC reagent uses the N-hydroxysuccinimide (NHS)-
activated heterocyclic carbamate to quantitatively derivatize primary and secondary
amino groups, converting them to stable hydrophobic 6-quinolinylaminocarbonyl (QAC)-
derivatives (the reaction chemistry is depicted in Fig. 1.7). The excess of AQC reagent
reacts with water to form 6-aminoquinoline (AMQ), which can be easily separated
from the QAC-derivatives (see Fig. A.1 for potential degradation ways). Therefore, this
8 According to the recommendations of the IUPAC for nomenclature of organic compounds [125]it was also called N-hydroxysuccinimidyl 6-quinolinyl carbamate (HSQC), CAS registry number148757-94-2 [126]. For simplicity reasons and regarding to the common use in the literature, withinthis thesis it was named AQC.
24 introduction
fluorescent derivatizing reagent originally designed for UV-absorbance detection [127],
can be applied for relative quantification with ESI-MS with higher sensitivity and
specificity. Consequently, the ease of AQC derivatization and the stability of QAC-
derivatives makes this approach ideally suitable for accurate profiling of lysine polyamines
in crude biosilica extracts.
After all the QAC-derivatives of polyamine-modified lysines are catalogued and
quantified in total biosilica hydrolysates by LC-MS/MS, lysine PTMs can be mapped to
specific sites in biosilica-associated proteins by the bottom-up proteomic strategy. The
latter usually implies pre-fractionation of intact proteins and enzymatically cleaved
peptides before mass spectrometric analysis [109], which will be covered in the follow-
ing section.
1.4.3 Fractionation of proteins and peptides prior to mass spectrometry analysis
Pre-fractionation of proteins by sodium dodecyl sulfate polyacrylamide gel electro-
phoresis (SDS-PAGE) followed by in-gel digestion is the most commonly used sample
preparation technique in proteomics to date [110, 111]. This gel-based approach allows
reliable elimination of common contaminants (such as detergent and salts), meaning
that essentially any chemicals can be used for sample preparation prior to the gel.
Following in-gel digestion or acidic hydrolysis of silaffin proteins, the resulting com-
plex peptide or amino acid mixture needs to be (at least partly) fractionated prior to
introduction into the mass spectrometer. For this, reversed-phase liquid chromatogra-
phy (RPLC) separates individual compounds mainly by their hydrophobicity, making
it by far the preferred method used for separation of peptides released by enzymatic
digestion. Typically, octadecyl carbon chain C18-bonded silica used for the column
packing (‘stationary phase’), whereas the acidified water and acetonitrile solvents (‘mo-
bile phase’) used for creation of concentration gradients, which refers to the increase of
the concentration of one solvent versus another during the MS acquisition. The ion-
ization of non-volatile peptides is usually achieved by soft ionization techniques such
as electrospray ionization (ESI) and nESI (nano-ESI), where a high voltage is used to
create an electrostatically charged spray that triggers the desolvation of peptides from
solvent droplets into the gas phase [135]. Due to direct compatibility, RPLC can be
1.4 mass spectrometry in ptm discovery 25
coupled online to ESI ion source of a mass spectrometer. RPLC separation efficiency
depends on many factors such as column parameters (dimensions, adsorbent surface
chemistry, material particle size and packing, etc.), separation conditions (column tem-
perature, solvent flow rate, etc.), eluent types and composition, and the chemical nature
of the components. Therefore, it is possible to achieve an optimal selectivity and sen-
sitivity via proper adjustment of these parameters for each individual compound of
interest. Next, extensively pre-fractionated protein digest is subjected to tandem mass
spectrometry (MS/MS) analysis.
1.4.4 Tandem mass spectrometry analysis in modification-specific proteomics
Mass spectrometry (MS) has been widely recognized as a superior analytical technique
in proteomics and has gained an important role in the analysis of PTMs. To date, a
wide variety of different mass spectrometry instrument configurations, that differ in
performance capabilities (i. e., ionization method, scan speed, resolution, sensitivity,
and mass range) have been developed for proteomics. A major breakthrough was the
introduction of Orbitrap™ (Thermo Fisher Scientific) mass spectrometers that are
recognized as a gold standard for mass spectrometry-based proteomics at the moment.
Most of the experiments reported in this thesis were carried out using a ‘hybrid’ tan-
dem mass spectrometer, Linear Trap Quadropole (LTQ) Orbitrap Velos combining a
linear ion trap with the Orbitrap analyzer [136]. This instrument can routinely achieve
high resolution and mass accuracy, providing confident identification of PTMs with
high sensitivity and throughput. The principal scheme of this instrument is shown in
Fig. 1.8.
Typically, the mass spectrometer measure the mass-to-charge ratio (m/z) of gas-phase
molecular ions within certain resolution and m/z range. Gas-phase ionized molecules
arrive from ESI source (a) through the ion optics (b) to linear IT, or Linear Trap
Quadropole (LTQ). Trapped ions of a defined m/z range can be isolated and analyzed
using tandem mass spectrometry (MS/MS) in high pressure IT (c), where fragmenta-
tion is achieved upon collisions with molecules of neutral gas, hence termed collision-
induced dissociation (CID). The resulting fragments are then detected in low pressure
IT (d) as they hit a mass detector. Alternatively, the precursor ions can be transported
26 introduction
(e)(a) (d) (f )(c)
(g)
to vacuum pumps
ion transfer
ion optics
(b)
Figure 1.8 Schematic view of the LTQ Orbitrap Velos mass spectrometer. The hybrid config-uration couples a linear ion trap (IT) to a high resolution mass analyzer (Orbitrap). The in-strument is equipped to perform both collision-induced dissociation (CID) and higher-energycollisional dissociation (HCD) mode of peptide fragmentation. (a) electrospray ion source; (b)stacked ring ion guide (S-lens); (c) high pressure IT (6.7× 10−3 mbar); (d) low pressure IT(4.7× 10−4 mbar) for CID; (e) the curved ion trap (C-trap); (f) HCD collision cell; (g) Orbitrap.Figure is adapted from Olsen et al. [136].
all the way from IT to the C-trap (e) and further into the external collision cell (f), where
higher-energy collisional dissociation (HCD) fragmentation takes place. The ions are
then returned to the C-trap, from where they are ejected into the Orbitrap mass ana-
lyzer (g) by the high-energy pulse. This energy force the ions to circulate around the
central rod electrode and oscillate with different axial frequencies that are proportional
to their m/z. However, the acquisition speed of Orbitrap-HCD is about half of what is
found for IT-CID fragmentation spectra. Despite this limitation, higher resolution and
mass accuracy of MS/MS spectra, provided by Orbitrap mass analyzer9, outperform
those of CID spectra acquired in IT. Moreover, IT-CID fragmentation does not allow for
trapping of fragment masses below ‘low mass cut-off’, or the so-called ‘1/3 rule’ (∼28% of
the precursor mass). In contrast to IT-based CID, HCD with Orbitrap detection is less
affected by low mass cut-off limitation and supports detection of lower m/z region10. It
is highly advantageous for fragmentation of modified peptides, where a covalent mod-
ification may give rise to a ‘characteristic’ low-molecular fragments, also denominated
as ‘marker’, ‘diagnostic’ or ‘reporter’ ions. These characteristic fragments are analytically
very useful, since their occurrence is a reliable indicator for the presence of the cor-
9 A m/z range from 100 to 2000 can be measured in 1.3 s at a targeted resolution 130 000 at m/z 400(Rm/z 400), whereas it declines proportionally as 1/√m/z
10 The C-trap scheme is better for transmission and detection of low-molecular weight fragments, com-pared to quadrupole IT [137]: it is only limited by rf-amplitude of the C-trap and cuts off m/z below~1/20 of the precursor m/z and therefore does not compromise the detection of low-molecular fragments.
1.4 mass spectrometry in ptm discovery 27
responding modified amino acid residue (for review refer to [138]). Alternatively, in
some cases, the peptide modification may give rise to a strong neutral loss, where all
charge is retained on one dominant fragment ion, suppressing the relative ion abun-
dance of other fragmentation events. Here HCD demonstrates another clear advantage
over CID, allowing multiple fragmentation events and resulting in richer MS/MS spec-
tra of modified peptides, where the neutral loss may be unproductive.
As seen from above, the hybrid mass spectrometer combines complementary frag-
mentation by IT-CID and Orbitrap-HCD, presenting the advantages of speed for the
former and accurate measurements within wide m/z range for the latter. A common op-
erational mode to control the MS/MS acquisition process is based on data-dependent
acquisition (DDA), where the most abundant precursor ions are selected for MS/MS
analysis. This process is depicted in Fig. 1.9b. In this strategy, the mass spectrometer is
programmed to select the ions with predefined features (e. g., charge state) in the full
scan (MS1) for fragmentation in a cyclic way, so that N of the most intense precursors
(TopN) are subsequently subjected either to CID or HCD MS/MS scan (MS2). The
m/z values of fragmented precursors are placed into the dynamic ‘exclusion’ list for a
certain period of time to allow the successive fragmentation round of next abundant
precursor. Typically, each MS1 is followed by several MS2 rounds, alternating back-to-
back CID and HCD fragmentation for same precursor ion, after which the new DDA
cycle commences. Therefore, DDA attempts to optimize productivity of MS/MS by
minimization of redundant peptide precursor selection and maximization the number
of peptide identifications.
Analysis of MS/MS spectra provides information on the molecular weight of the
fragment ions and enables extrapolating its sequence and position of PTM sites [141].
In LTQ Orbitrap Velos peptide ions are fragmented by either one of the collision-
induced techniques: collision-induced dissociation (CID) and higher-energy collisional
dissociation (HCD). Here the peptide backbone cleavage occurs through the minimum-
energy path via breakage of amide CO NH bonds (marked by in Fig. 1.9c). The
most common peptide fragments that produced by low-energy collisions are b- and
y-ions, highlighted in the Fig. 1.9c in red. The mass differences between y-ion series in-
dicate the amino-acid sequence, which could be read from C- to N-terminus. Typically,
HCD spectra (in contrast to CID) contain ions in the low m/z range including y1, y2
and immonium ions (IMs) of modified residues. In addition, a-ions can occur, which
28 introduction
RT: 0.00 - 154.98
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
51.74
60.0021.29
42.89
57.5542.64 51.3224.09
40.39
9.19 43.07 77.0774.1730.12
15.077.96 47.03 63.5033.87
94.3993.68 96.62134.52 136.02123.27118.257.18 107.68
Re
la
ve
ab
un
da
nce
, %
Retenon me, min
(a) total ion chromatogram (TIC)
400 450 500 550 600 650 700 750 800 850 900 950 1000
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
558.3260z=2
660.3243z=2
776.3519z=2525.2667
z=2
620.3027
z=1
477.8230
z=1
Inte
nsi
ty
m/z
Inte
nsi
ty
m/z
Inte
nsi
ty
m/z
Next MS1
survey
MS/MS 1st most
intense precursor
MS/MS 2nd most
intense precursor
MS/MS 3rd most
intense precursor
Re
la
ve a
bu
nd
an
ce,
%
MS1 scan
(b) full scan (MS1) survey and data-dependent acquisition (DDA) of the Top3 most intense precursors
these precursors are excluded from fragmentation in the next cycle
Figure 1.9 Principles of data-dependent acquisition (DDA) and peptide fragmentation: (a)total ion chromatogram (TIC) of Asp-N digest of biosilica AFSM extract; (b) the DDA cycle ofthe Top3 most intense precursors; (c) fragment ion nomenclature.
1.4 mass spectrometry in ptm discovery 29
a2b2
c2
x5
y5
z5
y4
y3
y2
y1
y6
b1 b3 b4 b5 b6(c) Fragment ion nomenclature [139, 140]. This schematic diagram shows N-terminal a,
b and c ions and C-terminal x, y and z ions for a seven amino acid peptide
represent a CO loss from the b-ions11. If the peptide is modified, ion series containing
the modification will demonstrate the corresponding mass shift. Additionally, for cer-
tain modifications the characteristic fragments or neutral losses occur. E. g., H3PO4-loss
is typically observed for phosphorylation (−98 Da) and CH3SOH-loss for methionine
oxidation (−64 Da).
As shown in Fig. 1.9c, fragments in b- and y-ion series differ from each other by
mass of one amino acid residue. Consequently, it is possible to derive sequence in-
formation from peptide fragment spectra. However, the experimental MS/MS spectra
often demonstrate incomplete ion series, intervening peaks from co-isolated precursors,
etc. Assuming that MS-acquisition is performed in course of HPLC run, it generates
increasingly complex datasets containing a large number of MS/MS spectra. Conse-
quently, the subsequent analysis have to face a huge amount of generated data. The
presence of PTMs boosts the combinatorial explosion in the number of potential pro-
tein modification states, which excessively complicates the data analysis of proteomics
data and leads to high number of false-positive identifications [143, 144]. Therefore,
modification-specific analysis of silaffins requires a toolkit of dedicated computational
and statistical methods, which will be reviewed in Section 1.4.5.
11 This loss is usually observed for b2-ion and generates HCD-characteristic a2/b2-pair in the lower massrange [142] (see Fig. 1.9c)
30 introduction
1.4.5 Bioinformatics tools for modification-specific proteomics
Bioinformatics represents an essential aspect of MS-based proteomics, especially due
to the complex datasets produced by modification-specific proteomic analysis. PTM
annotation using MS and MS/MS data can be achieved using automated search tools.
Prior to the search against protein sequence database, the peptide-derived spectral data
is converted into a peak list containing the m/z value of the precursor and its fragments.
Several algorithms have been developed to carry out this analysis, where peptide can-
didates are assigned to an experimental spectrum within the shortest possible time
and ranked by an empirical or statistical peptide-spectrum match (PSM) score. Among
the most popular search engines are Mascot and SEQUEST, while many others exist
(for review refer to [145–147]). However, the identification rate for each type of soft-
ware is limited, irrespective of its sophistication and algorithmic basis, due to MS/MS
spectra from co-isolated precursors, unspecific or unexpected protease cleavage, incor-
rect monoisotopic peak assignments or charge state determinations. Hence, database
searching using tandem mass spectra is particularly challenging, especially due to the
complex nature of multiple coexisting lysine PTMs in silaffin proteins. This effect is
much more noticeable for the analysis of larger peptides as their long sequences can
be modified several times with different modifications that exceedingly complicates
database searches. Therefore, specialized computational approaches need to be imple-
mented to allow for a comprehensive assignment of silaffin PTMs.
Commonly, a modification-specific MS-data analysis strategy can be divided into
three major steps: pre-processing of the MS output data, PTM search against pro-
tein sequence database, and statistical validation of PSM and PTM assignment results.
The goal of the preprocessing step is to increase the quality of a subsequent database
search results, which may include cleaning, deisotoping and deconvolution of raw
peptide MS/MS spectra [148–151]. In a second step, before initiating the database
search, the user is asked to specify a list of PTMs that can be searched as ‘fixed’ or
‘variable’ [152]. Fixed modifications are applied universally, while variable modifica-
tions are those which may or may not be present. Finally, the database search result
typically represent a mix of correct and false PSMs. The confident way to filter for con-
fident PSMs hits is the use of a composite target-decoy database, which is created by
reversing or scrambling protein sequences from a target database and is then appended
1.5 rationale of the thesis 31
to it. Under the assumption that random decoy PSMs and target matches follow the
same distribution, a score cut-off corresponding to certain false discovery rate (FDR)
can be estimated [153]. In addition to FDR evaluation, another goal of the validation
step of the modification-specific analysis is the precise localization of PTMs within a
peptide [152]. To this end, the corresponding algorithms can assess the cumulative bi-
nomial probability of correct site localization using Mascot Delta Score (MD-score) [154]
or “localization probability score,” which is integrated into MaxQuant [155]. However, the
validation of site-specific PTM assignments very often relies on the visual inspection
of the search results and therefore has to be often verified manually.
1.5 rationale of the thesis
The characterization of silaffin PTMs is one of the major tasks that has to be accom-
plished in proteomics of diatom biosilica. As discussed in Section 1.3.2, silaffin pro-
teins undergo multiple post-translational processing events, which include proteolytic
cleavage and covalent addition of a modifying groups to various amino acid residues.
Lysine modifications represent the most remarkable class of silaffin PTMs, because they
are hypothesized to modulate chemical properties of biosilica-associated proteins and
regulate their silica-precipitation function. Our knowledge of lysine PTMs is limited to
a few proteins from phylogenetically distant species, which preclude any conclusions
according to their structural relationship and functional importance.
Since silaffin lysine PTMs are highly heterogeneous, it is necessary first to create a cat-
alogue of polyamine-modified lysines present in all studied diatom species. Although
the profiling of lysine modifications may be achieved by the developed technique in
any biosilica extract, the further localization of pre-profiled PTMs to weakly homolo-
gous protein sequences (refer to Section 1.3.1) ultimately requires the availability of a
complete genome or at least a substantial part of cDNA sequences. Luckily, the recent
advances in genomic sequencing have enabled a unique opportunity to localize silaffin
PTMs to specific sites in biosilica-associated proteins. Currently, six diatom genome
sequences have been published [13–16], providing resources for deeper research into
the proteome of these organisms. Therefore, the current study has been focused on
32 introduction
(a) Phylogenetic tree for three diatomspecies
1μm
(b) T. pseudonana (valveview)
1μm
(c) T. pseudonana (girdle view)
1μm
(d) C. cryptica (valve view)
1μm
(e) C. cryptica (girdle view)
1μm
(f ) T. oceanica (valve view)
1μm
(g) T. oceanica (girdle view)
Figure 1.9 Phylogenetic tree [156] and scanning electron microscope (SEM) images of the cellwalls of three centric diatom species. These diatom species have circular valves with varyingornamentation between central and rim region of the valve face (Fig. 1.9b, 1.9d and 1.9f), whiletheir girdle bands have less structural complexity and look similar to each other (Fig. 1.9c,1.9e and 1.9g). Biosilica architectures of these diatom species demonstrate clear differences:T. oceanica (Fig. 1.9f) has a much smoother valve surface throughout most of the valve areain contrast to T. pseudonana (Fig. 1.9b) and C. cryptica (Fig. 1.9d). At the same time, all threediatom species possess tube-like features at the valve rim; the T. oceanica shell, however, almostentirely lacks elevated mesh-like ridges on the valve surface, which are present only in the rimarea (Fig. 1.9f). SEM images are courtesy of D. Pawolski.
1.5 rationale of the thesis 33
three centric diatom species with sequenced genomes: Thalassiosira pseudonana [13],
T. oceanica [14] and Cyclotella cryptica [15]. Their biosilica cell wall structures, presented
in Fig. 1.9, display heterogeneous morphology that reflects their phylogenetic prox-
imity, where T. pseudonana is more closely related to C. cryptica than to T. oceanica [156,
157].
In this thesis, I would like to expand the number of profiled lysine modifications to
several phylogenetically-related species in order to compare their profiles in different
diatom species. Therefore, this work aimed at the development of the corresponding
analytical method for analysis of lysine polyamine modifications in biosilicifying pro-
teins. After the proof-of-concept, the method will be applied to biosilica extracts from
the three diatom species. Furthermore, lysine PTMs need to be localized at biosilica-
associated protein sequences. This effort would eventually result in determination of
consensus modification sites, which is a key requirement for mechanistic understand-
ing of post-translational modification machinery in diatom biosilica.
2 A I M O F T H E T H E S I S
The function of biosilica precipitating proteins is largely defined by the presence of
lysine post-translational modifications (PTMs), and exploring their diversity is critical
for a mechanistic understanding of the biomineralization process in diatoms.
The primary aim of this study is to define consensus motifs that comprise a ‘Rosetta
stone’ for the lysine modification code for biosilica-associated proteins.
Three goals were to be addressed during the research:
1. Establish an analytical method for lysine PTM profiling that shifts the conven-
tional bottom-up approach towards analysis of modified lysines in total biosilica
hydrolysates (Section 3.1).
2. Apply the method above for global profiling of lysine modifications in biosilica
extracts from three closely related diatom species: T. pseudonana, T. oceanica and
C. cryptica. To this end, we examine whether similarities at the molecular level
follow evolutionary proximity (Section 3.2).
3. Map of pre-profiled polyamine PTMs back to protein sequences in order to de-
termine consensus motifs for polyamine modifications in biosilica-associated pro-
teins (Section 3.3).
35
3 R E S U LT S A N D D I S C U S S I O N
Contents3.1 A method for analysis of ε-polyamine PTMs . . . . . . . . . . . . . . . . 38
3.1.1 Establishing a method to analyse ε-polyamines . . . . . . . . . . 38
3.1.2 Method applicability for lysine PTM profiling . . . . . . . . . . 40
3.1.3 Profiling of lysine PTMs in silaffin-3 . . . . . . . . . . . . . . . . 43
3.2 Profiling lysine PTMs in biosilica extracts . . . . . . . . . . . . . . . . . 46
3.2.1 Lysine PTM profile and characteristic fragments . . . . . . . . . 47
3.2.2 Elucidation of phosphopolyamine structures . . . . . . . . . . . 59
3.2.3 Lysine PTM profiles of AFSM extracts . . . . . . . . . . . . . . . 62
3.2.4 Comparison of AFIM and AFSM profiles in T. pseudonana . . . . 65
3.2.5 Phylogenetic relationship across three diatom species . . . . . . 67
3.3 PTM localization and discovery of consensus motifs . . . . . . . . . . . 72
3.3.1 Multiple protease strategy for mapping lysine PTMs . . . . . . 72
3.3.2 Selection of deprotection technique . . . . . . . . . . . . . . . . . 74
3.3.3 Mapping lysine PTMs on tpSil3 using iterative search strategy . 77
3.3.4 Deconvolution of raw MS/MS spectra . . . . . . . . . . . . . . . 80
3.3.5 PTM mapping by polyamine-specific fragments . . . . . . . . . 83
3.3.6 Identification of consensus motifs harboring lysine PTMs . . . . 85
37
38 results and discussion
3.1 a method for analysis of ε-polyamine ptms: a
proof-of-concept
3.1.1 Establishing a method to analyse ε-polyamine PTMs in biosilica hydrolysates
This study aimed to establish a selective and sensitive method for analysis of polyamine-
modified lysines. According to the previously identified compounds (refer to Fig. 1.5
in Section 1.3.3), lysine ε-polyamine modifications exhibit predictable structures con-
sisting of repeating propyleneimine (hereafter denoted as propylamine) units and N-
methyl groups. A general formula for lysine modifications is displayed in Fig. 3.1,
where structural units are color-coded. Each structure consists of several propylamines
that are attached linearly to ε-amine of lysine residue (PA0, PA1, PA2, PA3, ...), display-
ing different degree of N-methylation (Me1, Me2, Me3, ...). Additionally, the lysine
side-chain can be hydroxylated (Hydroxy), whereas this side hydroxyl can be phos-
phorylated (Phospho).
The list of m/z values calculated according to the generic formula can be used for tar-
geted MS/MS analysis of ε-polyamine-modified and methylated lysines (Table 3.1).
Previously, direct infusion ESI-MS/MS was applied in a number of studies allow-
ing characterization of modified lysine residues cleaved from biosilica-associated pro-
teins [6, 52, 57, 64, 67, 68, 83, 90]. However, being simple and rapid, the direct infusion
analysis would not optimally allow discrimination and quantification of multiple com-
pounds with isobaric molecular masses corresponding to the structural formula in
Fig. 3.1 (e. g., the molecules with different position of N-methyl groups). As of now,
this limitation precluded mainly the full qualitative and quantitative profiling of lysine
modifications present in total hydrolysates of the whole biosilica extracts. Therefore,
we set up a novel technique, aiming not to replace the existing one, but rather to
complement it with the structural profiling and quantitation of isomeric ε-polyamine
modifications.
We established a method, which includes acidic hydrolysis and derivatization by
AQC followed by LC-MS/MS analysis. The analyzed polyamine structures display
different degrees of N-methylation, allowing a maximum of 2n + 1 additional methyl
groups per molecule (where n represents a number of nitrogens in polyamine back-
3.1 a method for analysis of ε-polyamine ptms 39
+ + +PA1 PA2 PA3
+
Me1
...
Me3 Me7Me5
Me2 Me4...
Me6
Phospho
Hydroxy R1
R2
R3
R4
R5
R6
R7
Lysine (PA0).
Figure 3.1 Generic structure of the lysine post-translational modifications in biosilica-associated proteins. PA0, lysine residue; PA1, PA2, PA3, propylamine units; Me1–Me7, N-methylation positions (where R1–R7 = H or CH3); δ-hydroxylation of lysine (Hydroxy); phos-phorylation of side hydroxyl (Phospho).
Backbone Me0 Me1 Me2 Me3 Me4 Me5 Me6 Me7 ...
Lys-PA0 147.1128 161.1285 175.1442 189.1599 — — — — ...
Lys-PA1 204.1706 218.1863 232.2020 246.2177 260.2334 274.2491 — — ...
Lys-PA1-PA2 261.2284 275.2441 289.2598 303.2755 317.2912 331.3069 345.3226 359.3383 ...
... ... ... ... ... ... ... ... ... ...
Hydroxy-Lys-PA0 163.1077 177.1234 191.1391 205.1548 — — — — ...
Hydroxy-Lys-PA1 220.1655 234.1812 248.1969 262.2126 276.2283 290.2440 — — ...
Hydroxy-Lys-PA1-PA2 277.2233 291.2390 305.2547 319.2704 333.2861 347.3018 361.3175 375.3332 ...
... ... ... ... ... ... ... ... ... ...
Phospho-Hydroxy-Lys-PA0 243.0740 257.0897 271.1054 285.1211 — — — — ...
Phospho-Hydroxy-Lys-PA1 300.1318 314.1475 328.1632 342.1789 356.1946 370.2103 — — ...
Phospho-Hydroxy-Lys-PA1-PA2 357.1896 371.2053 385.2210 399.2367 413.2524 427.2681 441.2838 455.2995 ...
... ... ... ... ... ... ... ... ... ...
Table 3.1 Calculated m/z values of singly-protonated molecular species for ε-polyamine struc-tures from Fig. 3.1. Accounted for singly-charged molecular species. Furthermore, each struc-ture was supplemented with respective number of derivatization groups (N×QAC) attachedto primary and secondary amines (shown in Table A.1). Propylamine units (PA0, PA1, PA2);N-methyl groups (Me1–Me7); δ-hydroxylation of lysine (Hydroxy); phosphorylation of sidehydroxyl (Phospho).
bone). As discussed in Section 1.4.2, AQC reagent quantitatively derivatizes primary
and secondary amines [127]. Consequently, the corresponding number of attached
derivatization groups (N×QAC) indicates how many non-methylated nitrogens are
present in the ε-polyamine chain. The N×QAC groups is therefore helpful to resolve
ambiguities between multiple isobaric molecules, allowing discrimination of struc-
tural isomers including polyamine chains of varying structures. The resulting QAC-
derivatives can be quantified by the integration of extracted-ion-chromatogram (XIC)
40 results and discussion
peaks of their protonated molecular species. LC separation of QAC-derivatives is fol-
lowed by high resolution mass spectrometry (HRMS), which can be targeted to precal-
culated m/z values of anticipated ε-polyamine lysines (from Table 3.1) with the corre-
sponding number of QAC moieties attached. Tentatively assigned polyamines could be
further confirmed by LC-MS/MS analysis, thus providing detailed information regard-
ing both relative abundance and structure of covalently polyamine-modified lysines.
Finally, upon the MS/MS fragmentation QAC-derivatives generate the pronounced
fragment of m/z 171.0564, which could be used for multiple reaction monitoring (MRM)
experiments [130, 131].
However, the different number of QAC groups (N×QAC) attached to polyamine
derivatives may affect their ionization efficiency, thus biasing the instrument response.
Therefore, having an established analytical method at hand, we examined its applica-
bility for the analysis of synthetic and commercially available standards that contain
different number of primary and secondary amines (Section 3.1.2). To further demon-
strate the quantification accuracy, we applied the entire workflow to well-characterized
biosilica-associated protein silaffin-3 from T. pseudonana [67, 68] (Section 3.1.3).
3.1.2 Evaluation of the method applicability for profiling of lysine ε-polyamines
In order to investigate the response factors for lysine derivatives, the calibration curves
for molecules reacted with different number of derivatization groups (N×QAC) were
build. Additionally, method evaluation addressed the completeness of derivatization
reaction and stability of resulting QAC-derivatives. For this purpose stock solutions
of synthetic and commercially available standards with different number of primary
and secondary amines were used (all compounds are listed in Table 5.1 and Sec-
tion 5.1 of Materials and Methods). The structure and corresponding number
of QAC-groups attached to each of the analytical standards are depicted in Fig. 3.2.
The synthetic ornithine- and lysine-based ε-polyamines, or post-translational modifi-
cation (PTM) 275-orn and PTM 289 respectively1 (Fig. 3.2a and 3.2b), were synthe-
1 For reader convenience here and further all the lysine derivatives in the text are annotated with m/zvalues of singly protonated molecular ions. Similarly, QAC-derivatized molecules are denoted with them/z value following by the respective number of QAC moieties attached in parentheses (N×QAC). InSection 3.3 lysine PTMs mapped to the protein sequence are denoted with the nominal m/z value forsimplicity.
3.1 a method for analysis of ε-polyamine ptms 41
sized by Marina Abacilar in Armin Geyer laboratory (Philipps-Universität, Mar-
burg, Germany). The polyamine chain of these molecules corresponds to the abun-
dant lysine modification, which has been characterized previously in protein silaffin-3
from T. pseudonana (tpSil3) [67, 68]. The use of ornithine for internal standard deriva-
tive is explained by its abundance in diatom biosilica (concentrations are three orders
of magnitude lower than lysine, unpublished results). Much like spermidine poly-
amine (Fig. 3.2c), they react with three AQC molecules resulting in 3×QAC-derivatives.
Standards of unmodified lysine (Fig. 3.2d), ε-N-monomethyllysine (Fig. 3.2e), δ-hy-
droxylysine (Fig. 3.2f) accepted 2×QAC moieties, whereas ε-N,N,N-trimethyllysine
(Fig. 3.2h), ε-N,N-dimethyllysine (Fig. 3.2g), arginine (Fig. 3.2i), and proline (Fig. 3.2j)
were 1×QAC-derivatized.
Previously, ε-polyamine chains were demonstrated to be stable towards acidic hy-
drolysis [57]. At the same time, primary and secondary amines have different reaction
rates [122, 127]. Therefore, partial or incomplete derivatization of polyamines may bias
the quantification accuracy. To ensure the completeness of derivatization, the synthetic
polyamine-modified lysine consisting of two propylamine units with a dimethylated
terminal amine attached to ε-amine was used (PTM 289, Fig. 3.2b). The complete-
ness of derivatization reaction was assessed by the amount of incompletely derivatized
ε-polyamine, which did not exceed 1 % of total standard amount. QAC-derivatives
are stable at room temperature, while the excess reagent does not affect the analysis.
Moreover, during RPLC separation the sample is cleaned up with the mobile phase,
thus eliminating ion-suppression effect from borate buffer, which is poorly compatible
with ESI-mass spectrometry (MS). The QAC-derivatized ε-polyamine-modified lysine
was stable at 4 C in borate buffer within a week with decomposition degree less than
10 %; nevertheless, immediate analysis right after derivatization was preferred. Finally,
the completeness of AQC-derivatization of polyamines in crude biosilica hydrolysates
was confirmed by using ornithine-based internal standard (PTM 275-orn, see Fig. 3.2a),
which was spiked into each sample prior to the analysis.
Calibration curves produced for 2×QAC and 3×QAC-derivatives were linear within
three orders of magnitude (1000-fold dynamic range), demonstrating similar response
factors (see Fig. 3.2). Therefore no individual standards for each type of lysine modi-
fications because the instrument response is controlled by the number of QAC groups
attached. Therefore, it is safe to consider that all 2×QAC and 3×QAC (and presum-
42 results and discussion
1.0E+05
1.0E+06
1.0E+07
1.0E+08
1.0E+09
1.0E+10
1.0E+11
0.01 0.10 1.00 10.00 100.00
(a) PTM 289 R2=0.9943
(b) PTM 275 R2=0.9967
(c) PTM 146 R2=0.9925
(f) PTM 147 R2=0.9949
(e) PTM 161 R2=0.9980
(d) PTM 163 R2=0.9993
(h) PTM 175 R2=0.9929
(g) PTM 189 R2=0.9979
(i) PTM 173 R2=0.9880
(j) proline R2=0.9959
1×
QA
C2
×Q
AC
3×
QA
C
1×QAC
2×QAC
3×QAC
amount (loaded on-column),
log-scale, pmol
instrument response,
log-scale, a.u.
3×QAC
2×QAC
1×QAC
Calibration curves (logarithmic scale, Y - arbitrary abundance units, X – amount loaded on-column, pmol)
+ +
(a) PTM 275-orn (3×QAC)
+ +
(b) PTM 289 (3×QAC)
+ +
(c) PTM 146 (3×QAC)
+
+
(d) lysine (2×QAC)
+
+
(e) PTM 161 (2×QAC)
+
+
(f ) PTM 163 (2×QAC)
+
+
(g) PTM 175 (1×QAC)
+
+
(h) PTM 189 (1×QAC)
+
(i) arginine (1×QAC)
+
(j) proline (1×QAC)
Figure 3.2 Calibration curves of standard compounds with a different number of QAC moi-eties attached: (a) polyamine-modified ornithine (PTM 275-orn); (b) polyamine-modified ly-sine PTM 289; (c) spermidine; (d) unmodified lysine; (e) ε-N-monomethyllysine (PTM 161); (f)δ-hydroxylysine (PTM 163); (g) ε-N,N-dimethyllysine (PTM 175); (h) ε-N,N,N-trimethyllysine(PTM 189); (i) arginine; (j) proline; QAC, derivatization group.
3.1 a method for analysis of ε-polyamine ptms 43
ably 4×QAC, which were not tested) compounds demonstrate similar response and
no correction factors are required. At the same time, the 1×QAC lysines demon-
strated approximately ×10 lower response factors (see Fig. 3.2a). Consequently, abun-
dances of 2×QAC and 3×QAC-derivatives, calculated from XICs, can be normalized
to the spiked ornithine-based internal standard (Fig. 3.2a), while 1×QAC-derivatized
molecules have to be quantified via external calibration. Additionally, calibration
curves were produced for 17 physiological amino acids in order to determine their
content in protein hydrolysates (Fig. A.2). 1×QAC-derivatized amino acids displayed
dynamic ranges of four orders of magnitude, whereas a few amino acids exhibited
linearity over two orders of magnitude.
Therefore, the established method is applicable for quantitative analysis of lysine
derivatives. The major advantages provided by this method include enhanced sensi-
tivity and selectivity to all the covalently modified lysines. To further demonstrate
the method applicability for analysis of crude biosilica hydrolysates, we applied the
entire workflow for characterization of lysine modifications in the purified biosilica-
associated protein tpSil3.
3.1.3 Profiling of lysine modifications in silaffin-3
As discussed in Section 1.3.2, protein silaffin-3 from T. pseudonana (tpSil3) is a well-
characterized component of its biosilica extract with highly complex modifications [67,
68]. Therefore, tpSil3 was purified from T. pseudonana biosilica extract according to the
protocol developed by Poulsen and Kröger (refer to Section 5.2 or [67]). ESI-MS of
underivatized acidic hydrolysate in the positive ion mode revealed two abundant peak
clusters with m/z 319.2704, 333.2860, 347.3017, and lesser amounts of m/z 275.2442,
289.2598, 303.2755, 317.2911, which displayed the mass difference of 14 Da between
neighbouring peaks that corresponds to CH2 unit (Fig. 3.3). These masses fit to lysines
modified with two propylamine units with different number of N-methyl groups (see
in Table 3.1), which also agreed with previously published data [67]. These chemical
structures were confirmed with high-resolution MS/MS spectra with 3 ppm accuracy
(refer to Section 3.2.1 and Fig. A.13–A.21).
44 results and discussion
260 280 300 320 340 360 380 400 420 440
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
431.2572
C 13 H35 O 8 N8
-0.0342 mmu
399.2401
C 20 H36 O 4 N2 P
-0.6027 mmu
289.2597
C 14 H33 O 2 N4
-0.1270 mmu
303.2753
C 15 H35 O 2 N4
-0.1204 mmu
333.2859
C 16 H37 O 3 N4
-0.0986 mmu
319.2703
C 15 H35 O 3 N4
-0.0792 mmu
347.3016
C 17 H39 O 3 N4
-0.0932 mmu
413.2524
C 16 H38 O 6 N4 P
0.0653 mmu275.2442
C 13 H31 O 2 N4
0.0378 mmu
317.2911
C 16 H37 O 2 N4
-0.0254 mmu
Figure 3.3 MS-spectrum of acidic hydrolysate of silaffin-3 from T. pseudonana (tpSil3)
Next, HCl hydrolysate of the purified tpSil3 was subjected to AQC-derivatization.
Resulting QAC derivatives were injected to LC-MS for quantification of modified lysines
and other amino acids from their XICs signals (see Section 5.6 for experimental de-
tails). The amino acids were quantified from calibration curves (Fig. A.2), while the
molar amounts of ε-polyamine lysine derivatives were calculated proportionally to the
amount of ornithine-based internal standard (PTM 275-orn, Fig. 3.2a). The calculated
molar amounts of both amino acids and ε-polyamines were normalized to the total
amount of all QAC-derivatives. To ensure correctness of the results obtained from this
experiment, the same sample of tpSil3 was analyzed for amino acid content using UV-
detection at 280 nm (refer to Section 5.7). Both results were compared to theoretical
amino acid composition of tpSil3. The full profile of lysine modifications and amino
acid content of this protein are displayed in Fig. 3.4.
The polypeptide chain of tpSil3 contains a total of 33 lysine residues that corre-
spond to 16 % (of the total amino acid content), while less than 5 % of free lysines
were detected (see Fig. 3.4). At the same time, relative abundances of other amino
acids corroborated tpSil3 database sequence. Under the conditions of acidic hydrolysis
(6 m HCl, 16–24 h at 110 C) some amino acid residues undergo oxidation or complete
3.1 a method for analysis of ε-polyamine ptms 45
0%
10%
20%
30%
40%
Arg His Ser Gly Asx Glx Thr Ala Pro Val Leu Ile Phe Tyr Met Lys 175 261 275 289 303 319 333 399 413
MS-detection UV-detection Theoretical
rela
tve
ab
un
da
nc
e, %
0%
5%
10%
15%
Lys 175 261 275 289 303 319 333 399 413
(b)(a) (c) (d) (e) (f) (g) (h) (i)
rela
tve
ab
un
da
nc
e, %
amino acid analysis (AAA) and lysine PTMs profile of tpSil3 by MS- and UV-detection.
+
+
(a) PTM 175 (3×QAC)
+
+ +
(b) PTM 261 (3×QAC)
+ + +
(c) PTM 275 (4×QAC)
+ +
(d) PTM 289 (3×QAC)
+ +
(e) PTM 303a (2×QAC)
+ +
(f ) PTM 319 (3×QAC)
+ +
(g) PTM 333 (2×QAC)
+ +
(h) PTM 399 (3×QAC)
+ +
(i) PTM 413 (2×QAC)
Figure 3.4 Amino acid content and lysine PTMs profile of tpSil3 by MS- and UV-detection.Validation of the developed method with the analysis of silaffin-3 from T. pseudonana (tpSil3).Only 25 % of free lysines were detected. About 75 % of total lysine content is modified withdifferent ε-modifications, displayed in (a)–(i). Asx, Aspartic acid or Asparagine; Glx, Glutamicacid or Glutamine; QAC, derivatization group.
46 results and discussion
degradation [113]. Upon HCl hydrolysis asparagine and glutamine were converted to
aspartic acid and glutamic acid, respectively, and therefore detected as sum of both
(Asx and Glx in Fig. 3.4). Tryptophan completely degraded, whereas methionine and
cysteine cannot be directly determined from the hydrolyzed samples due to oxida-
tion, and therefore were not quantified. Serine, tyrosine, and threonine are partially
hydrolyzed. Thus, considering the stable amino acid residues, both LC-MS and UV-
detection demonstrated collaborating results for amino acid determination, however
UV-detection failed to distinguish different lysine modifications species due to the lack
of corresponding standards. In contrast, the developed MS-based method clearly indi-
cated the presence of differently modified lysines. Therefore, we concluded that total
amount of modified lysines corresponded to 80 %, which was in a good agreement with
previously reported data [67, 68]. Side-chain polyamines of the detected lysine modifi-
cations vary in number of propylamine units and N-methylation groups. The number
of derivatization groups (N×QAC) attached to primary and secondary amines of mod-
ified lysines was in accordance with fragment spectra of non-derivatized molecules
(e. g. for PTM 303a, Fig. 3.4), which will be discussed in detail in Section 3.2.1.
3.2 profiling lysine ε-polyamine modifications in
biosilica extracts from three diatom species
Silaffin proteins appear to be permanently associated with (or embedded within) the
biosilica, as they are not extracted from diatom cell walls under rigorous extraction
conditions (e.g., 2 % SDS at 95 C, 8 m urea, or 6 m guanidinium·HCl), as long as the
silica remains intact (discussed in Section 1.3.1). To increase the accessibility of silaffins
we employed the protocol by Kröger et al. [66], which was used previously for the
characterization of LCPA, silacidins, and other biosilica-embedded components [66, 67,
72, 83]. Briefly, acidified ammonium fluoride solution (pH 4.5) was used for solubiliza-
tion of the biosilica-associated proteins, therefore termed ammonium fluoride soluble
material (AFSM).
The AFSM extracts from three diatom species were hydrolyzed with HCl. The ly-
sine modifications from total acidic hydrolysates were fragmented via two alternative
3.2 profiling lysine ptms in biosilica extracts 47
ways: (a) directly in HCl hydrolysate without any pre-fractionation, and (b) in course
of LC-MS/MS run with pre-column AQC-derivatization. The lysine ε-polyamine struc-
tures were obtained by combining information derived from both types of spectra,
which are given in Appendix (Fig. A.9–A.25). Each MS/MS spectrum was interpreted
and annotated with compositions and chemical structures. Additionally, the number
of derivatization adducts (N×QAC) helped to resolve isobaric molecular species using
RPLC. The full catalogue of the detected lysine PTM structures from the three diatom
species is summarized in Table 3.2.
3.2.1 Lysine PTM profile and characteristic fragments
Prior to the analysis of lysine ε-polyamines in total biosilica hydrolysates, the fragmen-
tation of lysine- and ornithine-based standards bearing δ-polyamine modifications was
investigated (mass spectra are shown in Fig. 3.6). The MS/MS of ornithine derivative
with exact m/z 275.2442 (PTM 275-orn) revealed the series of fragments that corre-
sponds to the fragmentation of propylamine chain, which corroborated fragmentation
pattern of lysine ε-polyamine modification (cf. Fig. 3.6a and Fig. 3.6b). Cleavage po-
sitions that lead to the observed fragment ions are depicted by in Fig. 3.5, where
the corresponding fragments are indicated as m- and n-ion series. Optimal normalized
collision energies (nCE) required for the complete fragmentation of both polyamine
standards were about 25 % to 35 %, which was applied for fragmentation of other mod-
ified lysines. Additionally, the pronounced H2O-losses were detected for ornithine
fragments, where the formation of stable six-membered lactam ring occurred (the so-
called ornithine effect [158]), which was not observed for lysine ε-polyamine standard.
This hallmark fragments can be used to distinguish further lysine and ornithine poly-
amine modifications, if they would be present in diatom biosilica hydrolysates.
Next, the acidic hydrolysates of biosilica extracts from the three diatoms were sub-
jected to direct infusion MS/MS analysis. Altogether, 20 m/z values were detected
within 3 ppm accuracy that were matched to ε-polyamine-modified lysine structures.
In Table 3.2 each structure is referred to as ‘PTM’ with a nominal m/z for underiva-
tized singly-charged ion (the exact m/z are also provided). In order to validate the
chemical structures of underivatized molecular species higher-energy collisional disso-
48 results and discussion
+ + +
m2
m1
m3 m4 m5
n2
n1
n3 n4 n5
-98 Da
-78 Da
Figure 3.5 Schematic diagram of m- and n-ions representing fragmentation of ε-polyaminechain (R1–R7 = H or CH3). For certain lysine modifications containing δ-hydroxyl phosphoryla-tion the H3PO4 (−98 Da) or HPO3-loss (−80 Da) neutral losses can be observed.
ciation (HCD) MS/MS was produced for m/z. Direct infusion MS/MS analysis revealed
the presence of multiple compounds with isobaric molecular masses. This approach,
being simple and rapid, appeared to be insufficient for the analysis of complex biosilica
matrices due to the presence of several structural isomers, i. e. PTMs 303, 317, and 331.
The direct infusion analysis did not allow their discrimination, and, consequently, it
was not possible to unambiguously validate the corresponding structures with MS/MS
spectrum. Additionally, direct MS analysis could be difficult for the identification and
quantification of these lysine PTMs due to significant differences in ionization efficien-
cies of polyamine molecules, which would require to use analytical standards for each
individual compound.
To overcome this issue, LC-MS/MS was performed, which resolved and fragmented
separately the isobaric species. After pre-column derivatization each isoform reacted
with a different number of AQC molecules, corresponding to the total number of pri-
mary and secondary amines present in the structure. In Table 3.2 next to each m/z value
for underivatized species the corresponding number of derivatization adducts is pro-
vided (N×QAC, highlighted in gray). Isobaric molecular species with different N×QAC
groups were well chromatographically resolved and fragmented separately, thus pro-
viding an independent confirmation for isomers and other structures. For instance,
PTM 303 in Fig. 3.7. Similarly, PTMs 317 and 331 in Fig. A.15–A.18. Additionally, the
RPLC analysis of QAC-derivatized species revealed the presence of five low-abundant
lysine modifications (PTMs 204, 218, 246, 248, and 261), which are present in biosil-
ica hydrolysates from all three diatom species. These modifications were tentatively
3.2 profiling lysine ptms in biosilica extracts 49
80 100 120 140 160 180 200 220 240 260 280 300
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
257.2320
C 13 H29 O N4
-1.6181 mmu
86.0966
C 5 H12 N
0.1342 mmu
155.1170
C 8 H15 O N2
-0.9323 mmu
212.1745
C 11 H22 O N3
-1.2842 mmu
239.2215
C 13 H27 N4
-1.5258 mmu275.2423
C 13 H31 O 2 N4
-1.8458 mmu
116.0702
C 5 H10 O 2 N
-0.3935 mmu
143.1535
C 8 H19 N2
-0.7936 mmu
173.1275
C 8 H17 O 2 N2
-0.9812 mmu
160.1798
C 8 H22 N3
-1.0047 mmu
230.1863-H₂O
-18.0103
-H 2 O
-0.2277 mmu
-18.0105
-H 2 O
-0.0489 mmu
-18.0105
H2 O
-0.0923 mmu
103.1228
C 5 H15 N2
-0.7113 mmu
+
143.1543
160.1808
116.0706
230.1863
86.0964
173.1285
103.1230
(a) Fragment spectrum of synthetic ornithine δ-polyamine derivative PTM 275-orn (m/z 275.2442; 1+)
80 100 120 140 160 180 200 220 240 260 280 300
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
86.0966
C 5 H12 N
0.1269 mmu
289.2579
C 14 H33 O 2 N4
-1.8793 mmu
98.0963
C 6 H12 N
-0.0769 mmu
187.1430
C 9 H19 O 2 N2
-1.1142 mmu244.2005
C 12 H26 O 2 N3
-1.4961 mmu
143.1535
C 8 H19 N2
-0.8149 mmu 271.2475
C 14 H31 O N4
-1.7456 mmu
170.1165
C 9 H16 O 2 N
-1.0283 mmu226.1900
C 12 H24 O N3
-1.4082 mmu
103.1228
C 5 H15 N2
-0.1652 mmu
130.0856
C 6 H12 O 2 N
-0.6308 mmu
160.1799
C 8 H22 N3
-0.9331 mmu
-18.0104
-H 2 O
-0.1336 mmu
-18.0105
-H 2 O
-0.0879 mmu
-17.0265
-H3 N
-0.0859 mmu
+
143.1543
244.2020
86.0964
187.1441
103.1230160.1808
130.0863
(b) Fragment spectrum of synthetic lysine ε-polyamine derivative PTM 289-lys (m/z 289.2598; 1+)
Figure 3.6 HCD MS/MS spectra (nCE to 30 %) and chemical structures of synthetic standardsof oligo-propylenediamine-substituted ε-lysine and δ-ornithine derivatives used for validationof the method and as internal standards: (a) ornithine δ-polyamine derivative (m/z 275.2442;1+); (b) lysine ε-polyamine derivative (m/z 289.2598; 1+). Spectra are annotated with accuratemasses, calculated chemical composition (CHNO) and delta mass (in mmu). Lysine-specificimmonium ions at m/z 84.0808 and m/z 129.1022 are not annotated. The pronounced H2O-losses that were observed during fragmentation of PTM 275-orn are characteristic for ornithineδ-polyamines, where the formation of stable six-membered lactam with m/z 257.2320 occurs(the so-called ornithine effect [158]), which has not been observed for lysine ε-polyamines.
50 results and discussion
assigned with a corresponding chemical structure based on the accurate mass and
N×QAC derivatization moieties (listed in Table 3.2).
The fragmentation of mono-, di- and trimethylated lysine derivatives results in
lysine-specific immonium ions with a moderate intensity at m/z 115.1229, 129.1386 and
143.1543, respectively. Also, the immonium ion with m/z 130.0867 has been observed
for all methylated lysine species (Fig. A.9–A.11). A specific characteristic ion for mono-
methylated lysine was observed at m/z 98.0964, which corresponds to the immonium
ion (IM)-NH3 ion (Fig. A.9). These results are in a good accordance with previously
published data on lysine methylation [138, 159–161].
In addition to the lysine-specific fragment ions at m/z 84.0808 and m/z 129.1022, sev-
eral specific marker ions can be detected for ε-polyamine chains. For instance, MS/MS
spectra of PTMs 413 and 333 (phosphopolyamine and its non-phosphorylated counter-
part, correspondingly) contain the pronounced fragment with the exact m/z 143.1543
(refer to Fig. A.21 and Fig. A.24). As shown in MS/MS spectra from tpSil3 peptides in
Fig. 3.18, the fragmentation of ε-polyamine-modified lysine (PTM 289) also results in
formation of the abundant fragment ion of m/z 143.1543. Similarly, this characteristic
ion occurs in spectra of PTMs 303b and 317, which have the same ε-polyamine chain
structure (Fig. 3.7 and A.16). The fragmentation of ε-polyamines PTMs 317a (isomer
modified by 1×QAC, Fig. A.15), PTMs 331a and 331b (isomers with 1 and 2×QAC re-
spectively, Fig. A.17 and A.18), PTM 347 (Fig. A.22) resulted in characteristic fragment
with m/z 157.1699, whereas lysine derivatives PTM 275 (Fig. A.13), PTM 303b (isomer
with 3×QAC, Fig. 3.7), PTM 319 (Fig. A.20) demonstrated the presence of m/z 129.1386
diagnostic ion.
3.2 profiling lysine ptms in biosilica extracts 51
80 100 120 140 160 180 200 220 240 260 280 300
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
303.2753
C 15 H35 O 2 N4
-0.1563 mmu
143.1543
C 8 H19 N2
0.0413 mmu
187.1441
C 9 H19 O 2 N2
-0.0050 mmu
258.2176
C 13 H28 O 2 N3
-0.0391 mmu
157.1700
C 9 H21 N2
0.0372 mmu
98.0969
C 6 H12 N
0.5141 mmu
130.0864
C 6 H12 O 2 N
0.1199 mmu
214.2277
C 12 H28 N3
-0.0580 mmu
285.2647
C 15 H33 O N4
-0.1906 mmu
86.0971
C 5 H12 N
0.6383 mmu
201.1596
C 10 H21 O 2 N2
-0.1628 mmu
+
+
143.1543
+
130.0863
130.0863
130.0863
157.1699
157.1699
258.2176
258.2176187.1441
187.1441
201.1598
232.2020
117.1386
117.1386
(a) Fragment spectrum of underivatized isomeric lysine ε-polyamines PTM 303 (m/z 303.2755; 1+)
100 150 200 250 300 350 400 450 500 550 600 650 700
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0550
C 10 H7 O N2
-0.3120 mmu
303.2747
C 15 H35 O 2 N4
-0.7031 mmu
237.1649
C 3 H17 O N12
0.5762 mmu473.3228
C 25 H41 O 3 N6
-0.6507 mmu
143.1540
C 8 H19 N2
-0.2530 mmu
201.1593
C 10 H21 O 2 N2
-0.4222 mmu
540.1858
C 28 H30 O 10 N
-0.6315 mmu
321.2802
C 21 H37 O 2
1.3445 mmu
388.2333
C 19 H34 O 7 N
0.3375 mmu
QAC
+ +
473.3235
143.1543
201.1598
388.2343
(b) Fragment spectrum of 2×QAC-derivatized lysine modification PTM 303a (m/z 322.1893; 2+)
Figure 3.7 HCD MS/MS spectra of isomeric lysine derivatives PTM 303. (a) spectrum ofunderivatized isomers (m/z 303; 1+), nCE to 30 %; (b) spectrum of 2×QAC-derivatized moleculePTM 303a (m/z 322.1893; 2+), nCE to 30 %. Fragment peaks are annotated with an accurate mass,corresponding calculated chemical composition (CHNOP) and delta mass (in mmu).
52 results and discussion
QAC
100 150 200 250 300 350 400 450 500 550 600 650 700
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0549
C 10 H7 O N2
-0.3478 mmu
242.1282
C 14 H16 O N3
-0.5558 mmu
299.1859
C 16 H27 O 5
0.6164 mmu
357.1912
C 18 H29 O 7
0.3882 mmu117.1387
C 6 H17 N2
0.0563 mmu
473.3226
C 24 H45 O 7 N2
0.4814 mmu
214.1909
C 11 H24 O N3
-0.4598 mmu
643.3691
C 34 H51 O 8 N4
-1.0628 mmu
402.2491
C 20 H36 O 7 N
0.4291 mmu
+
643.3715
473.3235
527.2401
357.1912
242.1288
402.2499
117.1386
(c) Fragment spectrum of 3×QAC-derivatized lysine modification PTM 303b (m/z 407.2133; 2+)
100 150 200 250 300 350 400 450 500 550 600 650 700
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0550
C 10 H7 O N2
-0.3334 mmu
303.2748
C 15 H35 O 2 N4
-0.6906 mmu
237.1649
C 3 H17 O N12
0.6046 mmu
473.3226
C 24 H45 O 7 N2
0.4740 mmu
388.2333
C 19 H34 O 7 N
0.3496 mmu
157.1697
C 9 H21 N2
-0.2324 mmu
187.1438
C 9 H19 O 2 N2
-0.3192 mmu
QAC
+
473.3235
157.1699 187.1441
388.2343
(d) Fragment spectrum of 2×QAC-derivatized lysine modification PTM 303c (m/z 322.1893; 2+)
Figure 3.7 HCD MS/MS spectra of lysine derivative PTM 303b and PTM 303c (continuedfrom previous page). (d) spectrum of 3×QAC-derivatized lysine modification PTM 303b, nCEto 30 %; (c) spectrum of 2×QAC-derivatized lysine modification PTM 303c, nCE to 30 %. Frag-ment peaks are annotated with an accurate mass, corresponding calculated chemical composi-tion (CHNOP) and delta mass (in mmu).
3.2 profiling lysine ptms in biosilica extracts 53
Table 3.2 Catalogue of lysine modifications and their characteristic fragments. Cleavage po-sitions that lead to the observed fragment ions are depicted by . Lysine-specific fragments atm/z 84.0808 and m/z 129.1022 are not listed.
m/z (n×qac) ptm structure and reporter fragments spectrum
Ornithine-based internal standard
PTM 275-orn
(standard)
m/z 275.2442
(3×QAC)
+
143.1543
160.1808
116.0706
230.1863
86.0964
173.1285
103.1230
characteristic fragments
Fig. A.8
p. 126
ε-methylated lysines (Fig. 3.10b)
PTM 161
m/z 161.1285
(2×QAC)
+
98.0964, 115.1229
Fig. A.9
p. 127
PTM 175
m/z 175.1441
(1×QAC)
+
129.1386
Fig. A.10
p. 128
(+28
K in Fig. 3.21b)
PTM 189
m/z 189.1598
(1×QAC)
+
144.1388
Fig. A.11
p. 129
(+42
K in Fig. 3.21c)
54 results and discussion
Table 3.2 Catalogue of lysine modifications (continued from previous page)
m/z (n×qac) ptm structure and reporter fragments spectrum
ε-polyaminated lysines (Fig. 3.10c)
PTM 204
m/z 204.1707
(3×QAC)
+
low-abundant derivative
no MS/MS
PTM 218
m/z 218.1863
(3×QAC)
+
low-abundant derivative
no MS/MS
PTM 232
m/z 232.2020
(2×QAC)+
201.1598130.0863
103.1230
161.1285
characteristic fragments
Fig. A.12
p. 130
(+85
K in Fig. 3.21f)
PTM 246
m/z 246.2176
(2×QAC)
+
low-abundant derivative
no MS/MS
PTM 261
m/z 261.2285
(4×QAC)
+
low-abundant derivative
no MS/MS
3.2 profiling lysine ptms in biosilica extracts 55
Table 3.2 Catalogue of lysine modifications (continued from previous page)
m/z (n×qac) ptm structure and reporter fragments spectrum
PTM 275
m/z 275.2442
(4×QAC)
+
129.1386
characteristic fragments
Fig. A.13
p. 131
PTM 289
m/z 289.2598
(3×QAC)
+
143.1543
244.2020
86.0964
187.1441
103.1230160.1808
130.0863
143.1543
Fig. A.14
p. 132
(+142
K in Fig. 3.21d)
PTM 303a
m/z 303.2755
(2×QAC)
+
143.1543
130.0863 258.2176201.1598
characteristic fragments
Fig. 3.7b
p. 51
PTM 303b
m/z 303.2755
(3×QAC)
+
130.0863
157.1699
187.1441
232.2020
117.1386
characteristic fragments
Fig. 3.7c
p. 52
56 results and discussion
Table 3.2 Catalogue of lysine modifications (continued from previous page)
m/z (n×qac) ptm structure and reporter fragments spectrum
PTM 303c
m/z 303.2755
(2×QAC)
+
473.3235
157.1699 187.1441
388.2343
characteristic fragments
Fig. 3.7d
p. 52
PTM 317a
m/z 317.2911
(1×QAC)
157.1699
+
130.0863 272.2333201.1598
characteristic fragments
Fig. A.15
p. 133
PTM 317b
m/z 317.2911
(2×QAC)
+
143.1543
130.0863 215.1754 272.2333
characteristic fragments
Fig. A.16
p. 134
PTM 331a
m/z 331.3068
(1×QAC)
+
157.1699
characteristic fragments
Fig. A.17
p. 135
PTM 331b
m/z 331.3068
(2×QAC)
+ +
157.1699
characteristic fragments
Fig. A.18
p. 136
3.2 profiling lysine ptms in biosilica extracts 57
Table 3.2 Catalogue of lysine modifications (continued from previous page)
m/z (n×qac) ptm structure and reporter fragments spectrum
δ-hydroxylated lysines (Fig. 3.10d)
PTM 163
m/z 163.1077
(2×QAC) +
characteristic fragments
??
no MS/MS
PTM 205
m/z 205.1547
(1×QAC)
+
characteristic fragments
Fig. A.19
p. 137
PTM 248
m/z 248.1969
(2×QAC)
+
characteristic fragments
no MS/MS
PTM 319
m/z 319.2704
(3×QAC)
+
129.1386
characteristic fragments
Fig. A.20
p. 138
PTM 333
m/z 333.2860
(2×QAC)
+
143.1543
188.2121
143.1543
Fig. A.21
p. 139
(+186
K in Fig. 3.21e)
58 results and discussion
Table 3.2 Catalogue of lysine modifications (continued from previous page)
m/z (n×qac) ptm structure and reporter fragments spectrum
PTM 347
m/z 347.3017
(2×QAC)
+ +
157.1699
characteristic fragments
Fig. A.22
p. 140
Phosphopolyamines (Fig. 3.10e)
PTM 399
m/z 399.2367
(3×QAC)
+
129.1386
characteristic fragments
Fig. A.23
p. 141
PTM 413
m/z 413.2523
(2×QAC)
+
143.1543
143.1543
Fig. A.24
p. 142
PTM 427
m/z 427.2680
(2×QAC)
+ +
157.1699
characteristic fragments
Fig. A.25
p. 143
An important outcome of the fragmentation study is the discovery of characteris-
tic ions, present in fragmentation spectra of the ε-polyamine modifications. These
3.2 profiling lysine ptms in biosilica extracts 59
fragments are diagnostic for modified lysine residues and can be used for further
peptide-independent and modification-specific detection of ε-polyamine PTMs. De-
tected characteristic ions preferentially represent ε-side chain fragments or IMs, whose
m/z are summarized in Table 3.2 (below each structure). According to the MS/MS of
other lysine modifications, the majority of modification-specific fragments refer to the
low-mass region between m/z 50 and 300, and therefore could be detected with high
resolution and sub-ppm mass accuracy (refer to spectra in Fig. A.8–A.25). Information
about these diagnostic modification-specific ions is very important for identification of
modified peptides and for the validation of MS/MS-based PTM assignments, which
will be discussed further in Section 3.3.5.
The accurate masses and MS/MS spectra of six lysine modifications corresponded
to phosphorylated δ-hydroxylysines with ε-polyamine chains, and its corresponding
non-phosphorylated counterparts with the mass shift of 80 Da for HPO3 (PTMs 399,
319, 413 and PTMs 333, 427 and 347 respectively). However, it is highly surprising
that O-phosphoester bond can be resistant upon exhaustive acidic hydrolysis2 (6 N
HCl, 24 h at 110 C). In the current work, investigation of these structures, collectively
denoted as phosphopolyamines, was addressed with elaborate structural study, which
will be discussed in the following section (Section 3.2.2).
3.2.2 Elucidation of phosphopolyamine structures resistant to acidic hydrolysis
Profiling of biosilica hydrolysates and purified tpSil3 revealed the presence of QAC-
derivatized phosphopolyamines, which eluted earlier than their non-phosphorylated
counterparts (see Fig. A.5). Two alternative structures were proposed, which contained
either phosphoester bond (C O P, Fig. 3.9b) or phosphonate group (C P , Fig. 3.9c).
Both structures were consistent with MS/MS spectrum of the PTM 413 (m/z 413.2523,
Fig. 3.8a), however the observed intensity corresponding to the H3PO4 neutral loss
(−98 Da) was unexpectedly weak. To test the C P bond assumption, total biosilica hy-
drolysate was derivatized with acetic anhydride to detect the corresponding mass shifts
for acetylated species (see Section 5.9). The resulting MS/MS spectrum acquired from
the doubly acetylated derivative of m/z 497.2735 and demonstrated in Fig. 3.8b, how-
2 On the other hand, phosphorylated structures were completely converted to non-phosphorylated onesby HF-treatment (Fig. A.6).
60 results and discussion
100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
Re
lative
Ab
un
da
nce
143.1543C 8 H19 N2
-0.0078 mmu
188.2120C 10 H26 N3
-0.0819 mmu
413.2519C 16 H38 O6 N4 P
-0.4093 mmu98.0969C 6 H12 N
0.4630 mmu 333.2855C 16 H37 O3 N4
-0.4721 mmu
271.1051C 11 H11 O N8
0.0239 mmu
86.0970C 5 H12 N
0.0589 mmu
315.2751C 16 H35 O2 N4
-0.8737 mmu
+
-79.9664-H O3 P
0.0660 mmu
-97.9768-H 3 O4 P
-0.0885 mmu
271.1051
188.2120
143.1543 86.0970
(a) MS/MS of underivatized phosphopolyamine (m/z 413.2523)
+
100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420 440 460 480 500
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
105
110
115
120
Re
lative
Ab
un
da
nce
185.1644C 10 H21 O N2
-0.9816 mmu
86.0965C 5 H12 N
-0.4393 mmu
114.0912C 6 H12 O N
-0.6790 mmu
497.2723C 16 H38 O6 N10 P
0.9930 mmu
417.3061C 16 H37 O3 N10
1.0903 mmu
230.2220C 12 H28 O N3
-1.2135 mmu
-79.9662-H O3 P
-0.0973 mmu
230.2220
313.2040C 16 H30 O2 N2 P
-0.4685 mmu
313.2040
185.1644 86.0965
399.2955C 16 H35 O2 N10
1.0397 mmu
-97.9768-H 3 O4 P
-0.0794 mmu
(b) MS/MS of doubly acetylated phosphopolyamine (m/z 497.2735)
Figure 3.8 Phosphopolyamine MS/MS spectra. H3PO4 (−98 Da) or HPO3 (−80 Da) neutrallosses.
3.2 profiling lysine ptms in biosilica extracts 61
ever, does not allow assigning unambiguously the C P bond in the phosphopolyamine
structure. The same problems were subsequently encountered in analysis of less abun-
dant phosphorylated species (PTMs 399 and 427), which were detected in biosilica
hydrolysates (displayed in Fig. 3.10e, phosphorylated).
2 1 - 0 - 1 - 2 - 3 - 4
0.0
0
- 1 - 2
d(31P)
-1.6
7
-1.7
0
ppm
–
(a) 31P-NMR spectrum of T. pseudonana biosilica hydrolysate.
_
(b) C O P; −0.6 ppm
_
(c) C P ; 8.5 ppm
Figure 3.9 31P-NMR spectrum of T. pseudonana biosilica hydrolysate.
To further elucidate the structure of the phosphopolyamines, diatom biosilica hy-
drolysates were subjected to 31P-NMR analysis, which were carried out by Marcus
Rauche in Eike Brunner laboratory (Technische Universität Dresden, Germany). The
NMR spectrum of T. pseudonana biosilica hydrolysate (displayed in Fig. 3.9a) revealed
the presence of one signal at −1.68 ppm. Phosphoserine with C O P bond and
N-phosphonomethylglycine (or glyphosate) with C P bond were measured indepen-
dently as a reference compound to distinguish chemical shifts for both bonds (data
not shown). The 31P-NMR spectra exhibit that for C P bonds the chemical shift is
about 8.5 ppm (Fig. 3.9c) and for C O P bond about −0.6 ppm (see Fig. 3.9b). The
signal at −1.68 ppm at the 31P-NMR spectrum of the hydrolysate supports the assump-
tion, that the phosphorous in all phosphopolyamines is attached via oxygen at the
lysine residue and not directly bonded on a carbon Fig. 3.9b. From comparison of the31P1H-NMR-decoupled and nondecoupled spectra the signal at −1.68 ppm exhibit
62 results and discussion
a doublet structure due to J-coupling (approximately 7.3 Hz) to one neighboring 1H
nucleus. Coupling constants around 7 Hz are typical for 3J-couples of P H. This is
indicating that the phosphate residue is linked to a disubstituted C H group. Instead
the coupling constants of P H bond in glyphosate is much higher with approximately
12.71 Hz and typical for 2J(PH)-couplings [162]. Consequently, both chemical shift and
multiplet structure indicate that the structure of phospho-containing compounds in
T. pseudonana biosilica hydrolysate corresponded to phosphoester bond (C O P, see
Fig. 3.9b).
Taken together, these results suggested that phosphopolyamine modifications com-
prise an abundant class of lysine post-translational modifications (see Fig. 3.10e and
Fig. 3.11e). These modifications was not described before in the literature, however
phosphorylation of the hydroxyl group of N-trimethylhydroxylysine has been previ-
ously reported by Nakajima and Volcani in Navicula pelliculosa diatom [52], which is
displayed in Fig. 1.5). Later, same lysine modification was found in silaffin-1A from
C. fusiformis [66]. The presence of these modifications in biosilica extracts from differ-
ent diatom species indicates, that phosphopolyamines may play an important role in
biosilicification process. Phosphopolyamines occurred in extracts of both T. pseudonana
and C. cryptica (however, with a different abundances, see Fig. 3.10e), but were com-
pletely absent from T. oceanica extract. This observation motivated the comparative
study of lysine PTMs in three diatom species, in order to analyze the similarities and
differences in ε-polyamine profiles from the three diatom species.
3.2.3 Lysine polyamine modification profiles of AFSM extracts
The AFSM biosilica hydrolysates were derivatized by AQC and subjected to LC-MS/MS
analysis, where QAC-derivatives were detected and quantified by XICs of their proto-
nated molecular ions (exact experimental procedure described in Section 5.6). The
molar amounts of ε-polyamines were calculated proportionally to the amount of in-
ternal standard spiked into each sample (PTM 275-orn), and then normalized to the
total molar amount of all ε-polyamine derivatives. For technical reproducibility evalua-
tion, three successive LC runs were considered, and for biological reproducibility, two
biological replicates were averaged. Coefficients of variation for AQC-derivatized in-
3.2 profiling lysine ptms in biosilica extracts 63
ternal standard, spiked into the biosilica extracts samples before analysis, were within
3–14 %. Coefficients of variation for technical replicates were within 10 % for all of the
measurements. Moderate reproducibility could be explained by the biological variation
in diatom cultures (age variations, different growth rate, etc.)
The content and abundance of modified lysines are shown in Fig. 3.10. To evaluate
the total occupancy of lysine residues in the each AFSM extract, the rate of unmodi-
fied lysine residues PTM 147 (2×QAC) out of total detected lysines was measured and
shown in Fig. 3.10a (unmodifified lysine out of total amount). Notably, total lysine
occupancy accounted for 75–85 %, which is consistent with the previous analysis of
purified tpSil3 protein (~75 %, see Section 3.1.2). Altogether, 25 modified lysine QAC-
derivatives were detected within 3 ppm accuracy, and their chemical structures were
confirmed by high-resolution MS/MS. Upon dissociation of QAC moieties the reporter
fragment of m/z 171.0564 is readily generated, while the rest fragments correspond to
fragmentation of underivatized lysine modifications. Interpretation of these spectra in
combination with the number of reacted derivatization moieties (N×QAC) aided the
assignment of N-methylation in the ε-polyamine side-chain, thus allowing to resolve
ambiguities in case of structural isomers. Independent proof for the assigned poly-
amine structures was obtained from high resolution MS/MS spectra of underivatized
molecules, which was discussed in Section 3.2.1. All the obtained quantitative and
structural data are schematically summarized in Fig. 3.10, where the corresponding
chemical structures are provided next to the data bars.
Mono-, di- and trimethylation of lysine ε-amino group represented the most abun-
dant cumulative modification, which accounted for 50–70 % of the total PTM abun-
dance (m/z 161.1285, 175.1441, 189.1598; ε-methylated, Fig. 3.10b). Linear polyamine
chains attached to ε-amino groups of lysine residues represent the most structurally
diverse subgroup of lysine PTMs (total 13 structures out of 25), whose polyamine side-
chains displayed different degree of N-methylation (ε-polyaminated, Fig. 3.10c). Six
ε-polyaminated and ε-methylated lysines were δ-hydroxylated (Fig. 3.10d). Addition-
ally, the LC-MS/MS analysis revealed three phosphorylated hydroxylysine derivatives
(PTMs 399, 413, and 427, Fig. 3.10e), whose non-phosphorylated counterparts with the
corresponding mass difference of 80 Da were also observed (accordingly, PTMs 319, 333
and 347; Fig. 3.10d). δ-hydroxylysine (PTM 163 2×QAC), and also ε-N,N,N-trimethyl-δ-
hydroxylysine (PTM 205 1×QAC), reported previously in other diatom species [52]. A
64 results and discussion
0%
5%
10
%
15
%
20
%
25
%
30
%
35
%
40
%
45
%
50
%
161 (2×QAC)
175 (1×QAC)
189 (1×QAC)
204 (3×QAC)
218 (3×QAC)
232 (2×QAC)
246 (2×QAC)
261 (4×QAC)
275 (4×QAC)
289 (3×QAC)
303 (2×QAC)
303 (3×QAC)
317 (1×QAC)
317 (2×QAC)
331 (1×QAC)
331 (2×QAC)
163 (2×QAC)
205 (1×QAC)
248 (2×QAC)
319 (3×QAC)
333 (2×QAC)
347 (2×QAC)
399 (3×QAC)
413 (2×QAC)
427 (2×QAC)
mol. %
(e) ph
osph
orylated(d) δ-h
ydroxy-p
olyamin
es(c) ε-p
olyamin
es(b) ε-m
ethyla ted
+
+
+
+
+
+
+
+ +
+
+ +
+
+
+ +
+
+ +
+ + +
+ +
+ +
+ + +
+
+ +
+ +
+
+
+
+
+ +
+ +
+ +
+ +
+ +
+ +
+ +
+
147 (2×QAC)
19
.4±
5.1
%
31
.5±
5.2
%
29
.65
±6
.6% 0%
20
%4
0%
60
%8
0%
10
0%
(a) u
nm
odifi
ed lysin
e (out of total am
oun
t)
mo
l. %
+
+
T. o
cea
nica
C. cry
p
ca
T. p
seu
do
na
na
nominal m/z of PTM
(N×QAC-groups)
Figure 3.10Structure
andcontent
oflysine
post-translationalmodifications
(PTMs)
inhydrolysates
ofdiatom
biosilicaA
FSMextracts
fromTP
,CC
, andTO
. Errorbars
fortw
obiologicalreplicates.C
hemicalstructures
ofdetected
lysinem
odificationsw
iththe
respectivenum
berofQ
AC
-groups,where
polyamine
molecules
areannotated
with
nominalm
/zvalues
ofthesingly
protonatedm
olecularion.See
alsoTable
3.2for
details.
3.2 profiling lysine ptms in biosilica extracts 65
number of ε-polyamine-modified lysines were previously reported for silaffin proteins,
e. g. PTMs 275, 289, 303, 333, which were also detected in tpSil3 hydrolysate (see [68]
and Section 3.1.2). In addition to these seven already known lysine PTMs, 18 novel
lysine modifications are reported here for the first time.
The length of polyamine chain in lysine modifications is restricted by two repeated
propylamine units for all three diatom species, while the methylation degree of poly-
amine chains may substantially vary. For instance, it is shown that the modification
of PTM 303 is present in all extracts as two structural isomers (see Fig. 3.7 in Sec-
tion 3.2.1). These two isoforms, which are denoted as PTM 303a and PTM 303b in
Table 3.2, were well separated via LC-MS/MS, because they carry different number of
QAC-derivatization groups (2×QAC and 3×QAC, respectively). The different number
of derivatization groups indicates on varying N-methylation pattern of their polyamine
chains. Abundances of both isoforms differed significantly throughout profiles of all
three diatom species. The same was observed for PTM 331 (1×QAC and 2×QAC),
where these modifications were specific for T. oceanica and C. cryptica respectively, sim-
ilar to the relative abundance of both isoforms varied among the three diatom species
and the number of reacted QAC-groups helped to resolve structural isomers.
3.2.4 Comparison of AFIM and AFSM profiles in T. pseudonana
After mineral phase dissolution and extraction of AFSM, the organic matrices that
remain insoluble after ammonium fluoride treatment was also isolated and analyzed
in T. pseudonana biosilica. Previously it has been shown, that ammonium fluoride in-
soluble material (AFIM) contains proteins, polysaccharides and long-chain polyamines
(LCPAs) [71, 163–165], however, it remained poorly characterized in any diatom species
due to laborious and inefficient isolation procedure. In the current work the biochemi-
cal composition and importantly ε-polyamine profile of the AFIM has been addressed
in T. pseudonana. The lysine ε-polyamine profiles for AFSM and AFIM are compared
in Fig. 3.11. The qualitative composition of lysine PTMs for both fractions remains the
same; at the same time, AFSM and AFIM display striking quantitative differences in
modification profiles. On one hand, the abundance of lysine ε-methylated species is
decreased in the AFIM extract (Fig. 3.11b), whereas on the other hand, the content of
66 results and discussion
δ-hydroxylated and particularly phosphorylated species is substantially increased for
AFIM fraction (Fig. 3.11d-e).
(b) ε-methylated (c) ε-polyaminated (e) phosphorylated(d) δ-hydroxylated
0%
10%
20%
30%
40%
50%
60%
70%
16
1 (
2×
QA
C)
17
5 (
1×
QA
C)
18
9 (
1×
QA
C)
20
4 (
3×
QA
C)
26
1 (
4×
QA
C)
27
5 (
4×
QA
C)
28
9 (
3×
QA
C)
30
3 (
2×
QA
C)
30
3 (
3×
QA
C)
31
7 (
2×
QA
C)
16
3 (
2×
QA
C)
20
5 (
1×
QA
C)
31
9 (
3×
QA
C)
33
3 (
2×
QA
C)
39
9 (
3×
QA
C)
41
3 (
2×
QA
C)
+
AFSM
AFIM
T. pseudonana
(a) unmodified lysine (out of total amount)
14.9%±0.6
29.6%±6.6
0%20%40%60%80%100%
mol. %
14
7 (
2×
QA
C)
+
+
+
+
+
+
+
++
++
++
+
++
++
++
+
++
+
+
++
++
++
++
+
+
mo
l. %
Figure 3.11 Comparison of AFSM and AFIM
The abundance of unmodified lysine (out of total amount) in AFIM is two times less
as compared to AFSM Fig. 3.11a, and the profile is shifted towards ε-polyamine species
in AFIM. Important to mention, that proteins comprising insoluble fraction are differ-
ent from those in AFSM, consisting mainly of cingulins and newly discovered proteins
termed silicanins [90, 166]. The larger number of lysine PTMs in these proteins would
be expected to enhance the silica formation activities of protein aggregates in AFIM,
which might be important for biosilica morphogenesis of girdle bands region. How-
ever, PTM profiling of AFIM fractions in other diatom species remains to be further
investigated.
3.2 profiling lysine ptms in biosilica extracts 67
3.2.5 Phylogenetic relationship across three diatom species
The modification pool shared by all diatom species is qualitatively conserved for most
of the modifications, although the relative abundances of lysine PTMs differed strongly
throughout profiles of all species. Some of modifications are shared by all studied
species, whereas several PTMs appear to be species-specific. To further investigate this
issue, we compared the full profiles of lysine modifications in biosilica extracts from
three diatom species T. pseudonana, C. cryptica and T. oceanica. The graphic presentation
of phylogenetic relationship can be significantly simplified via tabular view in Table 3.3,
where the lysine modifications have been clustered into the boxes ( ) with species-
specific modifications based on their abundance or occurrence.
It is shown in Table 3.3, that T. pseudonana and C. cryptica ( ) shared almost
the same polyamine modification pool with three specific PTMs for C. cryptica ( ),
while multiple exceptions occurred for the phylogenetically more distant diatom T. oceanica
( ). Total six lysine modifications were specific for T. oceanica extract (i. e., PTMs 218,
232, 246, 248, 317a and 331a). It is clear, therefore, that polyamine PTM profiles follow
the phylogenetic proximity across three diatom species, which is reflected by phyloge-
netic tree on top of each column in Table 3.3 ( , adapted from Fig. 1.9a and [156,
157]).
Strong differences in almost all PTM abundances imply the existence of tightly regu-
lated enzymatic machinery responsible for biosilica formation in diatoms. Differently
modified lysines may represent semi-products of subsequent silaffin processing steps,
where enzymatic machinery for post-translational modification should include methyl-
transferases, aminopropyl-transferases, and hydroxylases (see Fig. 3.10b–d). There is
a clear structural similarity between lysine ε-polyamine chains and LCPAs, indicating
a common enzymatic pathway for biosynthesis of these molecules (which has been
previously hypothesized in [99, 167]). These scheme implies that hydroxylation and
phosphorylation of lysine residues occurs after ε-polyamination, which is supported
by the presence of the corresponding ‘intermediates’. Based on the clustered data,
summarized in table Table 3.3, it is possible to draw hypothetical routes for lysine
post-translational modifications that are demonstrated in Fig. 3.12 (the direction of
hypothetical PTM pathway is marked with arrows).
Nevertheless, the surprising variability of polyamine modifications and strong dif-
ferences for abundances of (almost) all lysine derivatives imply on the set of differently
modified proteins potentially involved in biosilica morphogenesis. Therefore, the fur-
ther elucidation of post-translational specificity for protein modifications is important
for mechanistic understanding of biosilicification processes in different diatoms species.
To follow up on this notion, it is necessary to locate the sites of those modifications in
silaffin sequences and to perform inter- and cross-species comparison of the corre-
sponding PTM patterns that will be presented in Section 3.3.
(c) TO specific
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ +
+ +
+ +
+
+
+
+
PTM 161
PTM 175
PTM 189
PTM 204PTM 218
PTM 232
PTM 246
PTM 261
PTM 275
PTM 289
PTM 303 (2×QAC)
PTM 303 (3×QAC)
PTM 317 (1×QAC)
PTM 317 (2×QAC)
PTM 331 (1×QAC)
PTM 331 (2×QAC)
PTM 163
PTM 205
PTM 248
PTM 319PTM 333
PTM 347
PTM 399 PTM 413 PTM 427
Lysine
+
+
+
(a) All species
(d) CC specific (b) TP and CC specific
?
? ?
?
+ 57 Da
+ 14 Da
+ 14 Da
+ 14 Da
+ 14 Da
+ 2
8 D
a
+ 16 Da
+ 80 Da+ 80 Da
+ 16 Da
+ 16 Da
+ 16 Da
+ 80 Da
+ 14 Da
+ 14 Da
+ 14 Da
+ 14 Da
+ 14 Da
+ 14 Da
+ 14 Da
+ 16 Da
+ 14 Da
+ 14 Da
+ 14 Da
+ 16 Da
+ 14 Da
+ 14 Da
+ 57 Da
Figure 3.12 Hypothetical routes for lysine post-translational modifications from the three diatom species,based on their occurrence and abundance. See also Table 3.3. TP , T. pseudonana; CC , C. cryptica; TO ,T. oceanica
Table 3.3 Tabular representation of the data from Fig. 3.10 (clustered by occurrence and abundance). Phylogenetic relationship between the threediatom species is shown by the tree ( ), which is adapted from Alverson et al. [156]. The abundance profiles of modified lysines were similar inthe phylogenetically more closely related T. pseudonana and C. cryptica ( ), and both differed from the profile in the phylogenetically more distantT. oceanica ( ). TP , T. pseudonana; CC , C. cryptica; TO , T. oceanica.
Occurred in TP , CC and TO Occurred in TP and CC Occurred in TO only Occurred in CC only
+
+ +
+ +
+ + + +
PTM 163 (2×QAC) PTM 205 (1×QAC) PTM 218 (3×QAC) PTM 331b (2×QAC)
0.7±0.3% 0.2±0.0% 4.6±0.5% 2.2±0.5% 1.6±0.7% 0.0±0.0% 0.0±0.0% 0.0±0.0% 0.8±0.5% 0.0±0.0% 3.3±0.9% 0.1±0.0%
+
+ + + + +
+
+ +
PTM 161 (2×QAC) PTM 275 (4×QAC) PTM 232 (2×QAC) PTM 347 (2×QAC)
4.2±0.9% 9.7±1.6% 3.3±1.7% 7.5±0.6% 5.3±0.8% 0.5±0.3% 0.0±0.0% 0.1±0.1% 10.7±0.3% 0.0±0.0% 0.9±0.3% 0.0±0.0%
+
+ + + + + + + +
PTM 175 (1×QAC) PTM 303b (3×QAC) PTM 246 (2×QAC) PTM 427 (2×QAC)
30.7±4.2% 23.2±7.3% 44.1±0.6% 1.0±0.2% 3.1±0.7% 0.2±0.0% 0.0±0.0% 0.0±0.0% 1.5±0.3% 0.0±0.0% 1.7±0.4% 0.0±0.0%
+
+ + + +
PTM 189 (1×QAC) PTM 319 (3×QAC) PTM 248 (2×QAC)
2.5±0.4% 24.9±1.4% 17.4±2.6% 3.7±1.1% 2.6±0.3% 0.1±0.2% 0.0±0.0% 0.0±0.0% 1.1±0.2%
+
+ + + + +
PTM 204 (3×QAC) PTM 333 (2×QAC) PTM 317a (1×QAC)
0.8±0.2% 0.4±0.1% 1.0±0.1% 9.3±4.2% 1.7±0.1% 0.4±0.2% 0.1±0.0% 0.1±0.1% 2.5±2.0%
+
+ + + + +
PTM 261 (4×QAC) PTM 399 (3×QAC) PTM 331a (1×QAC)
1.5±0.7% 0.2±0.1% 2.3±0.3% 3.7±1.5% 4.0±0.4% 0.0±0.0% 0.0±0.0% 0.0±0.0% 1.3±0.2%
+ + + +
PTM 289 (3×QAC) PTM 413 (2×QAC)
16.4±4.5% 12.4±2.8% 2.8±0.9% 9.2±5.3% 3.1±0.4% 0.1±0.1%
+ +
PTM 303a (2×QAC)
2.2±0.1% 1.6±0.3% 6.2±1.4%
+ +
PTM 317b (2×QAC)
0.5±0.5% 0.4±0.1% 0.9±0.2%
72 results and discussion
3.3 site-specific localization and discovery of con-
sensus motifs for lysine polyamine ptms
In order to investigate site-specificity of post-translational modification machinery in
diatom biosilica, PTM profiles should be followed up by accurate mapping of modifi-
cation sites in biosilica-associated proteins from three diatom species. Localization of
lysine modifications from PTM profiles (displayed in Fig. 3.10) can be achieved through
the ‘bottom-up’ approach [109]. The presence of characteristic fragments (summarized
in Table 3.2) in MS/MS spectra of modified peptides will be beneficial for further
validation of the found PSMs. However, the canonical bottom-up proteomics faces im-
portant limitations, when it comes to analysis of highly post-translationally modified
proteins, such as silaffins. To address these challenges, the current study was focused
on the tailoring of the existing proteomics methodologies towards the localization of
highly heterogeneous lysine polyamine modifications in biosilica-associated proteins.
The applicability of this specialized approach need to be validated with an analysis
of previously characterized protein, silaffin-3 from T. pseudonana (tpSil3) [67, 68]. Next,
PTMs were mapped onto sequences of biosilica-associated proteins from three closely-
related diatom species (T. pseudonana, C. cryptica, T. oceanica). Finally, the found modifi-
cation sites should be aligned to reveal consensus motifs for post-translational modifi-
cation.
3.3.1 Multiple protease strategy for mapping lysine PTMs
As discussed in Section 3.1.3, tpSil3 is a well-characterized protein from T. pseudonana
biosilica extract [67, 68]. Previously, Sumper et al. attempted to map lysine modifica-
tions in tpSil3 protein using multiple proteolytic enzymes, chemical cleavage reagents
(CNBr), and their combinations [68]. However, the information about lysine PTM sites
obtained in that study is more suggestive than definitive. In this regard, the major lim-
itation was the use of low-resolution mass measurement for lysine modification map-
ping in the absence of proper MS/MS confirmation for sequences of detected modified
peptides. We therefore aimed to demonstrate that our PTM localization results are
3.3 ptm localization and discovery of consensus motifs 73
consistent with, or improved over previous tpSil3 mapping efforts. As shown above in
Fig. 3.4, the amino acid content of the purified protein was analyzed and verified to be
identical to the predicted. Additionally, the purity of the native tpSil3 was examined
by SDS-PAGE, which displayed the presence of a single intense band (see Fig. 3.15).
To investigate the modification sites in tpSil3, native protein was digested in-gel with
several proteolytic enzymes having complementary cleavage specificity (Asp-N, chy-
motrypsin, Proteinase K, Glu-C and trypsin), and the resultant digests were analyzed
by LC-MS/MS (refer to Section 5.14). To this end, using more than one proteolytic en-
zyme having complementary cleavage specificity, or a multiple protease strategy [168],
increases the sequence coverage and, therefore, chances of detecting PTM sites in
biosilica-associated proteins. The combination of both highly selective and nonselective
proteases improves protein and PTM coverage [169, 170]. Moreover, digesting silaffin
proteins with multiple proteases also improves the chances of producing informative
mass spectrum, facilitating the modification assignment problem and resolving poten-
tial PTM localization ambiguities. However, only one (modified) peptide was detected
in Asp-N digest, which resulted in drastically low sequence coverage (total ~5 %, refer
to Fig. 3.13a).
EGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMP
DEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAE
DVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKE
SSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
(a) native tpSil3 (11/205 amino acids, 5 % coverage)
EGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMP
DEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAE
DVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKE
SSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
(b) tpSil3 expressed from a synthetic gene (192/205 amino acids, 94 % coverage)
Figure 3.13 Peptide coverage obtained for: (a) native tpSil3 (natively purified protein, 5 %);(b) tpSil3 expressed from a synthetic chimeric gene (94 %, accounted for both Asp-N andtrypsin)
74 results and discussion
To demonstrate that unmodified tpSil3 sequence is perfectly digestible, a synthetic
chimeric gene, which encodes tpSil3 sequence without a signal peptide3 concatenated
with reference quantification peptides from protein standards (four from BSA and six
from PhospB) and flanked by purification tags (Twin-strep-tag and His-tag) was pro-
duced (the full sequence is displayed in Fig. 5.2). This synthetic chimeric gene was
inserted into high-level expression vector and expressed in E. coli (for cloning protocol
refer to Section 5.3 or to [171]). The resulting band of overexpressed protein was di-
gested in-gel with Asp-N and trypsin. LC-MS/MS analysis of the digest resulted in
94 % peptide coverage (Fig. 3.13b). We therefore concluded that tpSil3 bears a complex
and abundant set of PTMs, which impedes the access of proteolytic enzymes to protein
backbone, thus decreasing the sequence coverage. The analysis of these highly modi-
fied structures require the use of specialized protein deprotection technique, which can
enable access of proteases and allow mapping of lysine modifications, which should
remain unaffected by deprotection.
3.3.2 Selection of deprotection technique
The most logical approach to improve the digestion efficiency and to maximize the
number of detectable peptides per protein is to selectively remove non-lysine PTMs.
Several enzymes are available for releasing O-linked glycans and phosphorylation.
However, dephosphorylation with calf intestinal alkaline phosphatase (CIAP) [172]
was inefficient for tpSil3 deprotection (data not shown). As compared with enzymatic
removal of protein phosphorylation or glycosylation, the chemical deprotection has
the advantage that all O-linked modifications can be removed regardless of their struc-
ture. However, harsh chemical methods can also cleave peptide bonds, which leads to
unacceptable protein degradation. To this end, several chemical treatments were exam-
ined for reducing the modification complexity of tpSil3, while leaving the polypeptide
backbone intact:
(a) treatment with trifluoromethanesulfonic acid (TFMS) [84, 85];
(b) treatment with soluble HF·pyridine complex [173–175];
3 N-terminal signal peptide (17 amino acids) for cotranslational import into the endoplasmic reticu-lum [70] is cleaved out at RXL site and therefore was not considered for coverage evaluation.
3.3 ptm localization and discovery of consensus motifs 75
(c) treatment with anhydrous HF [58, 59].
The apparent molecular weight of native tpSil3 was around 60 kDa (0 h point in
Fig. 3.15a–3.15c), while the calculated mass of this protein is ~25 kDa (including the
mass of lysine modifications profiled in tpSil3, see Fig. 3.4 in Section 3.1.3). Therefore,
it was hypothesized that tpSil3 also bear multiple O-linked PTMs, which render it
highly negatively charged and decrease electrophoretic mobility of tpSil3. Moreover,
tpSil3 is poorly stained with colloidal Coomassie and for visualization the polycationic
carbocyanine dye ‘Stains All’ [87] was employed, which is also diagnostic of a high
negative net charge of the native protein. Altogether, the strikingly low electrophoretic
mobility and negative charge is a clear indication of covalent modifications affecting
multiple amino acid residues in tpSil3. Indeed, complex O-linked glycosylation and
sulphation has been previously reported for this protein (described in Section 1.3.2, see
also [67]).
EGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMP
DEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAE
DVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKE
SSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
(a) TFMS-treated tpSil3 (93/205 amino acids, 45 % coverage)
EGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMP
DEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAE
DVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKE
SSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
(b) tpSil3 treated with HF·pyridine complex (12/205 amino acids, 6 % coverage)
EGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMP
DEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAE
DVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKE
SSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
(c) tpSil3 treated with anhydrous HF (168/205 amino acids, 82 % coverage)
Figure 3.14 Figure continued from p. 73. Peptide coverage obtained for: (a) treatment with tri-fluoromethanesulfonic acid (TFMS) (40 %); (b) tpSil3 treated with soluble HF·pyridine complex;(c) tpSil3 treated with anhydrous HF
76 results and discussion
(a) TFMS
0h ½h 2h0h ½h 2h
Stains allCoomassie
(b) HF·pyridine complex
0h ½h 1h 2h 3h0h ½h 1h 2h 3h
Coomassie Stains all
(c) anhydrous HF
0h 1h1h 0h
Stains allCoomassie
Figure 3.15 Gel images stained with Coomassie and ‘Stains all’, demonstrating different treat-ments of tpSil3 protein.
Removal of the O-linked modifications (carbohydrate or/and phospho groups) by
these three reagents was performed for tpSil3 prior to digestion in-gel with the same
set of proteases (see above). Resulting digests were analyzed with LC-MS/MS, and
obtained peptide coverage values was compared with the one for the native tpSil3 (5 %,
Fig. 3.13a). Sequence coverage achieved for each treatment is shown in Fig. 3.14a–3.14c.
Deglycosylation with TFMS allowed to cover 14 lysine residues (and 40 % of amino acid
sequence, Fig. 3.14a), whereas treatment with HF·pyridine complex turned out to be
completely inefficient for tpSil3 (5 %, Fig. 3.14b). In contrast, treatment with anhydrous
HF resulted in 73 % coverage that contained 25 out of a total of 33 lysine residues (see
Fig. 3.14c). Moreover, treatment by anhydrous HF also improved electrophoretic be-
havior of tpSil3 (~35 kDa), allowing visualization with MS-compatible Coomassie dye
(see in Fig. 3.15c). At the same time, TFMS-treatment reduced the apparent molecular
weight to 50 kDa (Fig. 3.15a). TFMS is known to cleave O-linked glycans, however the
O-phosphorylation is stable to this treatment [85], which was verified with β-casein
standard (data not shown). This explains the lower mass shift for TFMS, as compared
to anhydrous HF, which cleaves both O-phosphoester and O-glycosidic linkages, while
preserving peptide bonds. The latter was ensured by treatment of commercially avail-
able protein standards (BSA, β-casein and Ribonuclease B). Protein losses after anhy-
drous HF treatment did not exceed 40 % (data not shown). Alternatively, HF·pyridine
complex demonstrated much higher protein losses in comparison to other procedures
tested (almost full degradation after 1 hour of treatment, see Fig. 3.15b). Hence, anhy-
drous HF outperformed two alternative treatments both in terms of sequence coverage
3.3 ptm localization and discovery of consensus motifs 77
increase and lower protein degradation rates, and, despite the toxicity and requirement
of special equipment for handling, this procedure was selected for further studies.
3.3.3 Mapping lysine PTMs on tpSil3 using iterative search strategy
The use of HF-deprotection technique makes tpSil3 almost fully accessible to the pro-
teolytic enzymes. After parallel digestion in-gel with five proteases having comple-
mentary cleavage specificity (Asp-N, chymotrypsin, Proteinase K, Glu-C and trypsin),
the resultant peptides were subjected to LC-MS/MS analysis. The employed proteases
were carefully chosen according to their cleavage specificity (refer to Table 5.3), whereas
the use of each additional protease increased the protein sequence coverage by, on aver-
age, 15 %. Although less suitable for identification purposes, digestion by several pro-
teases allows producing complementary longer overlapping peptides, thus improving
the identification of PTM sites [168]. The larger size of peptides is beneficial in this case,
because it compensates the peptide hydrophilicity acquired from highly charged ly-
sine PTMs. The analysis of larger peptides gears this multi-protease approach towards
‘middle-down’ proteomics, such that confident combinatorial assignment of variable
modification sites becomes possible.
As discussed in Section 3.2, total 25 lysine PTMs were detected in biosilica extracts
from the three diatom species. In order to search the acquired MS/MS spectra against
all these PTMs, they need to be user-defined prior to database searches. Fixed mod-
ifications do not increase the complexity of the search, while the number of variable
modifications must be limited, in order to confine the search space and to control the
rate of false positive identifications, or false discovery rate (FDR, [153]). To overcome
the drawbacks of database search against multiple PTMs simultaneously, a number of
strategies for unrestricted PTM identification may be employed, such as the de novo
sequencing [147, 176], sequence-tag [177], and second pass searches [178]. Each of
these strategies has its own limitations and weaknesses, including sensitivity towards
the database size and the quality of MS/MS spectra [144]. It is therefore advisable
to limit the number of allowed PTMs for each query using follow-up searches [179].
This approach expands the systematic localization to hundreds of PTMs from complex
MS/MS data.
78 results and discussion
EGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMP
DEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAE
DVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKE
SSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
(a) Lysine PTM map of tpSil3 obtained in the current study
(b)+28
K (PTM 175) (c)+142
K (PTM 289)
+
(d)+186
K (PTM 333)
(e) MS/MS spectrum of modified peptide A.E+28
K QAIEAAVEE.D (m/z 622.82; 2+)
(f ) MS/MS spectrum of modified peptide E.DVTEMSMA+142
K AGK.D (m/z 713.37; 2+)
(g) MS/MS spectrum of modified peptide Q.AIEAAVEEDVAGPA+186
K AA.K (m/z 600.00; 3+)
Figure 3.16 Silaffin mapping. (e); (f); (g).
3.3 ptm localization and discovery of consensus motifs 79
This approach implies that multiple database queries are searched repeatedly us-
ing conventional search engine (such as Mascot) with a restricted number of variable
modifications. It has been empirically found that peptides containing more than two
ε-polyamine modifications do not produce confidently interpretable MS/MS spectra,
and, therefore, larger number (> 2) of variable modifications per search could be
omitted. Digestion with non-specific or semi-specific proteases (e. g., Proteinase K or
chymotrypsin; see Table 5.3) often results in unusual and unexpected protein cleav-
ages [180]. Therefore, additional searches without cleavage specificity have been per-
formed to ensure all peptide-spectrum matches (PSMs) that were missed by a specific
database search (for experimental details refer to Section 5.15). The performance of
this approach was evaluated with mapping of pre-defined modifications onto tpSil3
sequence: PTMs 175, 261, 275, 289, 303a, 319 and 333 (see Fig. 3.44). This number
of variable modifications (total six) resulted in 15 consecutive searches5, each with a
combination of two different variable ε-lysine modifications. At the same time each
search included methionine oxidation as a variable modification and carboxyamido-
methylated cysteine as a fixed one6. The overall workflow of the multiple non-specific
searches with restricted number of variable modifications per query resulted in mod-
erate search times (less than an hour per one query) and false discovery rates (FDRs)
below 2 %, which was the acceptance criterion for the further studies.
The sequence of tpSil3 with mapped lysine PTMs is displayed in Fig. 3.16a. It
was possible to achieve 73 % of cumulative sequence coverage, whereas all mapped
modifications were confirmed with peptide MS/MS spectra (in contrast to a previ-
ous study [68], discussed further in Section 3.3.6). In total, 13 PTM sites have been
mapped with three different kinds of lysine modifications (PTMs 175, 289, and 333, see
Fig. 3.16b–3.16d), while 11 lysines were found unmodified. These results are consistent
with the PTM profile of tpSil3 (see Fig. 3.4), where 30 % of unmodified lysines has been
detected (while 11 out of 33 were mapped unmodified, ~33 %). However, some of these
4 After HF-treatment [58], phosphorylated PTMs 399 and 413 were converted into PTMs 319 and 333respectively (Fig. A.6). Therefore, PTMs 399 and 413 were not searched in MS/MS data obtained fromHF-treated samples.
5 If the set has n elements, the number of k-combinations, where the order of selection does not matter, isdefined as
nCk =
n!k!(n−k)!
6 The sulfur-containing amino acids methionine and cysteine are more easily oxidized than the otheramino acids. During the sample preparation cysteines are carboxyamidomethylated and the reactionis close to 100 % and therefore this modification should be specified as fixed. However, in all casesmethionine oxidation has to be specified as a variable modification.
80 results and discussion
lysines could have multiple modifications, and therefore direct comparison of these
data may be biased. Unfortunately, full coverage is rarely achieved, and in the case of
tpSil3 it was not possible to map 9 remaining lysines (marked with ? in Fig. 3.16a),
presumably due to lack of informative fragment spectra of peptides cleaved from un-
covered regions. Moreover, lysine ε-polyamination and methylation render modified
peptides hydrophilic, thus reducing separation efficiency on RPLC beads.
Some of these fragment spectra are provided in Fig. 3.16e–3.16g, where each peptide
represent either type of lysine PTM. During this study it was observed, that MS/MS
spectra of multiply-charged peptides (> 3+) often contain multiply charged fragments
(> 2+), which are ignored by the conventional database search engines such as Mas-
cot. To address this issue, deconvolution of raw peptide mass spectra was performed,
which will be discussed further in Section 3.3.4.
3.3.4 Identification of modified peptides by deconvolution of raw MS/MS spectra
We were focused on two types of diatom extracts, rich in biosilica-associated proteins
including silaffins: AFIM and AFSM. These extracts were separated by SDS-PAGE and
visualized with the polycationic carbocyanine dye ‘Stains All’ [86, 87]. From the gel
images it could be concluded, that T. pseudonana and C. cryptica AFSM extracts comprise
heterogeneous set of proteins, however the T. oceanica extract consists of few highly
abundant components. The entire gel slabs were excised and digested in-gel with the
same set of proteases as mentioned above, and the resulting digests were analyzed by
LC-MS/MS.
Although classical bottom-up analysis is the optimal strategy for PTM mapping,
combining high sensitivity of detection and efficient MS/MS fragmentation of short
modified peptides, the mapping of multiple lysine modifications in silaffins challeng-
ing. Bottom-up proteomics exclusively relies on high cleavage specificity of trypsin,
which cuts peptide bonds at the C-terminus of arginine and lysine residues. Such a
cleavage places the highly basic residues at the C-termini and generates peptides in
the preferred mass range (from 0.5 to 3 kDa) for effective MS/MS fragmentation [181].
Doubly-protonated peptides undergo facile fragmentation yielding sequence informa-
tion [141]. In this regard, collision-induced techniques, collision-induced dissocia-
3.3 ptm localization and discovery of consensus motifs 81
(a) Non-deconvoluted MS/MS; S.DASTEYESGASEAGAEVTA+142
K AE+28
K GSD.D; ion score = 56.9
(b) Deconvoluted MS/MS; S.DASTEYESGASEAGAEVTA+142
K AE+28
K GSD.D; ion score = 102.2
Figure 3.17 HCD MS/MS spectra of modified peptides from protein silaffin-4 fromT. pseudonana (tpSil4) (identified in Asp-N digest of T. pseudonana biosilica extract). (a) raw spec-trum (m/z 910.77; 3+); (b) deconvoluted spectrum (m/z 910.77; 3+); (c) deconvoluted spectrumof peptide (m/z 662.99; 3+).
tion (CID) and higher-energy collisional dissociation (HCD), are most effective for
short, low-charged unmodified peptides such that highly informative and more eas-
ily interpretable mass spectra are produced.
As mentioned above, lysine modifications completely block tryptic digestion, and,
due to the significant lysine PTM sites occupancy (70 % to 80 % of lysines are modified,
see Fig. 3.10a), makes trypsin completely inefficient for silaffin PTM mapping. Protein
digestion with alternative proteases (e. g., Asp-N or Glu-C) leads to long and highly
charged peptides, whose MS/MS-spectra are particularly difficult to interpret [182].
Additionally, the presence of multiple basic lysine residues and positively-charged
polyamine modifications prevents full fragmentation upon CID or HCD and directs
the backbone bond dissociation to specific sites, which inhibits the formation of a suf-
ficiently diverse series of b and y-type fragment ions. Moreover, the presence of mul-
tiple positively-charged lysine ε-polyamine modifications, at positions other than the
C-terminus, result in complex fragmentation spectra due to the presence of multiply-
82 results and discussion
(c) Deconvoluted MS/MS; S.DMSVSS+186
K AQMSYIHGSG.D; Mascot ion score = 38.8
Figure 3.17 HCD MS/MS spectrum of modified peptide. (c) deconvoluted spectrum of pep-tide (m/z 662.99; 3+) (continued from previous page).
charged fragment ions, that fail to be matched with the conventional database search
engines like Mascot. To overcome this issue, pre-processing of the raw MS/MS spectra
was performed, that took an advantage of high resolution and mass accuracy of Or-
bitrap. This mass analyzer can resolve fragments with very close ∆-masses as well as
multiply charged ions at the level of isotope distribution. Consequently, pre-processing
reduces MS/MS spectra complexity in two steps: the first one reduces the isotope en-
velope of the fragment to one peak (deisotoping) [151], whereas the second mathemati-
cally collates a spectrum of several peaks for multiply-charged fragments into one peak
corresponding to a singly-charged ion (deconvolution) [148]. This implies that each frag-
ment is represented in the spectrum by only one singly charged peak, thus facilitating
the peptide identification from the resulting MS/MS peptide spectra.
An algorithm for deconvolution of mass spectra to singly charged fragment spectra
was implemented according to Gorshkov et al. [183]; processing details are described
in the Materials and Methods (Section 5.15). The total pre-processing procedure
for the entire spectral dataset takes less than a minute, which is significantly less than
the database search times. The typical example of deconvolution pre-processing for a
triply-charged peptide bearing two different lysine modifications is given in Fig. 3.17.
Non-deconvoluted spectrum contains multiply-charged fragments Fig. 3.17a, which
are either ignored by Mascot database search or assigned with lower ion score (56.9),
whereas the deconvoluted spectrum demonstrates extended y-ion series with an almost
doubled ion score (102.2, Fig. 3.17b) and improves spectrum-to-sequence matching.
Deconvolution helps to overcome this issue, allowing to match peptide spectra that
3.3 ptm localization and discovery of consensus motifs 83
has not been matched before deconvolution (example of such MS/MS is provided in
Fig. 3.17c).
Altogether, a total of 61 silaffin-like proteins were identified in the three diatom
species (25 for T. pseudonana, 15 for C. cryptica, 20 for T. oceanica), while 26 of them
were post-translationally modified with 5 types of lysine PTMs. It was possible to
localize in total 130 lysine PTM sites. The current analysis of the AFIM extract from
T. pseudonana revealed several novel biosilica-associated proteins with unknown func-
tions in [90, 166]. All identified proteins are summarized in Table A.2. In addition
to already known proteins, several novel silaffin-like proteins (SFLPs) were found in
C. cryptica and T. oceanica extracts. However, in some cases it was not possible to identify
PTM site unambiguously due to the lack of complete fragment ion series. To address
this issue, the characteristic fragments obtained from fragmentation spectra were used,
which is discussed in the following section.
3.3.5 Mass spectrometric mapping of PTM sites based on characteristic fragments
A multiple protease strategy, which was applied for mapping of PTMs in tpSil3 protein
(and discussed in Section 3.3.1), often produces peptides with sub-optimal length [168].
Specifically, peptides cleaved by other proteases than trypsin are likely to contain one
or more internal lysine residues that bear positively charged modifications, a situation
that can lead to unassignable MS/MS. PTM assignment in such peptides is therefore
far more challenging, since it is not always possible to unambiguously map a modifica-
tion site to specific lysine residue. Moreover, another problem arises from the regular
structure of ε-polyamine modifications, where different combinations of these PTMs
can correspond to the same mass shift (e. g., the mass of+142
K ++28
K will be the same as
for+85
K ++85
K ).
As demonstrated in Section 3.2.1 and Table 3.2, fragmentation of covalently mod-
ified lysines produces a set of characteristic ions that are specific for each type of
ε-polyamine modification. The presence or absence of these reporter fragments can be
used for PTM determination in instances, where the lack of b or y-ions leads to insuffi-
cient evidence for site-specific PTM assignment. All of the polyamine-specific reporter
fragments refer to low m/z in the range of 50 and 300 Da. In this scenario, polyamine-
84 results and discussion
modified peptides can be matched to bona fide MS/MS spectra containing a peptide
sequence tag (PST), or series of sequence ions that are clearly identifiable. Notably,
normalized collision energies (nCE) that are usually applied for peptide fragmentation
are in the same range as for fragmentation of modified lysines (about 25 % to 35 %),
whereas Orbitrap resolution increases as 1/√m/z towards low masses.
143.1543
(a) MS/MS of peptide G.DMSMA+142
K SH+28
K AEAE.D; (m/z 535.61; 3+); nCE to 30 %; ion score = 51.7
143.1543
(b) MS/MS of peptide V.DA+142
K AS+28
K ESHMSISG.D; (m/z 539.96; 3+); nCE to 30 %; ion score = 44.3
143.1543
(c) Characteristic ion (m/z 143.1543; 1+) from PTM 289 (+142
K )
Figure 3.18 MS/MS spectra of modified peptides that contain characteristic fragments forε-polyamine of PTM 289. Spectra were deconvoluted prior to Mascot ion search
Characteristic fragments that are present in peptide spectra are displayed in Fig. 3.18.
Fragmentation of ε-polyamine side-chain of modified lysine residue produces charac-
teristic ion m/z 143.1543 (Fig. 3.18a–3.18b). The presence of this fragment indicates
that fragmented peptides bear PTM 289, which is depicted in Fig. 3.18c. This ion was
subsequently used as reporter ions for peptides modified with PTM 289.
3.3 ptm localization and discovery of consensus motifs 85
Characteristic fragment m/z 143.1543 can also occur upon fragmentation of ε-polyamines
with a similar structure, e. g., PTM 333. MS/MS spectra of two peptides, modified ei-
ther by PTM 289 or 333 are provided in Fig. 3.19. However, the intensity of m/z 143.1543
fragment varies significantly in these spectra (cf. Fig. 3.19a–3.19b). Upon fragmenta-
tion of the peptide bearing PTM 333 the intense fragment m/z 143.1543 is released,
which results from cleavage at the quaternary ammonium group (Fig. 3.19d). The
facile fragmentation of PTM 333 was explained previously in the literature by inter-
nal proton transfer from the adjacent secondary amino group [102]. This fragment
with m/z 143.1543, can be used as a distinguishing feature between PTMs 289 and 333.
Next, the sequence context of mapped PTM sites were compared in order to identify
consensus modification sequences.
3.3.6 Identification of consensus motifs harboring lysine PTMs
Modified peptides identified by MS/MS ion searches (see Section 3.3.4) were matched
by Mascot searches against a database that contained proteins from the three diatom
species and common laboratory contaminants (refer to Section 5.15 for experimental de-
tails). All the diatom proteins that are not associated with biosilica (histones, clathrins,
etc.) were subsequently filtered out. The resulting list of amino acid sequences with
mapped modification sites (provided in Table A.2) was further checked using BLAST.
It appeared that these proteins show little homology to each other or other proteins in
the NCBI non-redundant database with a clear GO assignment.
Most polyamine-modified proteins contain KXXK repeats and RXL processing sites
(Table A.2, highlighted in blue), which are also present in all silaffins characterized to
date (refer to Sections 1.3.1 and 1.3.2). These sequence features are clearly conserved
among all proteins identified in this work, such as silaffins (e. g., tpSil3 (B8BRK6) and
tpSil4 (B8C0W5) from T. pseudonana), cingulins (B8CGS1, CingulinY3 from T. pseudonana )
and novel silaffin-like proteins (K0S9A6 and K0SSD7 from T. oceanica; G11469 and G22685
from C. cryptica). A significant number (115 out of 150) of identified lysine ε-polyamine
PTM sites reside within KXXK repeats. It was hypothesized previously that KXXK repeti-
tive motifs may represent a target for polyamine modification [68]. In addition they are
known to mediate silica precipitation [184] and to be involved in intracellular target-
86 results and discussion
143.1543
low intensity
(a) MS/MS of peptide K.AE+42
K PASSMPEMSVGA+142
K .A; (m/z 535.61; 3+); nCE to 30 %; ion score = 49.3
143.1543
high intensity
(b) MS/MS of peptide K.AE+42
K PASSMPEMSVGA+186
K .A; (m/z 539.96; 3+); nCE to 30 %; ion score = 43.5
143.1543
(c) Characteristic ion (m/z 143.1543; 1+) from
PTM 289 (+142
K )
+
143.1543
(d) Characteristic ion (m/z 143.1543; 1+) from
PTM 333 (+186
K )
Figure 3.19 MS/MS spectra of modified peptides, that contain characteristic fragments forε-polyamine of PTM 289. Spectra were deconvoluted prior to Mascot ion search
3.3 ptm localization and discovery of consensus motifs 87
ing [71, 185]. However, the KXXK repeat also frequently occurs in non-modified proteins.
Therefore, in order to reveal any consensus sequences for polyamine modifications, the
immediate vicinity of mapped PTM sites needs to be directly compared.
PTMs are usually located in the context of a particular amino acid pattern with a
fixed length (sequence motif). We expect a true ε-polyamination motif to be shared
between distinct biosilica-associated proteins, particularly those from different diatom
phyla. Moreover, sometimes one PTM can promote or inhibit the modification of ad-
jacent amino acids, what is generally termed ‘PTM crosstalk’. PTMs involved in a
crosstalk typically occur in close proximity to each other [186–195]. Therefore, the
current study is based on two hypotheses: (i) lysines bearing the same PTM type
in non-homologous biosilica-associated proteins should be located in a conserved se-
quence context; (ii) if PTM crosstalk exists, two adjacent PTM sites should occur more
frequently than would be expected by chance alone.
As discussed in Section 1.3.3, Sumper et al. formulated a set of empirical rules based
on mapping of tpSil3 lysine PTMs, which are referred to as the ‘lysine modification
code’ [68]. According to these rules, the N-terminal lysine of K(A/S/Q)XK motif is modi-
fied by the PTM 289:
...+142
K↑(A/S/Q)XK... (a)
For KXXK motifs which are separated by more than five amino acid residues from
each side, the C-terminal lysine becomes dimethylated (PTM 175):
...KXXK .......>5 aa
KXX+28
K↑.......
>5 aaKXXK... (b)
Next, if a single lysine is located close to a KXXK motif (i. e., separated by one or two
amino acids from KXXK), both lysines of the adjacent KXXK are modified by the PTM 289:
...KXX+142
K↑
...1-2 aa
K ...1-2 aa
+142
K↑XXK... (c)
Finally, if two KXXK motifs are separated by less than six amino acids, terminal lysine
residues in both KXXK motifs are modified by PTM 333:
...+186
K↑XXK .......
<6 aa
KXX+186
K↑... (d)
88 results and discussion
All the above rules (a)–(d), however, were formulated based on the PTM mapping of a
single protein, silaffin-3 from T. pseudonana (refer to [68]). We therefore compared the
mapping results from Section 3.3.3, to the ones obtained by Sumper et al. (cf. Fig. 3.20a
and 3.20b). Our results were in good agreement with that reported previously, however
our data showed differences for six lysines (out of 25), which were detected as non-
modified residues (marked with a * in Fig. 3.20b). Thus, it can be inferred from our data
that the PTM map in Fig. 3.20b is consistent with rules (a)–(c), while five ε-polyamine-
modified (PTM 289, Fig. 3.20d) and six dimethylated lysines (PTM 175, Fig. 3.20c) in
KXXK repeats comply with the rules (a) and (b) respectively. At the same time, rule (c)
holds only for one KXXK with PTM 289 (out of three covered motifs), whereas three
out of the four mapped lysine residues that conform to the context of rule (d) remain
unmodified. In contrast to the previous study [68], all the peptide identifications in
the current work were confirmed by high-resolution MS/MS spectra. It is possible that
the corresponding modified peptides were not detected in our experiments. However,
lysine PTM profile for tpSil3 protein (displayed in Fig. A.3), where the total content
of PTM 333 (Fig. 3.20e) corresponds to ~2–3 modified lysines (together with PTM 413,
which is converted to PTM 333 by HF-treatment), support our PTM mapping results.
PTM 333 sites mapped to other proteins in the current study also contradict the
rule (d) (B8BYI7, B8C0W5, B8CGS1 from TP; g22685 from CC; refer to Table A.2). However,
a full validation of this rule would require near-complete PTM mapping for all proteins
of interest, which is rarely achievable in large-scale proteomic studies. On the other
hand, the rule (a) defines a specific amino acid context for ε-polyamine modification
(PTM 289), whereas rule (b) represents a relaxed sequence motif, which could be easily
validated, made more exact or even revised if necessary. Hence, we attempted to define
precise consensus motifs for all types of mapped ε-polyamine modifications using a
substantially larger proteomic dataset. Taking into account that many PTMs may also
be present at nonconsensus sites, the conservation of determined sequence motifs for
polyamine modifications was explored across the three distinct diatom species. Finally,
we investigated the interplay between different types of PTMs.
In order to define consensus sequences for ε-polyamine modifications, the short
amino acid stretches flanking PTM sites need to be aligned and the frequency of each
amino acid residue need to be evaluated (e. g., by ‘Sequence Logo’ [196]; refer to Sec-
tion 5.15). However, there are a number of pitfalls that have to be avoided during
3.3 ptm localization and discovery of consensus motifs 89
EGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMP
DEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAE
DVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKE
SSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
(a) Lysine PTM map of tpSil3 obtained by Sumper et al. [68]
EGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMP
DEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAE
DVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKE
SSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
? ? ? ?
? ?
? ?
* * * *
* *
*
(b) Lysine PTM map of tpSil3 obtained in the current study+0
K - unmodified;?K - unmapped;
*K - differences
(c)+28
K (PTM 175) (d)+142
K (PTM 289)
+
(e)+186
K (PTM 333)
Figure 3.20 Comparison of two PTM maps of silaffin-3 from T. pseudonana (tpSil3): (a) ob-tained by Sumper et al. [68]; (b) obtained in the current study; (c)–(e) color-coding used forlysine PTMs.
this procedure. Firstly, the length of the sequence stretch should be reasonably short.
Therefore, the length of flanking stretches was limited to 20 amino acids (10 down-
stream and 10 upstream), because wider window will increase the chance of finding
more than two modifiable amino acids in one motif, significantly complicating the
analysis. Secondly, it was not always possible to unambiguously assign PTMs to some
KXXK lysine pairs due to lack of corresponding overlapping peptides or incomplete
fragment series. Furthermore, 25 lysine residues have ambiguous modification sta-
tus, where two different PTMs can affect the same residue (e. g., dimethylation and
trimethylation). Therefore, in case of both ambiguous assignments and lysine residues
bearing multiple modifications, we considered all possible sequence variants that were
supported by the obtained data. Finally, lysine residues in detected peptides were con-
sidered non-modified, when the corresponding unmodified peptide was detected in
absence of the modified counterpart. The lysines situated before tryptic peptides or at
90 results and discussion
Aligned to unmodified lysine↓
logo 24
HLMDFQTIRYCGNEPVKAS
CFQPLNRTYIEVAKDGS
CITMNRFQVYDLPESAKG
CQFMNYDRIPTKGELVAS
WQYCNEFMRAPVIKLDSGT
HRQYVDPFGIMNAKSTEL
HRFIQNDMTKVPELCSAG
HRLMQTDFIPYACEVNSGK
CHKRFNQMPVIYETDGLAS
KQRCFETNPVSLYIDGAKCKMQR
HWFLNYITVEADPGS
MNQPAKCDIYLFGVTSE
PQFINCTVYEDGASK
QMPRCVDFINYEKLTASG
HRCFNYMTKPVILGDESA
MNQYFRLDGPIKVESAT
HRMIKPVYNQCFSTLEDGA
FYCENQVHMILPSKTAGD
LCHQFIMRNDTYKEGVAS
CDFMRGYIQLEKNPTVSA
24
TP_B5YNQ3_344 VSRLRRLKDDKGDEAVEESIVTP_B8BRK6_43 HSISMSMHSSKAEKQAIEAAVTP_B8BRK6_65 EDVAGPAKAAKLFKPKASKAGTP_B8BRK6_68 AGPAKAAKLFKPKASKAGSMPTP_B8BRK6_70 PAKAAKLFKPKASKAGSMPDETP_B8BRK6_73 AAKLFKPKASKAGSMPDEASATP_B8BRK6_93 AKSAKMSMDTKSGKSEDAAAVTP_B8BRK6_96 AKMSMDTKSGKSEDAAAVDAKTP_B8BRK6_142 VTEMSMAKAGKDEASTEDMCMTP_B8BRK6_164 FAKSDKEMSVKSKQGKTEMSVTP_B8BRK6_200 AAKIFKGKSGKSGSLSMLKSETP_B8BRK6_208 SGKSGSLSMLKSEKASSAHSLTP_B8BSN6_192 KEAYVELFTTKYNVRDAVPDLTP_B8BSN6_182 YLEPLMGPLKKEAYVELFTTKTP_B8BSN6_399 TKKSTTLAIPKSTPTISLGSTTP_B8BSN6_410 STPTISLGSTKSTATDSSLKPTP_B8BSN6_74 TKGRNAGKIVKLVNDVVLDRQTP_B8BYI7_61 EEVEYIMSDGKAGKLPYGGSTTP_B8C0W5_72 GSGDEEAVDAKAEKTSTTGSATP_B8C0W5_243* AGSSDMSVSSKPEKSEGSSEATP_B8C0W5_246* SDMSVSSKPEKSEGSSEATTATP_B8CC24_430 KPKSPPKDAAKKASTAASFRSTP_B8CGS1_200 DDYSAGADAGKSENYDEEASRTP_B8LBG8_61 EDASRPERLLKSLSFSIELGETP_B8LBG8_389 PALPVEEQMDKIPGGVALLFLTP_B8LDT2_455 LSEGIAVGYAKSSGRSSQQAVCC_g11469_308 SAPVEKESAYKVFSKASL...CC_g11606_152 FVKMLQMIGFKPKKVPFIPYSCC_g11606_284 CPGDSVGLSIKGIAKDEKVEPCC_g11606_288 SVGLSIKGIAKDEKVEPGDIICC_g11606_302 VEPGDIIYVQKEGELKPIKSFCC_g11606_307 IIYVQKEGELKPIKSFTAMVACC_g11606_310 VQKEGELKPIKSFTAMVAVQECC_g11606_405 RIAVMDSNRLKMLGKVTGTATCC_g11606_409 MDSNRLKMLGKVTGTATD...CC_g11606_40 ERGVTIQCNTKEFFTEKYHYTCC_g13975_273 CGYLKGDVGDKSCFEYAACYQCC_g13975_296 ADLGIFNVGYKSCIGRGSCEYCC_g13975_632 RLRFLQESEYKTSAVLFEIVSCC_g1484_156 KIGANSCIGNKNCYFLKDATICC_g1484_156 KIGANSCIGNKNCYFLKDATICC_g1484_276 EENQALIGDCKCLGDYICENNCC_g15479_207 KSSKQDMSMGKSFDSKSDKVACC_g15479_603 NIETSAAEEEKLTTSEEISESCC_g15720_277 PQECINNAVDKSYNGCVTASPCC_g22685_354 AKAEKYSKAAKSLSMNEAIKDCC_g22685_363 AKSLSMNEAIKDAKAEKTHSLCC_g25187_16 NYLRCDPATVKSSDKETCNAICC_g25187_20 CDPATVKSSDKETCNAIKHEVCC_g25187_27 SSDKETCNAIKHEVCGKDMSNCC_g25187_33 CNAIKHEVCGKDMSNIDQSYCCC_g25187_149 NYLRCDPATVKSSDKDTCNAICC_g25187_153 CDPATVKSSDKDTCNAIKHDVCC_g25187_160 SSDKDTCNAIKHDVCGKDMSNCC_g25187_173 VCGKDMSNVDKSYCECIGLYGCC_g25187_184 SYCECIGLYGKGTANLRGIMKCC_g3798_241 KEGYGHDGYAKEEYGHDGYDNCC_g3798_145 YYGVVEHFGYKPSYGSSGEHSCC_g3964_541 QIDKVAGLSGKETTAPPFAKVCC_g3964_550 GKETTAPPFAKVYAGASATANCC_g3964_603 PRYIPNQVSLKGPAIAAAIGECC_g3964_643 KGELGLGNSVKSVDAPNNDNGCC_g3964_685 DVFATGSNLYKQLCKDTDGEPCC_g3964_689 TGSNLYKQLCKDTDGEPTTTPCC_g3964_749 LGDGTFLDQDKTSVLIPNDGTCC_g3964_798 RYQLGLGEPGKTAYPTEVDFQCC_g3964_816 DFQVPFFNIAKISSSGSHTVACC_g3964_1115 FSDGEPVTTPKAIKNIQDVKACC_g3964_1118 GEPVTTPKAIKNIQDVKADVECC_g3964_1236 NELDTVAGILKISSSGTQTVACC_g3964_1447 ATALYFSGDPKAVGENTDGNLCC_g749_314 SGSSSNEYGNKYDGYAPAKGYCC_g749_426 YNAIIQCCDDKFGPASFEDGTCC_g749_440 ASFEDGTCLYKDICETVPPSPCC_g749_608 SDGASSGESSKGEGYSGYSQKCC_g7979_688 SPSPTTCEERKWYALSTGDMLCC_g7979_776 VTPSPTVCEDKVFFFDGDVCSCC_g7979_302 KGTSFNVSGSKSDKGASFNVSCC_g8502_533 NNYKGLFGDYKRVTGTTLQKQCC_g8502_99 YVSLTACCNAKFESYARCDFTCC_g8502_62 QAFTANCGPNKPCADGLCCSQTO_K0R7E4_41 VRAGDRCNYPKYDNCSVGPSSTO_K0RIC9_261 INVIGEPVDEKGPIFAKGKEKTO_K0RIC9_267 PVDEKGPIFAKGKEKFAPLHRTO_K0RIC9_287 RSAPTFTEQGKSQEILVTGIKTO_K0RIC9_297 KSQEILVTGIKVVDLLAPYAKTO_K0RIC9_307 KVVDLLAPYAKGGKIGLFGGATO_K0RIC9_321 IGLFGGAGVGKTVVIMELINNTO_K0RIC9_465 GLQERITSTAKGSITSVQAVYTO_K0RIC9_594 VAEVFTGTAGKFVSLADTIKGTO_K0RIC9_603 GKFVSLADTIKGFEEIINGDYTO_K0S7V0_379 SKGTGYGQSDKWQDYDGR...TO_K0S9A6_121 EAKSAKVAEAKPVKEAAAKSATO_K0SQ58_48 VLAPMPGNTLKAGEDERELGSTO_K0SSD7_156 QSKAPEDYTAKITSEAAMQLNTO_K0SUG8_1365 MRGEGFDFLSKDSKASLFPVATO_K0SUG8_2212 IVERLNRYLNKGLTIMTNERETO_K0T463_183 PDNGWESPHDKPYEGIVYGGSConserv. •••••••••••••••••••••︸ ︷︷ ︸ ︸ ︷︷ ︸
Not conserved Not conserved
(a) non-modified lysine+0
K
PTM 175 (Dimethylation)↓
logo 24
LMRTVYDINCEPAKGS
HLRIPYEDNMAKSG
EHINQTVYDFLCMRAGKS
DIPVEHLQAKTMNYGS
ELNQRCHMPKADYGSV
CNRHVAFILSTEKMYDG
FCKLHPVYDSGAHINPRVLSYAFDGKCFLKQVPTYDNGSA
IMRGLPQVDHTKYSNAEKLY
CKQDPTEGVAS
CFHIKLMQRVNTYEGDPSA
ILQRDFVHKTAMENPSYG
FHLNQTYIKVMPADSEG
IRTHLNPQMVYEKADSG
QCLVDEMTGIKANPYS
EIMQRTFLHPVYDNGKSA
FLQVPEMRACDYSGK
EFLNPRTCMIVYDKGSA
CHIPLMNQTVFRYEGADKS
24
TP_B8BRK6_46 SMSMHSSKAEKQAIEAAVEEDTP_B8BRK6_109 DAAAVDAKASKESHMSISGDMTP_B8BRK6_126 SGDMSMAKSHKAEAEDVTEMSTP_B8BRK6_159 DMCMPFAKSDKEMSVKSKQGKTP_B8BRK6_211 SGSLSMLKSEKASSAHSLSMPTP_B8BRK6_225 AHSLSMPKAEKVHSMSA....TP_B8C0W5_148 AGAEVTAKAEKGSDDEGHDAKTP_B8C0W5_341 SMSHYTHGYEKSIFG......TP_B8CC24_431 PKSPPKDAAKKASTAASFRSNTP_B8LBG8_106 KSGKADAKAHKVDEEDLALASTP_B8LDT2_67 RNFYRDDDTRKCSNEATGGIYCC_g11469_150 SLRTVESKAEKLPGGSMSPVACC_g11469_191 SMRTVDAKAQKQQPGSMPPAYCC_g11469_214 SMRTVEAKAEKTPPDGGSMRLCC_g11606_154 KMLQMIGFKPKKVPFIPYSGFCC_g11606_291 LSIKGIAKDEKVEPGDIIYVQCC_g13975_233 TIGDGSCIGYKACYKAQDATICC_g13975_237 GSCIGYKACYKAQDATIGDGSCC_g1484_162 CIGNKNCYFLKDATIGDRSCLCC_g1484_177 GDRSCLYDSIKGAQNSYGYACCC_g1484_193 YGYACAYLQGKVGNDSCHEYACC_g1484_213 AACYQYGDDNKTFNIGNNACQCC_g15479_617 ERDNSFSFSMKTKHALKHRLFCC_g15720_111 IGQNACSSVYKTTVGQGSCNGCC_g22685_133 GELSMMAKVAKEPAMSVGSKACC_g22685_171 PEMSVGAKAEKPAMSVEAKAECC_g22685_171 PEMSVGAKAEKPAMSVEAKAECC_g22685_217 ADASAGAKSEKPASSMPAMSVCC_g22685_244 PAMSVEAKAEKPAMSVEVDAKCC_g22685_266 EKVMSVGKAKKDELSMAKVAKCC_g22685_288 EPSMSISKAAKDEEDESSGSACC_g22685_303 ESSGSAGKTHKVDSQSMPFGGCC_g25187_85 KDKQVLVDLNKDNGGGGGGDGCC_g25187_98 GGGGGGDGGGKSNGGGNNKSDCC_g25187_106 GGKSNGGGNNKSDGGGNNKSDCC_g25187_114 NNKSDGGGNNKSDGGGNNKSDCC_g25187_122 NNKSDGGGNNKSDGGGNKSDGCC_g25187_129 GNNKSDGGGNKSDGGNDNGKNCC_g3798_231 GDGYGHDGYDKEGYGHDGYAKCC_g749_322 GNKYDGYAPAKGYRLGSASFRCC_g8502_526 SAKSDGSNNYKGLFGDYKRVTTO_K0RN71_304 KYGGGKKRKQKSAEPDIDDDETO_K0RU48_146 CGSNAVASATKCRNPQLSCDRTO_K0RWP8_2756 SSATLFVDALKQVVKLCSCPDTO_K0S1R3_110 AGFNEDPPAVKCRNPRPLCDFTO_K0S7V0_276 CGKSGKAKGSKGGYGGYDYGHTO_K0S7V0_310* SKGGYGGDDAKSSKGGYGGYDTO_K0S7V0_313* GYGGDDAKSSKGGYGGYDAKSTO_K0S9A6_049* AAEEDHHGDAKAAKVPAAKSVTO_K0S9A6_052* EDHHGDAKAAKVPAAKSVKAETO_K0S9A6_63 VPAAKSVKAEKAPEEAAFAKSTO_K0S9A6_124 SAKVAEAKPVKEAAAKSAKVATO_K0S9A6_170* SAASSTSVAAKSTKTNPEMYMTO_K0S9A6_173* SSTSVAAKSTKTNPEMYMGIETO_K0S9A6_279 IAKSHKSKTTKEEMEESPGYRTO_K0SQ58_128 KKSGYYPKSDKSYGDYTYSKSTO_K0SQ58_148 SSKSYRDLQSKAPEDYTAKITTO_K0SQ58_68* SKSGYYLFGSKKSYGSKKYGSTO_K0SQ58_69* KSGYYLFGSKKSYGSKKYGSKTO_K0SSD7_111 PKKSGYYHYPKKSGHYPKKSGTO_K0SSD7_112 KKSGYYHYPKKSGHYPKKSGYTO_K0T322_57 MKSGKDAKAEKYTTPEYQGKAConserv. •••••••••••••••••••••︸︷︷︸
AKA↑
PTM 289 (14 out of 62 sequences)
(b) dimethylation site+28
K
+28
K – dimethylation (PTM 175)+42
K – trimethylation (PTM 189)+0
K – non-modified lysine
K – unmapped lysines
* – ambiguous PTM sites
PTM 189 (Trimethylation)↓
logo 24
WYCLNVAEIKRGTPS
INQTVLPYADEFSMKG
ILQTVCEYAFKNMRGS
IADLMCNPVTGS
EHNFTCKMYDGPASV
CIPRTVFLMNYKSADEG
FHKLNPVCDGTYSACHNPTVLRYADGSKFGKLPQRCDENTYVSACGHNPTISKQDLYAEKHI
KMQYLNDAEVSTPG
FILYQVDTGEPSA
CDFHLQENMTYKAGPVS
EHNPQRTIMADFYSG
EFKQNTYDMAPVGS
CDIQALTGKPYMSVE
HLRTYIMDEPVFGKAS
LNQTVYCDEFPRKMAGS
DFHQCETNRVYGKAS
MPCKLTDNGYEVSA
24
TP_B5YNQ3_341 RNGVSRLRRLKDDKGDEAVEETP_B8BSN6_709 LFGGGNASSNKSVSFTPKATSTP_B8CC24_431 PKSPPKDAAKKASTAASFRSNCC_g11469_128 RSNPTFTVLEKVPSMPLAADSCC_g11469_150 SLRTVESKAEKLPGGSMSPVACC_g11469_175 SMRTVEAKAEKTASAGSMRTVCC_g11469_191 SMRTVDAKAQKQQPGSMPPAYCC_g11469_214 SMRTVEAKAEKTPPDGGSMRLCC_g11469_235 AESTPAAKAEKTPADAGSMRTCC_g11469_252 SMRTVDAKAEKLSPGSMPAAVCC_g11469_273 AGETPAPKAEKTPADGASMRSCC_g11469_290 SMRSVDTKAKKHTPGGSMSAPCC_g11469_303 PGGSMSAPVEKESAYKVFSKACC_g11606_359 WKMGKKTGGQKVENPPELSQYCC_g13975_325 NSCNEFYACYKNYGTVSYNSCCC_g13975_237 GSCIGYKACYKAQDATIGDGSCC_g13975_254 GDGSCTGDSIKGVTYYGFSCGCC_g13975_267 TYYGFSCGYLKGDVGDKSCFECC_g1484_162 CIGNKNCYFLKDATIGDRSCLCC_g1484_177 GDRSCLYDSIKGAQNSYGYACCC_g1484_193 YGYACAYLQGKVGNDSCHEYACC_g15479_429 TTFSTDSKADKSPVFSMDAKACC_g15479_483 TSFSMETKADKSPVFSMDTKACC_g15479_517 TLSMPAAKTTKEEVISLSMGYCC_g15720_69 CGSCNGFRACKNAYYSTIGEVCC_g15720_111 IGQNACSSVYKTTVGQGSCNGCC_g15720_197 TGACYVYLEYKGIYTFTVGNNCC_g15720_231 IMIGDNSCNAKEACYSVEANVCC_g22685_81 LFKPAPAKADKGGSMPEVEADCC_g22685_133 GELSMMAKVAKEPAMSVGSKACC_g22685_155 PAMSVGSKAEKPASSMPEMSVCC_g22685_171 PEMSVGAKAEKPAMSVEAKAECC_g22685_171 PEMSVGAKAEKPAMSVEAKAECC_g22685_217 ADASAGAKSEKPASSMPAMSVCC_g22685_244 PAMSVEAKAEKPAMSVEVDAKCC_g22685_276 KDELSMAKVAKMEPSMSISKACC_g22685_288 EPSMSISKAAKDEEDESSGSACC_g22685_303 ESSGSAGKTHKVDSQSMPFGGCC_g22685_348 VFSLHDAKAEKYSKAAKSLSMCC_g3964_166 RKSGDSNSALKISGRGKKQSNCC_g7979_107 NVNVSGSKSDKGTGINVEGGACC_g7979_305 SFNVSGSKSDKGASFNVSGSKCC_g8502_526 SAKSDGSNNYKGLFGDYKRVTTO_K0R8C7_118 IQTSAEDTSLKGFSSSQAKHATO_K0S7V0_273 KGGCGKSGKAKGSKGGYGGYDTO_K0S7V0_310* SKGGYGGDDAKSSKGGYGGYDTO_K0S7V0_313* GYGGDDAKSSKGGYGGYDAKSTO_K0S9A6_63 VPAAKSVKAEKAPEEAAFAKSTO_K0SAX6_494 SKVDAKASEQKPEAAVETKVETO_K0SSD7_111 PKKSGYYHYPKKSGHYPKKSGTO_K0SSD7_112 KKSGYYHYPKKSGHYPKKSGYTO_K0SUG8_1368 EGFDFLSKDSKASLFPVAFGSConserv. •••••••••••••••••••••︸︷︷︸
AKA↑
PTM 289 and 333 (13 out of 52 sequences)
+
(c) trimethylation site+42
K
+142
K – ε-polyamine (PTM 289)+186
K – ε-polyamine (PTM 333)+85
K – ε-polyamine (PTM 232)
Figure 3.21 Graphical representations of the local protein contexts of modified lysines ±10 residues. Seefull description on p. 91.
3.3 ptm localization and discovery of consensus motifs 91
PTM 289 (2 propylamine units)↓
logo1
32
4
LMEYKDAS
RTDGESAAHILVTSKDEMG
HYKPDEVASAFHLPTSVDEGM
CHKLTEDMASR
EHGLMATS
DEHGPSKVFKPQTGMSEDA
DEPTVGSAKL
PQVSTADFNPSVAQTGHEGMPVAKDMYGPQELASVT
FHKNRVYDMQAESP
DHYKEMGSPA
KRQVMAEGDS
EYDMTVKAGS
TIKVYESAGM
GHIQTVEFMDKAPS 1
32
4
TP_B8BRK6_106 KSEDAAAVDAKASKESHMSISTP_B8BRK6_123 MSISGDMSMAKSHKAEAEDVTTP_B8BRK6_139 AEDVTEMSMAKAGKDEASTEDTP_B8BRK6_156 STEDMCMPFAKSDKEMSVKSKTP_B8BRK6_166 KSDKEMSVKSKQGKTEMSVADTP_B8BRK6_169 KEMSVKSKQGKTEMSVADAKATP_B8BRK6_222 ASSAHSLSMPKAEKVHSMSA.TP_B8C0W5_145 ASEAGAEVTAKAEKGSDDEGHTP_B8C0W5_216* SDEATTSDASKATKVFKSSGKTP_B8C0W5_219* ATTSDASKATKVFKSSGKSGKTP_B8C0W5_243* AGSSDMSVSSKPEKSEGSSEATP_B8C0W5_246* SDMSVSSKPEKSEGSSEATTATP_B8CGS1_263* DESYGDSGDSKAGKAEAGYGDTP_B8CGS1_266* YGDSGDSKAGKAEAGYGDDYGCC_g11469_147 DSGSLRTVESKAEKLPGGSMSCC_g11469_147 DSGSLRTVESKAEKLPGGSMSCC_g11469_172 DSGSMRTVEAKAEKTASAGSMCC_g11469_188 SAGSMRTVDAKAQKQQPGSMPCC_g11469_188 SAGSMRTVDAKAQKQQPGSMPCC_g11469_211 YAGSMRTVEAKAEKTPPDGGSCC_g11469_211 YAGSMRTVEAKAEKTPPDGGSCC_g11469_232 MRLAESTPAAKAEKTPADAGSCC_g11469_249 DAGSMRTVDAKAEKLSPGSMPCC_g11469_270 AAVAGETPAPKAEKTPADGASCC_g22685_168 SSMPEMSVGAKAEKPAMSVEACC_g22685_168 SSMPEMSVGAKAEKPAMSVEACC_g22685_273 KAKKDELSMAKVAKMEPSMSICC_g22685_300 EEDESSGSAGKTHKVDSQSMPCC_g22685_300 EEDESSGSAGKTHKVDSQSMPCC_g22685_345 SAKVFSLHDAKAEKYSKAAKSTO_K0RIC9_565 LGMDELSEDDKLVVSRARKVQTO_K0S9A6_049* AAEEDHHGDAKAAKVPAAKSVTO_K0S9A6_052* EDHHGDAKAAKVPAAKSVKAETO_K0S9A6_60 AAKVPAAKSVKAEKAPEEAAFTO_K0S9A6_60 AAKVPAAKSVKAEKAPEEAAFTO_K0S9A6_170* SAASSTSVAAKSTKTNPEMYMTO_K0S9A6_173* SSTSVAAKSTKTNPEMYMGIEConserv. •••••••••••••••••••••︸ ︷︷ ︸
(A/S)K(A/S)EK
(d) m/z 289+142
K+28
K – dimethylation (PTM 175)+42
K – trimethylation (PTM 189)+0
K – non-modified lysine
K – unmapped lysines
* – ambiguous PTM sites
PTM 333 (2 propylamine units)↓
logo1
32
4
DMYAES
ADEFGTVYS
DGHISTEM
ADEMYPS
KSTEGD
ASTVDM
AGSDGVKDPSGATAGSKG
LVAAFGPQSTE
AMSYKALVGPS
ESAFGY
AISKMG
DEGHKPTS
DKYGSV
ADSEYG
DISAKG 1
32
4
TP_B8BRK6_62 AVEEDVAGPAKAAKLFKPKASTP_B8BYI7_64 EYIMSDGKAGKLPYGGSTVDITP_B8C0W5_192 EAGSDMSVSSKAQMSYIHGSGTP_B8C0W5_216* SDEATTSDASKATKVFKSSGKTP_B8C0W5_219* ATTSDASKATKVFKSSGKSGKTP_B8CGS1_263* DESYGDSGDSKAGKAEAGYGDTP_B8CGS1_266* YGDSGDSKAGKAEAGYGDDYGTP_B8CGS1_320 MFHDKSGKGGKGSSSGGEGYGCC_g22685_168 SSMPEMSVGAKAEKPAMSVEACC_g22685_168 SSMPEMSVGAKAEKPAMSVEAConserv. •••••••••••••••••••••︸︷︷︸
SKA
+
(e) m/z 333+186
K
PTM 232 (1 propylamine unit)↓
logo1
32
4
AGIPRSV
CGIRTV
DEFGKPY
AEGSTD
EGVYAAFRSYP
AGLRVE
AGLVYK
EKMSYG
ADEGLSKA
GPVSATGEEGHPYK
GRSTYE
KLRDS
DIMFG
AEYGEKMNRS
ADKNPSV
AEKNYR 1
32
4
TO_K0RCW9_108 VRPSAPGYEDKPEERRGGSPETO_K0RHV4_222 GIFDVFRKKLKSTGGSFYKARTO_K0RWT0_323 RTGTAPVVMEKGEPESIAMVRTO_K0S9A6_132 PVKEAAAKSAKVAHEDMGESATO_K0SQ58_59 AGEDERELGSKSGYYLFGSKKTO_K0SSD7_150 SCDAYYEAYGKSGKTKGGRNNTO_K0T463_198 IVYGGSLGGSKAEKSDDENDYConserv. •••••••••••••••••••••︸ ︷︷ ︸
Not significant (7 sequences)
(f ) m/z 232+85
K+142
K – ε-polyamine (PTM 289)+186
K – ε-polyamine (PTM 333)+85
K – ε-polyamine (PTM 232)
Figure 3.21 Graphical representations of the local protein contexts of modified lysines ±10residues (continued from p. 90). Sequence logo plots represent relative amino acid frequenciesfor ±10 amino acids from the lysine PTM site. The total height of the stack of letters at eachposition shows the sequence conservation, while the relative height of each letter shows therelative abundance of the corresponding amino acid [196]. Positively and negatively chargedresidues are shown in blue and red respectively, uncharged residues are green, hydrophobicresidues are black, and S/T/Y residues are highlighted in orange.
92 results and discussion
their C-termini were also mapped as non-modified lysines, because trypsin does not
cleave after modified lysine residues.
This all considered, the local contexts of lysine PTM sites were investigated in all
three diatom species. However, prior to the alignment of modified residues, the conser-
vation of amino acids surrounding non-modified lysines was checked. As displayed
in Fig. 3.21a, no sequence conservation was observed in short sequence stretches with
non-modified lysine residues. The local contexts of methylated lysines (ε-N,N-di- and
ε-N,N,N-trimethylation, denoted also as PTMs 175 and 189 respectively, see Fig. 3.21b–
3.21c), on the other hand, demonstrate the prevalence of C-terminal lysines in KXXK
motifs as methylation target. Of the total 93 mapped methylated residues, 21 bear ei-
ther di- or trimethylation in C. cryptica and T. oceanica species (but not in T. pseudonana),
which is consistent with the relative abundance of both ε-methylated lysines in total
hydrolysates (see Fig. 3.10b from Section 3.2.3). The -3 position from the methylation
site contains a markedly conserved lysine residue that is often flanked by alanine re-
sidues (AKA). Lysines at the -3 position, as well as the N-terminal lysine in KXXK, are
often modified by an ε-polyamine chain (~25 % of aligned sequence stretches). Such a
conclusion was further corroborated by the alignment of ε-polyamine-modified lysines,
which are shown in Fig. 3.21d–3.21f. The PTM 333 (ε-polyamine with two propylamine
units and quaternary amine) is often present within an SKA consensus site (4 out of
8 in Fig. 3.21e) and affects N-terminal lysine in KXXK. However the small number of
sequences (altogether 8 mapped sites) do not allow us to draw strong conclusions. A
similar situation is observed for the T. oceanica specific PTM 232 (ε-polyamine with one
propylamine unit, 6 mapped sites). Amino acid residues that surround ε-modification
sites (PTM 289) display some degree of conservation ((A/S)K(A/S)EK, Fig. 3.21d).
Next, we wanted to test whether the identified motifs were conserved across all
the species, or if different organisms displayed differences in the amino acid context.
For this purpose, sequences with mapped PTM sites were grouped by the species
from which they derive. Sequence logos resulting from these alignments are shown in
Fig. 3.22. These results clearly indicate that both methylated and polyamine-modified
lysines are found in KXXK boxes at the C- and N-terminal positions respectively. How-
ever, sequence context of PTM 289 site in T. oceanica differed from the overall picture
of polyamine-modified lysines, which could be affected by the small sample size of
3.3 ptm localization and discovery of consensus motifs 93U
nmod
ified
lysi
nere
sidu
e(+
0 K)
↓
K+0
(TP)
13 24
HIRYDKPVFGLETSA
CMPREILTFNAKSDG
DINTLRVAMPYEGSK
FGIKMQVWYPTDREALS
AELPRYFIKMSTDNVG
CFGLRVIPTAMDKES
CIKNQRTVWYDFMPSEGAL
HIPTELMQRVFGASDK
GHKPYMVITEFLDSA
EIKNQFPCALSTVDG
KFKNRYEIPQTVDLGAS DINRWFLSAKPGVYTE ILPQRYNTADEVGSK CNDEFMRVYGLPQASTCFGRILMPTANKVDES
HKNRVDFLMYGPEITAS
CHKMTYFINPQVDGASEL
CFHIQRVWYEMPLSTAGKND
CMYFGILEPRDKTAS
CFIMSYKQRPDELTAGV
13 24
conserv.
•••••••••••••••••••••
︸︷︷
︸no
tcon
serv
ed(47
sequ
ence
s)
Dim
ethy
latio
nsi
te(+
28 K)
↓
K+28
(TP)
13 24
KPRADSAHKNSGM
CDFGASAEHKPYLM
AHRYPVS
FKSTDM
HLPSDAADGK TYSA DKRSHE
KCGQSEVA ADEHIMSADEFHINTS
DGVAMEEHKDSA
GILTVAS
AFGHKLSTV
DLQRSEG
DEGISAM
DMNPYKS
13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
AK
(A/S
)EK
A(11
sequ
ence
s)
Trim
ethy
latio
nsi
te(+
42 K)
↓
K+42
(TP)
13 24
LPRFKNSG GPV
GPS KNRADL ARS
ARS KLNKADS DSV
KSTAFGADTEPSAFKARVESTENS
13 24
︸︷︷
︸N
otsi
gnifi
cant
(3se
quen
ces)
PTM
289
site
(+14
2K
)↓
K+142
(TP)
13 24
DMYKSADGTES
ITMDSE
KVYDASAEHMDTVG
CEKTMSAD
AELMS DGPSKV
FKPQSTDAM
EPTGSAKPQTVSA DFH
STGEAGMK DTAEGVS FH
MVYSEDEHKMSGA
AEGKVMDS
AEKVYDTSIKSTVYAEG
GHSTADK
13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
(A/S
)K(A
/S)E
KS
(14
sequ
ence
s)
PTM
333
site
(+18
6K
)↓
K+186
(TP)
13 24
DMSYAE
ADEFGTVY
DGHISTE
ADEMYSKSTGDAMSTVDAGS DVGK DGPSA ATGS KGLVA AEFGPQST
AMSYK ALVGSESFGY
AISKGDEGHKPST
DKVYGS
ADSYGDISKG
13 24
conserv.
•••••••••••••••••••••
︸︷︷
︸SK
A(S
/G)K
(A/V
)XK
(8se
quen
ces)
(a)A
lignm
ents
forTP
prot
eins
Unm
odifi
edly
sine
resi
due
(+0 K)
↓
K+0
(CC)
13 24
ILMTDEFGPQRYVACNKS
CFLAPQRTKNVDESYGI
CNRTKQFELYADPSG
CFMPQYDNRISTEGAKLV
FMRYANQVCEIKLPDGST
PEHKMYGQSADFIVLNT
HTADFILMQNVPEKSGC
DFMPQTYAIKLVCESNG
CFMQRNYIPTVDLEASG
EFQRTCPSVLNAGIYD
KCLMQRWYAFHINTGPVEDS KMPAGDVLYCEFIST EINPQFVTCDYGKSA IMPCTDFLVYEANSKGFHMRCKNVYIPTDLSGEA
LNQADRFGVEIPSKT
HPSVEFLNQYIKTCGAD
LQYEGPCKMNSAHIVTD
CHIDFMQNTAGKEYSV
CDFGKYELVQAINPTS
13 24
conserv.
•••••••••••••••••••••
︸︷︷
︸no
tcon
serv
ed(55
sequ
ence
s)
Dim
ethy
latio
nsi
te(+
28 K)
↓
K+28
(CC)
13 24
CILTYAKNPESG
KLPRIAEMSDGN
ADINQVYCLSMGRK
ADIGKMQNTYSKNQACMDGSV
CFHMNSVADLYEIG
KVCDYSAG
ILNSYAFDGKCFKLPQTYDVGSNA
DHIKLMPQGAYNE
KKLQAEGVDPTS CFKLTVYGNQEPDA
ADEFHLQRTVNSYMPG
FKLMNQAIDSGAEHILNQTDMVSG
ACKQVIPSYDEGMN
DIRVYHPKASGN
CEFVADMPRGYKS
FLNRTYCDKGVSA
CINRSTVYEFLQDKAG
13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
(G/A
)KA
EKX
(30
sequ
ence
s)
Trim
ethy
latio
nsi
te(+
42 K)
↓
K+42
(CC)
13 24
KLVWYCEINRAGTPS
IPTVYKLAFDESGM
ILQACEFKYGNMRS
AICLMPVNGTSEFGHNYKTACDMPSV
CIKLPTVYFMNSADEG
FKNPCGTYSACNPRVYADGLSK EFGLQYCNTVSACGHKTIQLADYEKAHIMQYDLNSEVTPG ILVYDGQTESPA C
DEFKQNTYAGMPVS
NPQRTAFIMYDGS
EFKQTYADNMPVGS
ACDILPTYKGSVEM
HLRYDFIKMEGPVAS
FLNQTYCDEPRKAGMS
DEFQTYCVNRGKSA
DMPCKLNTGSYEVA
13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
(A/S
)KA
(E/D
)KX
(41
sequ
ence
s)
PTM
289
site
(+14
2K
)↓
K+142
(CC)
13 24
AKMEYDS
RESA LVDKMGKVAEPS DFGLSEMEMSR GLST HPSV M
GADEPGSA KVTA AHQE KMYPQVLT EDQSAP
KAGMSPAQSDG
MAVGSAKESGM
IMAPS
13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
AK
AEK
X(16
sequ
ence
s)
PTM
333
site
(+18
6K
)↓
K+186
13 24
SS MPEMSVG A
KAEKPAMSV E
A13 24
︸︷︷
︸N
otsi
gnifi
cant
(1se
quen
ce)
(b)A
lignm
ents
forCC
prot
eins
Unm
odifi
edly
sine
resi
due
(+0 K)
↓
K+0
(TO)
13 24
ADHLMRSYEIKPGV
GMWDKNRTYLSVA
CFNTDEKRSVAGQL
AFRYGILPSTDEV
AKMPRVWFIQTDSGL
ADKNRYEFPTISGL
CFKLQSYPRDTVGA
EFQYGLPSVADINT
EFGHKNPQSVDLYAT
KNREIPSTDALG
KFHMNWYPRSTVADG DERYFGSKPQLV FGLMNPSYADIKTVE
CFGQRYAKNTVDELI
CEFQRTGMDISLA
EKQRGSDILTVA
CEMQVYAGKLNPTDFR
ADMQRSVEIKTYLGP
EHIQDKNRALSVG
EGHMNPQRTLVYAKS
13 24
conserv.
•••••••••••••••••••••
︸︷︷
︸N
otco
nser
ved
(29
sequ
ence
s)
Dim
ethy
latio
nsi
te(+
28 K)
↓
K+28
(TO)
13 24
EGIMPVACKS
DPYGASK
EFHTAGSK
AEHTVYNGS
EHLSVADKGY
AEFGHLRSTVKYD
DFGKYHPSVAHLPRSVYDGKGKQTYDPSADLPQEKVSTA
KKQTVYCEGAS ENVRSTYAGP
HMPTVANYEGK
HKSTVYADEGP
ADLMNQREPGKY
CILAKTPSY
EFGLMPQVKSYAD
DMYASCGK
APVGIDSYK
FGHKMTVYADRES
13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
AK
AX
KX
(20
sequ
ence
s)
Trim
ethy
latio
nsi
te(+
42 K)
↓
K+42
(TO)
13 24
EGIPVKS
PQYGKAFKSTVGACDSGDFKAGY
DEGLSKY
DGHVYAS
DGHSTYKAEKPYDS
EKLPQAS
KKPASG EFPGSAEGHLSYK
AEFHSYGASVYPGAEKPQVYDFTAGK
FSAGKHSVGKY
AEGYDS13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
XK
XX
KG
S(9
sequ
ence
s)
PTM
289
site
(+14
2K
)↓
K+142
(TO)
13 24
ELSA DGSA AEHMTK
DEHSV
DEGSVP
DHLTAHSA EGVK ADS DTVA KLSTVA ANPTVE
APVK ESTVAKMNRPPSYAE
AMRVEGMAK
ISVYA MQVEF13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
(A/S
)K(S
/D)(
A/V
)K(A
/V)(7
sequ
ence
s)
PTM
232
site
(+85 K
)↓
K+85
(TO)
13 24
AIPRSTVG
CGIRTVY
DEFGKPSTY
AEGDST
EGVAYACFRSTYP
AGLRVES
AGLVYKCEKMYGS
ADEGLNSKAGPRVS ACNTGE
EGHPRYKGSTYERCLRDKS
DIMRYFG
AERYG DEMNRKS
AKNPSVYD
ADEGKNYR13 24
conserv.
•••••••••••••••••••••
︸︷︷︸
SKSE
KX
(9se
quen
ces)
(c)A
lignm
ents
forTO
prot
eins
Figure
3.22
Gra
phic
alre
pres
enta
tion
sof
the
loca
lpr
otei
nco
ntex
ts,
alig
ned
byPT
Msi
tes
sepa
rate
lyfo
rea
chdi
atom
spec
ies
(TP
,T.
pseu
dona
na;CC
,C
.cry
ptic
a;TO
,T.
ocea
nica
).T
hem
odifi
edly
sine
resi
due
ism
arke
dw
ith
arro
w.
Con
sens
usse
quen
ce(i
fan
y)an
da
num
ber
ofal
igne
dse
quen
ces
ispr
ovid
edbe
low
each
sequ
ence
logo
.
94 results and discussion
aligned sequences (total seven stretches, see Fig. 3.22c). Nevertheless, 88 % of modified
lysines mapped in this study were embedded into defined KXXK motifs, with biases for
certain lysine-flanking amino acid residues (putative motifs are demonstrated below
each sequence logo in Fig. 3.22a–3.22c). In order to further investigate the association
between PTMs in KXXK boxes, short sequence stretches were aligned by the entire lysine
pairs.
Specific properties of lysine methylation and ε-polyamination were further investi-
gated by studying their crosstalk. As demonstrated in Fig. 3.21 and 3.22, 114 out of the
130 total mapped modified lysines that were detected in this study, resided in repeat
KXXK. However, it is clear that these PTMs are not strictly dependent on each other,
because there are counter-examples for the both cases, when either ε-polyamination
or methylation occurs in KXXK alone (five and eight cases respectively, Fig. 3.21). We
therefore examined the statistical evidence for association between ε-polyamines and
methylation present in KXXK motifs. To this end, all sequences containing this motif
were extracted and aligned by the N-terminal lysines. The analyzed sequences are
presented in both graphical (as sequence logos) and text form in Fig. 3.23. Firstly, we
counted the number of either ε-polyaminated or methylated lysines at I or II position
in KXXK motif, and analyzed their distribution for non-random modification patterns
using a Fisher’s exact test (Table A.3). Here we determined a major bias towards methy-
lation at C-terminal lysine in KXXK block (with the prevalence of dimethyllysine marks
in T. pseudonana and trimethyllysine in C. cryptica), whereas polyamination occurred
mostly at N-terminal lysine in KXXK (PTMs 289 and 333). This correlation is statistically
significant for T. pseudonana and C. cryptica (Tables A.3b and A.3d), while the sample
size is too small to draw a conclusion about a crosstalk in T. oceanica (Fisher’s exact
test, Table A.3f). Lysines modified with ε-polyamines with two propylamine units are
flanked by either alanine or serine residue (A/S)K(A/S) in T. pseudonana and C. cryptica
(see Fig. 3.22a and 3.22b), whereas the little number of sequences for T. oceanica is not
enough to properly define a consensus sequence (Fig. 3.22c):
(A/S)(+142
K /+186
K )(A/S)E+28
K (T. pseudonana)
(A/S)(+142
K /+186
K )(A/S)E(+28
K /+42
K ) (C. cryptica)
A(+142
K /+85
K )XE(+28
K /+42
K ) (T. oceanica)
3.3 ptm localization and discovery of consensus motifs 95
Therefore, three diatom species comparison, provided in Fig. 3.23, revealed the po-
tential PTM crosstalk in a large number of polyamine-modified proteins (e. g., B8BRK6
and B8C0W5 from T. pseudonana, G11469 and G22685 from C. cryptica), which occurs be-
tween methylation and ε-polyamines. The defined consensus site (A/S)K(A/S) for ε-
polyamines in T. pseudonana and C. cryptica demonstrates, that biosilica-associated pro-
teins are modified in a similar way in these closely-related species. The discovered
interplay between ε-polyaminated and methylated lysines may indicate the presence
of recognition domains in the corresponding PTM enzymes, similar to Tudor domains
for histone methylases [100], which could facilitate binding of silaffins already possess-
ing methylated lysine residues. However, this potential crosstalk remain to be further
investigated, and to demonstrate its biological relevance it is necessary to examine in
vivo the effect of methylation on lysine ε-polyamination (or vice versa).
96 results and discussionT.pseudonana
PTM289
(15outof26
sequences)↓
logo
1 32 4
HMRD E G K P S AA KMN V YD G T E S
H D GIK V T E S
ME K V Y D A S
FH K Y A EMST D G
C GHIKP R V ET DSAM
DEK A L M SAFHMP RD G K S V
F PR T KMS A D
G L P T S AKD GLQ P V S AA D H S T F E GK
DPQE LT G V A S
A G H KMPD F S E
HIMT YDE K G A S
DP T V K A E GM S
EHP T A D V Y G K S
DILV E S A G
G HL ST V E A D K
1 32 4
TP_B5YNQ3_341
RNGVSRLRRLKDDKGDEAVEE
TP_B8BRK6_43
HSISMSMHSSKAEKQAIEAAV
TP_B8BRK6_62
AVEEDVAGPAKAAKLFKPKAS
TP_B8BRK6_65
EDVAGPAKAAKLFKPKASKAG
TP_B8BRK6_70
PAKAAKLFKPKASKAGSMPDE
TP_B8BRK6_93
AKSAKMSMDTKSGKSEDAAAV
TP_B8BRK6_106
KSEDAAAVDAKASKESHMSIS
TP_B8BRK6_123
MSISGDMSMAKSHKAEAEDVT
TP_B8BRK6_139
AEDVTEMSMAKAGKDEASTED
TP_B8BRK6_156
STEDMCMPFAKSDKEMSVKSK
TP_B8BRK6_166
KSDKEMSVKSKQGKTEMSVAD
TP_B8BRK6_208
SGKSGSLSMLKSEKASSAHSL
TP_B8BRK6_222
ASSAHSLSMPKAEKVHSMSA.
TP_B8BYI7_61
EEVEYIMSDGKAGKLPYGGST
TP_B8C0W5_72
GSGDEEAVDAKAEKTSTTGSA
TP_B8C0W5_145
ASEAGAEVTAKAEKGSDDEGH
TP_B8C0W5_216*
SDEATTSDASKATKVFKSSGK
TP_B8C0W5_216*
SDEATTSDASKATKVFKSSGK
TP_B8C0W5_219*
ATTSDASKATKVFKSSGKSGK
TP_B8C0W5_219*
ATTSDASKATKVFKSSGKSGK
TP_B8C0W5_243*
AGSSDMSVSSKPEKSEGSSEA
TP_B8C0W5_243*
AGSSDMSVSSKPEKSEGSSEA
TP_B8CGS1_317
GYHMFHDKSGKGGKGSSSGGE
TP_B8CGS1_263*
DESYGDSGDSKAGKAEAGYGD
TP_B8CGS1_263*
DESYGDSGDSKAGKAEAGYGD
TP_B8LBG8_103
PMTKSGKADAKAHKVDEEDLA
Conserv.
•••••••••••••••••••••↑
Dim
ethylation(8
outof26sequences)
(a)(A/S)(+142
K/+186
K)(A/S)E+28
K
+142
K–
ε-polyamine
(PTM
289)+186
K–
ε-polyamine
(PTM
333)+85
K–
ε-poly amine
(PTM
232)
C.cryptica
PTMsite
m/z
189(Trim
ethylation)↓
logo
1 32 4
ILMFK Y E D S AGHINR T V A E S K
AIL ST Y E VDM G K
KMN V GLT A E P S
GQT VK A DF P L SME
IPQVE FKLN AM R S
K NP E G LVM T S
EHIK G T AMP S VPT V YMID A E G
FILT P G S AK
D P T V S AT HIK Q D A EK
AMNYQE G LDSV T P
FGILT D QS E A P
F LQT K E G V AMP S
TQ A FIM G D S
D H A P E N GM V S
ILY A K G P V S EM
KLMDE GIV P A S
1 32 4
CC_g11469_147
DSGSLRTVESKAEKLPGGSMS
CC_g11469_147
DSGSLRTVESKAEKLPGGSMS
CC_g11469_172
DSGSMRTVEAKAEKTASAGSM
CC_g11469_188
SAGSMRTVDAKAQKQQPGSMP
CC_g11469_188
SAGSMRTVDAKAQKQQPGSMP
CC_g11469_211
YAGSMRTVEAKAEKTPPDGGS
CC_g11469_211
YAGSMRTVEAKAEKTPPDGGS
CC_g11469_232
MRLAESTPAAKAEKTPADAGS
CC_g11469_249
DAGSMRTVDAKAEKLSPGSMP
CC_g11469_270
AAVAGETPAPKAEKTPADGAS
CC_g11606_152
FVKMLQMIGFKPKKVPFIPYS
CC_g11606_288
SVGLSIKGIAKDEKVEPGDII
CC_g11606_307
IIYVQKEGELKPIKSFTAMVA
CC_g15479_426
AEKTTFSTDSKADKSPVFSMD
CC_g15479_483
AEKTSFSMETKADKSPVFSMD
CC_g15479_514
EKETLSMPAAKTTKEEVISLS
CC_g22685_78
ATKLFKPAPAKADKGGSMPEV
CC_g22685_130
AEKGELSMMAKVAKEPAMSVG
CC_g22685_130
AEKGELSMMAKVAKEPAMSVG
CC_g22685_142
AKEPAMSVGSKAEKPASSMPE
CC_g22685_168
SSMPEMSVGAKAEKPAMSVEA
CC_g22685_168
SSMPEMSVGAKAEKPAMSVEA
CC_g22685_168
SSMPEMSVGAKAEKPAMSVEA
CC_g22685_168
SSMPEMSVGAKAEKPAMSVEA
CC_g22685_214
SKVADASAGAKSEKPASSMPA
CC_g22685_214
SKVADASAGAKSEKPASSMPA
CC_g22685_241
AEKPAMSVEAKAEKPAMSVEV
CC_g22685_241
AEKPAMSVEAKAEKPAMSVEV
CC_g22685_263
AKAEKVMSVGKAKKDELSMAK
CC_g22685_273
KAKKDELSMAKVAKMEPSMSI
CC_g22685_285
AKMEPSMSISKAAKDEEDESS
CC_g22685_285
AKMEPSMSISKAAKDEEDESS
CC_g22685_285
AKMEPSMSISKAAKDEEDESS
CC_g22685_300
EEDESSGSAGKTHKVDSQSMP
CC_g22685_300
EEDESSGSAGKTHKVDSQSMP
CC_g22685_345
SAKVFSLHDAKAEKYSKAAKS
CC_g22685_351
LHDAKAEKYSKAAKSLSMNEA
CC_g22685_363
AKSLSMNEAIKDAKAEKTHSL
CC_g3964_1115
FSDGEPVTTPKAIKNIQDVKA
CC_g7979_104
DNINVNVSGSKSDKGTGINVE
CC_g7979_302
KGTSFNVSGSKSDKGASFNVS
Conserv.
•••••••••••••••••••••↑
Di-/trim
ethylation(36
outof41sequences)
(b)(A/S)(+142
K/+186
K)(A/S)E(+28
K/+42
K)
T.oceanica
PTM289
and232
(4and
3outof19
sequences)↓
logo
1 32 4
G HIME K A S
C G R T V Y K AD PSY E V A G K
A CDIKMP T V E G S
LV A D K P S G Y
F LY A G H S T K
DLV E H A G S
FH PD V Y A K G
C E G LK S A Y D
G P V S AKD P T G A SD N V G A T E SK
IR Y E S A G T V
CDEKST Y A N G P
D HMR T E G L PA Y
D P R F A G ED N P R Y A EM G K
D GMN T V A S YN PQF GMV Y A D
1 32 4
TO_K0RIC9_307
KVVDLLAPYAKGGKIGLFGGA
TO_K0S7V0_273
KGGCGKSGKAKGSKGGYGGYD
TO_K0S7V0_310*
SKGGYGGDDAKSSKGGYGGYD
TO_K0S7V0_310*
SKGGYGGDDAKSSKGGYGGYD
TO_K0S9A6_60
AAKVPAAKSVKAEKAPEEAAF
TO_K0S9A6_60
AAKVPAAKSVKAEKAPEEAAF
TO_K0S9A6_121
EAKSAKVAEAKPVKEAAAKSA
TO_K0S9A6_129
EAKPVKEAAAKSAKVAHEDMG
TO_K0S9A6_276
KTVIAKSHKSKTTKEEMEESP
TO_K0S9A6_170*
SAASSTSVAAKSTKTNPEMYM
TO_K0S9A6_170*
SAASSTSVAAKSTKTNPEMYM
TO_K0S9A6_49*
AAEEDHHGDAKAAKVPAAKSV
TO_K0S9A6_49*
AAEEDHHGDAKAAKVPAAKSV
TO_K0SQ58_125
HYPKKSGYYPKSDKSYGDYTY
TO_K0SSD7_101
GYSTYTSYCSKSNKRCRRKYG
TO_K0SSD7_150
SCDAYYEAYGKSGKTKGGRNN
TO_K0SUG8_1365
MRGEGFDFLSKDSKASLFPVA
TO_K0T322_57
SKAMKSGKDAKAEKYTTPEYQ
TO_K0T463_198
IVYGGSLGGSKAEKSDDENDY
Conserv.
•••••••••••••••••••••↑
Dim
ethylation(12
outof19sequences)
(c)A(+142
K/+85
K)XE(+28
K/+42
K)
+28
K–
dimethylation
(PTM175)
+42
K–
trimethylation
(PTM189)
+0K
–non-m
odifiedlysine
K–
unmapped
lysines
*–
ambiguous
PTMsites
Figure3.23G
raphicalrepresentationsof
thelocal
proteincontexts
ofm
odifiedlysines
situatedin
KXXK
motifs.
Sequencecontexts
arealigned
byN
-terminallysine
inKXXK±
10residues.Sequence
logosare
plottedin
thesam
ew
ayas
inprevious
figures
4 C O N C L U S I O N S A N D O U T LO O K
In this thesis I investigated the profile and site-specificity of lysine ε-polyamine PTMs
in diatom biosilica-associated proteins. The motivation for these experiments stems
from the significance of polyamine structures in biosilicification process. The silica
precipitation activity reported for polyamines in the literature [31, 61–67, 83], sug-
gests that they are involved in the species-specific patterning of diatom biosilica. In
this context, lysine ε-polyamine modifications can play an important role in regula-
tion of biosilica morphogenesis. Therefore, the characterization of the PTM consensus
sites is important for understanding the link between biosilica-associated proteins and
biosilica-forming machinery.
Here we present an integrated analytical strategy for the systematic analysis of ly-
sine polyamine modifications. The employed approach relies on the profiling of ly-
sine PTMs in biosilica hydrolysates prior to the site-specific identification of lysine
ε-polyamines in biosilica-associated proteins. It starts with exhaustive protein hydrol-
ysis in 6 m HCl and is followed by AQC-derivatization of polyamine-modified lysines
and their identification by LC-MS/MS. In this way, we could distinguish structural iso-
mers of polyamine moieties and quantify them with a limited set of internal standards
(Sections 3.1.1–3.1.3).
Using this approach we have catalogued lysine polyamine modifications in proteins
associated with silicified cell walls, which were isolated from three closely-related di-
atom species: Thalassiosira pseudonana, T. oceanica and Cyclotella cryptica (Section 3.2.1).
High resolution MS/MS analysis of these modifications revealed characteristic frag-
ments (Table 3.2) that were subsequently used as reporter ions for modified peptides.
Altogether, we identified 25 polyamine modifications (Section 3.2.3), which not only
confirmed seven previously known PTMs, but also revealed 18 novel ones, includ-
ing three acid-resistant phosphoester-containing polyamines (collectively denoted as
phosphopolyamines in Section 3.2.2). We also observed that the pattern of polyamine
modifications reflects the phylogenetic proximity of the three species (Section 3.2.5),
97
98 conclusions and outlook
where, on one hand, two closely-related diatoms (T. pseudonana and C. cryptica) share
conserved set of polyamine modifications, which, on the other hand, differ substan-
tially from a phylogenetically distant diatom (T. oceanica).
Detected structures represent an unusual class of protein post-translational modifica-
tion, which appears to be unique for biomineralizing organisms. These modifications
occur at the lysine side-chains, where propyleneimine (or aminopropyl) units are lin-
early linked to ε-amines (see Fig. 4.1c–4.1e). The polyamine chains characterized in
all three (centric) diatom species consist of 1–2 aminopropyl units, while the longer ε-
polyamine structures were detected previously in the pennate diatom C. fusiformis [64–
66]. Lysine residues were detected in three methylation states: textmono-, di- and tri-
methylation, whereas each N-atom in ε-polyamine structures can also be methylated.
Additionally, the lysine residue can be converted to δ-hydroxylysine, whose hydroxyl
group can be phosphorylated (3 out the total of 25 PTM structures). As a result, lysine
ε-polyamine modifications introduce to biosilica-associated proteins positive charges
of protonated amino groups along with negatively charged phosphate residues. The
zwitterionic character of these molecules is likely to influence their chemical and biolog-
ical properties; however the mechanistic role of these PTMs remains elusive. Besides di-
atoms, silicified sponge spicules [91, 92] and calcifying coccolithophores [44] also have
similar polyamine structures, although phosphopolyamines seem to be diatom-specific.
Apparently, diatoms have a highly complex PTM machinery, whose site-specificity was
investigated in this work by mapping polyamine modifications to protein sequences.
The profiled lysine modifications were localized at biosilica-associated proteins by
GeLC-MS/MS using the multiple protease digestion approach (Section 3.3.1) after se-
lective removal of O-linked glycans and phosphorylation by HF-treatment (Section 3.3.2).
For the identification of polyamine-modified peptides an iterative search strategy (Sec-
tion 3.3.3) and deconvolution of raw peptide MS/MS spectra (Section 3.3.4) were em-
ployed, whereas polyamine-specific fragments were used for the validation of PTM
assignments (Section 3.3.5). We have identified 150 polyamine-acceptor lysines, which
can be modified by 5 types of PTM marks displayed in Fig. 4.1. Two of them represent
di- and trimethylation (Fig. 4.1a and 4.1b), while the others are ε-polyamines with ei-
ther one (Fig. 4.1c) or two (Fig. 4.1d and 4.1e) aminopropyl units and a different degree
of N-methylation. PTM 333 (Fig. 4.1e) contain quaternary amino group and occur in
T. pseudonana and C. cryptica only, while PTM 232 (Fig. 4.1c) is specific for T. oceanica.
conclusions and outlook 99
These modifications were mapped to 25 biosilica-associated proteins (summarized in
Table A.2) from the three diatom species that are sharing no sequence homology and
are also not homologous to other known proteins. In this way, we have substantially
extended the catalogue of lysine ε-polyamine sites in biosilica-associated proteins and
provided a resource for future studies of site-specificity and functional association of
PTM machinery in diatom biosilica.
(a) PTM 175+28
K(ε-dimethylation)
+
(b) PTM 189+42
K(ε-trimethylation)
(c) PTM 232+85
K (ε-polyamine withone aminopropyl unit)
(d) PTM 289+142
K (ε-polyamine with twoaminopropyl units)
+
(e) PTM 333+186
K (ε-polyamine with twoaminopropyl units and quaternary amine)
Figure 4.1 Structure of mapped PTMs.
Since a primary goal of this study was to reveal consensus sequences for lysine poly-
amine modifications, the conservation of amino acids surrounding the PTM sites was
tested. It was shown that polyamine modifications occurred at several consensus mo-
tifs, despite full sequences of the modified proteins were not conserved. In total, we de-
fined two consensus motifs for ε-polyamines and methylation common to T. pseudonana
and C. cryptica, while the assignments in T. oceanica were inconclusive because of small
sampling size (Section 3.3.6). Out of the total 25 polyamine-modified proteins 21 con-
tained multiple conserved repeats KXXK, and 88 % of mapped PTMs resided in this
sequence motif. It was shown, that methylation commonly occurs at C-terminal lysine
of KXXK, while ε-polyamine modifications reside preferably at N-terminal lysine. In ad-
dition, given the proximity of PTMs in KXXK, we hypothesized that crosstalk between
different modifications may be an important mechanism of the biosilica PTM machin-
ery. The association between ε-polyamines and methylation in KXXK repeats was statis-
tically significant for T. pseudonana and C. cryptica and not significant for T. oceanica.
100 conclusions and outlook
However, it was possible to map only 5 out of total 25 modified lysines detected.
Therefore, as a perspective for the future work, we would like to improve the iden-
tification of lysine modification sites by all-ion fragmentation (AIF) technique using
the catalogued polyamine-specific fragments for peptide-independent identification of
ε-polyamine PTMs. Furthermore, newly emerging alternative proteases [180, 197–199]
are useful to increase the proteome coverage and improve the identification of PTMs,
through the analysis of longer peptides, an approach referred as ‘middle-down’ prote-
omics, which in turn enables a perspective characterization of PTM proteoforms [200,
201]. Our results also open new perspectives for protein functional studies. In or-
der to gain the insight into the biosilica post-translational modification machinery, the
prospective polyamine biosynthetic enzymes should be investigated [99]. To this pur-
pose, the synthesis of initial substrate for transfer of aminopropyl groups is required.
Several polyamine synthases have been already cloned and tested for their possible
activity [97, 98], while many more remain to be discovered and characterized. Finally,
in order to provide a direct mass spectrometric evidence for the PTM crosstalk be-
tween polyamination and methylation, it need to be validated in vivo via site-directed
mutagenesis.
5 M AT E R I A L S A N D M E T H O D S
Contents5.1 Synthesis of polyamine standards . . . . . . . . . . . . . . . . . . . . . . 104
5.2 Isolation of biosilica-associated proteins . . . . . . . . . . . . . . . . . . 105
5.3 Expression of tpSil3 from synthetic gene . . . . . . . . . . . . . . . . . . 107
5.4 HCl hydrolysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.5 AQC-derivatization of amino acids and polyamines . . . . . . . . . . . 108
5.6 LC-MS/MS analysis of QAC-derivatives . . . . . . . . . . . . . . . . . . 108
5.7 Amino acid measurement using UV-detection . . . . . . . . . . . . . . . 109
5.8 Direct infusion MS/MS analysis . . . . . . . . . . . . . . . . . . . . . . . 110
5.9 Acetylation of phosphopolyamines . . . . . . . . . . . . . . . . . . . . . 110
5.10 31P NMR measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.11 Deglycosylation with TFMS . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.12 Treatment with HF·pyridine soluble complex . . . . . . . . . . . . . . . 111
5.13 Anhydrous HF-treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.14 Protein analysis by GeLC-MS/MS . . . . . . . . . . . . . . . . . . . . . . 112
5.15 Proteomics data processing . . . . . . . . . . . . . . . . . . . . . . . . . . 114
101
102 materials and methods
Table 5.1 Chemicals and reagents. Unless otherwise noted, reagents were purchased commer-cially from Sigma-Aldrich Co. (Milford, MA, USA) with a highest purity available.
(a) chemicals and reagents
Pierce™ Amino Acid Standard H 2.5 µm/mL in 0.1 N HCl
Thermo Scientific Pierce(Waltham, MA, USA)
6 m HCl Sequencing grade
Acetonitrile (ACN) HPLC grade(freshly purchased)Water (H2O)
Pierce BSA standard ampules 2.0 mg/mL
AccQ·Tag™ Ultra Derivatization Kit Waters Corporation(Milford, MA, USA)
Trifluoroacetic acid (TFA) For protein sequencing Merck Millipore(Darmstadt, Germany)Formic acid (FA) 98–100 %
Coomassie Brilliant Blue R 250Stains for electrophoresis
SERVAHeidelberg, Germany
‘Stains All’
Sigma-Aldrich Co.(Schnelldorf, Germany)
β-casein (bovine) >98 %, SDS-PAGE
l-lysine
all >98 %
δ-hydroxy-l-lysine
ε-N-monomethyl-l-lysine
ε-N,N-dimethyl-l-lysine
ε-N,N,N-trimethyl-l-lysine
l-arginine
l-proline
Spermidine
Acetic anhydride
Triethylamine (TEA)
Ammonium fluoride (NH4F)
HF·pyridine soluble complex (Olah’s reagent) pyridine ~30 %, HF ~70 %)
GlycoProfile™ IV Chemical Deglycosylation Kit
Hydrogen fluoride (HF) Anhydrous, 3.5 GHC Gerling(Hamburg, Germany)
Trypsin MS grade Promega(Madison, WI, USA)
Endoproteinase Asp-N sequencing grade
Roche Diagnostics GmbHMannheim, Germany
Proteinase K PCR grade
Chymotrypsin Sequencing grade
Glu-C (V8 protease) MS grade
Ornithine δ-polyamine derivativeAnalytical standards Synthesized in Armin Geyer lab
(Philipps-Universität, Marburg, Germany)Lysine ε-polyamine derivative
materials and methods 103
Table 5.1 Materials and instrumentation (continued from previous page).
(b) materials
HyperSil Gold Kappa column 0.5 mm i.d.×150 mm 3 µmThermo Fisher Scientific
(Rockford, IL, USA)Acclaim™ PepMap100 C18 75 µm i.d.×15 cm, 3 µm, 100 Å
Acclaim™ PepMap100 C18 nanoViper 75 µm i.d.×2 cm, 3 µm, 100 Å
OPTI-PAK 1 µL C18Dichrom GmbH(Marl, Germany)
Acid-washed glass hydrolysis tubes 5 ml Wheaton(Millville, NJ, USA)
LoBind tubes 1.5 mL Eppendorf(Hamburg, Germany)Digital readout ThermoMixer C —
SDS-PAGE pre-cast gradient gels
Glycine (4–20 %) anamed Elektrophorese GmbH(Groß-Bieberau, Deutschland)
Glycine (4–20 %) Bio-Rad Laboratories(Richmond, CA, USA)Tricine (10–20 %)
Universal indicator paper Merck(Darmstadt, Germany)
(c) equipment and instrumentation
LTQ Orbitrap™ Velos
Mass spectrometers Thermo Fisher Scientific(Bremen, Germany)
Q Exactive™
Q Exactive™ HF
Agilent 1200 LC systemHPLCs
Agilent TechnologiesSanta Clara, CA, USA
Eksigent NanoLC™ 2D EksigentDublin, CA, USA
Vacuum concentrator RVC 2-25 CDplus Martin Christ GmbH(Osterode am Harz, Germany)
104 materials and methods
5.1 synthesis of polyamine standards
Two internal standard compounds of oligo-propylenediamine-substituted lysine and
ornithine derivatives (Fig. 5.1a and 5.1b) were synthesized by Marina Abacilar in
Armin Geyer laboratory (Philipps-Universität, Marburg, Germany). Alkylation via
Mitsunobu reaction [202] was the key step for the modification of the side chains
of amino acids (ornithine and lysine). A detailed scheme of synthesis is provided in
Fig. 5.1c. Isolated compounds were purified by RPLC and characterized by 1H- and 13C-
NMR, HPLC and HRMS. The analytical data for corresponding synthetic standards are
provided in Fig. 3.6 (Suppl. Material A.1).
NHNH
N
NH2
O
OH
(a) ornithine δ-polyamine derivative(chemical formula: C13H30N4O2;monoisotopic mass: 274.2369)
H2N
NHNH
OH
O
N
(b) lysine ε-polyamine derivative (chem-ical formula: C14H32N4O2; monoiso-topic mass: 288.2525)
NH
NH
Dde OtBu
O
n
n = 3 - Dde-(L)Orn-OtBu 1n = 4 - Dde-(L)Lys-OtBu 2
NH
N
Dde OtBu
O
n
OH
3 (n = 3)
4 (n = 4 )
i
Ns Ns
iiNH
N
Dde OtBu
O
n
NNs
NNs
iii-vi
5 (n = 3)
6 (n = 4 )
H2N
HN
OH
O
n
HN N
7 (n = 3)
8 (n = 4 )
TFA
(c) Synthesis of derivatives (a) and (b). (i) 3-brom-1-propanol, K2CO3, TBAI, 60 C, DMF, 12 h; (ii) PPh3,DIAD, N-(3-(dimethylamino)propyl)-2-nitrobenzenesulfonamide, dry THF, 3 d; (iii) 2% hydrazine inDCM; (iv) CTC-Resin, DIPEA, (v) 12 h; 2-Mercaptoethanol, DBU, 3×30 min; (vi) TFA/H2O/Et3SiH.
Figure 5.1 Chemical structures and synthesis of oligo-propylenediamine-substituted ε-lysineand δ-ornithine derivatives. The synthetized molecules are either lysine or ornithine withaddition of two aminopropyl units with a dimethylated N-terminus. (a) ornithine δ-polyaminederivative; (b) lysine ε-polyamine derivative; (c) synthesis of deprotected lysine and ornithinederivatives (a) and (b).
Purification by RPLC was performed with a Thermo Scientific Dionex UltiMate
3000 semi-preparative system, including a HPG-3200BX pump, an ERC Series-300 sol-
vent degasser, a MWD-3000 detector and a AFC-3000 fraction collector. A Macherey
Nagel VP Nucleodur C18 Gravity column (5 µm, 125×2.1 mm) was used. Eluents in
5.2 isolation of biosilica-associated proteins 105
both systems: A: H2O + 0.1 % TFA, B: MeCN + 0.085 % TFA. Afterwards, the synthe-
sized compounds were lyophilized with a Christ Alpha 2-4 LDplus.
The analytical HPLC spectra were recorded with a Thermo Scientific Dionex Ul-
tiMate 3000 system, including a LPG-3400SD pump, a WPS-3000SL autosampler, a
TCC-3000SD column compartment and a DAD-3000 detector. An ACE UltraCore 2.5
Super-C18 column (150×2.1 mm) was used as stationary phase.
High-resolution ESI mass spectra of synthesized compounds (shown in Fig. 3.6) were
recorded in the positive ion mode with a Q Exactive mass spectrometer.1H- and 13C-NMR spectroscopy (TOCSY, HSQC) was performed on a Bruker AV-
300 or AV-500/HD-500 spectrometer. Chemical shifts are reported in ppm and are
referenced to the residual solvent peak (DMSO-d6 by 2.5 ppm). Multiplicities are in-
dicated by s (singlet), d (doublet), t (triplet), bs (broad singlet), m (multiplet) and pq
(pseudo quartet). Coupling constants (J) are reported in Hertz [Hz].
5.2 isolation of biosilica-associated proteins
Thalassiosira pseudonana (strain CCMP|1335), T. oceanica (strain CCMP|1005) and Cyclotella
cryptica (strain CCMP|332) were grown in an enriched artificial seawater (ESAW) medium
according to the North East Pacific Culture Collection protocol (Canadian Center for
the Culture of Microorganisms ESAW Recipe [203]) at 18 C under constant light at
5000–10 000 lx.
Isolation of diatom cell walls was performed by Christoph Heintze in Nils Kröger
laboratory (B-CUBE, Dresden, Germany) as described previously [57]. Briefly, cells
were boiled twice in 2 % SDS / 100 mm EDTA to remove intracellular components and
membranes. Cell walls were pelleted by centrifugation (10 min, 3200×g), extracted
with acetone and washed extensively with H2O. Milli-Q (Merck, Darmstadt, Germany)
purified H2O (resistivity: 18.2 MΩ cm) was used throughout this procedure. The puri-
fied cell walls (biosilica) was lyophilized and stored at −20 C until further use.
Ammonium fluoride (NH4F) extraction of purified cell walls, orammonium fluoride
soluble material (AFSM), was as described by Kröger et al. [66]. Purified cell walls
were resuspended in 10 m ammonium fluoride and the suspension was acidified to
pH 4.5 by adding HCl. After 30 min at RT, the suspension was centrifuged (10 min,
3200×g) and the supernatant subjected to dialysis against 10 mm ammonium acetate
106 materials and methods
(Spectra/Por dialysis tubing, 500 Da molecular weight cut-off). The dialysate was
centrifuged for 15 min at 3200×g and the supernatant was lyophilized and kept at
−20 C.
For extraction of ammonium fluoride insoluble material, biosilica was isolated ac-
cording to Scheffel et al. [71] and treated with 0.1 mg/ml chitinase from Streptomyces
griseus ( 0.2 U/mg) from Sigma-Aldrich Co. (Schnelldorf, Germany) in chitinase
buffer1 (50 mm potassium phosphate pH 6.0, 0.05 % (w/v) sodium azide, 1 mm PMSF)
at 37 C in a shaker incubator for 2 d. The progress of chitin degradation was moni-
tored by Calcofluor White staining as described previously [163]. The chitinase-treated
biosilica was washed once with 1 % (w/v) SDS followed by 5× washing with H2O
by repeated centrifugation-resuspension cycles. The final pellet (i. e., chitin-free biosil-
ica) was resuspended in H2O and freeze-dried. The dry material was resuspended in
150 ml 10 m NH4F and adjusted to pH 4.5 by the addition of HCl. The suspension was
incubated at room temperature for 30 min, and centrifuged at 3200×g for 30 min. The
pellet was washed with H2O by resuspension-centrifugation (3200×g, 30 min). Resid-
ual chitin was removed by a second chitinase treatment as described above, followed
by washing once with 1 % (w/v) SDS and 5× washing with H2O. The resulting NH4F-
insoluble organic matrix material (AFIM) was freeze-dried and stored at −20 C until
use.
The isolation of silaffin-3 from T. pseudonana (tpSil3), which was used in Sections 3.1.3
and 3.3.3, was according to Poulsen and Kröger [67]. The dialysate of AFSM extract
was loaded onto a HighS cation exchange column (Bio-Rad Laboratories, Richmond,
CA, USA) equilibrated in 50 mm ammonium acetate. After washing the column with
50 mm ammonium acetate and 0.5 m ammonia, LCPAs were eluted with 2 m NaCl in
pH 10.0 buffer (100 mm ammonia, 50 mm ammonium acetate). Next, the flow-through
from the HighS cation exchange column and the 50 mm ammonium acetate wash were
pooled, concentrated by lyophilization, and then subjected to fractionation using a
Superdex200 HiLoad 16/60 column (Amersham Biosciences, Little Chalfont, UK)
with running buffer 500 mm NaCl and 50 mm ammonium acetate at 1.0 ml/min flow
rate. Fractions were analyzed by Tricine-SDS-PAGE [204] and staining with ‘Stains
All’ [86, 87]. Fractions eluting between 45 and 60 min (containing tpSil1/2 and tpSil3)
were combined, concentrated by ultrafiltration (molecular weight cut-off 10 kDa) and
1 Note: the chitinase solution was filtered through a polyethersulfone syringe filter with 0.2 µm pore size.
5.3 expression of tpsil3 from synthetic gene 107
loaded onto a Mono Q HR-5/5 column (Amersham Biosciences, Little Chalfont, UK)
equilibrated with 50 mm Tris-HCl, pH 6.4. Elution was performed at a flow rate of
0.5 ml/min by linearly increasing the NaCl concentration to 2 m in 1 h. Fractions con-
taining tpSil3 eluted between 22.5 and 28.5 min were pooled, exhaustively dialyzed
(molecular weight cut-off 7 kDa) against 10 mm ammonium acetate, and lyophilized.
The dry residue was dissolved in water and stored frozen at −20 C until use.
5.3 expression of tpsil3 from synthetic gene
tpSil3 was expressed from a synthetic gene according to Kumar et al. [171]. The
database sequence of tpSil3 (without signal peptide) was in-silico concatenated into
a single protein sequence, flanked with tryptic peptide sequences picked out from the
PhospB and (at the N-terminus side) and BSA (at the C-terminus side) and with two
affinity tags (Twin-strep-tag and His-tag) together with 3C cleavage site (sequence is
depicted in Fig. 5.2).
MGSAWSHPQFEKGGGSGGGSGGSAWSHPQFEKLEVLFQGPAAAKVFADYEEYVKDFYELEPHKVAAAFPGDVDRGLAGVENV
TELKEGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMPDEAGAKSAKMSMDTKSGKSEDAAAVD
AKASKESHMSISGDMSMAKSHKAEAEDVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKESSMP
SSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSAHLVDEPQNLIKYLYEIARQTALVELLKLGEYGFQNAL
IVRDAFLGSFLYEYSRLVNELTEFAKGSGHHHHHH
Figure 5.2 Sequence design of tpSil3 expressed from a synthetic gene. The sequence of tpSil3(without signal peptide) is in-silico concatenated into a single sequence, flanked with peptidesequences from PhosB (at the N-terminus side) and bovine serum albumin (BSA) (at the C-
terminus side) proteins with two affinity tags ( Twin-strep-tag and His-tag ) together with 3Ccleavage site.
This synthetic gene was obtained from GeneArt (Regensburg, Germany) and sub-
cloned into pET expression vector, which was transformed into an E. coli strain that
was dual auxotroph for arginine and lysine ( 4Arg4Lys BL21 (DE3) T1 pRARE ). Fresh
transformants were inoculated into a synthetic media MDAG-135 [205] supplemented
with antibiotics (100 µg/ml ampicillin and 15 µg/ml chloramphenicol) and incubated
overnight at 37 C on a shaking platform. After this overnight incubation the start-
ing culture was further diluted 1000× by MDAG-135 media supplemented with the
same antibiotics and incubated at 37 C until OD600=0.5. Cells in MDAG-135 media
were induced by 0.2 mm IPTG. After 4 h to 6 h post induction cells were pelleted, re-
108 materials and methods
suspended in 2× PBS, aliquoted, snap frozen in liquid nitrogen and stored at −80 C
until analysed. Prior analyses frozen aliquots were thawed and cell lysed in an equal
volume of 2× Laemmli buffer by incubating at 80 C for 15 min. The samples were
spun down and the supernatant subjected to SDS-PAGE. Protein bands were visual-
ized by Coomassie staining and full length expression of corresponding synthetic genes
was validated by in-gel digestion with trypsin or Asp-N and LC-MS/MS of recovered
peptides, as described in Section 5.14.
5.4 hcl hydrolysis
Hydrolysis of biosilica extracts (AFSM and AFIM) and protein tpSil3 was performed
by 6 m HCl in vacuo for 17 h at 110 C in acid-washed clear-glass tubes. The HCl hy-
drolysates were evaporated to dryness in a vacuum centrifuge at 40 C and dissolved
in water for further analysis.
5.5 aqc-derivatization of amino acids and polyamines
Derivatization of acidic hydrolysates and standards with AccQ·Tag Ultra Kit was car-
ried out according to the protocol provided by the manufacturer [206]. 20 µL of either
standard solution or biosilica hydrolysate was mixed with 140 µL AccQ·Tag Ultra bo-
rate buffer (pH~9.0), and 40 µL AccQ·Tag Ultra reagent that was previously dissolved
in 1.0 mL of AccQ·Tag Ultra reagent diluent (the concentration of reconstituted AQC
is approximately 10 mm in acetonitrile). The solution was left for 1 min at RT, but
to ensure complete derivatization and decomposition of unreacted reagent it was al-
lowed to proceed for 10 min at 55 C. AQC-derivatization mixture was diluted 10×and immediately subjected to LC-MS/MS analysis.
5.6 lc-ms/ms analysis of qac-derivatives
Liquid chromatographic analysis of QAC-derivatives was performed on a Agilent 1200
LC system, equipped with a binary solvent manager, an autosampler, a column heater,
a DAD (set at at 280 nm), and interfaced to a Q Exactive mass spectrometer. The
separation column was a HyperSil Gold Kappa column 0.5 mm i.d.×150 mm packed
with 3 µm particles. A column OPTI-PAK (1 µL C18) was used as the trap column.
5.7 amino acid measurement using uv-detection 109
10 µL of the sample was loaded with solvent A at 0.020 µL/min. After loading the
trap column was switched online to the separation column, and the mobile phase flow
rate was maintained at 20 µL/min. 10 µL of sample was injected. Eluent A was 0.1 %
FA in water, and eluent B was FA in neat acetonitrile. The column heater was set at
40 C. The separation of QAC-derivatives was performed by 60 min gradient, which is
provided in Table 5.2.
Table 5.2 HPLC gradient used for the analysis of QAC-derivatives.
Time (min) 0 10 20 50 55 57 60B (%) 0 0 10 95 95 0 0
A: 0.1 % FA B: 100 % ACN in 0.1 % FA
The LC was connected to the Q Exactive mass spectrometer under the control of Xcal-
ibur 4.0 software (Thermo Fisher Scientific). Survey scans were acquired within the
range of m/z 140–700 at a resolution of of 70 000 FWHM at m/z 200 and with the target
value of 3× 106 ions with a maxiaml injection time of 100 ms. Survey scan was fol-
lowed by MS/MS fragmentation targeted to the inclusion list derived from Tables 3.1
and A.1. Isolation of precursors was performed with a window of m/z = 2 at 5 ppm.
Spectra were acquired in one microscan under the stepped normalized collision energy
of 25, 30 and 35 % with a target value of 1× 105 ions and a maxiaml injection time of
100 ms. Resolution for HCD spectra was set to 70 000 at m/z 200 with, whereas the
first mass was fixed to m/z 80. Three replicate LC-MS/MS runs for each sample were
performed and saved as .RAW files (Thermo).
5.7 amino acid measurement using uv-detection
The amino acid analysis (AAA) with single wavelength UV detection was done in the
Functional Genomics Center Zürich (FGCZ, ETH Zürich, Switzerland) [207]. The
tpSil3 sample was hydrolyzed with HCl and derivatized by AQC reagent as described
previously. Derivatives were separated with high resolution using UPLC columns
(1.7 µm particles). Amino acid concentrations were determined using the MassTrak
Amino Acid Analysis Solution (Waters Corporation, Milford, USA) with UV de-
tection at 280 nm. The results were distributed as .pdf and .txt file, containing the
chromatogram and a tabular summary of the integration results.
110 materials and methods
5.8 direct infusion ms/ms analysis
Total biosilica hydrolysates and analytical standards were diluted with a mixture aceto-
nitrile/water/FA (v/v/v - 50/45/5). Dilution of the hydrolysates and stock solutions
were selected individually for each experiment. Prior to the analysis, samples were
loaded into 96-well plate (Eppendorf, Hamburg, Germany). Mass spectrometric anal-
ysis was carried out in the positive ion mode using either Q Exactive or Q Exactive HF
mass spectrometer. Instruments were equipped with the robotic nanoflow ion source
TriVersa NanoMate (Advion BioSciences, Ithaca, NY, USA) using chips with spray-
ing nozzles with a diameter of 4.1 µm and controlled by Chipsoft 8.3.3 software. The
ionization voltage and gas back pressure were set to 2.00 kV and 0.80 psi. Under these
settings, 8 µL of the analyte was electrosprayed for more than 50 min. The temperatures
of the ion transfer capillary was 275 C and S-lens level was 65 %. The samples were
sprayed for 10 min. FT MS mass resolution Rm/z 400 was 140 000 (FWHM); target value
AGC was 3× 106 and maximum injection time was 25 ms. One FT MS was acquired
within 3.52 s. The total acquisition time for all FT MS spectra was 10 min.
5.9 acetylation of phosphopolyamines
The hydrolyzed biosilica extract from T. pseudonana was derivatized with acetic anhy-
dride. Briefly, biosilica hydrolysate was evaporated to dryness and dissolved in 80 µl
of triethylamine 30 mm resulting in a 200× dilution corresponding to the initial extract
volume. The mixture was derivatized by addition of 160 µl of acetic anhydride/iso-
propanol = 1/7 (v/v) (total volume can be changed if proportions are maintained).
Above solutions were mixed thoroughly and incubated for at least 2 h at RT. The
derivatization reagents were removed by vacuum centrifugation at 40 C. The dried
sample was reconstituted in 50 µl of 50 % acetonitrile containing 5 % FA for direct infu-
sion MS/MS analysis.
5.10 31p nmr measurements
31P-NMR was performed by Marcus Rauche in Eike Brunner laboratory (Technische
Universität Dresden, Germany). All experiments were carried out at 300 K using a
5.11 deglycosylation with tfms 111
Ascend™ 500 spectrometer from from Bruker Daltonik GmbH (Bremen, Germany).31P-NMR spectra were recorded at a resonance frequency of 202.45 MHz using a 5 mm
BBO prodigy cryo probe (cooled with nitrogen to increase the sensitivity). A pulse
length of 11.63 µs at 61 W and a relaxation delay of 2 s were used. For 31P1H-NMR-
decoupling WALTZ-16 was used. The samples were rotated with 20 Hz. The chemical
shifts were referenced relative to H3PO4 for 31P-NMR. The samples were dissolved in
600 µl of 99.9 % D2O (Sigma-Aldrich Co., Schnelldorf, Germany), after centrifugation
the supernatant was adjusted with 0.1 m HCl to pH~5.0 (tested by universal indicator
paper) and transferred to a 5 mm NMR-tube.
5.11 deglycosylation with tfms
Deglycosylation with GlycoProfile IV Kit was performed according to the manufac-
turer’s instructions [208]. A dry pellet of tpSil3 was dissolved in 150 µl of trifluoro-
methanesulfonic acid (TFMS) and incubated for 30 min or 2 h at 0 C. Subsequently,
the solution was neutralized by dropwise addition of 60 % pyridine (in ethanol), which
caused the formation of a fine precipitate after 150 µl of the pyridine solution was
added. The precipitate was completely dissolved by adding 20 µl of water, and neu-
tralization was quickly completed by adding 150 µl of the pyridine solution2. The neu-
tralization of reaction mixture was monitored by addition of 4 µl Bromophenol Blue
Solution as an indicator dye until the pH is ~6.0. The neutral solution was mixed with
a sample buffer for SDS-PAGE and subjected to GeLC-MS/MS.
5.12 treatment with hf ·pyridine soluble complex
A chemical dephosphorylation/deglycosylation with anhydrous HF·pyridine soluble
complex was performed as described previously [174]. A dry pellet of tpSil3 was
dissolved in 50 µl of anhydrous HF·pyridine soluble complex and incubated at 0 C for
30 min, 1, 2 and 3 h. The reaction mixture at any given time was neutralized with a
sample buffer for SDS-PAGE and subjected to GeLC-MS/MS.
2 Note: the entire process of neutralization should be carried out quickly, keeping the reaction mixturecold at all stages to minimize protein degradation
112 materials and methods
5.13 anhydrous hf-treatment
Biosilica extracts or tpSil3 were dried and dissolved in liquid HF [58, 59]. After 30 min
at 0 C, HF was evaporated, and any remaining material was dissolved in water. De-
phosphorylation/deglycosylation efficiency is demonstrated by shifts to lower mass
in HF-treated proteins, evaluated by SDS-PAGE. After HF-treatment samples were
analyzed by GeLC-MS/MS analysis.
5.14 protein analysis by gelc-ms/ms
In-gel digestion was performed according to Shevchenko et al. [110, 111]. To visualize
protein lanes, gels were fixed, rinsed with water and successively stained with ‘Stains
All’ [86, 87] and Coomassie Brilliant Blue R 250. After destaining the entire gel lanes
covering the mass range of 10–250 kDa were excised, cut into small pieces (~1 mm3).
The gel pieces were then transferred into 1.5 mL tubes and completely destained by
acetonitrile/water. Destained gel pieces were reduced with 10 mm DTT (for 45 min at
56 C) and alkylated with 55 mm IAA (for 30 min in dark at RT). The reduced and
alkylated gel pieces were washed with water/acetonitrile and then shrunk with aceto-
nitrile. The ice-cold protease solution (per−mode = symbol50 ng µl−1) was added
to cover the shrunk gel pieces, and incubated on ice for 2 h. The swollen gel pieces
was then covered with 10 mm NH4HCO3 and incubated at 37 C for 12–18 h. Cleavage
specificity for proteases3 used or discussed in this thesis is provided in Table 5.3. The
resulting peptides were extracted by water/acetonitrile/FA (v/v/v - 49/50/1), dried
in a vacuum centrifuge and stored at −20 C until use.
The resulting peptides were recovered in 5 % and 2.6 µL was injected using AS-2 au-
tosampler into direct pumping nanoflow liquid chromatography system (Eksigent
NanoLC 2D), which eliminates the limitations imposed by flow splitting. NanoLC
was equipped with a 300 µm i.d.×5 mm trap and 75 µm i.d.×15 cm separation column
packed with Acclaim PepMap100 C18 3 µm particles. Multiple lysine PTMs increase
hydrophilicity of modified peptides, aggravating RPLC separation. Therefore, LC
3 Note: non-specific proteases, when allowed to work for a long time, can result in large number of shortpeptides, decreasing reproducibility and complicating the further analysis. Therefore, for non-specificproteases (e. g., Proteinase K) shorter incubation times were used (~4–6 h).
5.14 protein analysis by gelc-ms/ms 113
Table 5.3 Cleavage specificity of the proteases used (or discussed) in this thesis. For a review,see [109, 180, 199]. ‘[]’, cleavage activators; ‘〈〉’, cleavage preventors; ‘.’, cleavage point; ‘Ψ’,aliphatic, aromatic, or hydrophobic amino acids.
Protease Optimal pH Cleavage specificity used for Mascot searches
Trypsin 7.5 [RK].〈P〉 C-terminal to an arginine ora lysine (if not followed by a proline)
Glu-C (V8) 8.0 [E].〈P〉 and slower at [D].〈P〉 C-terminal to an glutamic acid and slower toan aspartic acid (if not followed by a proline)
Asp-N 4.0–9.0 .[D] and less specific at .[E] N-terminal to an aspartic and aglutamic acid (with less specificity)
Chymotrypsin 7.8–8.0 [FWY].〈P〉[LMADE].〈P〉 at slower rate Semi-specific
Proteinase K 8.0 Ψ. Non-specific
OmpT [197] 6.0–6.5 [KR].[KR] Cleaves within dibasic combinations of Arg and Lys.
LisargiNase [198] 7.5 .[RK] N-terminal to an arginine or a lysine
gradients were adjusted as follows: solvent A was 0.1 % FA in water; solvent B was
60 % acetonitrile in water containing 0.1 % FA. The samples were loaded for 8.5 min
with solvent A at 2 µL/min. After loading the trap column was switched online to the
separation column, and flow rate was set to 300 nL/min. The peptides were fraction-
ated using 175 min program shown in Table 5.4.
Table 5.4 HPLC gradient used for the analysis of peptides.
Time (min) 0 25 145 150 155 175B (%) 0 0 55 55 0 0
A: 0.1 % FA B: 60 % ACN in 0.1 % FA
The nanoLC was connected to the LTQ Orbitrap Velos hybrid mass spectrometer un-
der the control of Xcalibur 4.0 software (Thermo Fisher Scientific). The DDA cycle
consisted of a survey scan acquired in µs within the range of m/z 350–1600 performed
under the target mass resolution of 60 000 FWHM in the Orbitrap amalyzer. Automated
gain control (AGC) target ion count was set to 1× 106 for FT MS scans with maximal
fill time of 500 ms; precursor ion isolation width of 3 Da; spectra were recorded in cen-
troid mode. The data-dependent acquisition (DDA) cycle consisted of acquiring FT MS
survey spectrum followed by 8 MS/MS spectra with a fragmentation threshold of 4000
ion counts; singly charged precursor ions were excluded. Four CID and four HCD
fragment spectra were acquired in one microscan under the normalized collision en-
ergy (nCE) of 35 % and target value of 1× 104 ions (ion selection threshold 400 counts;
precursor ions isolation width m/z = 3). Activation parameter q = 0.25 and activation
114 materials and methods
time of 30 ms were applied. Fragmented precursors were dynamically excluded for
30 s. Two replicate LC-MS/MS runs for each sample were performed and saved as
.RAW files (Thermo). Lock mass was set to the singly charged ion of dodecamethylcy-
clohexasiloxane ion ((Si(CH3)2O))6; m/z =445.120025).
5.15 proteomics data processing
Data processing was performed using Proteome Discoverer 2.1 (Thermo Fisher Scien-
tific, Bremen, Germany). Beforehand, all MS2 spectra were processed using custom-
built deconvolution node developed by Vladimir Gorshkov [183] to produce deiso-
toped MS/MS spectra consisting only of singly charged fragments. Briefly, each iso-
topic cluster is converted to one singly charged peak with m/z-value calculated accord-
ing to the formula (*), where m/z1+ represents the mass-to-charge ratio of the deconvo-
luted singly charged peak, z is the charge state of the fragment and mH+ is the mass of
the proton:
m/z1+ = m/zz+ × z− (z− 1)×mH+ (*)
Next, fragment masses were grouped with a Distance tolerance that was set to
5 ppm4. For the grouped peaks masses were averaged and intensities were summed
up. All multiply charged peaks were deconvoluted to the singly charged state, and
all peaks that could not be assigned to any charge state according to the isotopic pat-
tern were transferred to the deconvoluted spectra with charge state 1+. The parameter
Isotope peak difference / N was set to one.
Mascot 2.2.06 database search engine (Matrix Science, London, UK) [209] was
used for peptide identifications against the custom-made database (80 096 sequence
entries) containing protein sequences from three diatom species (T. pseudonana [13],
T. oceanica [14] and C. cryptica [15]), which was concatenated with sequences of com-
mon laboratory contaminants were added (proteases, keratines, etc.) Deconvoluted
fragment spectra were sorted to CID and HCD spectra, which were processed indepen-
dently with parameters listed in Table 5.5.
4 i. e., a moving window is used to check each two neighbouring masses; m/z having absolute differencesless than 5 ppm are considered to belong to the same m/z peak.
5.15 proteomics data processing 115
Table 5.5 Mascot search parameters.
Parameters CID MS/MS HCD MS/MS
Precursor tolerance 10 ppm
Fragment tolerance 0.6 Da 60 mmu
Max missed cleavages 3
Enzyme See Table 5.3
Fixed modifications Carbamidomethyl (C)
Variable modifications Methionine oxidation and 2 ε-polyamine PTMs (see Section 3.3.3)
Precursor type (mass) Monoisotopic
Peptide charge 2+ and 3+
Instrument ESI-TRAP instrument settings HCD instrument settings
1+ fragments yes
2+ fragments if precursor 2+ or higher yes
immonium ions no yes
a-series no yes
b-series yes
y-series yes
y or y++ must be significant no
Max mass for internal fragements 700 1500
Scaffold 4.8.7 (Proteome Software, Inc., Portland, OR, USA) was used to validate
peptide-spectrum matches (PSMs) and protein identifications from Mascot searches [210,
211]. Proteins containing shared peptides were grouped satisfying the laws of parsi-
mony. Peptide identifications were accepted if they could be established at greater than
99.0 % probability as specified by the Peptide Prophet algorithm [210]. The peptide and
protein identities were accepted if they displayed a false discovery rate (FDR) ≤ 2 %
based on the Scaffold Local FDR algorithm with at least two unique peptides with a
precursor ion mass accuracy ≤ 10 ppm. Protein identifications were accepted if they
could be established at greater than 95.0 % probability [211] detected in at least one
biological replicate. Proteomics data were deposited to the ProteomeXchange Datasets
Consortium via the PRIDE [212] partner repository.
Sequece Logos [196] for the identification of PTM consensus sites were produced
by TEXshade package [213]. Briefly, amino acid sequences were restricted to 20 amino
acids (10 downstream and 10 upstream) and aligned by lysine PTM sites; details are
described in the Results and Discussion (Section 3.3.6).
118 supplemental material
NHS
+
6-quinolinyl carbamic acid AMQ
CO2+
OH-
NHS
+
6-quinolinyl isocyanate
+AMQ
+H 2O
+H 2O
NHS
++AMQ
AQC
N,N’-bis(6-quinolinyl)urea
(a)
(c)
(f)
(d)
(e)
(b)
Figure A.1 Reactions of 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC), whichmight occur in buffered aqueous solutions and/or during storage: (a) hydrolysis of AQC (cat-alyzed by acids or bases) results in N-hydroxysuccinimide (NHS) and 6-quinolinyl carbamicacid, which is an unstable and spontaneously breaks down (b) to carbon dioxide and 6-amino-quinoline (AMQ); (c) alkaline hydrolysis of AQC eliminates NHS and gives 6-quinolinyl iso-cyanate; (d) acid or base catalyzed addition of water to the carbon-nitrogen double bond givesan N-substituted carbamic acid; (e) in absence of a basic catalyst, disubstituted urea, N,N′-bis-(6-quinolinyl)urea, can be obtained by a nucleophilic addition of AMQ; (f) the primary amineAMQ forms N,N′-bis(6-quinolinyl)urea by a nucleophilic substitution reaction [129].
supplemental material 119
Table A.1 Calculated N×QAC-derivatization groups for lysine ε-polyamine modificationsfrom Fig. 3.1 (p. 39). Propylamine units (PA0, PA1, PA2); N-methyl groups (Me1–Me7); δ-hydroxylation of lysine (Hydroxy); phosphorylation of side hydroxyl (Phospho).
Backbone Me0 Me1 Me2 Me3 Me4 Me5 Me6 Me7 ...
Lys-PA0 2 2 — 1 — 1 — — — — — — — ...
Lys-PA1 3 3 2 2 — 2 1 1 — 1 — — — ...
Lys-PA1-PA2 4 4 3 3 2 3 2 2 1 2 1 1 1 ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Hydroxy-Lys-PA0 2 2 — 1 — 1 — — — — — — — ...
Hydroxy-Lys-PA1 3 3 2 2 — 2 1 1 — 1 — — — ...
Hydroxy-Lys-PA1-PA2 4 4 3 3 2 3 2 2 1 2 1 1 1 ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Phospho-Hydroxy-Lys-PA0 2 2 — 1 — 1 — — — — — — — ...
Phospho-Hydroxy-Lys-PA1 3 3 2 2 — 2 1 1 — 1 — — — ...
Phospho-Hydroxy-Lys-PA1-PA2 4 4 3 3 2 3 2 2 1 2 1 1 1 ...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1.00E+03
1.00E+04
1.00E+05
1.00E+06
1.00E+07
1.00E+08
1.00E+09
1.00E+10
1.00E+11
0.001 0.01 0.1 1 10 100
Arg+QAC1
His+QAC1
Ser+QAC1
Gly+QAC1
Asp+QAC1
Glu+QAC1
Thr+QAC1
Ala+QAC1
Pro+QAC1
Lys+QAC2
Val+QAC1
Leu+QAC1
Ile+QAC1
Phe+QAC1
Tyr+QAC1
Met+QAC1
Lys+QAC2
amount loaded on-column, log-scale, pmol
inst
rum
en
t re
spo
nse
, lo
g-s
cale
, a
.u.
Figure A.2 Calibration curves of QAC-derivatized amino acids. The dynamic range and linear-ity for QAC-derivatives were obtained from calibration curves that were built using solutionsof a standard physiological amino acid mix (Pierce Amino Acid Standard H, Table 5.1). Y -arbitrary abundance units, X – amount loaded on-column, pmol; both logarithmic scales.
120 supplemental material
0.0
3.6
44.2
10.19.5
20.4
1.5
39.0
6.15.4
2.3
4.8
2.6
0.0
6.07.6
16.3
1.0
4.0
13.7
2.10.8
1.60.8
1.8
7
40
12 12
20
4
34
67
4 43
18
33
0
10
20
30
40
50
60
Arg His Ser Gly Asx Glx Thr Ala Pro Val Leu Ile Phe Tyr Met Lys175 261 275 289 303 319 333 399 413
Nu
mb
er o
f am
ino
aci
d r
esid
ues
Number of AA residues
Experimental Theoretical
Lysine PTMsAmino Acids
(b)(a) (c) (d) (e) (f) (g) (h) (i)
mapped modifications
amino acid analysis (AAA) and lysine PTMs profile of silaffin-3 from T. pseudonana (tpSil3) bymass spectrometry (MS)- and ultraviolet (UV)-detection.
+
+
(a) PTM 175 (3×QAC)
+
+ +
(b) PTM 261 (3×QAC)
+ + +
(c) PTM 275 (4×QAC)
+ +
(d) PTM 289 (3×QAC)
+ +
(e) PTM 303a (2×QAC)
+ +
(f ) PTM 319 (3×QAC)
+ +
(g) PTM 333 (2×QAC)
+ +
(h) PTM 399 (3×QAC)
+ +
(i) PTM 413 (2×QAC)
Figure A.3 Number of amino acid residues. Comparison of obtained amino acid content ofsilaffin-3 from T. pseudonana (tpSil3) with the theoretical one (cf. with Fig. 3.4). Numbers ontop of each bar represent calculated amino acid residues. Only 25 % of unmodified lysineswere detected, whereas ~75 % of total lysine content is modified with different ε-modifications,displayed in (a)–(i). Asx, Aspartic acid or Asparagine; Glx, Glutamic acid or Glutamine; QAC,derivatization group.
A.1 analytical data for synthetic standards 121
a.1 analytical data for synthetic standards
(a) (S)-2-amino-5-((3-((3-(dimethylamino)propyl)amino)propyl)amino)pentanoic acid
1H-NMR (500MHz, DMSO d6): δ (ppm) = 1.61-1.89 (m, 4 H, γ H, β H2), 1.89-
2.03 (m, 4 H, HNCH2CH2CH2NH), 2.79 (d, 1 J = 3.2 Hz, 6 H, N(CH3)2), 2.89-3.05 (m,
8 H, δ H2, CH2NH), 3.06-3.17 (m, 2 H, CH2N(CH3)2), 3.95 (m, 1 H, α H), 8.31 (bs, 3 H,
α NH +3 ), 8.68 (bs, 2 H, 2-NH +
2 ), 8.79 (bs, 2 H, 2-NH +2 ), 9.78 (bs, 1 H, 3-HN+(CH3)2)
(300 K).13C-NMR (75MHz, DMSO d6): δ (ppm) = 21.2 (HNCH2CH2CH2NH), 21.9 (C γ),
22.8 (HNCH2CH2CH2NH), 27.4 (C β), 42.6 (N(CH3)2), 44.4 (CH2NHCH2), 46.7 (C δ),
51.8 (C α), 54.1 (CH2N(CH3)2) (300 K).
(b) (S)-2-amino-6-((3-((3-(dimethylamino)propyl)amino)propyl)amino)hexanoic acid
1H-NMR (500MHz, DMSO d6): δ (ppm) = 1.32-1.42 (m, 1 H, γ H), 1.42-1.52 (m,
1 H, γ H), 1.55-1.66 (m, 2 H, δ H2), 1.68-1.87 (m, 2 H, β H2), 1.87-2.04 (m, 4 H,
HNCH2CH2CH2NH), 2.80 (d, 1 J = 4.4 Hz, 6 H, N(CH3)2), 2.84-2.93 (m, 2 H, ε H2),
2.93-3.048 (m, 6 H, CH2NH), 3.06-3.18 (m, 2 H, CH2N(CH3)2), 3.98 (pq, 3 J = 5.7 Hz,
1 H, α H), 8.26 (bs, 3 H, α NH +3 ), 8.65 (bs, 2 H, 2-NH +
2 ), 8.81 (bs, 2 H, 2-NH +2 ), 9.78
(bs, 1 H, 3-HN+(CH3)2) (300 K).13C-NMR (75MHz, DMSO d6): δ (ppm) = 21.8 (C γ), 21.2, 22.9 (HNCH2CH2CH2NH),
25.5 (C δ), 29.9 (C β), 42.6 (N(CH3)2), 44.3 (CH2NHCH2), 46.8 (C ε), 54.0 (CH2N(CH3)2),
52.2 (C α) (300 K).
Chemical shifts are reported in ppm and are referenced to the residualsolvent peak (DMSO-d6 by 2.5 ppm). Multiplicities are indicated by s(singlet), d (doublet), t (triplet), bs (broad singlet), m (multiplet) and pq(pseudo quartet). Coupling constants (J) are reported in Hertz [Hz].
122 supplemental material
(a) Q9SE35 Silaffin-1 from C. fusiformis (cfSil1)MKLTAIFPLLFTAVGYCAAQSIADLAAANLSTEDSKSAQLISADSSDDASDSSVESVDAASSDVSGSSVESVDVSGSSLESVDVSGSSLESVDDSSEDSEEEELRILSSKKSGSYYSYGTKKSGSYSGYSTKKSASRRILSSKKSGSYSGYSTKKSGSRRILSSKKSGSYSGSKGSKRRILSSKKSGSYSGSKGSKRRNLSSKKSGSYSGSKGSKRRILSSKKSGSYSGSKGSKRRNLSSKKSGSYSGSKGSKRRILSGGLRGSM
(b) Q5Y2C2 Silaffin-1 from T. pseudonana (tpSil1)MKVTTSIITLLFASCGAADVQRVLEDVTEPAVTTPAATPAPITPEPATPAPTICEGRNFYYDEETRKCSNEATGGIYGTLIDCCVAISGSVSCPYVDICNTLQPSPSPETNEPSAKPITAAPISSAPVSAAPVTSAPVAAPVETTSMTGPTTIVASIVSTNAPSLTNAPSSSLEAVVTRIPVETTNTASPTTTAASIVSTNAPSSSPEAVVTPRPTFRPSPEGTESNTSPASIASDVMFGPPKTSTPTSTPTSSSHPSSSEPTLSPSVSKEPTGYPTSSPSHSPTKSPSKSPSSSPTTSPSASPTETPTETPTESPTESPTESPTLSPTESPTLSPTESPSLSPTLSTTWSPTGYPTLAPSPSPSISSAPSVSSAPSSPPSISSAPSVSSAPSKNFGFLPGLTEMPTISPTEDHYFFGKSHKSHKSHKSKATKTLKVSKSGKSAKSSKSSGRRPLFGVSQLSEGIAVGYAKSSGRSSQQAVGSWMPVAAACILGALSFFLN
(c) Q5Y2C1 Silaffin-2 from T. pseudonana (tpSil2)MKVTTSIITLLFASCGAADVQRVLEDVTEPAVTTPAATPAPITPEPATPAPTICEGRNFYRDDDTGKCSNEATGGIYGTLIECCVAISGSDSCPYVDICNTLQPSPSPETNEPSAKPITAAPISSAPVSAAPVTSAPVAAPVETTSMTGPTTIVASIVSTNAPSSTNAPSSSLEAVVTRIPVETTNTASPTTTAASIVSTNAPSSSPEAVVTPRPTFRPSPKGTESNTFPASIASDVMFDPARSEPTFTPTSSSQPSSSEPTLSPSVSKEPTRYPTSSPSHSPTKSPSKSPSSSPTTSPSASPTETPTETPTESPTELPTLSPTEFPSLSPTLSPTWSPTGYPTLAPSPSPSISSAPSVSSAPSSSPSISSAPSVSSAPSKNFGFLPGRNEMPTISPTEDHYFFGKSHKSHKSKATKTLKVSKSGKSSKSSKSSGSRPLFGVSQLSEGIAAGYAKSSGRSSQQAVGSWMPVAAACILGALSFFLN
(d) B8BRK6 silaffin-3 from T. pseudonana (tpSil3)MKTSAIALLAVLATTAATEPRRLRTLEGHGGDHSISMSMHSSKAEKQAIEAAVEEDVAGPAKAAKLFKPKASKAGSMPDEAGAKSAKMSMDTKSGKSEDAAAVDAKASKESHMSISGDMSMAKSHKAEAEDVTEMSMAKAGKDEASTEDMCMPFAKSDKEMSVKSKQGKTEMSVADAKASKESSMPSSKAAKIFKGKSGKSGSLSMLKSEKASSAHSLSMPKAEKVHSMSA
(e) B8C0W5 silaffin-4 from T. pseudonana (tpSil4)MKIIFPALAIIALVNGQQQVHRLRNDVIEHRVSSSASVATSTLFGRKGGRELSADRSEGSGGSGDEEAVDAKAEKTSTTGSAKAGKSAENEAATETSSKAAKLFKPKSSKGGASDASTEYESGASDASTEYESGASEAGAEVTAKAEKGSDDEGHDAKADKGTGSGKSGKSMSMHAKSGKGEAGSDMSVSSKAQMSYIHGSGDEGSDEATTSDASKATKVFKSSGKSGKGEAAGSSDMSVSSKPEKSEGSSEATTADASKATKVYKSDASTESKSAKHSASMPFGKSSKESDAKAHKGEMSVHQSKAFKGKSSKAMSVSSKAMSVSSKAASMSHYTHGYEKSIFG
(f ) B8C322 CingulinW1 from T. pseudonana (tpCinW1)MKIGYSLALLAVASASAQNTGLRGSDAEVELFNRKLSDWGDDGWNDDGWNDDGWGGSGSSSKSSKSGSSGSSGKSGKSGSSGKSGKSGSGDSWSDDGWSGSSGKSGKGDYGGSSGKSGKGGYGGHWVWEGSDDSTSWGSDDSYSSGKSGKGSKGSSKSSKGSGKSSKGSGKSSKGSDSSDDGEWGSGGWGSGGWGGGSSGKSGKGSYGGWASSDDGSWGGGSSGKSGKGSYGSSGKSGKGSYGGWAPSDDGWDGDGWYGGDSSGKSGKGSSGGSGKSGKGSYDGGWGSDDGTSWGSDDSYSSGKSGKGSKGSSKSSKGSSSKSSKGSSSKSGKGSGKSSKGSSDSSSSWDGHGGWSDSWGGDYSGKSGKGSSGKSGKGSSGGSWGSGSSGKSGKGSSGGSGDSSYGGWDGDSYREYGGF
(g) B8CDQ9 CingulinW2 from T. pseudonana (tpCinW2)MKLALFLTIPTLIAAQQSSVRGVATTSSRQLDEWGDDAWGSSDSGSSGKSGKSGGSASSGDGWETDGWGGDYSSSKSGKSGSGKSGKGSSGPHGHWVYIEDDSSDGSGKSGKGSSSKGSKGSSKSSKGSSSDDSTDDSWDGGWGGHGGWNGDNSGKSGKGSYGSGKSGKGSSYPSSHWGPSHWGSDDDDSSSSKSSKGSSESSSKSSKGSSDSSSKSSKGSSSSEDEGHWEWEGGYGSGKSGKGSYSGSSGKSGKSGSGDSWVGDYGSSGKSGKGSYGGDSWGGNYNGWGGHYDVDVDDDDSSSSKSSKGSSKSSKGSSEDSSKSSKGSSSKSSKGSSSSEDEGHWVWEGSYGSGKSGKGSYSGSSGKSGKSGSGDEGWYSGW
(h) B8CEX1 CingulinW3 from T. pseudonana (tpCinW3)MKAALILALAAGASAEITDQFERELGKSGKGSYGDWGGNYNGWGGNYWGDSSSDSSSKSSKGSSKSGKGSSKSGKGSSKSSKGSSKSSKGSSSSSDWSDDGWHWVSGWGSSYDGKSGKGSYGGDSGKSGKGSYDGGWGSYGSGKSGKGSYGGWSDGSGKSGKGSYGGWSDGSGKSGKGSYDGGWGSYGSGKSGKGSYGGWSDGSGKSGKGSYGGWSDGSDGGWGSSSEYEGWYSGHGGWGSDDDDSWGSSSSSKSSKGSSKSGKGSSKSSKGSSKSSKGSSKSSKGSSKSSKGSSKSSKGSSDGNWVWVSGWGSDDDHWGGGSGKSGKGSSGGGWSDDGWGAGSSSKSSKSGSGDDGWGGSDGHIVESNNNWVGSGDGGDSWGTDGWTNDGHDDKWSGDSWADDGHVSGSGKSGKSGSGGSGDGWGGSDGSSKSSKSGSGGSGDAWGGSDGSSKSGKSGSGGSGDAWGGSDGSSSKSGKSGSGGSGDSWGDDAWGGSDGSSSKSGKSGSGGSGDSWGDDAWGGSGGSSSKSGKSGSGGSSDSWGSSSKSGKSGSGGSSKSGKSGAGADGWEADGYEQDSAISKASTEMSFSTEASSSNRRRIVVALGAAAGGAVLLL
(i) B8CGN3 CingulinY1 from T. pseudonana (tpCinY1)MKSIIALSTIALASAGTNKTLAPTPFPGRPTPIPTPVNTYIVTEQTPAPTPGDVITPAPTICEEKIFFFDGGMCTNMFEVADGSSYNTLIQCCNANFGSFAMCVYEDMCVDVKPTRRPTTRPPTDMSYNYGIVDCFGKSGKSGSGCGKSGKGSKSSKSGGGYGYGDNYVDDYTPSTNDYSHSTNDYTPSTNDYEYGYGHGSSGKSGKGSKSSKGGKSSKSSGKSSKSSGKSGKGSSSSGKSGKGSDGHYTGDGYRYDDDAYYRKLSEGQAGGLRRTRKMP
(j) B8CGQ5, CingulinY2 from T. pseudonana (tpCinY2)MKLIIALSAITLASAGTNKTLPPTPFPGRPTPNPTMVNTIGTPGPSFIVTEQTPAPTPGDVLTPQPTPLPTLGGVPTTKMPTEMSYGYGYGDYGIVDCFGKSGKSGSGCGKSGKGSKSSGKSGKSGGGGGGGYGYGDNYADDYTPSTDDYEYGYGHGGSSGKSGKGSSGKSGKSSSKSSKGSGKSSKSSGKSSKSSGKSGKGGSRDDGHGYGGYGGYEGYGGYEGYQYGGDEYVRRNRRLGASHNNRI
(k) B8CGS1, CingulinY3 from T. pseudonana (tpCinY3)MKFSASILLLTVATASAGTNKTLAPTPFPGRPTPPGAGTPFPTENTPAPSPAFGTKPPTPVSESVASLLIVSWFVLGSMWPLNGRMNVSLTVHTVDRWRADTTLKDGTAEQRDRCSSYEPPQYSYEPPTTGCSKAGKGGKSGSMDYLIDCIDLSSKSGKSGSGYGPSSSKGGKSGSSSAGYGDDYTATTDDYSAGADAGKSENYDEEASRDDGHYGASSKGGKSGSAGYGDEGYGSSAGSSKGGKSEADGYGDESYGDSGDSKAGKAEAGYGDDYGASAKSGKGSDGYGSSSKSGKAGSAKSGKGEGYHMFHDKSGKGGKGSSSGGEGYGYGYDEAHDYGYGRRTRGLRASQ
Figure A.4 Sequences of biosilica-associated proteins. Colour codes: KXXK , tetrapeptide mo-tif; RXL , N-terminal processing site; N , asparagine residues (putative N-linked glycosylationsite); S/T , hydroxy-amino acids residues (putative O-linked glycosylation and phosphoryla-tion site).
A.2 xics of qac-derivatives 123
a.2 extracted-ion-chromatograms of qac-derivatives
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
0 5 10 15 20 25 30 35
Time (min)
0
20
40
60
80
100
0
20
40
60
80
100
Rela
tive A
bu
nd
an
ce
+ +
+ +
+ +
+ +
+ +
+ +
PTM 319 (3×QAC)
PTM 333 (2×QAC)
PTM 347 (2×QAC)
PTM 399 (3×QAC)
PTM 413 (2×QAC)
PTM 427 (2×QAC)
Figure A.5 extracted-ion-chromatograms (XICs) (3 ppm accuracy) of phosphopolyamine mod-ifications detected in biosilica extracts from T. pseudonana and C. cryptica. For presentation clar-ity the structures are annotated with nominal m/z values of their underivatized molecular ions.Phosphate is highlighted in red; QAC, derivatization group is faded out.
124 supplemental material
-+-
+-
+-
+-
+-
+-
+-
+-
+-
+-
+-
+-
+-
+P
TM
16
1
(2×
QA
C)
PT
M 1
75
(1×
QA
C)
PT
M 1
89
(1×
QA
C)
PT
M 2
04
(3×
QA
C)
PT
M 2
61
(4×
QA
C)
PT
M 2
75
(4×
QA
C)
PT
M 2
89
(3×
QA
C)
PT
M 3
03
(2×
QA
C)
PT
M 3
03
(3×
QA
C)
PT
M 1
63
(2×
QA
C)
PT
M 2
05
(1×
QA
C)
PT
M 3
19
(3×
QA
C)
PT
M 3
33
(2×
QA
C)
PT
M 3
47
(1×
QA
C)
+ +
+ +
+ +
+ +
+ +
+ +
HF
-treatm
en
t0
%
5%
10
%
15
%
20
%
25
%
30
%nominal m/z of PTM
(N×QAC-groups)
no
n-p
ho
sph
ory
late
d
PT
M 3
99
(3×
QA
C)
PT
M 4
13
(2×
QA
C)
PT
M 4
27
(1×
QA
C)
ph
osp
ho
ryla
ted
(e) p
ho
sph
ory
late
d
(c) ε-p
oly
am
ines
(b) ε-m
eth
yla
ted
-+n.d
.
T. p
seu
do
na
na
mol. %
(d) δ
-hyd
rox
y-p
oly
am
ines
PTM 319 (3×QAC)
PTM 333 (2×QAC)
PTM 347 (2×QAC)
PTM 399 (3×QAC)
PTM 413 (2×QAC)
PTM 427 (2×QAC)
Figure A.6Structure
andcontentoflysine
post-translationalmodifications
(PTMs)in
hydrolysatesofA
FSMextracts
fromT.pseudonana
before(–)and
after(+)H
F-treatment.Error
barsfor
2replicates.C
hemicalstructures
ofdetectedphosphorylated
modifications
andtheir
non-phosphorylatedcounterparts
aredepicted
ontop.
Phosphorylatedstructures
were
completely
convertedto
non-phosphorylatedones
byH
F-treatment.PTM
sare
annotatedw
ithnom
inalm/z
valuesofsingly
protonatedm
olecularions
(with
therespective
number
ofQ
AC
-groupsin
brackets).Seealso
Fig.3.10and
Table3.2.
A.2 xics of qac-derivatives 125
0
50
10023.82
0
50
10024.50
0
50
10016.97
0
50
10022.66
0
50
10029.65
0
50
10026.50
0
50
10021.11
0
50
10027.18
0
50
1006.01
0
50
10020.90
0
50
10026.06
0
50
1005.52
0
50
10018.19
0
50
10019.75
0
50
10018.12
0
50
10023.84
0
50
10019.18
0
50
10011.23
0
50
10020.61
0
50
10016.79
0
50
10026.11
m/z=147
2xQAC
m/z=161
2xQAC
0 5 10 15 20 25 30 35 40 45 50 55 60
Retention time (min)
Rela
tive a
bu
nd
an
ce (
%)
m/z=175
1xQAC
m/z=189
1xQAC
m/z=205
1xQAC
m/z=232
2xQAC
m/z=275
4xQAC
m/z=275-orn (internal standard)
3xQAC
m/z=289
3xQAC
m/z=303a
2xQAC
m/z=303b
3xQAC
m/z=317a
1xQAC
m/z=317b
2xQAC
m/z=319
3xQAC
m/z=331a
1xQAC
m/z=331b
2xQAC
+
+ +
+ +
m/z=333
2xQAC
m/z=347
2xQAC
m/z=399
3xQAC
m/z=413
2xQAC
m/z=427
2xQAC
+ +
+
+
+
+
+
+
+
+
+
+
+ + +
+ +
+ +
+ + +
+
+ +
+ +
+ +
+ +
+ +
+ +
Figure A.7 Extracted-ion-chromatograms (XICs) (3 ppm accuracy) of QAC-derivatized ε-polyaminemodifications detected in biosilica extracts from the three diatom species. QAC, derivatization group(faded out).
126 supplemental material
80 100 120 140 160 180 200 220 240 260 280 300
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
257.2320
C 11 H27 N7
-0.2844 mmu
86.0966
C 5 H12 N
0.1326 mmu
155.1170
C 6 H13 N5
0.4072 mmu
212.1745
C 9 H20 N6
0.0554 mmu
239.2215
C 13 H27 N4
-1.5306 mmu275.2423
C 11 H29 O N7
-0.5083 mmu
143.1541
C 8 H19 N2
-0.2080 mmu
173.1282
z=1
C 8 H17 O 2 N2
-0.2652 mmu
-18.0106
z=1
H2 O
0.5733 mmu
+
143.1543
160.1808
116.0706
230.1863
86.0964
173.1285
103.1230
(a) Fragment spectrum of underivatized lysine modification 275-orn (m/z 275; 1+)
100 150 200 250 300 350 400 450 500
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0553
C 10 H7 O N2
0.0237 mmu
275.2441
C 13 H31 O 2 N4
-0.0121 mmu
445.2921
C 23 H37 O 3 N6
-0.0513 mmu
149.0234
C 8 H5 O 3
0.1170 mmu
257.2335
C 13 H29 O N4
-0.0981 mmu
187.1441
C 9 H19 O 2 N2
0.0302 mmu
98.0969
C 6 H12 N
0.5208 mmu
QAC
-18.0107
z=1
H2 O
0.6344 mmu
M-2xQAC
M-3xQAC
+ +
445.2922
(b) Fragment spectrum of 3×QAC-derivatized lysine modification 275 (m/z 393; 2+)
Figure A.8 HCD tandem mass spectrometry (MS/MS) spectra of the synthetic ornithinederivative PTM 275-orn (internal standard). (a) spectrum of underivatized molecule (m/z275.2442; 1+), nCE to 30 %; (b) spectrum of 3×QAC-derivatized molecule (m/z 393.1977; 2+),nCE to 30 %. Fragment peaks are annotated with an accurate mass, corresponding calculatedchemical composition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 127
80 90 100 110 120 130 140 150 160 170 180 190 200
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
84.0813
C 5 H10 N
0.5450 mmu
130.0862
C 6 H12 O 2 N
-0.0535 mmu
98.0968
C 6 H12 N
0.3778 mmu
161.1282
C 7 H17 O 2 N2
-0.2415 mmu
immK -NH3
115.0754
C 6 H11 O 2
0.0085 mmu
imm-meKimm-meK -NH3
+
(a) Fragment spectrum of underivatized lysine modification 161 (m/z 161; 1+)
80 100 120 140 160 180 200 220 240 260
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0553
C 10 H7 O N2
-0.0221 mmu
161.1285
C 7 H17 O 2 N2
0.0094 mmu
130.0864
C 6 H12 O 2 N
0.1401 mmu
84.0814
C 5 H10 N
0.6603 mmu
98.9847
H4 O 4 P
0.5009 mmu
QAC
+
+
(b) Fragment spectrum of 2×QAC-derivatized lysine modification 161 (m/z 251; 2+)
Figure A.9 HCD MS/MS spectra of lysine derivative 161. (a) spectrum of underivatizedmolecule (m/z 161; 1+), nCE to 30 %; (b) spectrum of 2×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
128 supplemental material
80 90 100 110 120 130 140 150 160 170 180 190 200
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
84.0813
C 5 H10 N
0.4942 mmu
130.0862
C 6 H12 O 2 N
-0.1020 mmu
175.1438
C 8 H19 O 2 N2
-0.3527 mmu116.0707
C 5 H10 O 2 N
0.0740 mmu158.0921
C 6 H12 O 2 N3
-0.2934 mmu
106.9921
C H5 N2 P 2
-0.1026 mmu
+
(a) Fragment spectrum of underivatized lysine modification 175 (m/z 175; 1+)
172.0586
C 5 H8 O 3 N4
-0.4979 mmu
175.1442
C 8 H19 O 2 N2
0.0607 mmu
80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0553
C 10 H7 O N2
0.0542 mmu
130.0864
C 6 H12 O 2 N
0.1859 mmu
84.0815
C 5 H10 N
0.6984 mmu
QAC
+
+
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 175 (m/z 251; 2+)
Figure A.10 HCD MS/MS spectra of lysine derivative 175. (a) spectrum of underivatizedmolecule (m/z 175; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 129
80 90 100 110 120 130 140 150 160 170 180 190 200
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
84.0815
C 5 H10 N
0.7599 mmu
130.0866
C 6 H12 O 2 N
0.2993 mmu
189.1600
C 9 H21 O 2 N2
0.2356 mmu
144.1385
C 8 H18 O N
0.2040 mmu
+
(a) Fragment spectrum of underivatized lysine modification 189 (m/z 189; 1+)
80 90 100 110 120 130 140 150 160 170 180 190 200 210
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.0847 mmu
84.0815
C 5 H10 N
0.7213 mmu
130.0865
C 6 H12 O 2 N
0.2317 mmu
189.1599
C 9 H21 O 2 N2
0.0966 mmu
QAC
+
+
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 189 (m/z 251; 2+)
Figure A.11 HCD MS/MS spectra of lysine derivative 189. (a) spectrum of underivatizedmolecule (m/z 189; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
130 supplemental material
80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
232.2019
C 11 H26 O 2 N3
-0.0606 mmu
201.1597
C 10 H21 O 2 N2
-0.0250 mmu
130.0864
C 6 H12 O 2 N
0.1137 mmu
98.0969
C 6 H12 N
0.5024 mmu
143.1543
C 8 H19 N2
0.0283 mmu
161.1284
C 7 H17 O 2 N2
-0.0176 mmu187.1805
C 10 H23 O N2
-0.0259 mmu
+
201.1598130.0863
103.1230
161.1285
(a) Fragment spectrum of underivatized lysine modification 232 (m/z 232.2020; 1+)
QAC
100 150 200 250 300 350 400 450 500 550 600
m/z
0
10
20
30
40
50
60
70
80
90
100
Rela
tive A
bundance
171.0549
z=1
C 10 H7 O N2
-0.3848 mmu
232.2014
z=1
C 11 H26 O 2 N3
-0.5296 mmu
201.1593
z=2
C 10 H21 O 2 N2
-0.4358 mmu
402.2488
z=1
C 20 H36 O 7 N
0.1944 mmu
130.0861
z=1
C 6 H12 O 2 N
-0.1639 mmu
242.1281
z=1
C 14 H16 O N3
-0.7285 mmu
-2×QAC
+
+
402.2499 (+1)371.2078
201.6286 (+2)
242.1288
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 232 (m/z 286.65; 2+)
Figure A.12 HCD MS/MS spectra of lysine derivative 232. (a) spectrum of underivatizedmolecule (m/z 232.2020; 1+), nCE to 35 %; (b) spectrum of 2×QAC-derivatized molecule (m/z286.65; 2+), nCE to 30 %. Fragment peaks are annotated with an accurate mass, correspondingcalculated chemical composition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 131
80 100 120 140 160 180 200 220 240 260 280 300
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
98.0969
C 6 H12 N
0.4369 mmu
84.0814
C 5 H10 N
0.5993 mmu
187.1440
C 9 H19 O 2 N2
-0.1455 mmu
275.2770
C 14 H35 O N4
-3.5022 mmu
129.1387
C 7 H17 N2
0.0508 mmu
170.1174
C 9 H16 O 2 N
-0.1487 mmu241.5890
106.0866
C 4 H12 O 2 N
0.3370 mmu
72.0815
C 4 H10 N
0.7420 mmu
+
129.1386
(a) Fragment spectrum of underivatized lysine modification 275 (m/z 275; 1+)
100 150 200 250 300 350 400 450 500
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.1305 mmu
275.2443
C 13 H31 O 2 N4
0.1405 mmu
445.2924
C 23 H37 O 3 N6
0.2234 mmu
257.2337
C 13 H29 O N4
0.0850 mmu
145.0762
C 9 H9 N2
0.1623 mmu
187.1442
C 9 H19 O 2 N2
0.1370 mmu
98.0970
C 6 H12 N
0.5818 mmu
301.2233
C 14 H29 O 3 N4
-0.1507 mmu
+ + +
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 275 (m/z 251; 2+)
Figure A.13 HCD MS/MS spectra of lysine derivative 275. (a) spectrum of underivatizedmolecule (m/z 275; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
132 supplemental material
80 100 120 140 160 180 200 220 240 260 280 300 320
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
86.0970
C 5 H12 N
0.5290 mmu
98.0968
C 6 H12 N
0.3794 mmu
289.2592
C 14 H33 O 2 N4
-0.5727 mmu
187.1439
C 9 H19 O 2 N2
-0.2426 mmu143.1541
C 8 H19 N2
-0.1594 mmu
244.2016
C 12 H26 O 2 N3
-0.3818 mmu
201.1595
C 10 H21 O 2 N2
-0.3019 mmu
170.1173
C 9 H16 O 2 N
-0.2582 mmu
+
143.1543
244.2020
86.0964
187.1441
103.1230160.1808
130.0863
(a) Fragment spectrum of underivatized lysine modification 289 (m/z 289; 1+)
100 150 200 250 300 350 400 450 500 550 600 650 700 750 800
m/z
0
10
20
30
40
50
60
70
80
90
100
Rela
tive A
bundance
171.0553
z=1
C 10 H7 O N2
0.0436 mmu
289.2597
z=1
C 14 H33 O 2 N4
-0.1453 mmu
459.3079
z=1
C 24 H39 O 3 N6
0.1066 mmu
230.1576
z=2
C 5 H16 N11
-0.8901 mmu
143.1544
z=1
C 8 H19 N2
0.0967 mmu
187.1442
z=1
C 9 H19 O 2 N2
0.0814 mmu
86.0971
z=1
C 5 H12 N
0.6638 mmu
271.2491
z=1
C 14 H31 O N4
-0.1600 mmu
-18.0106
z=1
-H 2 O
0.5632 mmu
QAC
+ +
143.1543 86.0964
187.1441
459.3078
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 289 (m/z 251; 2+)
Figure A.14 HCD MS/MS spectra of lysine derivative 289. (a) spectrum of underivatizedmolecule (m/z 289; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 133
80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
201.1614
C 10 H21 O 2 N2
1.6726 mmu
317.2936
C 18 H39 O 3 N
1.1269 mmu
157.1713
C 9 H21 N2
1.3478 mmu130.0875
C 6 H12 O 2 N
1.1988 mmu 272.2353
C 16 H32 O 3
0.7339 mmu
98.0977
C 6 H12 N
1.3108 mmu
86.0972
C 5 H12 N
0.7741 mmu
157.1699
+
130.0863 272.2333201.1598
(a) Fragment spectrum of underivatized lysine modification 317a (m/z 317; 1+)
100 150 200 250 300 350 400 450 500
m/z
0
10
20
30
40
50
60
70
80
90
100
Rela
tive A
bundance
171.0555
z=1
C 10 H7 O N2
0.1894 mmu
232.2022
z=1
C 11 H26 O 2 N3
0.2216 mmu
317.2913
z=1
C 16 H37 O 2 N4
0.1719 mmu
402.2502
z=1
C 21 H32 O 3 N5
0.1949 mmu
201.1600
z=1
C 10 H21 O 2 N2
0.2014 mmu
QAC
86.0972
z=1
C 5 H12 N
0.7322 mmu
371.2077
z=1
C 20 H27 O 3 N4
-0.0646 mmu
+
317.2911
157.1699
371.2078
402.2499
232.2019
201.1598
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 317a (m/z 251; 2+)
Figure A.15 HCD MS/MS spectra of lysine derivative 317a. (a) spectrum of underivatizedmolecule (m/z 317; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
134 supplemental material
80 100 120 140 160 180 200 220 240 260 280 300 320 340
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
201.1600
C 10 H21 O 2 N2
0.2305 mmu317.2913
C 16 H37 O 2 N4
0.1768 mmu98.0971
C 6 H12 N
0.6496 mmu 143.1545
C 8 H19 N2
0.2238 mmu
187.1443
C 9 H19 O 2 N2
0.2361 mmu
86.0972
C 5 H12 N
0.7645 mmu
+
143.1543
130.0863 215.1754 272.2333
(a) Fragment spectrum of underivatized lysine modification 317b (m/z 317; 1+)
100 150 200 250 300 350 400 450 500
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
143.1544
C 8 H19 N2
0.1133 mmu
98.0970
C 6 H12 N
0.5436 mmu
171.0554
C 10 H7 O N2
0.1000 mmu
317.2546
C 15 H33 O 3 N4
-0.1092 mmu487.3389
C 26 H43 O 3 N6
-0.1875 mmu
QAC
386.2393
C 21 H32 O 3 N4
-0.4933 mmu
+ +
487.3391
143.1543
386.2318
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 317b (m/z 251; 2+)
Figure A.16 HCD MS/MS spectra of lysine derivative 317b. (a) spectrum of underivatizedmolecule (m/z 317; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 135
80 100 120 140 160 180 200 220 240 260 280 300 320 340
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
143.1545
C 8 H19 N2
0.2375 mmu
331.2093
C 13 H27 O 4 N6
0.4634 mmu
157.1702
C 9 H21 N2
0.2474 mmu
313.2739
C 19 H37 O 3
0.1868 mmu
109.1017
C 8 H13
0.5265 mmu
215.1393
C 10 H19 O 3 N2
0.2611 mmu
129.1389
C 7 H17 N2
0.3067 mmu
270.2793
C 17 H36 O N
0.1538 mmu
239.2372
C 16 H31 O
0.2563 mmu
98.0971
C 6 H12 N
0.6491 mmu
86.0972
C 5 H12 N
0.7625 mmu
+
157.1699
(a) Fragment spectrum of underivatized lysine modification 331a (m/z 331; 1+)
80 100 120 140 160 180 200 220 240 260 280 300 320 340
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0555
C 10 H7 O N2
0.1610 mmu
0.6154 mmu
157.1702
C 9 H21 N2
0.2255 mmu
331.3069
C 17 H39 O 2 N4
0.1318 mmu
86.0972
C 5 H12 N
0.7344 mmu
98.0970
C 6 H12 N
+
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 331a (m/z 251; 2+)
Figure A.17 HCD MS/MS spectra of lysine derivative 331a. (a) spectrum of underivatizedmolecule (m/z 331; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
136 supplemental material
80 100 120 140 160 180 200 220 240 260 280 300 320 340
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
95.0862
C 7 H11
0.6936 mmu
109.1017
C 8 H13
0.5519 mmu
157.1702
C 9 H21 N2
0.2722 mmu 313.2739
C 19 H37 O 3
0.2036 mmu130.0866
C 6 H12 O 2 N
0.3451 mmu
215.1757
C 11 H23 O 2 N2
0.3348 mmu 331.2094
C 13 H27 O 4 N6
0.5939 mmu
257.2477
C 16 H33 O 2
0.2319 mmu
+ +
157.1699
(a) Fragment spectrum of underivatized lysine modification 331b (m/z 331; 1+)
80 100 120 140 160 180 200 220 240 260 280 300 320 340
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.1305 mmu
161.1286
C 7 H17 O 2 N2
0.1620 mmu98.0970
C 6 H12 N
0.6047 mmu130.0865
C 6 H12 O 2 N
0.2469 mmu
187.1079
C 8 H15 O 3 N2
0.1608 mmu
331.3071
C 17 H39 O 2 N4
0.3149 mmu
172.0587
C 5 H8 O 3 N4
-0.3758 mmu
+ +
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 331b (m/z 251; 2+)
Figure A.18 HCD MS/MS spectra of lysine derivative 331b. (a) spectrum of underivatizedmolecule (m/z 331; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 137
70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
82.0658
C 5 H8 N
0.6497 mmu
100.0761
C 5 H10 O N
0.4478 mmu
128.0707
C 6 H10 O 2 N
0.0990 mmu 146.0811
C 6 H12 O 3 N
-0.0601 mmu
205.1546
C 9 H21 O 3 N2
-0.0764 mmu160.1331
C 8 H18 O 2 N
-0.0798 mmu
74.0244
C 2 H4 O 2 N
0.7406 mmu
72.0815
C 4 H10 N
0.7574 mmu
+
(a) Fragment spectrum of underivatized lysine modification 205 (m/z 205; 1+)
80 100 120 140 160 180 200 220 240 260 280 300
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.1153 mmu
188.0820
C 10 H10 O N3
0.1470 mmu
146.0813
C 6 H12 O 3 N
0.1596 mmu
100.0763
C 5 H10 O N
0.5729 mmu
128.0709
C 6 H10 O 2 N
0.2720 mmu
82.0659
C 5 H8 N
0.7464 mmu
205.1548
C 9 H21 O 3 N2
0.1619 mmu
+
+
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 205 (m/z 251; 2+)
Figure A.19 HCD MS/MS spectra of lysine derivative 205. (a) spectrum of underivatizedmolecule (m/z 205; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
138 supplemental material
80 100 120 140 160 180 200 220 240 260 280 300 320 340
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
129.1387
C 7 H17 N2
0.1020 mmu
98.0969
C 6 H12 N
0.4771 mmu
319.2700
C 15 H35 O 3 N4
-0.3383 mmu
86.0970
C 5 H12 N
0.6054 mmu
+
129.1386
(a) Fragment spectrum of underivatized lysine modification 319 (m/z 319; 1+)
100 150 200 250 300 350 400 450 500
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.0695 mmu
299.1866
C 17 H23 O N4
0.0076 mmu
268.1444
C 16 H18 O N3
-0.0296 mmu
191.1391
C 8 H19 O 3 N2
0.0954 mmu
129.1388
C 7 H17 N2
0.2146 mmu
489.3183
C 25 H41 O 4 N6
-0.0514 mmu
361.1870
C 18 H25 O 4 N4
-0.0200 mmu
+ +
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 319 (m/z 251; 2+)
Figure A.20 HCD MS/MS spectra of lysine derivative 319. (a) spectrum of underivatizedmolecule (m/z 319; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 139
80 100 120 140 160 180 200 220 240 260 280 300 320 340
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
143.1542
C 8 H19 N2
-0.0280 mmu
98.0969
C 6 H12 N
0.4637 mmu
86.0970
C 5 H12 N
0.6080 mmu
333.2855
C 16 H37 O 3 N4
-0.5031 mmu
188.2120
C 10 H26 N3
-0.1125 mmu
115.1232
C 6 H15 N2
0.2532 mmu
+
143.1543
188.2121
(a) Fragment spectrum of underivatized lysine modification 333 (m/z 333; 1+)
100 150 200 250 300 350 400 450 500
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.0847 mmu
143.1544
C 8 H19 N2
0.1438 mmu
98.0970
C 6 H12 N
0.5742 mmu268.1444
C 16 H18 O N3
0.0010 mmu
503.3343
C 26 H43 O 4 N6
0.2287 mmu
191.1391
C 8 H19 O 3 N2
0.0802 mmu
333.2860
C 16 H37 O 3 N4
-0.0373 mmu
+ +
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 333 (m/z 251; 2+)
Figure A.21 HCD MS/MS spectra of lysine derivative 333. (a) spectrum of underivatizedmolecule (m/z 333; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
140 supplemental material
80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
157.1702
C 9 H21 N2
0.2383 mmu
347.3019
C 17 H39 O 3 N4
0.2623 mmu
129.1389
C 7 H17 N2
0.3088 mmu
98.0971
C 6 H12 N
0.6326 mmu 303.0934
C 10 H15 O 7 N4
-0.0772 mmu
251.2702
C 18 H35
-3.1556 mmu
219.0572
C 8 H15 O 2 N P 2
-0.0562 mmu
+ +
157.1699
(a) Fragment spectrum of underivatized lysine modification 347 (m/z 347; 1+)
80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380 400 420
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.1305 mmu
268.1445
C 16 H18 O N3
0.0925 mmu
254.0924
C 14 H12 O 2 N3
0.0193 mmu
191.1391
C 8 H19 O 3 N2
0.1259 mmu
332.1241
C 16 H18 O 5 N3
0.0179 mmu
130.0865
C 6 H12 O 2 N
0.2622 mmu
386.4532
C 23 H54 N4
18.9479 mmu229.6420
+ +
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 347 (m/z 251; 2+)
Figure A.22 HCD MS/MS spectra of lysine derivative 347. (a) spectrum of underivatizedmolecule (m/z 347; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 141
100 150 200 250 300 350 400
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
129.1384
C 7 H17 N2
-0.2371 mmu
399.2354
C 17 H31 O 5 N6
0.3553 mmu86.0968
C 5 H12 N
0.3916 mmu
98.0967
C 6 H12 N
0.2241 mmu 303.2744
C 15 H35 O 2 N4
-1.0685 mmu
174.1960
C 9 H24 N3
-0.4372 mmu
319.2693
C 15 H35 O 3 N4
-1.1088 mmu
143.1539
C 8 H19 N2
-0.3740 mmu
+
129.1386
(a) Fragment spectrum of underivatized lysine modification 399 (m/z 399; 1+)
100 150 200 250 300 350 400 450 500 550 600
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.0847 mmu
129.1389
C 7 H17 N2
0.2451 mmu
436.1505
C 23 H22 O 6 N3
0.1398 mmu
299.1866
C 17 H23 O N4
0.0076 mmu98.0970
C 6 H12 N
0.5818 mmu
569.2851
C 15 H41 O 13 N10
0.1876 mmu
220.0969
C 12 H14 O 3 N
0.0887 mmu
399.2368
C 15 H36 O 6 N4 P
0.0881 mmu
+ +
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 399 (m/z 251; 2+)
Figure A.23 HCD MS/MS spectra of lysine derivative 399. (a) spectrum of underivatizedmolecule (m/z 399; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
142 supplemental material
100 150 200 250 300 350 400 450 500
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
143.1543
C 8 H19 N2
-0.0074 mmu
188.2120
C 10 H26 N3
-0.0887 mmu86.0970
C 5 H12 N
0.6105 mmu
413.2519
C 19 H29 O N10
-0.1028 mmu
333.2856
C 16 H37 O 3 N4
-0.4587 mmu
223.1204
C 12 H17 O 3 N
0.1374 mmu
+
143.1543
(a) Fragment spectrum of underivatized lysine modification 413 (m/z 413; 1+)
100 150 200 250 300 350 400 450
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 10 H7 O N2
0.0695 mmu
143.1544
C 8 H19 N2
0.1285 mmu
188.2122
C 10 H26 N3
0.1101 mmu
413.2524
C 16 H38 O 6 N4 P
0.0630 mmu
98.0970
C 6 H12 N
0.5665 mmu268.1444
C 16 H18 O N3
-0.0296 mmu
439.2317
C 17 H36 O 7 N4 P
0.1076 mmu
86.0971
C 5 H12 N
0.6962 mmu
+ +
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 413 (m/z 251; 2+)
Figure A.24 HCD MS/MS spectra of lysine derivative 413. (a) spectrum of underivatizedmolecule (m/z 413; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
A.2 xics of qac-derivatives 143
100 150 200 250 300 350 400 450
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
86.0968
C 6 H14
-12.1871 mmu
157.1695
C 8 H19 N3
12.1536 mmu
143.1539
C 7 H17 N3
12.1900 mmu
116.0706
C 3 H8 O N4
1.3023 mmu 202.2271
C 13 H30 O
-1.9861 mmu347.3004
C 15 H37 O 2 N7
0.0821 mmu
427.2665
C 15 H38 O 5 N7 P
-0.1963 mmu279.7980
C 2 H O 14 P
-111.8351 mmu
+ +
157.1699
(a) Fragment spectrum of underivatized lysine modification 427 (m/z 427; 1+)
100 150 200 250 300 350 400 450
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
171.0554
C 6 H10 O N3 P
-0.1801 mmu
242.1289
C 10 H19 O N4 P
-0.2235 mmu
311.1367
C 10 H18 O N9 P
0.0135 mmu
143.1544
C 9 H21 N
-12.4170 mmu
427.2684
C 22 H33 O 2 N7
-0.6227 mmu
86.0972
C 5 H12 N
0.7267 mmu
+ +
(b) Fragment spectrum of 1×QAC-derivatized lysine modification 427 (m/z 251; 2+)
Figure A.25 HCD MS/MS spectra of lysine derivative 427. (a) spectrum of underivatizedmolecule (m/z 427; 1+), nCE to 30 %; (b) spectrum of 1×QAC-derivatized molecule, nCE to30 %. Fragment peaks are annotated with an accurate mass, corresponding calculated chemicalcomposition (CHNOP) and delta mass (in mmu).
Table A.2 Sequences of identified proteins bearing lysine ε-polyamine PTMs. The legend for the color-coding: COVERAGE , peptide coverage; KXXK and RXL ,sequence motifs (where X stands for any amino acid); K , modified lysine residue; M , oxidized methionine residue; TP , T. pseudonana; CC , C. cryptica; TO ,T. oceanica; AFSM, ammonium fluoride soluble material; AFIM, ammonium fluoride insoluble material.
# protein sequence coverage
1B5YLH4
TP
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
2B5YLI3
TP
AFSM
AFIM
3B5YLL4
TP
AFSM
4B5YLX5
TP
AFSM
AFIM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
5B5YNQ3
TP
AFSM
6B8BRK6
TP
AFSM
7B8BSN6
TP
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
8B8BV44
TP
AFSM
9B8BYI7
TP
AFSM
10B8C0W5
TP
AFSM
11B8C2P6
TP
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
12B8C2P7
TP
AFSM
13B8C406
TP
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
14B8C8R2
TP
AFSM
15B8C9R4
TP
AFSM
16B8CBB3
TP
AFSM
17B8CC24
TP
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
18B8CE63
TP
AFSM
19B8CG95
TP
AFSM
20B8CGN3
TP
AFIM
21B8CGQ5
TP
AFIM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
22B8CGS1
TP
AFSM
AFIM
23B8LBG8
TP
AFSM
24B8LBU6
TP
AFSM
25B8LDT2
TP
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
26g11469
CC
AFSM
AFIM
27g11606
CC
AFIM
28g13975
CC
AFSM
AFIM
29g1484
CC
AFSM
AFIM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
30g15479
CC
AFSM
AFIM
31g15720
CC
AFSM
32g22685
CC
AFSM
33g25187
CC
AFSM
AFIM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
34g2543
CC
AFIM
35g3798
CC
AFIM
36g3964
CC
AFIM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
37g749
CC
AFIM
38g7979
CC
AFIM
39g8502
CC
AFSM
AFIM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
40g9515
CC
AFSM
AFIM
41K0R7E4
TO
AFSM
42K0R8C7
TO
AFSM
43K0RCW9
TO
AFIM
44K0RHV4
TO
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
45K0RIC9
TO
AFIM
46K0RN71
TO
AFSM
47K0RU48
TO
AFSM
AFIM
48K0RWT0
TO
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
49K0S1R3
TO
AFSM
50K0S7V0
TO
AFSM
51K0S9A6
TO
AFSM
52K0SAX6
TO
AFSM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
53K0SQ58
TO
AFSM
54K0SSD7
TO
AFSM
55K0T322
TO
AFIM
56K0T463
TO
AFIM
Table A.2 Sequences of identified proteins bearing lysine PTMs (continued from previous page)
# protein sequence coverage
57K0T7A1
TO
AFSM
58K0TBU5
TO
AFSM
59K0TCJ2
TO
AFSM
60K0RNL4
TO
AFSM
AFIM
A.2 xics of qac-derivatives 161
TP ..IK↑XXK.. ..KXX
IIK↑.. Total
PAK↑
11 6 17
MeK↑
1 8 9
Total 12 14 26
(a) P = 0.0145
TP ..KXXUnK↑.. ..KXX
MeK↑.. Total
..UnK↑XXK.. 3 2 5
..PAK↑XXK.. 3 5 8
Total 6 7 13
(b) P = 0.5921
CC ..IK↑XXK.. ..KXX
IIK↑.. Total
PAK↑
11 0 11
MeK↑
0 24 24
Total 11 24 35
(c) P < 0.0001
CC ..KXXUnK↑.. ..KXX
MeK↑.. Total
..UnK↑XXK.. 2 2 4
..PAK↑XXK.. 0 12 12
Total 2 14 16
(d) P = 0.0499
TO ..IK↑XXK.. ..KXX
IIK↑.. Total
PAK↑
7 4 11
MeK↑
4 10 14
Total 11 14 25
(e) P = 0.1160
TO ..KXXUnK↑.. ..KXX
MeK↑.. Total
..UnK↑XXK.. 0 2 2
..PAK↑XXK.. 0 3 3
Total 0 5 5
(f ) P = 1.0000
All ..IK↑XXK.. ..KXX
IIK↑.. Total
PAK↑
29 10 39
MeK↑
5 42 47
Total 34 52 86
(g) P < 0.0001
All ..KXXUnK↑.. ..KXX
MeK↑.. Total
..UnK↑XXK.. 5 6 11
..PAK↑XXK.. 3 20 23
Total 8 26 34
(h) P = 0.0789
Table A.3 Contingency tables for Fisher’s exact test (data taken from Fig. 3.23). Tables A.3a, A.3c, A.3eand A.3g, non-random modification patterns in KXXK; Tables A.3b, A.3d and A.3h, analysis of interactionbetween ε-polyamination and methylation (crosstalk). The association is considered to be statisticallysignificant when two-tailed P-value is less than 0.05. I, N-terminal lysine in KXXK; II, C-terminal lysinein KXXK; PA, ε-polyaminated lysine (PTMs 232, 289 or 333); Me, di- or trimethylated lysine (PTMs 175 or189); Un, unmodified lysine.
B B I B L I O G R A P H Y
[1] Frank E. Round, Richard M. Crawford, and David
G. Mann. Diatoms: Biology and Morphology of the
Genera. Cambridge University Press, 1990.
[2] C. B. Field. Primary Production of the Biosphere: In-
tegrating Terrestrial and Oceanic Components. Sci-
ence 281.5374 (1998), pp. 237–240.
[3] Virginia E. Armbrust. The life of diatoms in the
world's oceans. Nature 459.7244 (2009), pp. 185–192.
[4] Shruti Malviya et al. Insights into global diatom dis-
tribution and diversity in the world’s ocean. Proceed-
ings of the National Academy of Sciences 113.11 (2016),
E1516–E1525.
[5] Jaap S. Sinninghe Damsté et al. The Rise of the Rhi-
zosolenid Diatoms. Science 304.5670 (2004), pp. 584–
587.
[6] Maxime C. Bridoux, Vadim V. Annenkov, Richard
G. Keil, and Anitra E. Ingalls. Widespread dis-
tribution and molecular diversity of diatom frus-
tule bound aliphatic long chain polyamines (LCPAs)
in marine sediments. Organic Geochemistry 48 (2012),
pp. 9 –20.
[7] Sunil Kumar Shukla and Rahul Mohan. The Con-
tribution of Diatoms to Worldwide Crude Oil De-
posits. Cellular Origin, Life in Extreme Habitats and As-
trobiology. Springer Netherlands, 2012, pp. 355–382.
[8] Anne-Sophie Benoiston, Federico M. Ibarbalz, Lu-
cie Bittner, Lionel Guidi, Oliver Jahn, Stephanie
Dutkiewicz, and Chris Bowler. The evolution of di-
atoms and their biogeochemical functions. Philosoph-
ical Transactions of the Royal Society B: Biological Sciences
372.1728 (2017).
[9] Wiebe H.C.F. Kooistra and Linda K. Medlin. Evolu-
tion of the Diatoms (Bacillariophyta). Molecular Phy-
logenetics and Evolution 6.3 (1996), pp. 391–407.
[10] Wiebe H.C.F. Kooistra, Mario De Stefano, David
G. Mann, and K. Medlin. The Phylogeny of the Di-
atoms. Silicon Biomineralization. Springer Berlin Heidel-
berg, 2003, pp. 59–97.
[11] D. G. Mann and S. J. M. Droop. Biodiversity, bio-
geography and conservation of diatoms. Hydrobiologia
336.1 (1996), pp. 19–32.
[12] A. Montsant, K. Jabbari, U. Maheswari, and C.
Bowler. Comparative genomics of the pennate di-
atom Phaeodactylum tricornutum. Plant Physiology
137.2 (2005), pp. 500–513.
[13] E. Virginia Armbrust et al. The Genome of the Di-
atom Thalassiosira Pseudonana: Ecology, Evolution,
and Metabolism. Science 306.5693 (2004), pp. 79–86.
[14] Markus Lommer et al. Genome and low-iron re-
sponse of an oceanic diatom adapted to chronic iron
limitation. Genome Biology 13.7 (2012), R66.
[15] Jesse C. Traller et al. Genome and methylome of the
oleaginous diatom Cyclotella cryptica reveal genetic
flexibility toward a high lipid phenotype. Biotechnol-
ogy for Biofuels 9.1 (2016), p. 258.
[16] Chris Bowler et al. The Phaeodactylum genome re-
veals the evolutionary history of diatom genomes.
Nature 456 (2008), pp. 239–244.
[17] Andrew E Allen, Assaf Vardi, and Chris Bowler.
An ecological and evolutionary context for integrated
nitrogen metabolism and related signaling pathways
in marine diatoms. Current Opinion in Plant Biology 9.3
(2006), pp. 264–273.
[18] Andrew E. Allen et al. Evolution and metabolic sig-
nificance of the urea cycle in photosynthetic diatoms.
Nature 473 (2011), pp. 203–207.
163
164 bibliography
[19] © Christina Brodie. Geometry and Pattern in Nature 1:
Exploring the shapes of diatom frustules with Johan Gielis’
Superformula. 2004. url: http://www.microscopy-uk.
org.uk/mag/indexmag.html?http://www.microscopy-
uk.org.uk/mag/artapr04/cbdiatom2.html (accessed
on 06/16/2018).
[20] Christian E. Hamm, Rudolf Merkel, Olaf
Springer, Piotr Jurkojc, Christian Maier, Kathrin
Prechtel, and Victor Smetacek. Architecture and
material properties of diatom shells provide effec-
tive mechanical protection. Nature 421.6925 (2003),
pp. 841–843.
[21] Christian E. Hamm. The Evolution of Advanced Me-
chanical Defenses and Potential Technological Appli-
cations of Diatom Shells. Journal of Nanoscience and
Nanotechnology 5.1 (2005), pp. 108–119.
[22] Zachary H. Aitken, Shi Luo, Stephanie N.
Reynolds, Christian Thaulow, and Julia R. Greer.
Microstructure provides insights into evolutionary
design and resilience of Coscinodiscus sp. frustule.
Proceedings of the National Academy of Sciences 113.8
(2016), pp. 2017–2022.
[23] John A. Raven. The Transport and Function of Sili-
con In Plants. Biological Reviews 58.2 (1983), pp. 179–
207.
[24] Katharine R. Hendry, Alan O. Marron, Flora
Vincent, Daniel J. Conley, Marion Gehlen, Fed-
erico M. Ibarbalz, Bernard Quéguiner, and Chris
Bowler. Competition between Silicifiers and Non-
silicifiers in the Past and Present Ocean and Its Evo-
lutionary Impacts. Frontiers in Marine Science 5 (2018).
[25] Jane Bradbury. Nature's Nanotechnologists: Unveil-
ing the Secrets of Diatoms. PLoS Biology 2.10 (2004),
pp. 1512–1515.
[26] Michael Gross. The mysteries of the diatoms. Cur-
rent Biology 22.15 (2012), pp. 581–585.
[27] E. Epstein. The anomaly of silicon in plant biol-
ogy. Proceedings of the National Academy of Sciences 91.1
(1994), pp. 11–17.
[28] Emanuel Epstein. SILICON. Annual Review of Plant
Physiology and Plant Molecular Biology 50.1 (1999),
pp. 641–664.
[29] Richard Gordon and Ryan W. Drum. The Chemi-
cal Basis of Diatom Morphogenesis. Mechanical En-
gineering of the Cytoskeleton in Developmental Biology.
Ed. by Richard Gordon. Vol. 150. International Re-
view of Cytology Supplement C. Academic Press, 1994,
pp. 243 –372.
[30] Cheryl Wong Po Foo, Jia Huang, and David L. Ka-
plan. Lessons from seashells: silica mineralization
via protein templating. Trends in Biotechnology 22.11
(2004), pp. 577 –585.
[31] Manfred Sumper and Eike Brunner. Learning from
Diatoms: Nature’s Tools for the Production of Nanos-
tructured Silica. Advanced Functional Materials 16.1
(2006), pp. 17–26.
[32] Nils Kröger and Nicole Poulsen. Biochemistry and
Molecular Genetics of Silica Biomineralization in Di-
atoms. Handbook of Biomineralization. Wiley-VCH Ver-
lag GmbH, 2008. Chap. 3, pp. 43–58.
[33] Nils Kröger and Nicole Poulsen. Diatoms—From
Cell Wall Biogenesis to Nanotechnology. Annual Re-
view of Genetics 42.1 (2008), pp. 83–107.
[34] Nils Kröger and Kenneth H. Sandhage. From
Diatom Biomolecules to Bioinspired Syntheses of
Silica- and Titania-Based Materials. MRS Bulletin 35.2
(2010), 122–126.
[35] Richard Gordon, Dusan Losic, Mary Ann Tiffany,
Stephen S. Nagy, and Frithjof A. S. Sterrenburg.
The Glass Menagerie: diatoms for novel applica-
tions in nanotechnology. Trends in Biotechnology 27.2
(XXXX), pp. 116–127.
[36] Gloria Bueno, Oscar Deniz, Anibal Pedraza,
Jesús Ruiz-Santaquiteria, Jesús Salido, Gabriel
Cristóbal, María Borrego-Ramos, and Saúl
Blanco. Automated Diatom Classification (Part A):
Handcrafted Feature Approaches. Applied Sciences
7.12 (2017), p. 753.
[37] Ernst Heinrich Philipp August Haeckel. Kunstfor-
men der Natur. Leipzig und Wien :Verlag des Bibli-
ographischen Instituts, 1904.
[38] Victor A. Chepurnov, David G. Mann, Koen Sabbe,
and Wim Vyverman. Experimental Studies on Sex-
ual Reproduction in Diatoms. International Review of
Cytology. Elsevier, 2004, pp. 91–154.
bibliography 165
[39] P. Treguer, D. M. Nelson, A. J. Van Bennekom, D. J.
DeMaster, A. Leynaert, and B. Queguiner. The Sil-
ica Balance in the World Ocean: A Reestimate. Science
268.5209 (1995), pp. 375–379.
[40] Mark Hildebrand, Benjamin E. Volcani, Walter
Gassmann, and Julian I. Schroeder. A gene family
of silicon transporters. Nature 385 (1997), pp. 688–689.
[41] Alan O. Marron, Sarah Ratcliffe, Glen L.
Wheeler, Raymond E. Goldstein, Nicole King, Fab-
rice Not, Colomban de Vargas, and Daniel J.
Richter. The Evolution of Silicon Transport in Eu-
karyotes. Molecular Biology and Evolution 33.12 (2016),
pp. 3226–3248.
[42] Colleen A. Durkin, Julie A. Koester, Sara J. Ben-
der, and E. Virginia Armbrust. The evolution of sil-
icon transporters in diatoms. Journal of Phycology 52.5
(2016), pp. 716–731.
[43] Michael J. Knight, Laura Senior, Bethany Nanco-
las, Sarah Ratcliffe, and Paul Curnow. Direct ev-
idence of the molecular basis for biological silicon
transport. Nature Communications 7 (2016), pp. 1–11.
[44] Grazyna M. Durak, Alison R. Taylor, Charlotte
E. Walker, Ian Probert, Colomban de Vargas,
Stephane Audic, Declan Schroeder, Colin Brown-
lee, and Glen L. Wheeler. A role for diatom-like sil-
icon transporters in calcifying coccolithophores. Na-
ture Communications 7 (2016), p. 10543.
[45] Tracy L. Simpson and Benjamin E. Volcani. Sili-
con and Siliceous Structures in Biological Systems.
Springer New York, 1981.
[46] Daniel Otzen. The Role of Proteins in Biosilicifica-
tion. Scientifica 2012.867562 (2012), p. 22.
[47] Carolin C. Lechner and Christian F. W. Becker.
A sequence-function analysis of the silica precipitat-
ing silaffin R5 peptide. Journal of Peptide Science 20.2
(2014), pp. 152–158.
[48] H. Ehrlich and A. Witkowski. Biomineralization in
Diatoms: The Organic Templates. Biologically-Inspired
Systems. Springer Netherlands, 2015, pp. 39–58.
[49] Mark Hildebrand and Sarah J.L. Lerch. Diatom
silica biomineralization: Parallel development of ap-
proaches and understanding. Seminars in Cell & Devel-
opmental Biology 46.Supplement C (2015), pp. 27 –35.
[50] Mark Hildebrand, Sarah J. L. Lerch, and Roshan
P. Shrestha. Understanding Diatom Cell Wall Silici-
fication—Moving Forward. Frontiers in Marine Science
5 (2018).
[51] Tadashi Nakajima and Benjamin E. Volcani. 3,4-
Dihydroxyproline: A New Amino Acid in Diatom
Cell Walls. Science 164.3886 (1969), pp. 1400–1401.
[52] Tadashi Nakajima and Benjamin E. Volcani. ε-N-
trimethyl-L-δ-hydroxylysine phosphate and its non-
phosphorylated compound in diatom cell walls. Bio-
chemical and Biophysical Research Communications 39.1
(1970), pp. 28 –33.
[53] Nils Kröger, Christian Bergsdorf, and Manfred
Sumper. A new calcium binding glycoprotein family
constitutes a major diatom cell wall component. The
EMBO Journal 13.19 (1994), pp. 4676–4683.
[54] Nils Kröger, Christian Bergsdorf, and Manfred
Sumper. Frustulins: Domain Conservation in a Pro-
tein Family Associated with Diatom Cell Walls. Euro-
pean Journal of Biochemistry 239.2 (1996), pp. 259–264.
[55] Nils Kröger and Richard Wetherbee. Pleuralins
are Involved in Theca Differentiation in the Diatom
Cylindrotheca fusiformis. Protist 151.3 (2000), pp. 263
–273.
[56] Willem H. van de Poll, Engel G. Vrieling, and
Winfried W. C. Gieskes. Location and Expression
of Frustulins in the Pennate Diatoms Cylindrotheca
fusiformis, Navicula pelliculosa, and Navicula sali-
narum (Bacillariophyceae). Journal of Phycology 35.5
(1999), pp. 1044–1053.
[57] Nils Kröger, Rainer Deutzmann, Christian
Bergsdorf, and Manfred Sumper. Species-specific
Polyamines from Diatoms Control Silica Morphol-
ogy. Proceedings of the National Academy of Sciences
97.26 (2000), pp. 14133–14138.
[58] Andrew J. Mort and Derek T.A. Lamport. An-
hydrous hydrogen fluoride deglycosylates glycopro-
teins. Analytical Biochemistry 82.2 (1977), pp. 289 –309.
[59] A. J. Mort, P. Komalavilas, G. L. Rorrer, and D. T. A.
Lamport. Anhydrous Hydrogen Fluoride and Cell-
Wall Analysis. Plant Fibers. Ed. by Hans-Ferdinand
Linskens and John F. Jackson. Berlin, Heidelberg:
Springer Berlin Heidelberg, 1989, pp. 37–69.
166 bibliography
[60] Manfred Sumper, Eike Brunner, and Gerhard
Lehmann. Biomineralization in diatoms: Character-
ization of novel polyamines associated with silica.
FEBS Letters 579.17 (2005), pp. 3765–3769.
[61] Manfred Sumper and Gerhard Lehmann. Sil-
ica Pattern Formation in Diatoms: Species-Specific
Polyamine Biosynthesis. ChemBioChem 7.9 (2006),
pp. 1419–1427.
[62] Manfred Sumper and Nils Kröger. Silica formation
in diatoms: the function of long-chain polyamines
and silaffins. J. Mater. Chem. 14 (14 2004), pp. 2059–
2065.
[63] Manfred Sumper. Biomimetic patterning of silica
by long-chain polyamines. Angewandte Chemie Inter-
national Edition 43.17 (2004), pp. 2251–2254.
[64] Nils Kröger, Rainer Deutzmann, and Manfred
Sumper. Polycationic Peptides from Diatom Biosil-
ica That Direct Silica Nanosphere Formation. Science
286.5442 (1999), pp. 1129–1132.
[65] Nils Kröger, Rainer Deutzmann, and Manfred
Sumper. Silica-precipitating Peptides from Diatoms:
the chemical structure of silaffin-1A from Cylin-
drotheca fusiformis. Journal of Biological Chemistry
276.28 (2001), pp. 26066–26070.
[66] Nils Kröger, Sonja Lorenz, Eike Brunner, and
Manfred Sumper. Self-Assembly of Highly Phos-
phorylated Silaffins and Their Function in Biosilica
Morphogenesis. Science 298.5593 (2002), pp. 584–586.
[67] Nicole Poulsen and Nils Kröger. Silica Morpho-
genesis by Alternative Processing of Silaffins in the
Diatom Thalassiosira pseudonana. Journal of Biological
Chemistry 279.41 (2004), pp. 42993–42999.
[68] Manfred Sumper, Robert Hett, Gerhard Lehmann,
and Stephan Wenzl. A Code for Lysine Modifica-
tions of a Silica Biomineralizing Silaffin Protein.
Angewandte Chemie 119.44 (2007), pp. 8557–8560.
[69] Manfred Sumper and Eike Brunner. Silica Biomin-
eralisation in Diatoms: The Model Organism Thalas-
siosira pseudonana. ChemBioChem 9.8 (2008), pp. 1187–
1194.
[70] H. Nielsen, J. Engelbrecht, S. Brunak, and G. von
Heijne. Identification of prokaryotic and eukaryotic
signal peptides and prediction of their cleavage sites.
Protein Engineering Design and Selection 10.1 (1997),
pp. 1–6.
[71] André Scheffel, Nicole Poulsen, Samuel Shian,
and Nils Kröger. Nanopatterned protein microrings
from a diatom that direct silica morphogenesis. Pro-
ceedings of the National Academy of Sciences 108.8 (2011),
pp. 3175–3180.
[72] Stephan Wenzl, Robert Hett, Patrick Richtham-
mer, and Manfred Sumper. Silacidins: Highly
Acidic Phosphopeptides from Diatom Shells Assist
in Silica Precipitation In Vitro. Angewandte Chemie In-
ternational Edition 47.9 (2008), pp. 1729–1732.
[73] Patrick Richthammer, Mandy Börmel, Eike Brun-
ner, and Karl-Heinz van Pée. Biomineralization in
Diatoms: The Role of Silacidins. ChemBioChem 12.9
(2011), pp. 1362–1366.
[74] Amy R Kirkham et al. A role for the cell-wall pro-
tein silacidin in cell size of the diatom Thalassiosira
pseudonana. The ISME Journal 11.11 (2017), pp. 2452–
2464.
[75] Christian Zerfaß, Garry W. Buchko, Wendy J.
Shaw, Stephan Hobe, and Harald Paulsen. Sec-
ondary structure and dynamics study of the intrin-
sically disordered silica-mineralizing peptide P5 S3
during silicic acid condensation and silica deconden-
sation. Proteins: Structure, Function, and Bioinformatics
85.11 (2017), pp. 2111–2126.
[76] Manfred Sumper. A Phase Separation Model for the
Nanopatterning of Diatom Biosilica. Science 295.5564
(2002), pp. 2430–2433.
[77] Nils Kröger and Manfred Sumper. The Biochem-
istry of Silica Formation in Diatoms. Ed. by Edmund
Bäuerlein. 2nd edition. Wiley-VCH, Weinheim, 2004.
Chap. 9, pp. 137–158.
[78] Manfred Sumper, Sonja Lorenz, and Eike Brun-
ner. Biomimetic Control of Size in the Polyamine-
Directed Formation of Silica Nanospheres. Ange-
wandte Chemie International Edition 42.42 (2003),
pp. 5192–5195.
[79] Ruedi Aebersold and Matthias Mann. Mass
spectrometry-based proteomics. Nature 422.6928
(2003), pp. 198–207.
[80] Matthias Mann and Ole N. Jensen. Proteomic
analysis of post-translational modifications. Nature
Biotechnology 21.3 (2003), pp. 255–261.
[81] Ole N. Jensen. Interpreting the protein language us-
ing proteomics. Nature Reviews Molecular Cell Biology 7
(2006), pp. 391–403.
bibliography 167
[82] Yingming Zhao and Ole N. Jensen. Modification-
specific proteomics: Strategies for characterization
of post-translational modifications using enrichment
techniques. PROTEOMICS 9.20 (2009), pp. 4632–4641.
[83] Nicole Poulsen, Manfred Sumper, and Nils
Kröger. Biosilica formation in diatoms: Characteri-
zation of native silaffin-2 and its role in silica mor-
phogenesis. Proceedings of the National Academy of Sci-
ences 100.21 (2003), pp. 12075–12080.
[84] Albert S.B. Edge, Connie R. Faltynek, Liselotte
Hof, Leo E. Reichert, and Peter Weber. Deglycosy-
lation of glycoproteins by trifluoromethanesulfonic
acid. Analytical Biochemistry 118.1 (1981), pp. 131–137.
[85] Albert S. B. Edge. Deglycosylation of glycoproteins
with trifluoromethanesulphonic acid: elucidation of
molecular structure and function. Biochemical Journal
376.2 (2003), pp. 339–350.
[86] Kevin P. Campbell, David H. MacLennan, and An-
nelise O. Jorgensen. Staining of the Ca2+-binding
proteins, calsequestrin, calmodulin, troponin C, and
S-100, with the cationic carbocyanine dye “Stains-
all.” Journal of Biological Chemistry 258.18 (1983),
pp. 11267–73.
[87] Jody M. Myers, Arthur Veis, Boris Sabsay, and A.P.
Wheeler. A Method for Enhancing the Sensitivity
and Stability of Stains-All for Phosphoproteins Sep-
arated in Sodium Dodecyl Sulfate-Polyacrylamide
Gels. Analytical Biochemistry 240.2 (1996), pp. 300 –302.
[88] Vonda Sheppard, Nicole Poulsen, and Nils Kröger.
Characterization of an Endoplasmic Reticulum-
associated Silaffin Kinase from the Diatom Thalas-
siosira pseudonana. Journal of Biological Chemistry 285.2
(2010), pp. 1166–1176.
[89] John R. Griffiths and Richard D. Unwin, eds. Anal-
ysis of Protein Post-Translational Modifications by
Mass Spectrometry. John Wiley & Sons, Inc., 2016.
[90] Alexander Kotzsch, Damian Pawolski, Alexan-
der Milentyev, Anna Shevchenko, André Schef-
fel, Nicole Poulsen, Andrej Shevchenko, and Nils
Kröger. Biochemical Composition and Assembly of
Biosilica-associated Insoluble Organic Matrices from
the Diatom Thalassiosira pseudonana. Journal of Bio-
logical Chemistry 291.10 (2016), pp. 4982–4997.
[91] Satoko Matsunaga, Ryuichi Sakai, Mitsuru
Jimbo, and Hisao Kamiya. Long-Chain Polyamines
(LCPAs) from Marine Sponge: Possible Implica-
tion in Spicule Formation. ChemBioChem 8.14 (2007),
pp. 1729–1735.
[92] Satoko Matsunaga, Mitsuru Jimbo, Martin B.
Gill, L. Leanne Lash-Van Wyhe, Michio Mu-
rata, Ken'ichi Nonomura, Geoffrey T. Swanson,
and Ryuichi Sakai. Isolation, Amino Acid Se-
quence and Biological Activities of Novel Long-
Chain Polyamine-Associated Peptide Toxins from
the Sponge Axinyssa aculeata. ChemBioChem 12.14
(2011), pp. 2191–2200.
[93] Myung Hee Park. The Post-Translational Synthe-
sis of a Polyamine-Derived Amino Acid, Hypu-
sine, in the Eukaryotic Translation Initiation Factor
5A (eIF5A). The Journal of Biochemistry 139.2 (2006),
pp. 161–169.
[94] E. C. Wolff, K. R. Kang, Y. S. Kim, and M. H. Park.
Posttranslational synthesis of hypusine: evolution-
ary progression and specificity of the hypusine mod-
ification. Amino Acids 33.2 (2007), pp. 341–350.
[95] Anthony E. Pegg and Jr. Robert A. Casero, eds.
Polyamines. Humana Press, 2011.
[96] Tomonobu Kusano and Hideyuki Suzuki, eds.
Polyamines. Springer Japan, 2015.
[97] Jürgen M. Knott, Piero Römer, and Manfred
Sumper. Putative spermine synthases from Thalas-
siosira pseudonana and Arabidopsis thaliana synthe-
size thermospermine rather than spermine. FEBS Let-
ters 581.16 (2007), pp. 3081–3086.
[98] Piero Römer, A. Faltermeier, V. Mertins, T.
Gedrange, R. Mai, and P. Proff. Investigations about
N-aminopropyl transferases probably involved in
biomineralization. J. Physiol. Pharmacol. 59 Suppl 5
(2008), pp. 27–37.
[99] Anthony J. Michael. Molecular machines encoded
by bacterially-derived multi-domain gene fusions
that potentially synthesize, N-methylate and trans-
fer long chain polyamines in diatoms. FEBS Letters
585.17 (2011), pp. 2627 –2634.
[100] Paul Lasko. Tudor Domain. Current Biology 20.16
(XXXX), R666–R667.
168 bibliography
[101] Carolin C. Lechner and Christian F. W. Becker.
Silaffins in Silica Biomineralization and Biomimetic
Silica Precipitation. Marine Drugs 13.8 (2015),
pp. 5297–5333.
[102] Stephan Wenzl, Rainer Deutzmann, Robert Hett,
Eduard Hochmuth, and Manfred Sumper. Quater-
nary Ammonium Groups in Silica-Associated Pro-
teins. Angewandte Chemie International Edition 43.44
(2004), pp. 5933–5936.
[103] Luciano G. Frigeri, Timothy R. Radabaugh, Paul
A. Haynes, and Mark Hildebrand. Identification of
Proteins from a Cell Wall Fraction of the Diatom Tha-
lassiosira pseudonana : Insights into Silica Structure
Formation. Molecular & Cellular Proteomics 5.1 (2006),
pp. 182–193.
[104] Thomas Mock et al. Whole-genome expression profil-
ing of the marine diatom Thalassiosira pseudonana
identifies genes involved in silicon bioprocesses. Pro-
ceedings of the National Academy of Sciences 105.5 (2008),
pp. 1579–1584.
[105] Ziyad Tariq Muhseen, Qian Xiong, Zhuo Chen,
and Feng Ge. Proteomics studies on stress responses
in diatoms. PROTEOMICS 15.23-24 (2015), pp. 3943–
3953.
[106] Tore Brembu, Matilde Skogen Chauton, Per Winge,
Atle M. Bones, and Olav Vadstein. Dynamic re-
sponses to silicon in Thalasiossira pseudonana -
Identification, characterisation and classification of
signature genes and their corresponding protein mo-
tifs. Scientific Reports 7.1 (2017), p. 4865.
[107] Johan Stenflo, Per Fernlund, William Egan, and
Peter Roepstorff. Vitamin K Dependent Modifica-
tions of Glutamic Acid Residues in Prothrombin. Pro-
ceedings of the National Academy of Sciences 71.7 (1974),
pp. 2730–2733.
[108] Annie Moradian, Anastasia Kalli, Michael J.
Sweredoski, and Sonja Hess. The top-down, middle-
down, and bottom-up mass spectrometry approaches
for characterization of histone variants and their
post-translational modifications. PROTEOMICS 14.4-
5 (2013), pp. 489–497.
[109] Yaoyang Zhang, Bryan R. Fonslow, Bing Shan,
Moon-Chang Baek, and John R. Yates. Protein Anal-
ysis by Shotgun/Bottom-up Proteomics. Chemical Re-
views 113.4 (2013), pp. 2343–2394.
[110] Andrej Shevchenko, Matthias Wilm, Ole Vorm,
and Matthias Mann. Mass Spectrometric Sequenc-
ing of Proteins from Silver-Stained Polyacrylamide
Gels. Analytical Chemistry 68.5 (1996), pp. 850–858.
[111] Andrej Shevchenko, Henrik Tomas, Jan Havliš, Jes-
per V. Olsen, and Matthias Mann. In-gel digestion
for mass spectrometric characterization of proteins
and proteomes. Nat. Protocols 1.6 (2007), pp. 2856–
2860.
[112] B. T. Chait. Mass Spectrometry: Bottom-Up or Top-
Down? Science 314.5796 (2006), pp. 65–66.
[113] Michael Fountoulakis and Hans-Werner Lahm.
Hydrolysis and amino acid composition analysis
of proteins. Journal of Chromatography A 826.2 (1998),
pp. 109 –134.
[114] Shane M. Rutherfurd and G. Sarwar Gilani.
Amino Acid Analysis. Current Protocols in Protein Sci-
ence. John Wiley & Sons, Inc., 2001.
[115] Merja R. Häkkinen, Tuomo A. Keinänen, Jouko
Vepsäläinen, Alex R. Khomutov, Leena Alhonen,
Juhani Jänne, and Seppo Auriola. Analysis of
underivatized polyamines by reversed phase liq-
uid chromatography with electrospray tandem mass
spectrometry. Journal of Pharmaceutical and Biomedical
Analysis 45.4 (2007), pp. 625 –634.
[116] Gottfried J. Feistner. Profiling of basic amino acids
and polyamines in microbial culture supernatants by
electrospray mass spectrometry. Biological Mass Spec-
trometry 23.12 (1994), pp. 784–792.
[117] P. Fürst, L. Pollack, T.A. Graser, H. Godel, and P.
Stehle. Appraisal of four pre-column derivatization
methods for the high-performance liquid chromato-
graphic determination of free amino acids in biolog-
ical materials. Journal of Chromatography A 499 (1990),
pp. 557–569.
[118] Durk Fekkes. State-of-the-art of high-performance
liquid chromatographic analysis of amino acids in
physiological samples. Journal of Chromatography B:
Biomedical Sciences and Applications 682.1 (1996), pp. 3–
22.
[119] G. McClung and W. T. Frankenberger. Comparison
of Reverse-Phase High-Performance Liquid Chro-
matographic Methods for Precolumn-Derivatized
Amino Acids. Journal of Liquid Chromatography 11.3
(1988), pp. 613–646.
bibliography 169
[120] Karin Gartenmann and Sunil Kochhar. Short-
Chain Peptide Analysis by High-Performance Liq-
uid Chromatography Coupled to Electrospray Ion-
ization Mass Spectrometer after Derivatization with
9-Fluorenylmethyl Chloroformate. Journal of Agricul-
tural and Food Chemistry 47.12 (1999), pp. 5068–5071.
[121] Hans M.H. van Eijk, Dennis R. Rooyakkers, Peter B.
Soeters, and Nicolaas E.P. Deutz. Determination of
Amino Acid Isotope Enrichment Using Liquid Chro-
matography–Mass Spectrometry. Analytical Biochem-
istry 271.1 (1999), pp. 8–17.
[122] Steven A. Cohen. Amino Acid Analysis Using
Precolumn Derivatization with 6-Aminoquinolyl-N-
Hydroxysuccinimidyl Carbamate. Amino Acid Anal-
ysis Protocols. Ed. by Catherine Cooper, Nicolle
Packer, and Keith Williams. Totowa, NJ: Humana
Press, 2000, pp. 39–47.
[123] Y Mengerink, D Kutlán, F Tóth, A Csámpai,
and I Molnár-Perl. Advances in the evaluation
of the stability and characteristics of the amino
acid and amine derivatives obtained with the o-
phthaldialdehyde/3-mercaptopropionic acid and o-
phthaldialdehyde/N-acetyl-l-cysteine reagents. Jour-
nal of Chromatography A 949.1-2 (2002), pp. 99–124.
[124] Roland J.W. Meesters, Robert R. Wolfe,
and Nicolaas E.P. Deutz. Application of liq-
uid chromatography-tandem mass spectrometry
(LC–MS/MS) for the analysis of stable isotope en-
richments of phenylalanine and tyrosine. Journal of
Chromatography B 877.1-2 (2009), pp. 43–49.
[125] © IUPAC. RECOMMENDATIONS. 2018. url: https:
/ / iupac . org / what - we - do / recommendations/ (ac-
cessed on 02/13/2018).
[126] Thomas Weiss, Günther Bernhardt, Armin
Buschauer, Karl-Walter Jauch, and Hubert
Zirngibl. High-Resolution Reversed-Phase High-
Performance Liquid Chromatography Analysis of
Polyamines and Their Monoacetyl Conjugates by
Fluorescence Detection after Derivatization withN-
Hydroxysuccinimidyl 6-Quinolinyl Carbamate. Ana-
lytical Biochemistry 247.2 (1997), pp. 294–304.
[127] Steven A. Cohen and Dennis P. Michaud. Syn-
thesis of a Fluorescent Derivatizing Reagent, 6-
Aminoquinolyl-N-Hydroxysuccinimidyl Carbamate,
and Its Application for the Analysis of Hy-
drolysate Amino Acids via High-Performance Liquid
Chromatography. Analytical Biochemistry 211.2 (1993),
pp. 279 –287.
[128] Ji Liu Hong. Determination of amino acids by
precolumn derivatization with 6-aminoquinolyl-N-
hydroxysuccinimidyl carbamate and high perfor-
mance liquid chromatography with ultraviolet detec-
tion. Journal of Chromatography A 670.1-2 (1994), pp. 59–
66.
[129] Thomas S. Weiss. HPLC of Biogenic Amines as 6-
Aminoquinolyl-N-hydroxysuccinimidyl Derivatives.
Journal of Chromatography Library 70 (2005), pp. 502 –
523.
[130] Jenny M. Armenta, Diego F. Cortes, John M. Pis-
ciotta, Joel L. Shuman, Kenneth Blakeslee, Do-
minique Rasoloson, Oluwatosin Ogunbiyi, David
J. Sullivan, and Vladimir Shulaev. Sensitive and
Rapid Method for Amino Acid Quantitation in
Malaria Biological Samples Using AccQ•Tag Ultra
Performance Liquid Chromatography-Electrospray
Ionization-MS/MS with Multiple Reaction Monitor-
ing. Analytical Chemistry 82.2 (2010), pp. 548–558.
[131] Carolina Salazar, Jenny M. Armenta, and
Vladimir Shulaev. An UPLC-ESI-MS/MS Assay Us-
ing 6-Aminoquinolyl-N-Hydroxysuccinimidyl Car-
bamate Derivatization for Targeted Amino Acid
Analysis: Application to Screening of Arabidopsis
thaliana Mutants. Metabolites 2.3 (2012), pp. 398–428.
[132] Ran Liu, Kaishun Bi, Ying Jia, Qian Wang, Ran Yin,
and Qing Li. Determination of polyamines in human
plasma by high-performance liquid chromatography
coupled with Q-TOF mass spectrometry. Journal of
Mass Spectrometry 47.10 (2012), pp. 1341–1346.
[133] Christoph Magnes, Alexander Fauland, Edgar
Gander, Sophie Narath, Maria Ratzer, Tobias
Eisenberg, Frank Madeo, Thomas Pieber, and
Frank Sinner. Polyamines in biological samples:
Rapid and robust quantification by solid-phase
extraction online-coupled to liquid chromatogra-
phy–tandem mass spectrometry. Journal of Chromatog-
raphy A 1331 (2014), pp. 44 –51.
[134] Hidehiro Nakamura, Sachise Karakawa, Akiko
Watanabe, Yasuko Kawamata, Tomomi Kuwahara,
Kazutaka Shimbo, and Ryosei Sakai. Measurement
of 15N enrichment of glutamine and urea cycle
amino acids derivatized with 6-aminoquinolyl-N-
hydroxysuccinimidyl carbamate using liquid chro-
170 bibliography
matography–tandem quadrupole mass spectrometry.
Analytical Biochemistry 476 (2015), pp. 67 –77.
[135] JB Fenn, M Mann, CK Meng, SF Wong, and CM
Whitehouse. Electrospray ionization for mass spec-
trometry of large biomolecules. Science 246.4926
(1989), pp. 64–71.
[136] Jesper V. Olsen et al. A Dual Pressure Linear Ion
Trap Orbitrap Instrument with Very High Sequenc-
ing Speed. Molecular & Cellular Proteomics 8.12 (2009),
pp. 2759–2769.
[137] Jae C. Schwartz, Michael W. Senko, and John E. P.
Syka. A two-dimensional quadrupole ion trap mass
spectrometer. Journal of the American Society for Mass
Spectrometry 13.6 (2002), pp. 659–669.
[138] Chien-Wen Hung, Andreas Schlosser, Junhua Wei,
and Wolf D. Lehmann. Collision-induced reporter
fragmentations for identification of covalently mod-
ified peptides. Analytical and Bioanalytical Chemistry
389.4 (2007), pp. 1003–1016.
[139] P. Roepstorff and J. Fohlman. Proposal for a com-
mon nomenclature for sequence ions in mass spectra
of peptides. Biological Mass Spectrometry 11.11 (1984),
pp. 601–601.
[140] © Matrix Science. Peptide fragmentation. 2016. url:
http : / / www . matrixscience . com / help /
fragmentation_help.html (accessed on 04/30/2018).
[141] Hanno Steen and Matthias Mann. The ABC’s (and
XYZ’s) of peptide sequencing. Nature Reviews Molecu-
lar Cell Biology 5 (2004), 699 EP.
[142] Andreas Schlosser and Wolf D. Lehmann. Five-
membered ring formation in unimolecular reactions
of peptides: a key structural element controlling low-
energy collision-induced dissociation of peptides.
Journal of Mass Spectrometry 35.12 (2000), pp. 1382–
1390.
[143] Eric S Witze, William M Old, Katheryn A
Resing, and Natalie G Ahn. Mapping protein post-
translational modifications with mass spectrometry.
Nature Methods 4.10 (2007), pp. 798–806.
[144] Erik Ahrné, Markus Müller, and Frederique
Lisacek. Unrestricted identification of modified
proteins using MS/MS. PROTEOMICS 10.4 (2010),
pp. 671–686.
[145] Rovshan G Sadygov, Daniel Cociorva, and John R
Yates. Large-scale database searching using tandem
mass spectra: Looking up the answer in the back of
the book. Nature Methods 1.3 (2004), pp. 195–202.
[146] Alexey I. Nesvizhskii. Protein Identification by Tan-
dem Mass Spectrometry and Sequence Database
Searching. Mass Spectrometry Data Analysis in Proteo-
mics. Ed. by Rune Matthiesen. Totowa, NJ: Humana
Press, 2007, pp. 87–119.
[147] Jens Allmer. Algorithms for the de novo sequencing
of peptides from tandem mass spectra. Expert Review
of Proteomics 8.5 (2011), pp. 645–657.
[148] Matthias Mann, Chin Kai Meng, and John B. Fenn.
Interpreting mass spectra of multiply charged ions.
Analytical Chemistry 61.15 (1989), pp. 1702–1708.
[149] Marc Gentzel, Thomas Köcher, Saravanan Pon-
nusamy, and Matthias Wilm. Preprocessing of
tandem mass spectrometric data to support auto-
matic protein identification. PROTEOMICS 3.8 (2003),
pp. 1597–1610.
[150] Nedim Mujezinovic, Günther Raidl, James R. A.
Hutchins, Jan-Michael Peters, Karl Mechtler,
and Frank Eisenhaber. Cleaning of raw peptide
MS/MS spectra: Improved protein identification fol-
lowing deconvolution of multiply charged peaks, iso-
tope clusters, and removal of background noise. PRO-
TEOMICS 6.19 (2006), pp. 5117–5131.
[151] Ingvar Eidhammer, Kristian Flikka, Lennart
Martens, and Svein-Ole Mikalsen. Tandem MS or
MS/MS Analysis. Computational Methods for Mass Spec-
trometry Proteomics. John Wiley & Sons, Ltd, 2007.
Chap. 8, pp. 119–140.
[152] © Matrix Science. Modifications. 2016. url: http://
www.matrixscience.com/help/pt_mods_help.html
(accessed on 02/07/2018).
[153] Thomas Burger. Gentle Introduction to the Statisti-
cal Foundations of False Discovery Rate in Quantita-
tive Proteomics. Journal of Proteome Research 17.1 (2017),
pp. 12–22.
[154] Mikhail M. Savitski, Simone Lemeer, Markus
Boesche, Manja Lang, Toby Mathieson, Marcus
Bantscheff, and Bernhard Kuster. Confident Phos-
phorylation Site Localization Using the Mascot Delta
Score. Molecular & Cellular Proteomics 10.2 (2010),
p. M110.003830.
bibliography 171
[155] Jesper V. Olsen, Blagoy Blagoev, Florian Gnad,
Boris Macek, Chanchal Kumar, Peter Mortensen,
and Matthias Mann. Global, In Vivo, and Site-
Specific Phosphorylation Dynamics in Signaling
Networks. Cell 127.3 (2006), pp. 635–648.
[156] Andrew J. Alverson, Bánk Beszteri, Matthew L.
Julius, and Edward C. Theriot. The model marine
diatom Thalassiosira pseudonana likely descended
from a freshwater ancestor in the genus Cyclotella.
BMC Evolutionary Biology 11.1 (2011), p. 125.
[157] Andrew J. Alverson, Robert K. Jansen, and Edward
C. Theriot. Bridging the Rubicon: Phylogenetic anal-
ysis reveals repeated colonizations of marine and
fresh waters by thalassiosiroid diatoms. Molecular
Phylogenetics and Evolution 45.1 (2007), pp. 193 –210.
[158] William M. McGee and Scott A. McLuckey. The or-
nithine effect in peptide cation dissociation. Journal
of Mass Spectrometry 48.7 (2013), pp. 856–861.
[159] Kangling Zhang, Peter M. Yau, Bhaskar Chan-
drasekhar, Ron New, Richard Kondrat, Brian
S. Imai, and Morton E. Bradbury. Differentia-
tion between peptides containing acetylated or tri-
methylated lysines by mass spectrometry: An ap-
plication for determining lysine 9 acetylation and
methylation of histone H3. PROTEOMICS 4.1 (2004),
pp. 1–10.
[160] Timothy A. Couttas, Mark J. Raftery, Giulia
Bernardini, and Marc R. Wilkins. Immonium Ion
Scanning for the Discovery of Post-Translational
Modifications and Its Application to Histones. Jour-
nal of Proteome Research 7.7 (2008), pp. 2632–2641.
[161] Morten B. Trelle, and Ole N. Jensen. Utility
of Immonium Ions for Assignment of epsilon-
N-Acetyllysine-Containing Peptides by Tandem
Mass Spectrometry. Analytical Chemistry 80.9 (2008),
pp. 3422–3430.
[162] Olaf Kühl, ed. Phosphorus-31 NMR Spectroscopy:
A Concise Introduction for the Synthetic Organic
and Organometallic Chemist. Springer Berlin Heidel-
berg, 2009.
[163] Eike Brunner, Patrick Richthammer, Hermann
Ehrlich, Silvia Paasch, Paul Simon, Susanne Ue-
berlein, and Karl-Heinz van Pée. Chitin-Based
Organic Networks: An Integral Part of Cell Wall
Biosilica in the Diatom Thalassiosira pseudonana.
Angewandte Chemie International Edition 48.51 (2009),
pp. 9724–9727.
[164] Benoit Tesson and Mark Hildebrand. Characteri-
zation and Localization of Insoluble Organic Matri-
ces Associated with Diatom Cell Walls: Insight into
Their Roles during Cell Wall Formation. PLOS ONE
8.4 (2013), pp. 1–13.
[165] Aubrey K. Davis, Mark Hildebrand, and Brian
Palenik. A Stress-Induced Protein Associated With
The Girdle Band Region Of The Diatom Thalas-
siosira Pseudonana (Bacillariophyta). Journal of Phy-
cology 41.3 (2005), pp. 577–589.
[166] Alexander Kotzsch, Philip Gröger, Damian Pa-
wolski, Paul H. H. Bomans, Nico A. J. M. Som-
merdijk, Michael Schlierf, and Nils Kröger.
Silicanin-1 is a conserved diatom membrane protein
involved in silica biomineralization. BMC Biology 15.1
(2017), p. 65.
[167] A. J. Michael. Biosynthesis of polyamines and
polyamine-containing molecules. Biochemical Journal
473.15 (2016), pp. 2315–2329.
[168] Danielle L. Swaney, Craig D. Wenger, and Joshua J.
Coon. Value of Using Multiple Proteases for Large-
Scale Mass Spectrometry-Based Proteomics. Journal
of Proteome Research 9.3 (2010), pp. 1323–1329.
[169] M. J. MacCoss et al. Shotgun identification of pro-
tein modifications from protein complexes and lens
tissue. Proceedings of the National Academy of Sciences
99.12 (2002), pp. 7900–7905.
[170] Tao Xu, Catherine C L Wong, Anna Kashina, and
John R Yates. Identification of N-terminally arginy-
lated proteins and peptides by mass spectrometry.
Nature Protocols 4.3 (2009), pp. 325–332.
[171] Mukesh Kumar, Shai R. Joseph, Martina Augsburg,
Aliona Bogdanova, David Drechsel, Nadine L. Vas-
tenhouw, Frank Buchholz, Marc Gentzel, and An-
drej Shevchenko. MS Western, a Method of Multi-
plexed Absolute Protein Quantification is a Practical
Alternative to Western Blotting. Molecular & Cellular
Proteomics 17.2 (2017), pp. 384–396.
[172] © Abcam. Protein dephosphorylation protocol. 2018.
url: http : / / www . abcam . com / protocols /
protein- dephosphorylation- protocol (accessed on
02/16/2018).
[173] Carol A. Olson, Richard Krueger, and Nancy B.
Schwartz. Deglycosylation of chondroitin sulfate
proteoglycan by hydrogen fluoride in pyridine. An-
alytical Biochemistry 146.1 (1985), pp. 232 –237.
172 bibliography
[174] Hiroki Kuyama, Chikako Toda, Makoto Watanabe,
Koichi Tanaka, and Osamu Nishimura. An efficient
chemical method for dephosphorylation of phospho-
peptides. Rapid Communications in Mass Spectrometry
17.13 (2003), pp. 1493–1496.
[175] Eileen M. Woo, David Fenyo, Benjamin H. Kwok, Hi-
ronori Funabiki, and Brian T. Chait. Efficient Iden-
tification of Phosphorylation by Mass Spectrometric
Phosphopeptide Fingerprinting. Analytical Chemistry
80.7 (2008), pp. 2419–2425.
[176] Bin Ma and Richard Johnson. De Novo Sequencing
and Homology Searching. Molecular & Cellular Proteo-
mics 11.2 (2012).
[177] Matthias Mann and Matthias Wilm. Error-Tolerant
Identification of Peptides in Sequence Databases
by Peptide Sequence Tags. Analytical Chemistry 66.24
(1994), pp. 4390–4399.
[178] © Matrix Science. Error tolerant search. 2016. url:
http : / / www . matrixscience . com / help / error _
tolerant_help.html (accessed on 02/06/2018).
[179] Xin Huang et al. ISPTM: An Iterative Search
Algorithm for Systematic Identification of Post-
translational Modifications from Complex Proteome
Mixtures. Journal of Proteome Research 12.9 (2013),
pp. 3831–3842.
[180] Liana Tsiatsiani and Albert J. R. Heck. Proteomics
beyond trypsin. FEBS Journal 282.14 (2015), pp. 2612–
2626.
[181] Jesper V. Olsen, Shao-En Ong, and Matthias Mann.
Trypsin Cleaves Exclusively C-terminal to Arginine
and Lysine Residues. Molecular & Cellular Proteomics
3.6 (2004), pp. 608–614.
[182] Xue Jun. Tang, Pierre. Thibault, and Robert
K. Boyd. Fragmentation reactions of multiply-
protonated peptides and implications for sequenc-
ing by tandem mass spectrometry with low-energy
collision-induced dissociation. Analytical Chemistry
65.20 (1993), pp. 2824–2834.
[183] Vladimir Gorshkov, Thiago Verano-Braga, and
Frank Kjeldsen. SuperQuant: A Data Processing Ap-
proach to Increase Quantitative Proteome Coverage.
Analytical Chemistry 87.12 (2015), pp. 6319–6327.
[184] Ralph Wieneke, Anja Bernecker, Radostan Riedel,
Manfred Sumper, Claudia Steinem, and Armin
Geyer. Silica precipitation with synthetic silaffin
peptides. Org. Biomol. Chem. 9 (15 2011), pp. 5482–5486.
[185] Nicole Poulsen, André Scheffel, Vonda C. Shep-
pard, Patrick M. Chesley, and Nils Kröger. Pentaly-
sine Clusters Mediate Silica Targeting of Silaffins in
Thalassiosira pseudonana. Journal of Biological Chem-
istry 288.28 (2013), pp. 20100–20109.
[186] Tony Hunter. The Age of Crosstalk: Phosphoryla-
tion, Ubiquitination, and Beyond. Molecular Cell 28.5
(2007), pp. 730–738.
[187] Regev Schweiger and Michal Linial. Cooperativ-
ity within proximal phosphorylation sites is revealed
from large-scale proteomics data. Biology Direct 5.1
(2010), p. 6.
[188] Pablo Minguez, Luca Parca, Francesca Diella,
Daniel R Mende, Runjun Kumar, Manuela
Helmer-Citterich, Anne-Claude Gavin, Vera van
Noort, and Peer Bork. Deciphering a global net-
work of functionally associated post-translational
modifications. Molecular Systems Biology 8 (2012).
[189] Pedro Beltrao, Véronique Albanèse, Lillian R.
Kenner, Danielle L. Swaney, Alma Burlingame, Ju-
dit Villén, Wendell A. Lim, James S. Fraser, Ju-
dith Frydman, and Nevan J. Krogan. Systematic
Functional Prioritization of Protein Posttranslational
Modifications. Cell 150.2 (2012), pp. 413–425.
[190] Pablo Minguez, Ivica Letunic, Luca Parca, and
Peer Bork. PTMcode: a database of known and
predicted functional associations between post-
translational modifications in proteins. Nucleic Acids
Research 41.D1 (2012), pp. D306–D311.
[191] Mao Peng, Arjen Scholten, Albert J. R. Heck,
and Bas van Breukelen. Identification of Enriched
PTM Crosstalk Motifs from Large-Scale Experimen-
tal Data Sets. Journal of Proteome Research 13.1 (2013),
pp. 249–259.
[192] A. Saskia Venne, Laxmikanth Kollipara, and René
P. Zahedi. The next level of complexity: Crosstalk of
posttranslational modifications. PROTEOMICS 14.4-5
(2014), pp. 513–524.
[193] Veit Schwämmle, Claudia-Maria Aspalter, Simone
Sidoli, and Ole N. Jensen. Large Scale Analy-
sis of Co-existing Post-translational Modifications
in Histone Tails Reveals Global Fine Structure of
Cross-talk. Molecular & Cellular Proteomics 13.7 (2014),
pp. 1855–1865.
bibliography 173
[194] Yuanhua Huang, Bosen Xu, Xueya Zhou, Ying
Li, Ming Lu, Rui Jiang, and Tingting Li. Sys-
tematic Characterization and Prediction of Post-
Translational Modification Cross-Talk. Molecular &
Cellular Proteomics 14.3 (2015), pp. 761–770.
[195] Veit Schwämmle, Simone Sidoli, Chrystian Rumi-
nowicz, Xudong Wu, Chung-Fan Lee, Kristian He-
lin, and Ole N. Jensen. Systems Level Analysis of
Histone H3 Post-translational Modifications (PTMs)
Reveals Features of PTM Crosstalk in Chromatin
Regulation. Molecular & Cellular Proteomics 15.8 (2016),
pp. 2715–2729.
[196] Thomas D. Schneider and R.Michael Stephens. Se-
quence logos: a new way to display consensus se-
quences. Nucleic Acids Research 18.20 (1990), pp. 6097–
6100.
[197] Cong Wu, John C Tran, Leonid Zamdborg, Ken-
neth R Durbin, Mingxi Li, Dorothy R Ahlf, Bryan
P Early, Paul M Thomas, Jonathan V Sweedler,
and Neil L Kelleher. A protease for 'middle-down'
proteomics. Nature Methods 9.8 (2012), pp. 822–824.
[198] Pitter F. Huesgen, Philipp F. Lange, Lindsay D.
Rogers, Nestor Solis, Ulrich Eckhard, Oded
Kleifeld, Theodoros Goulas, F. Xavier Gomis-
Rüth, and Christopher M. Overall. LysargiNase
mirrors trypsin for protein C-terminal and
methylation-site identification. Nature Methods 12
(2014), pp. 55–58.
[199] Piero Giansanti, Liana Tsiatsiani, Teck Yew Low,
and Albert J R Heck. Six alternative proteases
for mass spectrometry–based proteomics beyond
trypsin. Nature Protocols 11.5 (2016), pp. 993–1006.
[200] Lloyd M Smith, and Neil L Kelleher. Proteoform:
a single term describing protein complexity. Nature
Methods 10.3 (2013), pp. 186–187.
[201] Lloyd M. Smith and Neil L. Kelleher. Proteoforms
as the next proteomics currency. Science 359.6380
(2018), pp. 1106–1107.
[202] Oyo Mitsunobu and Masaaki Yamada. Preparation
of Esters of Carboxylic and Phosphoric Acid via Qua-
ternary Phosphonium Salts. Bulletin of the Chemical So-
ciety of Japan 40.10 (1967), pp. 2380–2382.
[203] Canadian Centre for the Culture of Microorgan-
isms. HESNW/ESAW Recipe. 2018. url: http://cccm.
botany.ubc.ca/resources/marine-media-receipes/
hesnwesaw-recipe/ (accessed on 04/05/2018).
[204] Hermann Schägger. Tricine-SDS-PAGE. Nat. Proto-
cols 1.1 (2006), pp. 16–22.
[205] F. William Studier. Protein production by auto-
induction in high-density shaking cultures. Protein
Expression and Purification 41.1 (2005), pp. 207–234.
[206] Waters Corporation. AccQ·Fluor Reagent Kit care and
use manual. 2008. url: http : / / www . waters . com /
waters/download.htm?lid=10069610&id=10069609&
fileName=wat0052881&fileUrl=%2fwebassets%2fcms%
2fsupport % 2fdocs % 2fwat0052881 . pdf (accessed on
04/12/2018).
[207] ©Eidgenössische Technische Hochschule Func-
tional Genomics Center Zürich. Amino Acid Anal-
ysis. 2018. url: http://www.fgcz.ch/omics_areas/
prot/applications/protein- quantitation/amino-
acid-analysis.html (accessed on 02/14/2018).
[208] Sigma-Aldrich Co. GlycoProfile™ IV chemical deglyco-
sylation kit. 2004. url: https : / / www . sigmaaldrich .
com / content / dam / sigma - aldrich / docs / Sigma /
Bulletin/pp0510bul.pdf (accessed on 04/12/2018).
[209] David N. Perkins, Darryl J. C. Pappin, David
M. Creasy, and John S. Cottrell. Probability-
based protein identification by searching sequence
databases using mass spectrometry data. Electrophore-
sis 20.18 (1999), pp. 3551–3567.
[210] Andrew Keller, Alexey I. Nesvizhskii, Eugene
Kolker, and Ruedi Aebersold. Empirical Statistical
Model To Estimate the Accuracy of Peptide Identifi-
cations Made by MS/MS and Database Search. Ana-
lytical Chemistry 74.20 (2002), pp. 5383–5392.
[211] Alexey I. Nesvizhskii, Andrew Keller, Eugene
Kolker, and Ruedi Aebersold. A Statistical Model
for Identifying Proteins by Tandem Mass Spectrom-
etry. Analytical Chemistry 75.17 (2003), pp. 4646–4658.
[212] Juan Antonio Vizcaíno et al. The Proteomics Iden-
tifications (PRIDE) database and associated tools:
status in 2013. Nucleic Acids Research 41.D1 (2012),
pp. D1063–D1069.
[213] E. Beitz. TeXshade: shading and labeling of multi-
ple sequence alignments using LaTeX2e. Bioinformat-
ics 16.2 (2000), pp. 135–139.
A C K N O W L E D G M E N T S
I wish to thank, first and foremost, my supervisor Andrej Shevchenko for giving me
the opportunity to work in this fantastic project and guiding me through it. I consider
it an honour to have worked with the members of the Shevchenko Lab, who created
the best environment to do science. Indeed, this thesis is a result of many people’s hard
work and collaboration, and I acknowledge their tireless contributions to the body of
work for my doctoral thesis, as further summarized here.
I am indebted to many of my colleagues for their technical supports, especially Marc
Gentzel at Biotechnology Center (BIOTEC, Dresden), for handing with technique and
fruitful discussions. I would also like to thank Oskar Knittelfelder (Shevchenko
Lab) and Alastair Skeffington at MPI of Molecular Plant Physiology (Potsdam) for
critically reading the manuscript of this dissertation and improving its language.
This project could have never advanced so much without the invaluable contribu-
tions of our collaborators. Therefore, I hereby would like to express my gratitude
to Nils Kröger, Nicole Poulsen, Alexander Kotzsch, Christoph Heintze, and
Damian Pawolski at B CUBE Center for Molecular Bioengineering (Dresden). I also
thank Eike Brunner and Marcus Rauche at TU Dresden for help with NMR analysis,
and all members of Diatom Forschergruppe 2038 (nanomee.de).
Specially, I would like to thank my thesis advisory committee members Bernard
Hoflack and Gaia Pigino for their regular feedback on the progress of my research.
I am indebted to my friends Daria Ezerin, a at VIB-VUB Center for Structural Bi-
ology (Brussels) and Maxim Fomin at MPI for biophysical chemistry (Göttingen) for
their professional assistance and moral support.
Last but not the least, I thank my parents, my sister and my friends for everything
they have ever done for me so far.
The project is supported by the FOR 2038 ‘Nanopatterned Organic Matrices in Biological Silica
Mineralization’ awarded by Deutsche Forschungsgemeinschaft (DFG).
175
P U B L I C AT I O N S
The following papers originated from this thesis work:
• Alexander Kotzsch, Damian Pawolski, Alexander Milentyev, Anna Shevchenko,
André Scheffel, Nicole Poulsen, Andrej Shevchenko, Nils Kröger Biochemical
Composition and Assembly of Biosilica-associated Insoluble Organic Matrices
from the Diatom Thalassiosira pseudonana J Biol Chem. 2015 Dec
• Alexander Milentyev, Christoph Heintze, Maryna Abacilar, Marc Gentzel, Nicole
Poulsen, Marcus Rauche, Eike Brunner, Armin Geyer, Nils Kröger, Andrej Shevchenko
Biosilicome-wide profiling of lysine modifications reveals compositional simi-
larity in three diatoms species Mol Cell Proteomics (in preparation)
177
D E C L A R AT I O N / E R K L Ä R U N G
Declaration according to § 5.5 of the doctorate regulations
I herewith declare that I have produced this paper without the prohibited assistanceof third parties and without making use of aids other than those specified; notionstaken over directly or indirectly from other sources have been identified as such. Thispaper has not previously been presented in identical or similar form to any otherGerman or foreign examination board.
The thesis work was conducted from January 6, 2014 to January 6, 2018 under the su-pervision of Dr. Andrej Shevchenko at Max Planck Institute of Molecular Cell Biologyand Genetics.
I declare that I have not undertaken any previous unsuccessful doctorate proceed-ings.
I declare that I recognize the doctorate regulations of the Faculty of Science of Dres-den University of Technology.
Dresden, July 1, 2018
Erklärung entsprechend § 5.5 der Promotionsordnung
Hiermit versichere ich, dass ich die vorliegende Arbeit ohne unzulässige Hilfe Dritterund ohne Benutzung anderer als der angegebenen Hilfsmittel angefertigt habe; dieaus fremden Quellen direkt oder indirekt übernommenen Gedanken sind als solchekenntlich gemacht. Die Arbeit wurde bisher weder im Inland noch im Ausland ingleicher oder ähnlicher Form einer anderen Prüfungsbehörde vorgelegt.
Die Dissertation wurde im Zeitraum vom 6. Januar 2014 bis 6. Januar 2018 verfasstund von Dr. Andrej Shevchenko am Max-Planck-Institut für Molekulare Zellbiologieund Genetik betreut.
Meine Person betreffend erkläre ich hiermit, dass keine früheren erfolglosen Promo-tionsverfahren stattgefunden haben.
Ich erkenne die Promotionsordnung der Fakultät für Mathematik und Naturwissen-schaftender Technischen Universität Dresden an.
Dresden, den 1. Juli 2018
Alexander Milentyev
179