Performance assessment of chemometric resolution methods utilized for extraction of pure components...
Transcript of Performance assessment of chemometric resolution methods utilized for extraction of pure components...
Accepted Manuscript
Title: Performance assessment of chemometric resolutionmethods utilized for extraction of pure components fromoverlapped signals in gas chromatography-mass spectrometry
Author: Hooman Seifi Saeed Masoum Soodabe Seifi
PII: S0021-9673(14)01377-6DOI: http://dx.doi.org/doi:10.1016/j.chroma.2014.08.095Reference: CHROMA 355786
To appear in: Journal of Chromatography A
Received date: 11-2-2014Revised date: 22-8-2014Accepted date: 27-8-2014
Please cite this article as: H. Seifi, S. Masoum, S. Seifi, Performance assessmentof chemometric resolution methods utilized for extraction of pure componentsfrom overlapped signals in gas chromatography-mass spectrometry, Journal ofChromatography A (2014), http://dx.doi.org/10.1016/j.chroma.2014.08.095
This is a PDF file of an unedited manuscript that has been accepted for publication.As a service to our customers we are providing this early version of the manuscript.The manuscript will undergo copyediting, typesetting, and review of the resulting proofbefore it is published in its final form. Please note that during the production processerrors may be discovered which could affect the content, and all legal disclaimers thatapply to the journal pertain.
Downloaded from http://www.elearnica.ir
Page 1 of 35
Accep
ted
Man
uscr
ipt
Performance assessment of chemometric resolution methods utilized for extraction of pure 1
components from overlapped signals in gas chromatography-mass spectrometry2
3
Hooman Seifi a, Saeed Masoum *a, Soodabe Seifi b4
a Department of Analytical Chemistry, Faculty of Chemistry, University of Kashan, Kashan, Iran5
b Technical and Vocational University of Kermanshah, Girl's Technical College, Kermanshah, Iran6
Corresponding author Tel.: +98 361 5912338; fax: +98 361 5912397.7
E-mail address: [email protected] (Saeed Masoum)8
Abstract9
Multivariate resolution technique is a set of mathematical tools that uncovers the underlying profiles from 10
a set of measurements of time evolving chemical systems. This technique was proposed for resolving the 11
overlapping GC-MS peaks into pure chromatogram and mass spectra. In this paper, several common 12
resolution chemometric techniques in GC-MS resolution such as mean field-independent component 13
analysis (MF-ICA), multivariate curve resolution-alternating least squares (MCR-ALS) and multivariate 14
curve resolution-objective function minimization (MCR-FMIN) were investigated. The obtained solutions 15
using chemometric methods are assessed by lack of fit (LOF) and R2. Results show that all solutions by 16
fulfillment the same constraints, have same performance in resolving high overlapping peaks. Also, the 17
differences obtained in each case should be related to the unresolved rotational ambiguity.18
Among of the different ambiguities such as intensity, permutation and rotation in resolution methods,19
rotational ambiguity is the most difficult and critical one. Because of rotational ambiguity, there is a set of 20
feasible MCR solutions, which explain equally well the observed experimental data, and fulfill 21
sufficiently the imposed constraints of the system. So in these methods, a range of feasible solutions exist. 22
Page 2 of 35
Accep
ted
Man
uscr
ipt
The rotational ambiguities of the profiles are a challenging fact which complicates the development of 23
stable and universal self-modeling curve resolution (SMCR) algorithms. The relative component 24
contribution (RCC) function values for the component profiles obtained by the different methods are 25
calculated by MCR-BANDS. The values of RCC for these three methods are equivalence. Rotational 26
ambiguities of the solutions of SMCR methods can be reduced by applying suitable constraints. The 27
obtained results show, using data sets, which are arranged in a single augmented data matrix could be the 28
best solution for reducing or removing of rotational ambiguity.29
Keywords: Chemometric resolution methods, Rotational ambiguities, Co-elution, Gas chromatography–30
mass spectrometry, Augmentation31
32
Page 3 of 35
Accep
ted
Man
uscr
ipt
1. Introduction32
Historically, analysis of essential oil is a significant challenge to analytical chemists and pharmacologists. 33
Hyphenated chromatographic techniques such as high performance liquid chromatography-mass 34
spectrometry (HPLC-MS) and gas chromatography-mass spectrometry (GC-MS) have been the powerful 35
tools in the analysis of essential oils [1]. However, in GC-MS analysis of EOs, even under the best 36
experimental conditions, there are various problems such as baseline drift, low signal to noise (S/N) ratio, 37
different types of noise, co-elution [2-7]. In practice, peak tailing and fronting as other problems appear in 38
some of the total ion chromatograms (TIC). The main reasons for this phenomenon may be the bad 39
column cutting and related to the high concentration of corresponding components. These difficulties can 40
be arisen from the complexity of EOs and the variability of GC-MS systems such as detection system, 41
chromatographic device and experimental conditions. Among these problems, co-elution (overlapped 42
and/or embedded peaks) is one of the most observed chromatographic difficulties that is due to 43
complexity of samples, inadequate peak capacity and need to faster chromatographic analysis. There are 44
two main approaches for solving the co-elution problem. In one hand, scientists attempt to reach perfect 45
separation by improving the temperature programming, stationary phase composition (generally classical 46
chromatographic parameters). In another hand, the co-eluted peaks are decomposed into the contribution 47
of the pure components using chemometric resolution methods. However, the first approach is time 48
consuming and, it is common to find some co-eluted chromatographic peaks that still remain after re-49
programming of the methods. Also, the second approach depends on the selective nature of the 50
chromatographic profiles [3,8-10].51
Chemometric resolution techniques based on bilinear models try to resolve the chemical constituents in a 52
mixture by mathematical resolution of their signal contribution. The goal of these techniques is to 53
mathematically decompose an instrumental response of a mixture into the pure contribution of each 54
component involved in the system [11-15]. The chemometric resolution methods are divided into two 55
groups. The first group is non-iterative methods such as evolving factor analysis (EFA), evolving window 56
orthogonal projection (EWOP), orthogonal projection resolution (OPR), sub-window factor analysis 57
Page 4 of 35
Accep
ted
Man
uscr
ipt
(SFA), window factor analysis (WFA) and heuristic evolving latent projections (HELP). The second 58
group is iterative methods such as multivariate curve resolution-alternating least squares (MCR-ALS), 59
mean field-independent component analysis (MF-ICA), multivariate curve resolution-objective function 60
minimization (MCR-FMIN) [14,16-23].61
Because of randomly generated initial values and local minimization problem, different results may be 62
produced through different runs of the MF-ICA and MCR-ALS procedures. Furthermore, sometimes MF-63
ICA and MCR-ALS cannot be converged and suitable results may not be achieved [24]. There are 64
numerous reasons for this problem, that one of them is the substantial dependency to the initial estimates 65
of the spectral or concentration profiles. So, these algorithms reach local minima instead of global 66
minima. Stuck in local minima can cause insufficient curve resolution. One powerful solution for this 67
problem is using the non-linear optimization techniques. These approaches are applied extensively for 68
finding the best solution in the SMCR techniques. MCR-FMIN is one of these methods that recently has 69
been proposed by Tauler [18]. 70
For run of the iterative resolution methods, initial estimates of concentration or spectral profiles are 71
needed. These can be helpful for faster convergence, obtaining the reproducible results and avoiding to be 72
stuck in local minima [24-26]. Hence, methods, which are based on pure variable selection such as OPA 73
and SIMPLISMA and also approaches that are based on evolutionary nature of data such as EFA are used 74
as initial estimation [27-29].75
However, though chemometric resolution solutions have sufficient physical meaning and easy 76
explanation, but they are not “unique” in general case, and they have an unknown amount of ambiguity. 77
Three types of ambiguities are distinguished in multivariate curve resolution techniques such as: 78
permutation (exchangeable order of the constituents in the row and columns of results), intensity and 79
rotational ambiguities. The permutation and intensity ambiguities are eliminated by aware of analyst and 80
external calibration information, respectively. Rotational ambiguity is a more significant problem in the 81
application of soft-modeling techniques, and this is the chief reason for “non-unique” solutions. 82
Rotational ambiguity is reduced and totally avoided in some cases by intelligent use of data structure and 83
Page 5 of 35
Accep
ted
Man
uscr
ipt
imposing appropriate constraints [30]. Constraints like unimodality, non-negativity, selectivity, local rank 84
and equality or inequality have been proposed to limit rotational ambiguity [17,31-33]. In the absence of 85
this rotational ambiguity, all multivariate resolution techniques should give the same unique solution 86
which should be equal to the true one [34]. The calculative (and sometimes analytical) determination of 87
the range of feasible solutions has an advantage in the analysis of chemical systems.88
The range of acceptable solutions for a two-component system (F=2) was partially analyzed by Lawton 89
and Sylvestre [35]. In addition, Maeder et al. [36, 37] and Rajko and Istvan [36,37] in their papers 90
investigated two-component systems. For F > 2, a comparable analysis gets much harder. Borgen and his 91
coworkers contributed to this topic, did pioneering work for three-component systems [38,39]. The 92
analytical solution for these systems with computer tools was improved by Rajko et al. [40,41]. A new 93
approach to calculate the boundaries of the set of solutions for these systems has been proposed by 94
Abdollahi et al. [42].95
Essential oils (EOs) are complex mixtures of volatile organic compounds produced (as secondary 96
metabolites) in plants. EOs are constituted by hydrocarbons (terpenes and sesquiterpenes) and oxygenated 97
compounds (ethers, alcohols, esters, aldehydes, lactones, ketones, phenols and phenol ethers) [43-45]. A98
comprehensive knowledge of components of an EO will lead to a safe and better application of it, because 99
characteristics and properties of EOs are directly related to their compositions. Nevertheless, such a 100
comprehensive knowledge can only be obtained by utilizing appropriate extraction techniques and 101
accurately performed chromatographic analysis [43,46,47]. In another word, the effects of herb are 102
according to synergic effect (components that exist in each essential oil of herb if not a direct activity, at 103
least have a synergic effect on activities of others), so comprehensive knowledge is needed to correct 104
application in biological, pharmacological and food applications.105
In this study, volatile constituents of essential oils extracted from six Myrtus communis L. were analyzed 106
by GC-MS, and their components were identified assisted by chemometric methods. Then reliability of 107
results from several multivariate curve resolution methods in presence of rotational ambiguities was 108
Page 6 of 35
Accep
ted
Man
uscr
ipt
investigated and the effects of applying constraints and multi-set configurations on reducing ambiguity of 109
GC-MS data were surveyed. 110
2. Material and instrumentation111
2.1. Reagents112
All reagents and chemicals that were used in this work were of analytical grade. Anhydrous sodium 113
sulphate was purchased from Merck. Normal alkanes' standards were purchased from ULTRA Scientific 114
(North Kingstown, USA).115
2.2. Sample preparation116
Green parts of Myrtus communis L. were collected during September of 2011 from different area 117
(Noorabadand and Khoramabad) of Iran and confirming specimen of the plant was executed in the Barij 118
Essence Company (Kashan, Iran).119
Two of three samples (W: Noorabad Wild and K: Khoramabad Wild) were collected from hillsides that 120
grow wildly and third sample (E: cultivated from Noorabadand) was collected from cultivation area at 121
stead. In the next step, each sample was classified into two groups as fresh (F) and dried by sun (D). The 122
volatile constituents of Myrtus communis L. were extracted using Clevenger-type apparatus as proposed 123
by the European Pharmacopoeia [48]. The obtained essential oils were dried over anhydrous sodium 124
sulphate and after filtration, stored in a dark glass at low temperature (4 °C) prior to analysis.125
2.3. Gas chromatography–mass spectrometry126
Samples were analyzed with an Agilent HP-6890 gas-chromatograph combined to an Agilent HP-5973 127
mass spectrometer (that was supplied with an electron ionization and quadrupole analyzer) on a HP-5MS 128
capillary fused silica column (30 m×0.25 mm i. d. 0.25µm film thickness). The temperature of the ion 129
source was set at 230°C. Helium was used as carrier gas with a total flow of 5 ml/min. Interface 130
temperature was set at 280 °C. The mass scan range was from 40-300 amu (Ionization energy: 70 eV). A 131
split injection (split ratio, 1:20) and injector temperature of 250 °C were employed. The oven temperature 132
Page 7 of 35
Accep
ted
Man
uscr
ipt
was programmed to increase from 50 to 250 °C at a rate of 3 °C/min, and for post run step, kept constant 133
at 280 °C for 10 min.134
2.4. Data analysis135
Data analysis was carried out on a 4-GHz Pentium IV personal computer; all programs were coded in 136
MATLAB 7.14 for windows. Resolved spectra were identified by matching against the standard mass 137
spectral NIST 05 database, which contains 209311 compounds.138
3. Theory and methodology139
3.1. Theory140
The basic assumption of bilinear models is that multivariate experimental data is a linear sum of the pure 141
singular contributions of the different chemical components in the system. Contributions of each 142
component can be depicted by the product of two factors, one related to its pure spectrum and another one 143
related to the concentration.144
The two way data matrix X obtained from GC-MS can be represented by bilinear models as follows:145
MCR-ALS: TMCR ALS MCR ALS MCR ALSX X E CS E (1)146
MF-ICA: TMF ICA MF ICA MF ICAX X E AS E (2)147
MCR-FMIN: TMCR FMIN MCR FMIN MCR FMINX X E CS E (3)148
Where XMCR-ALS, XMF-ICA and XMCR-FMIN are the reconstructed data matrices by MCR-ALS, MF-ICA and 149
MCR-FMIN, respectively. C and A are the concentration profiles, and ST is the spectral profile obtained 150
by these three methods. EMCR-ALS and EMCR-FMIN are residual matrices with the data variance unexplained 151
by CST. EMF-ICA is a residual matrix with the data independent unexplained by AST. The dimensions of 152
matrices are: X (m×n), C and A (m×p), ST (p×n) and E (m×n); in GC-MS data, m is the time points in 153
chromatographic direction (the number of rows), n is mass-to-charge (m/z) values in spectral direction 154
(the number of columns of X), and p is the number of chemical constituents. The difference between 155
Page 8 of 35
Accep
ted
Man
uscr
ipt
MCR and ICA is that MCR methods attempt to maximize the data variance explained by the different 156
constituents under a set of constraints (i.e. non-negativity and other constraints in MCR method); in 157
another side, ICA attempts to maximize the independence among the resolved constituents with a 158
minimum loss of information [49].159
Assessment of the solutions of chemometric resolution methods is the main step in the chemometric 160
resolution analysis of analytical data, such as GC–MS. There are five different methods for evaluating the 161
chemometric resolution solutions for a GC–MS data: (1) main important aspect from the analytical 162
chemistry point of view is chromatographic shape recovery, (2) statistical parameters of lack of fit (LOF) 163
and variance explained (R2), (3) reverse match factor (RMF) for assessment of the resolved mass spectra 164
with the relative standards in the MS database, (4) assessment of the extent of rotational ambiguities, and 165
(5) co-injection (injection of the standards of recognized components at the similar instrumental 166
conditions and contrast of the retention indices) [8]. Among these methods, using chromatographic shape 167
recovery, statistical parameters and RMF are the most prevalent for the assessment of the resolved 168
profiles by chemometric resolution methods in the GC–MS analysis of EOs. Because of the application of 169
the proposed methods only for two- and/or three-component systems with a low quantity of artifacts, 170
investigating the feasible regions of MCR results of real data is limited. Co-injection is also a reliable 171
method to approve the results of chemometric resolution techniques; though, there are some limitations in 172
the accessibility of the standards for the EOs components.173
The lack of fit (LOF) and R2 parameters have used for satisfactory test of the results. LOF and R2 can be 174
calculated using Eqs. (4) and (5), respectively.175
2,1 1
2,1 1
,( ) 100
i jI J
i ji j
I J
i ji j
d dLOF
d
(4)176
2
,1 12
2,1
,
1
1 100
I J
i ji j
I J
i ji j
i jd dR
d
(5)177
Page 9 of 35
Accep
ted
Man
uscr
ipt
where di,j is the element of the original data matrix and ljd is the recovered value using resolution 178
techniques. The sums are extended to i=1, . . ., I rows and j=1, . . ., J columns of the data matrix d. The 179
closer the lack of fit and R2 are, respectively, to zero and one, the best model fitting of the experimental 180
data has been achieved. The mathematical solutions of Eqs. (1), (2) and (3) for C (or A) and ST are 181
obviously “not unique” if no additional information is available. In other word, the decomposition of 182
matrix X according to these equations has been ambiguous. Here, rotational ambiguity of MCR solutions 183
was described (rotational ambiguity of ICA is similar to rotational ambiguity of MCR and describing of 184
it, was eliminated for brevity). Physical constraints such as non-negativity or other natural constraints like 185
closure, unimodality, and selectivity and especially local rank, significantly reduce the range of possible 186
solutions [30,34,40,50,51]. Rotational ambiguities can be described using the following equation:187
** 1 ( ) T Told newX X E CS E C T S E X ET (6)188
In this equation, from initial solutions C and ST, a new set of solutions C* and (ST)* can be obtained by 189
linear combination of the latter using a nonsingular matrix T, the so-called transformation (rotation) 190
matrix T. All C and ST solution fit the data matrix X equally well; they result in the same matrix Xold. For 191
a system that has no unique solution due to the presence of rotational ambiguity, a range of feasible MCR 192
solutions can be obtained via a range of different matrices T, and there is no way to be sure that one 193
solution is better than another one, i.e., that one particular solution is the true one, unless more 194
information is provided to the system [11,52].195
3.2. Methodology196
For studying the performance of chemometric resolution methods the following steps were performed:197
1. Data preprocessing: Since GC-MS data obtained in full scan mode includes many noise channels; 198
therefore, deleting these channels would result in a quicker computation. Noises channels consist of 199
random signals. Shen et al. [53] have proposed methods based on the frequency difference between 200
Page 10 of 35
Accep
ted
Man
uscr
ipt
the noise and the signal to distinguish the noises and the signal channels in GC-MS data. This 201
method is the so-called morphological score. It was used for decreasing the hemoscedastic noises in 202
GC-MS data. Also, Savitsky–Golay filter [54] which is a smoothing filter based on least squares for 203
smoothing and differentiating data, was used for decreasing the heteroscedastic noises in GC-MS 204
data. Baseline drift during the chromatographic elution is another problem in GC–MS analysis. 205
Presence of baseline drift in GC-MS analysis leads to requiring more complicated analysis to resolve 206
the data [55]. There are several methods that can deal with the difficulties of spectral background and 207
baseline drift, simultaneously. Congruence analysis method and least-squares fitting for correcting 208
the baseline drift and spectral background, have been developed by Liang et al. [56]. In another 209
preprocessing step, each data matrix is scaled to have a maximum signal intensity of 1.0.210
2. Determination of chemical constituents: Determination of chemical rank was performed by 211
morphological score method. In this technique key spectra are used instead of full rank matrices. 212
Therefore, the obtained results are more reliable [53].213
3. Chemometrics resolution: Using chemometric resolution methods, the pure profiles 214
(chromatographic and spectral profiles) of the constituents will be recovered.215
4. Valuation of the results: To confirm the reliability and quality of results, the similarity match with 216
MS database was used. The obtained results using these methods are evaluated by the statistical 217
parameters such as lack of fit (LOF) and variance explained (R2). When the LOF and R2 are close to 218
zero and one, respectively, the best model fitting of the multivariate instrumental measurements has 219
been achieved.220
5. Verifying of rotational ambiguities: Tauler showed that the MCR solutions are within the boundaries 221
which estimated by MCR-BANDS [18] and also he expressed that Gemperline’s concept was equal 222
to minimize and maximize the function in this Eq. [57]:223
Page 11 of 35
Accep
ted
Man
uscr
ipt
( )T
p p
p T
C Sf T
CS (7)224
Where T is the rotation (transformation) matrix; p represents the number of chemical constituents. The 225
optimization of this function (either maximized or minimized) will give the maximum and the minimum 226
boundary, respectively. For a particular species, these boundaries will describe the feasible concentration 227
and spectral profiles, executing the constraints of the problem and giving a maximum and a minimum 228
signal contribution [18,50,51,58].229
4. Results and discussion230
The six total ionic chromatograms (TICs) of essential oils from Myrtus communis L. are shown in Figure 231
1. The TICs show that some parts of the peaks practically overlap with each other. However, the 232
similarity indices (SIs) obtained from direct searching with the MS database, are very low for many 233
chromatographic peaks. Also, the same component is possibly searched at different chromatographic scan 234
points. In this study, as examples, three peak clusters (labeled by A, B, and C) are presented to show the 235
efficiency and performance of chemometric resolution methods for extracting more information from the 236
multi-component overlapping GC–MS data of essential oils. Here, the results of the peak cluster A 237
(Figures and Tables) are selected to display the chemometrics steps. Also, Tables of peak clusters B and 238
C are available as supplementary data. The aim of selecting these specific peak clusters are showing 239
performance and then assessment application of chemometric resolution methods in two- and three 240
components systems with various in degree in overlapping.241
Figure 1242
Steps of resolution process are shown by peak cluster A from EF sample. It seems that there is one 243
component in peak cluster A, but inspection of mass spectra of different part of the cluster indicates that 244
there could be more than one component or severe noises. As a result of the co-elution problems, 245
Page 12 of 35
Accep
ted
Man
uscr
ipt
presence of noises (heteroscedastic and homoscedastic) and baseline drift in these peak clusters; similarity 246
indices for simple search through the MS database were low. So, preprocessing was performed for each 247
data matrix, as first step in the resolution process. Critical step for reducing the complexity of the data 248
matrix is removing of the chromatogram baseline drift. In this study, the baseline correction was done by 249
using the methodology that was proposed by Liang et al.[56]. Therefore, morphological score method 250
[53] and Savitzky-Golay filter [54] were used for removing the homoscedastic and the heteroscedastic 251
noises in this peak cluster, respectively. In second step, chemical rank is determined. For this objective, 252
morphological score method was used. In this method, the numbers of constituents were concluded by 253
counting the number of singular vectors with morphological scores upper than noise levels. In Figure 2, 254
the result of morphological score method is shown. Results of morphological score show the presence of 255
two components in this peak cluster. Also, results of morphological score method show that the peak 256
cluster B and C contain two and three components, respectively (results are not shown).257
Figure 2258
In the next step, reliability of MCR-ALS method by changing initial estimation methods was surveyed. In 259
this study, SIMPLISMA, EFA and OPA [57] were used as an initial estimate in an iterative process. 260
SMCR process starts with the estimation of initial pure variables, such as concentration and or pure 261
spectral profile. ALS procedure is carried out using the initial set of estimates, either concentration or 262
spectral profile. Several studies have claimed that while a poor initial estimate has used, ALS algorithm 263
reaches the suboptimal local minimum (reach to local minima can cause inadequate curve resolution) and 264
using strong initial estimate, ALS algorithm reaches the global minimum [24,25].265
Because of inevitable experimental uncertainties (noise, etc.), this decomposition is not perfect, and the 266
differences between the measured data and their decomposition are collected in a matrix of error E (or 267
residuals). By comparing the results of different samples can find out that these three initial estimation 268
methods (SIMPLISMA, EFA and OPA) extract the same value of signal contribution and lose the same 269
value of information (that presumably are noises). Table 1 and Figure 3 show the results of changing the 270
Page 13 of 35
Accep
ted
Man
uscr
ipt
initial estimation methods for peak cluster A. Also, results of changing the initial estimation methods for 271
peak cluster B and C are available in supplementary data (Table S1B and Table S1C, respectively).272
Table 1273
These results confirm that the various types of initial estimation methods using in an iterative process of 274
MCR-ALS have been used to prevent it from likely getting stuck in the local minima as well as using 275
different types of initial estimation techniques lead to the same results (in terms of statistical parameters 276
such as LOF and R2) but due to presence of rotational ambiguities, RMF of constituents is different.277
Allowing the analysis to reach convergence is so important. Results show that SIMPLISMA method can 278
be helpful for faster convergence compared to the other initial estimation methods (results are not shown).279
For resolving the peak clusters, chemometric resolution methods such as MCR-ALS [17], MF-ICA [16]280
and MCR-FMIN [18] were applied. Required initial estimates for MCR-ALS and MF-ICA were obtained 281
by SIMPLISMA [59] and also required initial estimate for MCR-FMIN as a basic theory of this method 282
was obtained from PCA [18]. Non-negativity on chromatograms and spectral profiles and normalization 283
on spectral profiles as constraints were used. Results of these decompositions for peak cluster A are 284
shown in Table 2. Also, results of these chemometric resolution methods for peak cluster B and C are 285
available in supplementary data (Table S2B and Table S2C, respectively). Figure 3 shows the pure 286
concentration profiles for peak cluster A from EF sample.287
Figure 3288
Because of sever co-elution and only applying non-negativity and normalization as constraints, the 289
resolved profiles have peak shapes that are far from ideal, for example, they have more than one 290
maximum for their concentration profiles of their components. In this figure, resolved concentration 291
profiles of both components from SIMPLISMA-MCR-ALS and OPA-MCR-ALS with the blue dotted-292
line and orange triangle-line indicate the concentration profiles of 1,8-Cineol and D-limonene, 293
respectively. From EFA-MCR-ALS method, blue stared-line and blue circled-line show the concentration 294
Page 14 of 35
Accep
ted
Man
uscr
ipt
profiles of 1,8-Cineol and D-limonene, respectively. For solutions of MF-ICA, red dash-dot-line indicates 295
the concentration profile of 1,8-Cineol and blue crossed-line shows the concentration profile of D-296
limonene and for solutions of MCR-FMIN, violet solid-line shows the concentration profile of 1,8-Cineol 297
and green rhombic-line indicates the concentration profile of D-limonene.298
Table 2299
When each pure spectrum was obtained, then the constituents can be identified by similarity searches 300
using the NIST mass database and can be confirmed with retention indices [60]. In Figure 4 the mass 301
spectra and their relative standard spectra of constituents in peak cluster A from EF sample are shown 302
(obtained by SIMPLISMA-MCR-ALS). The library searches showed that the constituents of D-limonene 303
and 1, 8-Cineol are present in the peak cluster A, the constituents of Linalool and Hotrienol are existed in 304
peak cluster B; and the components of α-Terpineol, Estragole, and Myrtenol in peak cluster C. Also, the 305
reliability of the results was confirmed by comparing the Kovats retention indices of the resolved 306
components with those of the pure ones. 307
Figure 4308
In this study, Tauler’s algorithm [30] called MCR-BANDS was used for computing the minimum and the 309
maximum boundaries. Effect of different initial estimation methods on rotational ambiguity of MCR-ALS 310
in resolving of peak cluster A is shown in Table 3 with their relative component contribution (RCC) 311
function values (see caption) when only non-negativity and spectral normalization constraints were 312
applied (Table S3B and Table S3C for peak clusters B and C, respectively). In these Tables, for each 313
component, the function values for the initial profiles (finic), the maximum (fmax), minimum (fmin) 314
optimized functions and the difference between them (fmax-fmin) are given that allow an easy inspection of 315
the extent of remaining rotational ambiguities. When the difference is close to zero, it means that 316
practically there is no remaining rotational ambiguity.317
Table 3318
Page 15 of 35
Accep
ted
Man
uscr
ipt
In all cases, solutions of MCR-ALS with different initial estimates are going to the same loss of 319
experimental data value (lack of fitting) and the same value of ambiguity (feasible region), but the values 320
of qualitative analysis of final solutions with different initial estimates are not same and resolved 321
components have different RMF (finish in different feasible solutions).322
Table 4 shows the values of rotational ambiguities in SMCR methods for peak cluster A. Results of 323
SMCR methods for peak cluster B and C are available in supplementary data (Table S4B and Table S4C, 324
respectively). Results show that the performance of multivariate resolution techniques could be 325
influenced by level of overlapping or embedding and noise of dataset (these peak clusters in each sample 326
have different value of noise and degree of overlapping). Abdollahi and Tauler have claimed that in the 327
presence of rotational ambiguity and under a special set of constraints, the obtained solution by the 328
different multivariate curve resolution methods can differ among them and also from the true solution, 329
depending on the applies algorithm and initial estimates [42, 46]. But results of our research show that, all 330
iterative chemometric resolution methods with applied non-negativity on chromatogram and spectral 331
profiles and normalization on spectral profiles have the same and large values of rotational ambiguities. 332
Tauler reported that the maximum and the minimum band boundaries calculated for the solutions of 333
MCR-ALS and MCR-FMIN were different [18]. But, Rajko acclaim that the difference between the 334
maximum and the minimum band boundaries calculated for the solutions of MCR-ALS and MCR-FMIN 335
is impossible, because the rotational ambiguity belongs to the data matrix and is independent of the 336
algorithms of the curve resolution [51]. From comparison of RCC values in Table 4 for Peak cluster 337
A (peak cluster B and C in Table S4B and Table S4C, respectively), it is observed that each 338
solution of methods has RCC value in same feasible solution regions. For example, Fmin for peak 339
cluster A in EF sample by any of the three investigated methods are 0.754 and 0.397 for the first 340
and second components, respectively. Also, Fmax for peak cluster A in EA sample by any of the 341
three studied methods are 0.847 and 0.493 for the first and second components, respectively. In 342
other words, the range of the feasible region for this data set is independent of the algorithms that 343
Page 16 of 35
Accep
ted
Man
uscr
ipt
were used for resolution (0.093 and 0.096 for the first and second components, respectively). 344
One can see from Tables 3, S3B, S3C, 4, S4B and S4C that Fmin and Fmax values are rather 345
similar when the same system was investigated by any of the three methods (MCR-ALS, MFICA 346
or MCR-FMIN). The results of this study approve the Rajko acclaim, that the maximum and the 347
minimum band boundaries calculated for the solutions of MF-ICA, MCR-ALS and MCR-FMIN 348
are the same, in other words, rotational ambiguity belongs to the data matrix is independent of 349
the algorithms of the curve resolution.350
NIST library by comparing the fragmentation patterns composed of all fragment ions of unknown 351
component with its standards (supported by both reasonably good spectral matches and close time 352
matching with a database) lists the formula and name of the most likely compound. While one or more of 353
resolving fragmentation ions (abundance or pattern in spectral profile) are affected with each of common 354
problems, such as rotational ambiguity and noise, different component may be characterized. For 355
example, in peak cluster A of WF sample, in spite of obtaining good values of LOF and R2, similarity 356
index (RMF2) is poor (noted by (-) in Tables 1 and 2). This means that the model fits well the data but the 357
solutions are not the true ones, because rotational ambiguities are still present. Existence of rotational 358
ambiguities had led to obtaining odd profiles for peak cluster A of WF sample that does not pertain to D-359
limonene or 1, 8-Cineol, and the obtained profile could not be identified. As another example, in peak 360
cluster B of EF sample, in spite of obtaining good values of LOF and R2, similarity index (RMF4) is poor 361
(noted by (-) in Tables S1B and S2B in supplementary data). As mentioned above, this means that 362
rotational ambiguities are still present. Results show that, SMCR methods by non-negativity and 363
normalization as constraints are not reliable for quantitative and qualitative analysis.364
Table 4365
Iterative techniques try to find a rational solution. The important differences between the iterative 366
approaches can be related to the initial estimates used, the sort of profiles that are iteratively optimized, or 367
Page 17 of 35
Accep
ted
Man
uscr
ipt
the nature and application of the constraints. In these methods, at each cycle of an optimization process, 368
some constraints such as non-negativity, normalization, unimodality, local rank/selectivity, are exerted on 369
the algorithms to obtaining meaningful solutions. Constraints force the profiles in C and/or ST to comply 370
some predefined mathematical or chemical properties. In MF-ICA, MCR-ALS and MCR-FMIN some 371
constraints such as non-negativity and normalization are common. But algorithms of MCR-ALS and 372
MCR-FMIN have an advantage that they could impose other constraints such as unimodality, local 373
rank/selectivity, trilinearity (MF-ICA algorithm could not comply these important constraints). Also, in 374
contrast to MF-ICA algorithm, resolution of augmentation data is another MCR algorithms advantage. In 375
another word, MCR algorithms are proffered to resolution of several runs and or several samples.376
Several factors such as intense of complexity, intense of components, and value of noises have effect on 377
ambiguity of resolved results [42]. Therefore, each type of chromatograms, which has the diverse nature 378
of the data set, has a different amount of rotational ambiguity. 379
Because of unknown concentrations of components in GC-MS, the constraint of equal unit concentration 380
profile (closure) cannot be used. So non-negativity, unimodality, and selectivity play a major role in these 381
analysis [52,61]. In this paper, for reducing rotational ambiguity the following constraints have been 382
applied; non-negativity for the concentration and spectral profiles, unimodality and selectivity for the 383
concentration profile and normalization for the spectral profile. In this case, for the analysis of 384
experimental systems, the information needed for the application of the local rank/selectivity constraints 385
can be obtained from preliminary evolving factor analysis (EFA) of the data sets [62]. Results of peak 386
cluster A of EF sample are shown in Table 5. In supplementary data, Tables S5B and S5C have shown the 387
results of applying constraints on peak cluster B and C of EF sample, respectively.388
Table 5389
Results show that rotational ambiguities are reduced by applying appropriate constraints. Imposing of 390
only non-negativity and normalization constraints leave a wide range of possible elution profiles (see 391
Tables 3, S3B and S3C, Tables 4, S4B and S4C). Also, the spectral profiles can be very different if only 392
Page 18 of 35
Accep
ted
Man
uscr
ipt
these constraints were applied. When unimodality was applied, this range of possible solutions was 393
narrowed (see Tables 5, S5B and S5C), and when local rank constraint was applied, this range of possible 394
solutions was narrowed significantly. From these results can be clearly concluded that imposing any 395
additional information on the chemical system could be reduced the rotational ambiguity of the solutions 396
of a SMCR method [57,61]. Results show that, MCR-ALS and MCR-FMIN did not produce the unique 397
solutions because rotational ambiguity could not be totally solved in these cases by the proposed 398
constraints.399
As mentioned above, the first possible strategy to decrease or overcome on the influence of rotational 400
ambiguities could be the use of additional information (applying constraints) for each of the co-eluted 401
components. The second strategy is arranging data sets in a single augmented data matrix (data 402
augmentation of the several samples or runs). Data can be augmented by the three options of matrix 403
augmentation (column-wise, row-wise or column- and row-wise augmented data matrix). When the data 404
matrix is arranged in a row - or a column -wise augmented matrix, the algorithm is started by spectral or 405
concentration profile as an initial estimate, respectively [42]. GC-MS data, including those of essential 406
oils are naturally three-way (samples × retention time × m/z) and therefore, are candidates for analysis by 407
chemometric resolution methods.408
From the fact that the positions and shapes of the elution of the two co-eluted components in peak clusters 409
A (as well as B) and the three co-eluted components in peak cluster C are different in the six different 410
samples and can be found out that they could not be described by only two elution profiles for peak 411
clusters A (as well as B) and only three elution profiles for peak clusters C. In the other words, each 412
component in each sample needs a specific concentration profile to describe its elution profile. In this 413
study, because of using several samples, matrices of experimental data have some deviation from tri-414
linearity. MCR-ALS belongs to the methods that could work even with some deviation from tri-linearity 415
in data set. MCR-ALS works with the unfolding three-way data array which breaks the tri-linear 416
structure in the data set [63].417
Page 19 of 35
Accep
ted
Man
uscr
ipt
Results from augmentation of these six samples proven that the peak cluster A includes two components, 418
D-limonene and 1, 8-Cineol. The RMF of D-limonene and 1, 8-Cineol are 880 and 945, respectively. 419
Results from augmentation of these samples confirmed that the peak cluster B includes two components, 420
Linalool and Hotrienol. The RMF of Linalool and Hotrienol are 925 and 931, respectively. Also, results 421
from augmentation of these samples confirmed that the peak cluster C includes three components such as 422
α-Terpineol, Estragole, and Myrtenol. The RMF of α-Terpineol, Estragole, and Myrtenol are 945, 932 and 423
905, respectively. By augmentation of datasets from only six samples, the value of rotational ambiguity 424
reaches to zero. This proves the importance of using augmentation data to get optimal solutions in MCR 425
analysis. Thus, by applying the strategy of augmentation, the concentration and spectral profiles of the 426
main components were obtained truly. Therefore, using augmentation of datasets is one of the best 427
solutions for reducing or removing the rotational ambiguity.428
5. Conclusion429
Gas chromatography-mass spectrometry is common hyphenated chromatography technique for 430
identification volatile constituents of essential oils, but because of baseline drift, noises and co-elution, 431
this technique has unreliable performance. Chemometric resolution methods for resolving overlapping 432
peaks into pure profiles were applied. In this work, several results of common resolution techniques such 433
as MF-ICA, MCR-ALS and MCR-FMIN have been surveyed. Self-modeling curve resolution techniques 434
are hard model-free methods in a sense that no a-priori information on the chemical system is needed for 435
the construction of profiles. These results confirm that, when non-negativity and normalization are 436
applied, results of these three methods almost were the same. In general, because of the so-called 437
rotational ambiguity, a range of feasible solutions exist. The rotational ambiguities of the profiles are a 438
challenging fact which complicates the development of stable and universal SMCR algorithms. The 439
calculative determination of the range of feasible solution has an advantage in the analysis of chemical 440
systems. These results approve that, chemometric resolution techniques with imposed non-negativity and 441
normalization as constraints have much value of ambiguities on rotational ambiguities that approximately 442
Page 20 of 35
Accep
ted
Man
uscr
ipt
is the same for all methods. Also, results show that using additional information on the chemical system 443
can be reduced the rotational ambiguity of the solutions of chemometric resolution techniques. From the 444
obtained results, it is deduced that augmentation of data decreases considerably the extent of rotational 445
ambiguities in multivariate resolution methods. So, using augmented data with MCR-ALS technique by 446
applying appropriate constraints is the best solution for obtaining resolved profiles that are most similar to 447
the true profiles..448
6. Acknowledgements449
The authors are grateful to University of Kashan for supporting this work by Grant NO. 256727/1 and 450
would like to thank from Dr. Hadi Parastar from Department of Chemistry, Sharif University of 451
Technology for his helpful discussion and thankful to Barij Essence Pharmaceutical Company (Kashan, 452
Iran) for help us in preparing samples.453
454
Page 21 of 35
Accep
ted
Man
uscr
ipt
References454
455
[1] M. Jalali-Heravi, B. Zekavat, H. Sereshti, Use of gas chromatography–mass spectrometry 456combined with resolution methods to characterize the essential oil components of Iranian cumin 457and caraway, J. Chromatogr. A 1143 (2007) 215-226.458
[2] C. Brunelli, C. Bicchi, A. Di Stilo, A. Salomone, M. Vincenti, High‐speed gas chromatography in 459
doping control: Fast‐GC and fast‐GC/MS determination of β‐adrenoceptor ligands and diuretics, 460
J. Sep. Sci. 29 (2006) 2765-2771.461[3] J.M. Amigo, M.J. Popielarz, R.M. Callejón, M.L. Morales, A.M. Troncoso, M.A. Petersen, T.B. 462
Toldam-Andersen, Comprehensive analysis of chromatographic data by using PARAFAC2 and 463principal components analysis, J. Chromatogr. A 1217 (2010) 4422-4429.464
[4] X. Shao, G. Wang, S. Wang, Q. Su, Extraction of mass spectra and chromatographic profiles 465from overlapping GC/MS signal with background, Anal. Chem. 76 (2004) 5143-5148.466
[5] W. Windig, J.M. Phalp, A.W. Payne, A noise and background reduction method for component 467detection in liquid chromatography/mass spectrometry, Anal. Chem. 68 (1996) 3602-3606.468
[6] C.-J. Xu, Y.-Z. Liang, F.-T. Chau, Identification of essential components of Houttuynia cordata 469by gas chromatography/mass spectrometry and the integrated chemometric approach, Talanta 68 470(2005) 108-115.471
[7] L.W. Hantao, H.G. Aleme, M.P. Pedroso, G.P. Sabin, R.J. Poppi, F. Augusto, Multivariate curve 472resolution combined with gas chromatography to enhance analytical separation in complex 473samples: A review, Anal. Chim. Acta 731 (2012) 11-23.474
[8] M. Jalali-Heravi, H. Parastar, Recent trends in application of multivariate curve resolution 475approaches for improving gas chromatography–mass spectrometry analysis of essential oils, 476Talanta 85 (2011) 835-849.477
[9] J.M. Amigo, T. Skov, R. Bro, ChroMATHography: solving chromatographic issues with 478mathematical models and intuitive graphics, Chem. Rev. 110 (2010) 4582-4605.479
[10] G. Lubec, L. Afjehi-Sadat, Limitations and pitfalls in protein identification by mass spectrometry, 480Chemical Reviews-Columbus 107 (2007) 3568-3584.481
[11] A. de Juan, R.A. Tauler, Multivariate curve resolution (MCR) from 2000: progress in concepts 482and applications, Crit. Rev. Anal. Chem. 36 (2006) 163-176.483
[12] M. Garrido, F. Rius, M. Larrechi, Multivariate curve resolution–alternating least squares (MCR-484ALS) applied to spectroscopic data from monitoring chemical reactions processes, Anal. Bioanal. 485Chem. 390 (2008) 2059-2066.486
[13] J.-H. Jiang, Y. Liang, Y. Ozaki, Principles and methodologies in self-modeling curve resolution, 487Chemom. Intell. Lab. Syst. 71 (2004) 1-12.488
[14] S. Masoum, H. Ghasemi-Estarki, H. Seifi, E.H. Ebrahimabadi, H. Parastar, Analysis of the 489volatile chemical constituents in Mindium laevigatum by Gas chromatography–Mass 490spectrometry and correlative chemometric resolution methods, Microchem. J. (2012) 276-281.491
[15] H. Seifi, S. Masoum, S. Seifi, E.H. Ebrahimabadi, Chemometric Resolution Approaches in 492Characterisation of Volatile Constituents in Plantago ovata Seeds using Gas Chromatography–493
Page 22 of 35
Accep
ted
Man
uscr
ipt
Mass Spectrometry: Methodology and Performance Assessment, Phytochem. Anal. 25 (2014) 494273–281.495
[16] A. Hyvärinen, E. Oja, Independent component analysis: algorithms and applications, Neural 496networks 13 (2000) 411-430.497
[17] J. Jaumot, R. Gargallo, A. de Juan, R. Tauler, A graphical user-friendly interface for MCR-ALS: 498a new tool for multivariate curve resolution in MATLAB, Chemom. Intell. Lab. Syst. 76 (2005) 499101-110.500
[18] R. Tauler, Application of non-linear optimization methods to the estimation of multivariate curve 501resolution solutions and of their feasible band boundaries in the investigation of two chemical and 502environmental simulated data sets, Anal. Chim. Acta 595 (2007) 289-298.503
[19] H. Li, J. Hou, K. Wang, F. Zhang, Resolution of multicomponent overlapped peaks: A 504comparison of several curve resolution methods, Talanta 70 (2006) 336-343.505
[20] S. Masoum, H. Seifi, E.H. Ebrahimabadi, Characterization of volatile components in calligonum 506comosum by coupling gas chromatography-mass spectrometry and mean field approach 507independent component analysis, Anal. Methods 5 (2013) 4639-4647.508
[21] M. Jalali-Heravi, B. Zekavat, H. Sereshti, Characterization of essential oil components of Iranian 509geranium oil using gas chromatography–mass spectrometry combined with chemometric 510resolution techniques, J. Chromatogr. A 1114 (2006) 154-163.511
[22] G. Wang, W. Cai, X. Shao, A primary study on resolution of overlapping GC-MS signal using 512mean-field approach independent component analysis, Chemom. Intell. Lab. Syst. 82 (2006) 137-513144.514
[23] M. Vosough, Using mean field approach independent component analysis to fatty acid 515characterization with overlapped GC–MS signals, Anal. Chim. Acta 598 (2007) 219-226.516
[24] M. Jalali-Heravi, H. Parastar, H. Ebrahimi-Najafabadi, Self-modeling curve resolution techniques 517applied to comparative analysis of volatile components of Iranian saffron from different regions, 518Anal. Chim. Acta 662 (2010) 143-154.519
[25] M. Jalali-Heravi, H. Parastar, Assessment of the co-elution problem in gas chromatography-mass 520spectrometry using non-linear optimization techniques, Chemom. Intell. Lab. Syst. 101 (2010) 1-52113.522
[26] A. Cichocki, R. Zdunek, Advances in Neural Networks, Springer, 2007.523[27] S. Navea, R. Tauler, A. de Juan, Monitoring and modeling of protein processes using mass 524
spectrometry, circular dichroism, and multivariate curve resolution methods, Anal. Chem. 78 525(2006) 4768-4778.526
[28] A. de Juan, M. Maeder, T. Hancewicz, R. Tauler, Use of local rank‐based spatial information for 527
resolution of spectroscopic images, J. Chemometrics 22 (2008) 291-298.528[29] T. Azzouz, R. Tauler, Application of multivariate curve resolution alternating least squares 529
(MCR-ALS) to the quantitative analysis of pharmaceutical and agricultural samples, Talanta 74 530(2008) 1201-1210.531
[30] J. Jaumot, R. Tauler, MCR-BANDS: A user friendly MATLAB program for the evaluation of 532rotation ambiguities in Multivariate Curve Resolution, Chemom. Intell. Lab. Syst. 103 (2010) 96-533107.534
[31] M.H. Van Benthem, M.R. Keenan, D.M. Haaland, Application of equality constraints on 535variables during alternating least squares procedures, J. Chemometrics 16 (2002) 613-622.536
[32] A. de Juan, M. Maeder, M. Martı́nez, R. Tauler, Combining hard-and soft-modelling to solve 537kinetic problems, Chemom. Intell. Lab. Syst. 54 (2000) 123-141.538
Page 23 of 35
Accep
ted
Man
uscr
ipt
[33] M. Sawall, C. Fischer, D. Heller, K. Neymeyr, Reduction of the rotational ambiguity of curve 539resolution techniques under partial knowledge of the factors. Complementarity and coupling 540theorems, J. Chemometrics 26 (2012) 526-537.541
[34] H. Abdollahi, R. Tauler, Uniqueness and rotation ambiguities in Multivariate Curve Resolution 542methods, Chemom. Intell. Lab. Syst. 108 (2011) 100-111.543
[35] W.H. Lawton, E.A. Sylvestre, Self modeling curve resolution, Technometrics 13 (1971) 617-633.544[36] R. Rajkó, Additional knowledge for determining and interpreting feasible band boundaries in 545
self-modeling/multivariate curve resolution of two-component systems, Anal. Chim. Acta 661 546(2010) 129-132.547
[37] R.b. Rajkó, Comments on Near-Infrared Hyperspectral Unmixing Based on a Minimum Volume 548Criterion for Fast and Accurate Chemometric Characterization of Counterfeit Tablets, Anal. 549Chem. 82 (2010) 8750-8752.550
[38] O.S. Borgen, N. Davidsen, Z. Mingyang, Ø. Øyen, The multivariate N-component resolution 551problem with minimum assumptions, Microchim. Acta 89 (1986) 63-73.552
[39] O.S. Borgen, B.R. Kowalski, An extension of the multivariate component-resolution method to 553three components, Anal. Chim. Acta 174 (1985) 1-26.554
[40] R. Rajkó, Some surprising properties of multivariate curve resolution‐alternating least squares 555
(MCR‐ALS) algorithms, J. Chemometrics 23 (2009) 172-178.556
[41] R. Rajkó, Studies on the adaptability of different Borgen norms applied in self‐modeling curve 557
resolution (SMCR) method, J. Chemometrics 23 (2009) 265-274.558[42] A. Golshan, H. Abdollahi, M. Maeder, Resolution of rotational ambiguity for three-component 559
systems, Anal. Chem. 83 (2011) 836-841.560[43] S. Burt, Essential oils: their antibacterial properties and potential applications in foods—a review, 561
Int. J. Food Microbiol. 94 (2004) 223-253.562[44] L.S. Nerio, J. Olivero-Verbel, E. Stashenko, Repellent activity of essential oils: a review, 563
Bioresour. Technol. 101 (2010) 372-378.564[45] E. Guenther, The production of essential oils, The essential oils 1 (1972) 87-226.565[46] D.J. Charles, J.E. Simon, Comparison of extraction methods for the rapid determination of 566
essential oil content and composition of basil, J. Am. Soc. Hortic. Sci. 115 (1990) 458-462.567[47] M. Godefroot, P. Sandra, M. Verzele, New method for quantitative essential oil analysis, J. 568
Chromatogr. A 203 (1981) 325-335.569[48] S. A. Maisonneuve, European Pharmacopoeia, Sainte-Ruffine, France 3 (1975) 68-80.570[49] H. Parastar, M. Jalali-Heravi, R. Tauler, Is independent component analysis appropriate for 571
multivariate resolution in analytical chemistry?, TrAC, Trends Anal. Chem. 31 (2012) 134-143.572[50] H. Abdollahi, M. Maeder, R. Tauler, Calculation and meaning of feasible band boundaries in 573
multivariate curve resolution of a two-component system, Anal. Chem. 81 (2009) 2115-2122.574[51] R. Rajkó, Computation of the range (band boundaries) of feasible solutions and measure of the 575
rotational ambiguity in self-modeling/multivariate curve resolution, Anal. Chim. Acta 645 (2009) 57618-24.577
Page 24 of 35
Accep
ted
Man
uscr
ipt
[52] R. Tauler, Multivariate curve resolution applied to second order data, Chemom. Intell. Lab. Syst. 57830 (1995) 133-146.579
[53] H. Shen, L. Stordrange, R. Manne, O.M. Kvalheim, Y. Liang, The morphological score and its 580application to chemical rank determination, Chemom. Intell. Lab. Syst. 51 (2000) 37-47.581
[54] A. Savitzky, M.J.E. Golay, Smoothing and differentiation of data by simplified least squares 582procedures, Anal. Chem. 36 (1964) 1627-1639.583
[55] P.H.C. Eilers, Parametric time warping, Anal. Chem. 76 (2004) 404-411.584[56] Y.Z. Liang, O.M. Kvalheim, A. Rahmani, R.G. Brereton, A two-way procedure for background 585
correction of chromatographic/spectroscopic data by congruence analysis and least-squares fit of 586the zero-component regions: comparison with double-centering, Chemom. Intell. Lab. Syst. 18 587(1993) 265-279.588
[57] R. Tauler, Calculation of maximum and minimum band boundaries of feasible solutions for 589species profiles obtained by multivariate curve resolution, J. Chemometrics 15 (2001) 627-646.590
[58] G. Ahmadi, H. Abdollahi, A systematic study on the accuracy of chemical quantitative analysis 591using soft modeling methods, Chemom. Intell. Lab. Syst. 120 (2012) 59-70.592
[59] W. Windig, C. Heckler, F. Agblevor, R. Evans, Self-modeling mixture analysis of categorized 593pyrolysis mass spectral data with the SIMPLISMA approach, Chemom. Intell. Lab. Syst. 14 594(1992) 195-207.595
[60] R.P. Adams, Identification of essential oil components by gas chromatography/mass 596spectrometry, B R.P. Adams (Ed.), Allured Publishing Corporation, 2007.597
[61] R. Tauler, A. Smilde, B. Kowalski, Selectivity, local rank, three‐way data analysis and ambiguity 598
in multivariate curve resolution, J. Chemometrics 9 (1995) 31-58.599[62] M. Maeder, A.D. Zuberbuehler, The resolution of overlapping chromatographic peaks by 600
evolving factor analysis, Anal. Chim. Acta 181 (1986) 287-291.601[63] A. Olivieri, G. Escandar, A. Peña, Second-order and higher-order multivariate calibration 602
methods applied to non-multilinear data using different algorithms, TrAC Trends in Anal. Chem. 60330 (2011) 607-617.604
605
606
Page 25 of 35
Accep
ted
Man
uscr
ipt
606
Table 1. Effects of different initial estimation methods to launch MCR-ALS 607algorithm in resolving of peak cluster A.608
EFa WDb EDc KDd WFe KFf
RMF1 938 923 974 885 954 911
RMF2 782 799 844 801 - 759
LOF 8.79 7.27 5.44 9.56 10.55 12.45SIMPLISMA
R2 99.23 99.47 99.70 99.09 99.89 98.45
RMF1 938 906 974 885 954 911
RMF2 799 798 844 801 - 759
LOF 8.79 7.27 5.44 9.56 10.55 12.45
OPA
R2 99.23 99.47 99.70 99.09 99.89 98.45
RMF1 938 902 974 885 954 911
RMF2 782 799 844 801 - 759
LOF 8.79 7.27 5.44 9.56 10.55 12.45EFA
R2 99.23 99.47 99.70 99.09 99.89 98.45
a Fresh sample cultivated in Noorabad; b Dried sample collected from hillsides of 609Noorabad; c Dried sample cultivated in Noorabad; d Dried sample collected from hillsides 610of Khoramabad; e Fresh sample collected from hillsides of Noorabad; f Fresh sample 611collected from hillsides of Khoramabad.612The database lists the name and formula of the most likely compound in accordance with 613the NIST search that matches an unknown compound and how similar the two compounds 614are in terms of their peaks (match/ reverse match factor). Reverse match factor (RMF) is a 615peak comparison between the unknown and the known spectra. As a general guide, 900 or 616greater is an excellent match; 800–900, a good match; 700–800, a fair match. Less than 600 617is a very poor match. RMF1 and RMF2 show reverse match factor (RMF) of 1, 8-Cineol and 618D-limonene, respectively.619b LOF and R2 values are calculated according to Equation (4) and (5), respectively.620
Page 26 of 35
Accep
ted
Man
uscr
ipt
Table 2. RMF of resolution chemometric methods for peak cluster A.621622
EFa WDb EDc KDd WFe KFf
RMF1 938 923 974 885 954 911MCR-ALS
RMF2 782 799 844 801 - 759
RMF1 946 940 975 878 954 911MF-ICA
RMF2 800 752 892 812 843 800
RMF1 943 941 930 816 948 850MCR-FMIN
RMF2 800 741 909 816 - 804
623
Page 27 of 35
Accep
ted
Man
uscr
ipt
623Table 3. Effect of different initial estimation methods with their relative component contribution (RCC) function 624values on rotational ambiguity of MCR-ALS in resolving of peak cluster A.625
626EF WD ED KD WF KF
1 2 1 2 1 2 1 2 1 2 1 2
fmin 0.754 0.397 0.811 0.332 0.203 0.851 0.834 0.339 0.869 0.347 0.724 0.349
finic- SIMPLISMA 0.825 0.423 0.864 0.366 0.203 0.897 0.885 0.464 0.869 0.382 0.724 0.533
finic-OPA 0.825 0.423 0.864 0.366 0.203 0.897 0.885 0.339 0.869 0.382 0.724 0.533
finic- EFA 0.842 0.427 0.871 0.351 0.203 0.897 0.884 0.341 0.879 0.367 0.807 0.410
fmax 0.847 0.493 0.888 0.424 0.281 0.897 0.885 0.502 0.896 0.382 0.867 0.533
fmax-fmin 0.093 0.096 0.077 0.092 0.078 0.046 0.051 0.163 0.027 0.035 0.143 0.184
627
Values of the relative component contribution (RCC) function (fp), for component p obtained using 628
different techniques under non-negativity constraints. fp is defined as: ( )T
p p
p T
C Sf T
CS , where T
p pC S is 629
the norm of the contribution of component p calculated as the norm of the product of its concentration 630profile Cp by its spectrum profile T
pS . TCS provides the global contribution of all resolved 631
components calculated as the norm of the product of the concentration matrix C by the spectra matrix ST. 632fmax – fmin corresponds to the difference between fmax and fmin values.633
Page 28 of 35
Accep
ted
Man
uscr
ipt
634
Table 4. The results of MCR-BANDS with their relative component contribution (RCC) function values635in resolving of peak cluster A with multivariate resolution techniques.636
637638639640641642643644
EF WD ED KD WF KF1 2 1 2 1 2 1 2 1 2 1 2
fmin 0.754 0.397 0.811 0.332 0.203 0.851 0.834 0.339 0.869 0.347 0.724 0.349
finic- MCR-ALS 0.825 0.423 0.864 0.366 0.203 0.897 0.885 0.464 0.869 0.382 0.724 0.533
finic-MF-ICA 0.815 0.493 0.811 0.424 0.203 0.897 0.836 0.479 0.877 0.377 0.792 0.486
finic-MCR-FMIN 0.830 0.441 0.853 0.382 0.203 0.897 0.847 0.472 0.869 0.347 0.724 0.533
fmax 0.847 0.493 0.888 0.424 0.281 0.897 0.885 0.502 0.896 0.382 0.867 0.533
Page 29 of 35
Accep
ted
Man
uscr
ipt
Table 5. Effects of different constraints on 644rotational ambiguity for peak cluster A of EF 645sample.646
647648649650651652653654655656657
Constraints: 1 normalization; 2 non-negativity; 3 658unimodality; 4 Selectivity/local rank.659a: fmax – fmin corresponds to the difference between 660fmax and fmin values for MCR-ALS and MCR-661FMIN solutions.662
663
fmax-fmin a
1 2
1, 2 0.093 0.096
1, 2, 3 0.051 0.063
1, 2, 4 0.014 0.012
Page 30 of 35
Accep
ted
Man
uscr
ipt
Figure Caption663
Figure 1. Total ion chromatogram of all samples.664
Figure 2. Morphological score plot for peak cluster A.665
Figure 3. Results of resolution for peak cluster A of EF sample by different multivariate resolution 666methods.667
Figure 4. Resolved mass spectra and their corresponding standard mass spectra for peak cluster A of EF 668sample. Resolved (a) and standard (c) mass spectra of 1, 8-Cineol; resolved (b) and standard (d) mass 669spectra of D-limonene.670
671
Page 31 of 35
Accep
ted
Man
uscr
ipt
Highlights:
Initial estimates have no effect on extraction of information from overlapped signals.
MCR-ALS results are favorably compared to results obtained by MF-ICA and MCR-FMIN.
Physical constraints were implemented to reduce the rotational ambiguities.
Using augmentation data is the best way to overcome on rotational ambiguity.
*Highlights (for review)
Page 32 of 35
Accep
ted
Man
uscr
ipt
Figure 1
Page 33 of 35
Accep
ted
Man
uscr
ipt
Figure 2
Page 34 of 35
Accep
ted
Man
uscr
ipt
Figure 3
Page 35 of 35
Accep
ted
Man
uscr
ipt
Figure 4