Decoding complex multicomponent chromatograms by fourier analysis

10
full page HY

Transcript of Decoding complex multicomponent chromatograms by fourier analysis

full page

CHROMATOGRAPHY

N O V E M B E R 1 , 2 0 0 1 / A N A LY T I C A L C H E M I S T R Y 6 1 9 A

atural samples often contain hundreds of compo-nents. For example, crude oil products can containthousands or tens of thousands of components. Thecomplete chromatographic resolution of such com-plex samples requires tens of millions of theoret-ical plates, considering that peak capacity is rough-ly proportional to the square root of the number

of theoretical plates. Because peaks do not elute equi distantly,analyzing a complex sample creates many coeluting components even withan extremely high-efficiency separation system. The severe peak overlap oftenobserved in such multicomponent separations arises mostly because of therandom distribution of retention times and the limited peak capacity ofthe separation system.

Routine one-dimensional (1-D) chromatographic methods cannot han-dle a complete qualitative and quantitative analysis of complex mixtures.For example, a 450-m long open tubular GC column with more than 1.3million effective plates identified ~970 compounds in a gasoline standard,yet that separation still had many unresolved peaks (1).

To better understand the complexity of multicomponent separations,serious efforts have been made in the past 20 years to describe the retentionpattern of complex chromatograms. In this report, we summarize the mostimportant characteristics of multicomponent chromatograms and showhow much information is hidden—usually neglected and buried—behindthe forest of peaks. Different approaches are presented for decoding com-plex chromatograms, showing how analytical information can be extractedby going beyond conventional data such as retention time and peak area.

We also describe an approach for decoding complex chromatogramsthat differs from the standard deconvolution methods. In a deconvolutionprocess, a short section of a chromatogram, usually one cluster of sever-al overlapping peaks, is investigated, and the profiles of the individual sin-

New statistical

approaches can extract

more information from

complex chromatograms

with overlapping peaks.

Attila FelingerUniversity of Veszprém (Hungary)

Maria Chiara PietrograndeUniversity of Ferrara (Italy)

DDeeccooddiinngg CCoommpplleexxMMuullttiiccoommppoonneennttCChhrroommaattooggrraammss

6 2 0 A A N A LY T I C A L C H E M I S T R Y / N O V E M B E R 1 , 2 0 0 1

gle-component peaks are estimated with an algorithm. How-ever, by using a statistical analysis, no specific information ona particular component is obtained, and the presence or ab-sence of a compound cannot be determined, nor can its con-centration be estimated. The result is that the total chroma -togram is regarded as a statistical ensemble whose commonattributes, such as peak width, peak shape, extent of separa-tion, number of detectable components, saturation of the sep-aration space, and order/disorder of the peaks, are estimated.

AA ssttaattiissttiiccaall mmooddeell ooff oovveerrllaappAlthough chromatography is a deterministic process—thatis, repeated injections of the same sample will lead to practi-

cally identical chromatograms—the concept of randomnessprovides a very powerful way to describe the extent of peakoverlap. For example, when a sample is composed of manycompounds, the interval between adjacent peaks varies; peakclusters and void spaces can also be observed in the chro-matogram. The retention patterns of complex mixtures canbe remarkably different. This is because the distribution ofthe standard free energy differences between the stationaryand mobile phases define a pseudorandom retention timedistribution (2). Accordingly, a complex chromatogram lookslike a random series of peaks.

When there is no chemical similarity between the samplecomponents because they come from numerous chemicalfamilies, the chromatogram is considered disordered. In thiscase, it is assumed that single-component peaks can be foundwith a constant probability per unit time, �, at any point inthe chromatogram. This assumption leads to the Poissondistribution of the number of single-component peaks. Theprobability density function of the Poisson distribution isgiven by

(1)

in which P(m) expresses the probability that there are exact-ly m single-component peaks within the length x. (A glos-sary of terms is found on p 621 A.) Due to the fundamen-tal properties of the Poisson distribution, both the meanvalue and the variance of the number of components are �x.

For an ordered chromatogram, the probability of findinga peak at a given location is not constant. For instance, ifcompounds of a homologous series are separated, peaks arefound at regular intervals, and the retention time incrementscan be forecast.

On the other hand, for a disordered multicomponentchromatogram, the distribution of the intervals—that is, theretention time increments—can be derived by applying Pois-son statistics. In this case, the retention time increments aregiven by the exponential distribution. This distribution of theretention times is uniform, provided that peak density is con-stant along the chromatogram.

Davis et al. proposed and tested a three-part model inwhich the enthalpy change followed a Poisson distribution.The average entropy change depends on the enthalpy change,and the actual entropy change is uniformly distributed aboutthe average entropy change (3). This model also confirmsthat the retention pattern of complex chroma tograms is con-trolled by Poisson statistics.

To characterize the complexity of a multicomponent chro-matogram, Giddings introduced the concept of dimension-ality (4). The chromatogram of a series of homologues isquite ordered and slightly complex. The dimensionality ofsuch a sample is low; increasing the dimension of the sepa-

P(m) = (lx)m

m! e–lx

T(e)

Td

(d)

Td

(c)

Tc

Tc

(b)

Tb

Tb

(a)

Ta

Ta

Time

T

FFIIGGUURREE 11.. The superposition of ordered chromatograms leads to adisordered complex chromatogram with serious peak overlap.

On the right-hand side of the figures, the density functions of change in retention times, ∆tR, are plotted. (a)–(d) Ordered chromatograms with normally distributed ∆tR (relative standard deviation = 0.2). (e) Pooled chro-matogram. The color of the lines below the peaks in (e) identify their origin.

N O V E M B E R 1 , 2 0 0 1 / A N A LY T I C A L C H E M I S T R Y 6 2 1 A

ration space would not really improve theextent of separation. The higher the com-plexity of the mixture, the higher the sam-ple dimensionality. In many instances, theseparation pattern is less disordered whenthe dimension of the separation space in-creases. For really complex mixtures, therepetitive retention patterns disappear andthe retention times become irregular.

Klein and Tyler first applied the conceptof Poisson statistics to complex chroma to -grams and determined the probabilities offinding several peaks in a given interval ofa chromatogram (5).

When a mixture containing a homolo-gous series is separated, the multicompo-nent chromatogram, of course, is not disor-dered, but mixing a few homologous seriescan result in a pseudodisordered chromatogram. Figure 1 illus-trates that the superposition of four ordered chromato gramsyields a disordered chromatogram. Rather than an elementary,ordered chromatogram, in which the average interval betweenadjacent peaks, T, goes as Ta = Tb = Tc = Td, separation is in-complete in the complex chromatogram as evidence by thepresence of doublets and triplets. In the pooled chromato -gram (Figure 1e), the average interval between adjacent peaksis 4 times smaller than in the elementary chromatograms,and the probability density function of the intervals has be-come exponential—that is, the superposition process resultsin a Poisson chromatogram. Complex multicomponent chro-matograms are, therefore, generally disordered, either becausethe compounds are dissimilar or the number of the homol-ogous series is high. In either case, the severity of peak over-lap can be estimated by Poisson statistics (6).

This statistical model of peak overlap is originally based onthe assumption that multicomponent chromatograms are com-pletely disordered (2), although this restriction has sincebeen resolved, and the extent of peak overlap can be estimat-ed for more-or-less ordered, complex chromatograms (7 ).

Very simple equations can be derived to determine thenumber of stand-alone peaks (singlets), doublets, triplets,etc., in a chromatogram or even the total number of observ-able peaks. The average number, pn, of n-tets (fused peaksthat are composed of n single-component peaks) is

pn = me–2�(1 – e–�)n–1 (2)� = m/nc (3)nc = X/x0 (4)

in which m is the total number of detectable components, �is the saturation level of the chromatogram, nc is the peakcapacity, x0 is the distance needed for the resolution of twoadjacent peaks, and X is the total length of the chromato -

gram. Thus, the total number of peaks—either stand-alone orthe fused cluster of several overlapping single components—which can be counted in the chromatogram is

(5)

The critical distance needed for resolution is determinedby the peak width and the critical resolution by

p =S pnn

nc

0.4

0.3

0.2

0.1

0 0.5 1 1.5 2 2.5 3

p n/n

c

1

0.8

0.6

0.4

0 200 400 600 800 1000

Prob

abili

ty (p

= m

)

p = m

Peaks

Singlets

Doublets

Triplets

0.2

m = 10

m = 20

m = 30

m = 40

m = 50

(a) (b)

a

FFIIGGUURREE 22.. Using statistics to study complex chromatograms.

(a) Plot of the ratio of the number of singlets, doublets, triplets, and peaks against �. (b) Plot of theprobability that all sample components will be separated against nc.

GGlloossssaarryy ooff tteerrmmss

aa Average value of peak areasAT Total area of a chromatogramc (t) Autocovariance function of a signaldh,c Half width of c(t) at half heightm Number of observable componentsnc Peak capacityp Total number of peakspn Average number of fused peaks containing n single-

component peaksP (m) Probability density functionRs Peak resolutiont TimetR Retention timeT Average interval between adjacent peaksu Auxiliary variablew Average baseline width of individual overlapping peaksx Lengthx 0 Distance needed for resolving two adjacent peaksX Total length of the chromatogramy (t) Total chromatogram signaly– Mean intensity of isolated peaksy–obs Computed mean intensity of observed peaks� Saturation level of the chromatogram � Probability per unit time� Gaussian peak standard deviation (peak width)�a Standard deviation of peak area

6 2 2 A A N A LY T I C A L C H E M I S T R Y / N O V E M B E R 1 , 2 0 0 1

x0 = 4�Rs (6)Rs = �tR/w; w = 4� (7)

in which Rs is the peak resolution, �tR is change in retentiontime, w is the average baseline width of the individual over-lapping peaks, and � is the standard deviation of the Gaussianpeak, which we refer to as peak width. The value of the crit-ical resolution needed for valley separation (shoulder limit)is typically taken as Rs = 0.5 when two Gaussian peaks of iden-tical height overlap. In a multicomponent chromatogram,the critical value of Rs will be different for each overlappingpeak pair as neighboring peak heights vary. An average valuecan be determined for the critical Rs value. For the shoulderlimit, the minimum Rs value is 0.726, provided that the peakheight distribution is exponential (8).

Davis has extended the statistical model of peak overlap topoorly resolved separations and stated that the average min-imum resolution varies with the saturation level of the chro ma -togram (9, 10). This variation occurs because multicomponentpeaks, but not single-component peaks, overlap as separationefficiency decreases. The concept of resolution should be de-fined in a completely different manner for overlapping mul-tiplets than for overlapping single-component peaks.

Several surprising results can be deduced on the basis ofthese results (2). The numbers of singlets, doublets, triplets,and peaks are plotted in Figure 2a. The straight white linewould correspond to an ideal separation in which there are

no fused peaks. The number of singletsvery soon deviates from this line when � isincreased. The plot of pn/nc indicates thatits maximum value is 0.37 when m equalsnc. This indicates that in a complex multi-component chromatogram, p is never high-er than 37% of nc. The number of stand-alone peaks is still smaller. In a disorderedchromatogram, even in the most favorablecases, the number of single-componentpeaks is <18% of nc. If 90% of the peaks arestand-alone, nc should be ~20 times higherthan m.

It is also possible to determine the prob-ability that all sample components will be re-solved from a complex sample (11). The re-sults of this calculation are demonstrated inFigure 2b. The plots clearly illustrate thatthe total resolution of a mixture that con-tains 30–50 components is already hopelesseven with extremely efficient systems.

Having seen the seriousness of the prob-lem of peak overlap in complex chromato -grams, one wonders whether there are toolsto estimate how far a given separation is fromthe complete isolation of each sample com-

ponent and whether there are methods to extract the hiddenanalytical information on the composition of the mixture andthe separation system’s performance.

EEssttiimmaattiinngg mm uussiinngg tthhee nnuummbbeerr ooff ccoouunntteeddppeeaakkssDavis and Giddings (2) and Herman et al. (12) devised a rel-atively simple calculation to estimate the number of compo-nents in a complex chromatogram. On the basis of Equation2, we can calculate the total number of peaks that we can countin a chromatogram by

(8)

This relationship estimates the number of detectable com-ponents in the sample. When chromatograms are recordedat different values of nc, the peaks should be counted in eachchromatogram, and m can be determined by fitting Equation8 to the data obtained by counting peaks at different peakcapacities. The simplest way, but still slow and laborious, isto alter nc by changing the mobile-phase velocity.

The value of m can simply be estimated from one singlechromatogram, too, when the peak maxima are counted atdifferent resolutions (13). Several arbitrary x0 intervals ofvarying lengths can be chosen, and all peaks that fall into oneinterval are taken as an unresolved cluster of overlapping sin-

p =S pnn=1

= me–m/nc

0 200 400 800600 1000 1200 1400 1600 1800Time (s)

∆t = 230 s∆t = 230 s

∆t = 230 s

FFIIGGUURREE 33.. Chromatogram of a diesel fuel sample obtained from a programmed tem-perature GC analysis.

The solid line in the inset shows a Gaussian peak fitted to the last peak detected in the chromatogram.

N O V E M B E R 1 , 2 0 0 1 / A N A LY T I C A L C H E M I S T R Y 6 2 3 A

gle-component peaks rather than as separate ones. With thiscalculation, chromatograms of varying nc values are emulat-ed, and m can be determined by Equation 8.

This method gives a correct evaluation when peak densityis constant along the time axis. However, in many cases, weobserve chromatograms that are crowded at small or interme-diate tR values with fewer peaks at longer times. For the eval-uation of such inhomogeneous disordered chroma tograms,Davis proposed an algorithm that takes into account theprobability distribution function of peak density (14). Alterna-tively, the time axis can be transformed in such a manner thatthe inhomogeneity in the chromatograms can be removed (6).

Isocratic LC and isothermal GC are not really appropri-ate methods for the separation of complex mixtures becausethey offer rather limited peak capacities. The statistical han-dling of such chromatograms is further complicated becausepeak width and density are not constant, but rather dependon tR. Nevertheless, the decoding of isocratic complex chro-matograms is possible (15).

EEssttiimmaattiinngg mm uussiinngg ppeeaakk hheeiigghhttss oorr aarreeaassThe approach to estimating m just described is based on peakpositions along the time axis (retention time distribution).The complete representation of a multicomponent chromato -gram also requires the description of the peak area or heightdistribution. In fact, peak abundance (area or height) con-tains information on the quantitative composition of the mix-ture, because it is related to the amount of the compound, atleast in the linear range of the detector response.

The relative abundance of a com-pound plays a fundamental role in defin-ing the overlapping process. When twoor more peaks overlap, the area of theresulting cluster peak is the sum of theareas of the single components in thecluster. Therefore, a multicomponentchromatogram can be completely de-scribed by combining the distributionsof the retention pattern and the relativeabundance. As a consequence, infor-mation on the mixture complexity canalso be extracted by the quantitative in-terpretation of experimental chroma -tograms, using the integrator output toreport tR and area/height for each peak.It is evident that a correct quantitativeanalysis requires that each peak be anisolated “pure” peak.

Theoretical and experimental evi-dence show that the exponential distri-bution of peak abundance can be con-sidered as a reference and general model.In fact, that distribution corresponds to

a condition of maximum probability. Applying this hypothesis,Dondi et al. derived

(9)

which relates the mean peak intensity (area or height) com-puted from the peaks observed in the chromatogram, y–obs,and the mean intensity corresponding to isolated peaks, y–

(16).By selecting different critical values of x0 (related to Rs)

and using them for counting peaks and evaluating y–obs, dif-ferent values of y– and x0/X pairs are obtained. According toEquation 9, the plot ln y–obs against x0/X should be a straightline, whose slope represents a statistical estimation of m. Thestrict correspondence between Equations 8 and 9 is evident:They represent two independent approaches for estimatingm, starting from the experimental chromatogram, by meansof the observable number of peaks or the mean value of theobservable peak intensity (area).

FFoouurriieerr aannaallyyssiiss ooff ccoommpplleexx cchhrroommaattooggrraammssFelinger et al. have proposed using Fourier analysis to decodecomplex multicomponent chromatograms (17, 18). Fouriertransformation has been widely used for processing signals ofanalytical instruments, because several calculations are simplerin the frequency domain than in the time domain. Some mod-els of chromatography also offer simpler solutions in the fre-quency domain.

ln yobsx0= ln y + mX

250

200

150

100

0 0.002 0.004 0.006

p

Region of fit

13.6

13.4

12.6

12.4

11.80 0.002 0.004 0.006

In y

obs

Region of fit

13.2

13

12.8

12.2

12

(a)

50

xo/X xo/X

(b)

FFIIGGUURREE 44.. Estimating m.

(a) Plot of the peaks counted for the chromatogram in Figure 3 at different criteria for the minimum distance needed for resolution. The solid green line indicates Equation 8 fitted to the data. (b) Plot ofln y–obs calculated at different criteria for the minimum distance needed for resolution. The solid blue lineindicates Equation 9 fitted to the data.

6 2 4 A A N A LY T I C A L C H E M I S T R Y / N O V E M B E R 1 , 2 0 0 1

In this new approach, Fourier analysis considers the chro-matogram as a finite-length fraction of a stochastic process—that is, the chromatogram of a complex mixture can be han-dled as a random series of peaks, in which the position, height,and width of the single-component peaks are uncorrelatedrandom variables. The power spectrum of such a multicom-ponent chromatogram is calculated as either the time or theensemble average of the “random” chromatogram.

Models have been derived for the power spectrum of var-ious multicomponent chromatograms. Fourier analysis canbe applied to either ordered or disordered chromatogramsas well. The varying peak width and the peak height disper-sion are taken into account by Fourier analysis theory. Bymeans of the power spectrum or the autocovariance function(ACVF) of the chromatograms, the mean peak width and theretention pattern can also be determined. The power spec-trum is the square of the absolute value of the Fourier trans-formed signal. The ACVF, c(t), of a signal is calculated froma mean-centered signal as

(10)

in which y(t) is the chromatogram, t is time, and u is an aux-iliary variable.

Unlike estimates on the basis of p or y–obs, in which infor-

mation is drawn solely from the retention times or the peakheights, Fourier analysis uses the total chromatographic sig-nal. Therefore, the shapes of the single-component peaks arealso contained in the model.

According to the Wiener–Khinchin theorem, the powerspectrum and ACVF form a “Fourier pair”. Thus, the ACVFand the power spectrum are identical tools to characterizemulticomponent chromatograms. Accordingly, we do noteven have to calculate the Fourier transform or the powerspectrum of the chromatogram; in many instances, it is suf-ficient to analyze the ACVF.

The ACVF can be numerically computed from the exper-imental chromatogram and represents the correlations in thechromatogram. Usually, three informative regions of theACVF can be distinguished (19). The region that containsthe shortest-range correlations only depends on the shape ofthe single-component peaks. The next part of the ACVF ismostly determined by the distribution of the retention timeincrements. The deterministic part is significant if the chro-mato gram is ordered—that is, if there is a constant and repet-itive peak pattern. When the peaks are ordered, the ACVFwill exhibit a series of peaks shifted by the same delay asmajor peaks in the chromatogram. In the final region of theACVF, random fluctuations will occur mostly because a fi-nite-length chromatogram is analyzed.

Expressions were derived to compute the ACVF in termsof the chromatographic parameters, such as m, average peak-shape parameters, and retention pattern. When � is constantin a disordered complex chromatogram, the ACVF becomes

(11)

in which AT is the total area of the chromatogram, �a is thestandard deviation, and aa is the average value of the peakareas. By fitting Equation 11 into the ACVF calculated fromthe recorded chromatogram, we can evaluate m and � of sin-gle-component peaks. AT and the peak-height distribution(�a/aa) should be calculated directly from the chro-matogram prior to estimating the parameters m and �.

The concentration of the components present in a multi-component mixture may span a huge range. Experimental datashow that the distribution of the single-component peak areasis very close to the exponential distribution (20, 21). The sam-ple size for a chromatographic analysis must be chosen sothat the number of detectable components is as large as pos-sible. If the sample size is too small, many components willbe lost in the baseline noise. On the other hand, if the samplesize is too large, the concentration of some components will beso high that the column will be overloaded. Because of col-umn overload, the � increases, and column efficiency drops.Therefore, the probability of peak overlap will increase.

Fourier analysis is an excellent tool to determine the con-

c(t) = AT(sa/aa + 1)2s π Xm

e–t 2/4s22 2 2

˛W

c(t) = 1X

[y(u) – y][y(u + t) – y]du∫X

0

0.01

0.008

0.004

0.002

0

0 100 200 300

ACVF

10

0.006

400 500 600 700 800Time (s)

0.01

ACVF

0155

Time (s)

dh,c c (0)

∆t = 230 s

∆t = 230 s

∆t = 230 s

FFIIGGUURREE 55.. The autocovariance function (ACVF) of the chro-matogram in Figure 3.

The arrows indicate the characteristic distance between the major chro-matographic peaks. The inset shows the beginning of the ACVF (dots)and the fitted Equation 11 (solid blue line). The arrows in the inset indi-cate the measurement of c(0) and dh,c, which is needed for estimating byEquation 12.

N O V E M B E R 1 , 2 0 0 1 / A N A LY T I C A L C H E M I S T R Y 6 2 5 A

centration range or sample size that is optimum for the high-est value of m and that can extract the maximum amount ofinformation (22). As sample concentration increases, moreand more detectable peaks will emerge out of the baselinenoise. The average value of � is nearly constant in this region.When the sample concentration further increases, the majorsample peaks broaden and distort. Therefore, the average �suddenly rises, and the estimated value of m drops. Thus,the sample size at which the global overload of the column isreached is simply identified by Fourier analysis.

A simple procedure to estimate mwas also developed, on thebasis of a graphical in-spection of the plot ofthe ACVF (23). Thefirst part of the ACVF re-sembles a descending halfof a Gaussian peak; itswidth depends on the av-erage values of � for allthe peaks in the chromato -gram. Therefore, the half-height width of the ACVFcan simply be related to themean � value. Experimen-tal determination of this param-eter is difficult because of peak overlap and time-con-suming because of the numerous peaks in the chroma togram.However, the chemometric approach allows a direct statisti-cal evaluation of the mean � value as an average value com-puted for all the peaks in the chromatogram. The averagepeak width is fundamental for evaluating the system perform-ance and for detecting effects, such as strong retention sitesor column overload, which increase peak width. Furthermore,this procedure can be used to select the best-programmedtemperature conditions that lead to constant peak width be-cause the ACVF can also estimate the minimum and the max-imum peak widths (24).

Moreover, a simple mathematical expression can be ob-tained to relate the value of the ACVF at t = 0 to m for themixture, so this value can simply be estimated

(12)

in which dh,c is the half width of the ACVF, c(t), at halfheight.

By means of a simple visual analysis of the autocovarianceplot, it is possible to recognize repetitive retention patterns.The region of the ACVF for distances higher than 4� con-tains information on the retention pattern. If an ordered re-tention pattern exists in the chromatogram—peaks appearingat repeated distance as a sequence (homologous series) or at

constant distances located in different parts of the chromato -gram (functional group increment)—some positive peaks arepresent at the corresponding distance values in the ACVF.

In contrast to the original chromatogram that is crowdedwith peaks hiding repetitions in the retention pattern underoverlapping peaks, the ACVF plot retains only a small num-ber of repeated peaks, which correspond to the most abundantor the most repeated components. The result is that theACVF plot is much simpler and allows regularities in the re-tention pattern and specific retention–molecular structure

relationships to be identified.

DDeeccooddiinngg aa ccoommpplleexx cchhrroommaattooggrraammThese procedures are il-

lustrated by investigatinga complex chromato gramobtained by a programmedtemperature capillary GCanalysis of a diesel fuelsample on an SPB-1 poly-dimethylsiloxane station-ary phase (Figure 3).There are 180 peaks in

the chromatogram. We ana-lyzed this chromatogram with the independent statisti-

cal methods discussed in this feature.An estimation using counted peaks is illustrated in Figure

4a. The dots represent the peaks counted at different nc val-ues. Equation 8 was fitted in the region indicated in Figure4. The data points at the very beginning of the figure shouldbe neglected during the fitting because the widths of the peakshinder information characteristic to high nc. In essence, theregion that is neglected during the fitting is that which is re-sponsible for the peak loss due to overlap, because in that in-terval, peaks are closer to each other than X. The best-fit pa-rameter estimates that m = 247 in the sample.

When the chromatogram is analyzed on the basis of thepeak areas, Equation 9 can be used to estimate m. Figure 4bshows that fitting Equation 9 to the logarithm y–obs results ina very similar value to that obtained by Equation 8. Equa-tion 9 estimates m = 244. The agreement is remarkable be-tween the two independent methods. The data points atsmall values of x0/X are again neglected. In theory, points inFigure 4b should be linear; but in reality, the data slightlydeviate from linearity. Most probably, the origin of the devi-ation arises because peak areas are not completely random inthe sample chromatogram; instead, major peaks are regular-ly repeated.

The ACVF of the chromatogram is shown in Figure 5.The constant repetition of the peaks in the ACVF demon-strates that in the original chromatogram, a periodic behavioris present with a �t = 230 s. This corresponds to �t between

m = AT(sa/aa + 1)2.129c(0)dh,cX

2 2 2

Global overload of the column is

simply identified by Fourier analysis.

6 2 6 A A N A LY T I C A L C H E M I S T R Y / N O V E M B E R 1 , 2 0 0 1

the major sample components (Figure 3). The beginning ofthe ACVF should be used for parameter estimation. The fit-ted ACVF is plotted in the inset in Figure 5, and the best-fitparameters indicate that m = 242 and � = 2.38 s. We canconsider this latter value as an average peak width because,although temperature was programmed to maintain constantpeak width during the elution, slight deviations occurred.When identifying a stand-alone peak in the chroma togram,we can verify the result of the nonlinear fitting. The insert inFigure 3 shows the last peak of the chromatogram, with aGaussian curve fitted to the peak. The width of that peak is� = 2.31 s, which is in extremely good agreement with thevalue estimated from the ACVF. Finally, with a simple mea -surement of the ACVF at the origin and the half width athalf height, Equation 12 gives m = 237.

It is obvious that the total number of compounds in thesample is much higher than 240, but the applied analyticalconditions (sample size, column length, mobile-phase veloc-ity, etc.) set a limit. Numerous components are buried in thebaseline noise and therefore are undetectable.

PPeerrssppeeccttiivveessWith these statistical analyses of complex chromatograms, theextent of separation, the complexity or structure of the samplemixture, and the average peak-shape parameters can be esti-mated. We can also use these analyses to optimize the samplesize and the gradient or temperature program. The informa-tion obtained this way improves our understanding of theseparation system’s performance. It is obvious, however, thatthe information can be improved by completely isolating andidentifying all sample components.

The statistical model of peak overlap clearly demonstratesthat the resolving power of 1-D separation systems is notenough for the complete resolution of complex mixtures.This justifies the need for orthogonal multicolumn separa-tion systems in which the two dimensions are independent,orthogonal separation axes. The better resolving power oforthogonal multidimensional separations is due to nc values,which are the product of the nc for the 1-D systems.

An example of such multidimensional separations is com-prehensive GC � GC methods, in which the retention on theindividual columns is due to completely different molecularproperties. With a properly tuned comprehensive GC � GCsystem, nc = ~18,000. Alternatively, LC/GC procedures, inwhich LC constitutes a preseparation step using a noncorre-lated retention mechanism, also permit a preconcentrationof the less-abundant components prior to GC analysis. Theextent of spot overlap can be estimated in multidimensionalseparations, too. Therefore, the decoding of multidimension-al chromatograms is similar to 1-D separations (25, 26).

The growing need to analyze combinatorial synthesis li-braries and the 2-D gel separation of proteins will pose furtherchallenges to multicomponent separations. Decoding such

signals and data may yield substantial information even with-out complete identification.

This work was supported by research grants T025458 and T034353 fromthe Hungarian Science Foundation and by a grant from the Italian Min-istry of the University and Scientific Research.

Attila Felinger is an associate professor at the University of Veszprém(Hungary). His research interests focus on the fundamentals of chro-matography, including nonlinear, preparative, and analytical separa-tions, as well as the statistical analysis of measurements by analyticalchromatography. Maria Chiara Pietrogrande is an associate professorat the University of Ferrara (Italy). Her research studies concern theanalysis of complex mixtures by optimizing separations in gas (GC,GC/MS) and liquid (TLC, HPLC) phases, and the statistical analysis ofcomplex chromatograms. Address correspondence to Felinger, De-partment of Analytical Chemistry, University of Veszprém, Egyetemutca 10, H–8200 Veszprém, Hungary ([email protected]).

RReeffeerreenncceess(1) Berger, T. A. Chromatographia 11999966,, 42, 63–71.(2) Davis, J. M.; Giddings, J. C. Anal. Chem. 11998833,, 55, 418–424.(3) Davis, J. M.; Pompe, M.; Samuel, C. Anal. Chem. 22000000,, 72,

5700–5713.(4) Giddings, J. C. J. Chromatogr., A 11999955,, 703, 3–15.(5) Klein, P. D.; Tyler, S. A. Anal. Chem. 11996655,, 37, 1280–1281.(6) Felinger, A. Anal. Chem. 11999955,, 67, 2078–2087.(7) Pietrogrande, M. C.; Dondi, F.; Felinger, A.; Davis, J. M. Chemom.

Intell. Lab. Syst. 11999955,, 28, 239–258.(8) Felinger, A. Anal. Chem. 11999977,, 69, 2976–2979.(9) Davis, J. M. Anal. Chem. 11999977,, 69, 3796–3805.(10) Davis, J. M. J. Chromatogr., A 11999999,, 831, 37–49.(11) Martin, M.; Herman, D. P.; Guiochon, G. Anal. Chem. 11998866,, 2200–2207.(12) Herman, D. P.; Gonnord, M.-F.; Guiochon, G. Anal. Chem. 11998844,, 56,

995–1003.(13) Davis, J. M.; Giddings, J. C. Anal. Chem. 11998855,, 57, 2168–2177.(14) Davis, J. M. Anal. Chem. 11999944,, 66, 735–746.(15) Davis, J. M. J. Microcolumn. Sep. 11999977,, 9, 193–203.(16) Dondi, F.; Bassi, A.; Cavazzini, A.; Pietrogrande, M. C. Anal. Chem.

11999988,, 70, 766–773.(17) Felinger, A.; Pasti, L.; Dondi, F. Anal. Chem. 11999900,, 62, 1846–1853.(18) Felinger, A. Data Analysis and Signal Processing in Chromatography;

Elsevier: Amsterdam, 1998.(19) Pietrogrande, M. C.; Pasti, L.; Dondi, F.; Rodriguez, M. H. B.; Diaz,

M. A. C. J. High Resolut. Chromatogr. 11999944,, 17, 2–13.(20) El Fallah, M. Z.; Martin, M. Chromatographia. 11998877,, 24, 115–122.(21) Nagels, L. J.; Creten, W. L. Anal. Chem. 11998833,, 55, 216–220.(22) Felinger, A.; Vigh, E.; Gelencsér, A. J. Chromatogr., A 11999999,, 839, 129–139.(23) Pietrogrande, M. C. Dondi, F.; Felinger, A. J. High Resolut. Chromatogr.

11999966,, 19, 327–332.(24) Felinger, A.; Pasti, L.; Dondi, F. Anal. Chem. 11999911,, 63, 2627–2633.(25) Davis, J. M. Anal. Chem. 11999911,, 63, 2141–2152.(26) Davis, J. M. Anal. Chem. 11999933,, 65, 2014–202.