Selection of orthogonal chromatographic systems based on parametric and non-parametric statistical...

10
Analytica Chimica Acta 539 (2005) 1–10 Review Selection of orthogonal chromatographic systems based on parametric and non-parametric statistical tests eter Forlay-Frick a , Elke Van Gyseghem a , K´ aroly H´ eberger b,∗∗ , Yvan Vander Heyden a,a Department of Pharmaceutical and Biomedical Analysis, Pharmaceutical Institute, Vrije Universiteit Brussel-VUB, Laarbeeklaan 103, B-1090 Brussels, Belgium b Chemical Research Center, Hungarian Academy of Sciences, H-1525 Budapest, P.O. Box 17, Hungary Received 17 December 2004; received in revised form 15 February 2005; accepted 18 February 2005 Available online 31 March 2005 Abstract The orthogonal/similar character of 38 chromatographic systems using retention factor data of 68 drug substances was determined using different parametric and non-parametric methods. Non-parametric methods can be recommended, as the majority of retention data cannot be considered as normally distributed. The generalized pairwise correlation method (GPCM) with different statistical tests was applied to examine the similarity/orthogonality of the tested systems. Beside this, the Pearson’s (product moment) correlation coefficient, the Spearman’s rho and the Kendall’s tau were also used as conventional correlation parameters. The Williams’ t-test (as a parametric statistical test), and the Conditional Fisher’s, the McNemar’s and the Chi-square tests (as non-parametric statistical tests) were used for hypotheses testing. Except for the selection using correlation coefficients all other measures are non-parametric. A new procedure was applied to establish orthogonality of the chromatographic systems. From the different correlation measures so-called orthogonality ratios were calculated. The ratios originating from GPCM with McNemar’s test was found to be the best to select orthogonal (dissimilar) and similar systems. The method was compared with various alternatives to validate the findings (among others with the ratios from Pearson’s correlation coefficients). The non-parametric options for correlation coefficients (Spearman’s rho and Kendall’s tau) are found not to be sensitive enough to define the orthogonality of these chromatographic systems. © 2005 Elsevier B.V. All rights reserved. Keywords: HPLC; Variable selection; Pairwise correlations; Non-parametric methods; Generalized pairwise correlation method; Spearman’s rho; Kendall’s tau Contents 1. Introduction ....................................................................................................... 2 2. Theory part ........................................................................................................ 2 3. Experimental ...................................................................................................... 4 3.1. Drugs and systems ........................................................................................... 4 3.2. Calculation procedure by GPCM ............................................................................... 5 3.3. Calculations ................................................................................................. 5 4. Results and discussion .............................................................................................. 5 5. Conclusions ...................................................................................................... 10 Acknowledgement ................................................................................................ 10 References ....................................................................................................... 10 Corresponding author. Tel.: +32 2 477 4723; fax: +32 2 477 4735. ∗∗ Co-corresponding author. Tel.: +36 1 325 7900; fax: +36 1 325 7554. E-mail addresses: [email protected] (Y. Vander Heyden), [email protected] (K. H´ eberger). 0003-2670/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.aca.2005.02.058

Transcript of Selection of orthogonal chromatographic systems based on parametric and non-parametric statistical...

Analytica Chimica Acta 539 (2005) 1–10

Review

Selection of orthogonal chromatographic systems based onparametric and non-parametric statistical tests

Peter Forlay-Fricka, Elke Van Gyseghema, Karoly Hebergerb,∗∗, Yvan Vander Heydena,∗a Department of Pharmaceutical and Biomedical Analysis, Pharmaceutical Institute, Vrije Universiteit Brussel-VUB,

Laarbeeklaan 103, B-1090 Brussels, Belgiumb Chemical Research Center, Hungarian Academy of Sciences, H-1525 Budapest, P.O. Box 17, Hungary

Received 17 December 2004; received in revised form 15 February 2005; accepted 18 February 2005Available online 31 March 2005

Abstract

The orthogonal/similar character of 38 chromatographic systems using retention factor data of 68 drug substances was determined usingdifferent parametric and non-parametric methods. Non-parametric methods can be recommended, as the majority of retention data cannot beconsidered as normally distributed.

nality ofre also used’s andcoefficien. From thetest wasto validateefficientstems.

Kendall’s

The generalized pairwise correlation method (GPCM) with different statistical tests was applied to examine the similarity/orthogothe tested systems. Beside this, the Pearson’s (product moment) correlation coefficient, the Spearman’s rho and the Kendall’s tau weas conventional correlation parameters. The Williams’t-test (as a parametric statistical test), and the Conditional Fisher’s, the McNemarthe Chi-square tests (as non-parametric statistical tests) were used for hypotheses testing. Except for the selection using correlationtsall other measures are non-parametric. A new procedure was applied to establish orthogonality of the chromatographic systemsdifferent correlation measures so-called orthogonality ratios were calculated. The ratios originating from GPCM with McNemar’sfound to be the best to select orthogonal (dissimilar) and similar systems. The method was compared with various alternativesthe findings (among others with the ratios from Pearson’s correlation coefficients). The non-parametric options for correlation co(Spearman’s rho and Kendall’s tau) are found not to be sensitive enough to define the orthogonality of these chromatographic sys© 2005 Elsevier B.V. All rights reserved.

Keywords: HPLC; Variable selection; Pairwise correlations; Non-parametric methods; Generalized pairwise correlation method; Spearman’s rho;tau

Contents

1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22. Theory part. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23. Experimental. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.1. Drugs and systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.2. Calculation procedure by GPCM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.3. Calculations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4. Results and discussion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Acknowledgement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

∗ Corresponding author. Tel.: +32 2 477 4723; fax: +32 2 477 4735.∗∗ Co-corresponding author. Tel.: +36 1 325 7900; fax: +36 1 325 7554.

E-mail addresses:[email protected] (Y. Vander Heyden), [email protected] (K. Heberger).

0003-2670/$ – see front matter © 2005 Elsevier B.V. All rights reserved.doi:10.1016/j.aca.2005.02.058

2 P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10

1. Introduction

The determination of impurities in drug substances isvery important in pharmaceutical analysis in order to avoidthe undesired effects of the impurities for the patients. Fornew substances the ICH guideline defines thresholds, abovewhich the impurities should be reported, identified and qual-ified [1]. These limits depend on the maximum daily dose,i.e. the amount of drug substance daily administered[1]. Thecurrent drugs, which can already be found in different Phar-macopoeias (for instance in the European Pharmacopoeia,USP Pharmacopoeia), have acceptance criteria referringto their impurity contents and impurity profile[2,3]. Thedrugs not fulfilling these requirements cannot be used formedicinal purposes. Analytically, separation of a drug and itsimpurities is a big challenge, which is not evident to achieve.The fraction of the impurities (e.g. remaining reaction in-termediates, degradation products and reaction by-products)is relatively small and they can have very similar chemicalstructures to that of the active substances, which mightcomplicate separation. Moreover, neither the structure northe number of impurities is known initially[4]. Accordingly,the development of an adequate method, with which allimpurities can be measured according to the requirements,could be very labor-intensive and time-consuming.

Different chromatographic systems might be applied inp odd posi-t o-t minet hen ,t rthogo ityis theya ot.I d, e.ga en al de-fi phics ef-fi lledo pairso ilar)a withp al( ifferi d byd ases.A ings at thec e in-f

og-o sing

the retention data of 68 drug substances for classifica-tion [5,6]. The systems differed in stationary phases, mo-bile phase compositions, pHs and buffers (Table 1). The68 drugs were chosen in such a way that they representa broad range of drug molecules in order to disclose thegeneric orthogonality of systems. The orthogonal systemsand groups of similar ones were selected from interpret-ing color maps and weighted-average-linkage dendrograms,both based on the Pearson’s correlation coefficients (r) cal-culated between the retention factors of all substances. Thesystems were ranked in the color map of correlation co-efficients according to decreasing dissimilarities (1− |r|)seen in the dendrogram. This facilitated the definition oforthogonal systems and similar groups from color map. Inthe original input matrix, retention data of 68 drug sub-stances were enumerated in the rows (objects) and chromato-graphic systems were arranged in the columns (variables).In such a way each column (variable or chromatographicsystem) had 68 elements. The classification was based onthe implicit assumption of normal distribution of these vari-ables.

In this study the distributions of the retention data of thedrug substances have been checked first. Then, different para-metric and non-parametric test methods were used and com-pared to select orthogonal chromatographic systems out ofthe 38 systems. Different other measures of correlation werea tau.I eend hro-m onalh ortedc tyr rson’sc gen-e wedf

sta-t thod( rre-l s tauc r sys-t hoseo onals

2

ient[h ation.T ener-ad ,M f(

arallel to determine initial starting conditions for methevelopment and to obtain an idea concerning the com

ion of the tested sample[5–8]. Many different systems pentially could be used as a primary screening to deterhe impurity profile of new drugs and the limitation of tumber of systems applied is very important[8]. Therefore

here is an urgent need to select systems, which are as onal as possible[5–8]. The term orthogonal or orthogonal

s not used in its strict mathematical sense here[8]. Strictlypeaking two vectors (parameters) are orthogonal whenre uncorrelated (r = 0) and they are either orthogonal or n

n situations as ours where various systems are compares potential starting points for method development, oft

ess strict definition is applied. Orthogonal systems arened as systems that differ significantly in chromatograelectivity[9]. Two systems for which the correlation cocient for retention data is low are also considered or carthogonal. It also means that, e.g. when comparingf systems, terms as more orthogonal (or more dissimnd fairly orthogonal can be applied. To keep analogyrevious publications[5–9] preferably the term orthogonrather than dissimilar) is used. Orthogonal systems dn selectivity, because the retention of solutes is causeifferent interactions between solutes and stationary phpplication of a set of orthogonal systems allows obtaineparations that are as diverse as possible, implying thhromatographic systems complement each other in thormation provided.

A method was published earlier to select the orthnal systems out of 38 chromatographic systems u

-

.

lso considered: the Spearman’s rho and the Kendall’sn this work a new term, the orthogonality ratio has also befined. This ratio expresses the similarity of the tested catographic system to all others. It provides an additielp to select orthogonal systems besides the earlier repolor map technique[6]. The calculation of orthogonaliatio for systems depends on the applied method (Peaorrelation coefficient, Spearman’s rho, Kendall’s tau orralized pairwise correlation method), hence, this is revie

urther on.The aim of this study was to evaluate how the various

istical techniques: generalized pairwise correlation meGPCM) with different statistical tests and different coation measures: the Spearman’s rho and the Kendall’an be used to define classes of orthogonal and similaems. The results were compared and validated with tbtained using correlation coefficients to select orthogystems[6].

. Theory part

The Pearson’s product moment correlation coeffic10], the Spearman’s rho[11] and the Kendall’s tau[12]ave been applied in this study as measures for correlhe results have been compared to the results of the glized pairwise correlation method (GPCM)[13–15] usingifferent statistical tests (Williams’t, Conditional Fisher’scNemar’s and Chi-square tests)[15] for examination o

dis)similar classes.

P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10 3

Table 1Stationary phases and conditions, which define the 38 chromatographic systems

Systemnumber

Stationary phase Mobile phase composition and chromatographic conditions

1 Chromolithperformancea

Methanol/0.08 M sodium phosphate buffer pH 3.0 from 10:90 to 75:25% (v/v) in 4 min; flow rate 2.0 mL/min; 40◦C

2 Chromolithperformancea

Methanol/0.08 M sodium phosphate buffer pH 6.8 from 10:90 to 75:25% (v/v) in 3 min; flow rate 2.0 mL/min; 40◦C

3 Zorbax Extend-C18b Methanol/0.08 M sodium borate buffer pH 10.0 from 10:90 to 75:25% (v/v) in 6 min; flow rate 1.0 mL/min; 40◦C4 ZirChrom-PSc Methanol/0.08 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 6 min; flow rate 1.5 mL/min; 40◦C5 ZirChrom-PSc methanol/0.08M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 4 min; flow rate 1.5 mL/min; 40◦C6 ZirChrom-PSc Methanol/0.08 M sodium borate buffer pH 10.0 from 10:90 to 70:30% (v/v) in 4 min; flow rate 1.5 mL/min; 40◦C7 ZirChrom-PSc Methanol/0.08 M sodium borate buffer pH 10.0 from 10:90 to 70:30% (v/v) in 4 min; flow rate 1.2 mL/min; 75◦C8 ZirChrom-PSc Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C9 ZirChrom-PSc Acetonitrile/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C

10 Platinum C18d Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 5 min; flow rate 3.0 mL/min; 40◦C11 Platinum EPS C18d Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 5 min; flow rate 3.0 mL/min; 40◦C12 Zorbax Eclipse

XDB-C8e

Methanol/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C

13 Zorbax EclipseXDB-C8

eMethanol/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C

14 Zorbax EclipseXDB-C8

eAcetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C

15 Zorbax EclipseXDB-C8

eAcetonitrile/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C

16 Betasil Phenyl Hexylf Methanol/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C17 Betasil Phenyl Hexylf Methanol/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C18 Betasil Phenyl Hexylf Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C19 Betasil Phenyl Hexylf Acetonitrile/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C20 Suplex pKb-100e Methanol/Britton–Robinson buffer pH 2.5 from 30:70 to 75:25% (v/v) in 20 min; flow rate 1.0 mL/min; 40◦C21 Suplex pKb-100e Methanol/Britton–Robinson buffer pH 7.5 from 30:70 to 70:30% (v/v) in 10 min; flow rate 2.0 mL/min; 40◦C22 ZirChrom-PBDc Methanol/Britton–Robinson buffer pH 2.5 from 30:70 to 75:25% (v/v) in 20 min; flow rate 1.0 mL/min; 40◦C23 ZirChrom-PBDc Methanol/Britton–Robinson buffer pH 7.5 from 30:70 to 70:30% (v/v) in 20 min; flow rate 1.0 mL/min; 40◦C24 ZirChrom-PBDc Methanol/0.016 M borate buffer pH 10.0 from 30:70 to 75:25% (v/v) in 8 min; flow rate 1.5 mL/min; 40◦C25 Chromolith

PerformanceaAcetonitrile/0.08 M sodium phosphate buffer pH 3.0 from 10:90 to 60:40% (v/v) in 6 min; flow rate 2.0 mL/min; 40◦C

26 ChromolithPerformancea

Acetonitrile/0.08 M sodium phosphate buffer pH 7.5 from 10:90 to 60:40% (v/v) in 6 min; flow rate 2.0 mL/min; 40◦C

27 Aquac Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C28 Aquac Acetonitrile/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 75:25% (v/v) in 4 min; flow rate 2.0 mL/min; 40◦C29 Suplex pKb-100e Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C30 Suplex pKb-100e Acetonitrile/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C31 PLRP-Se Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C32 PLRP-Se Acetonitrile/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C33 Luna CNc Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C34 Luna CNc Acetonitrile/0.08 M sodium phosphate buffer pH 5.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C35 ZirChrom-PBDc Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 5 min; flow rate 2.0 mL/min; 75◦C36 ZirChrom-PBDc Acetonitrile/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 5 min; flow rate 2.0 mL/min; 75◦C37 Zorbax Extend-C18b Acetonitrile/0.04 M sodium phosphate buffer pH 3.0 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C38 Zorbax Extend-C18b Acetonitrile/0.04 M sodium phosphate buffer pH 6.8 from 10:90 to 70:30% (v/v) in 8 min; flow rate 1.0 mL/min; 40◦C

a 100 mm× 4.6 mm i.d.b 150 mm× 4.6 mm i.d., 3.5�m.c 100 mm× 4.6 mm i.d., 3�m.d 53 mm× 7 mm i.d., 3�m.e 150 mm× 4.6 mm i.d., 5�m.f 100 mm× 4.6 mm i.d., 5�m.

The Pearson’s correlation coefficient is the standard corre-lation coefficient, which is frequently used without knowingthe assumptions on two random variables. If two variables ofa parent population are uncorrelated, the probability,P(r, n),that a random sample ofn observations will yield a correla-tion coefficient for those two variables greater in magnitude

than|r| is given by[16]:

P(r, n) = 1√π

Γ [(ν + 1)/2]

Γ [ν/2]

∫ 1

|r|(1 − x2)

1/2(ν−2)dx (1)

where the degrees of freedomν =n− 2.

4 P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10

If the probability is predefined, say at 5%, then at a givennumber of observations the limit value of correlation coef-ficients can be calculated above which the correlation can-not be considered as random. Bevington gathered the limitvalues in a table[16]. Later, Pecka and Ponec developed amethod, which allows evaluating the statistical importanceof independent correlations for multilinear relationships[17].

As it will be seen later, the measured retention data werenot normally distributed. Hence, in principle, non-parametricmethods are recommended instead. The Spearman’s rho andthe Kendall’s tau can be reckoned as non-parametric alterna-tives of the correlation coefficient.

The Spearman’s rho (ρ) [11] uses the same equation asPearson’s correlation coefficient but instead of the randomvariables (measured values) their rank numbers (R, lowestvalue gets the lowest rank, and so on) are used in the formula:

ρ =∑n

i=1 R(xi)R(yi) − n(

n+12

)2

A(x)A(y)(2)

where

A(x) =(

n∑i=1

R(xi)2 − n

(n + 1

2

)2)1/2

a

A

if-foi

τ

I rdantp eanb rn tofi h then

ac-c nedc tionc

fs le,T hei se thes tho de-pe

are considered that can occur when the differences�X1 forY versusX1, and�X2 for Y versusX2 are determined. Onlythe signs of the differences are taken into account:

�X1 = (X1i − X1j) sgn(Yi − Yj),

�X2 = (X2i − X2j) sgn(Yi − Yj), 1 ≤ i ≤ j ≤ n (4)

sgn(Yi − Yj) =

0 if Yi = Yj

|Yi − Yj|Yi − Yj

otherwise

wheren is the number of measurements.

There will bem =(n

2

)= n(n − 1)/2 point pairs and dif-

ferences�X1 as well as�X2. The frequencies for the fourpossible different signs of�X1 and�X2 are arranged in a2× 2 contingency table. If both differences are positive (orboth are negative), the distinction cannot be made betweenX1andX2. However, if the frequency value for opposite signs ofdifferences forX1 is significantly greater, then,X1 is termedas superior, otherwiseX2. Whether the frequency value issignificant or not, this can be determined using a suitablestatistical test.

In its generalized form (GPCM) all possible independentvariable pairs are compared and a number of “superiority”i then“ r-i esv ero s ofv sta-t theM testsa thods( nk-i d (c)p s inw irwisec hen usedf foura lFn onala

3

3

sub-s ity ofs were

nd

(y) =(

n∑i=1

R(yi)2 − n

(n + 1

2

)2)1/2

The Kendall’s tau (τ) [12] is based on calculating the derence between the number of concordant (Nc:xi >xj if yi >yjrxi <xj if yi <yj) and discordant (Nd: xi >xj if yi <yj orxi <xj

f yi >yj) pairs:

= Nc − Nd

n(n − 1)/2(3)

t is computed as the excess of concordant over discoairs, divided by a term representing the geometric metween the number of pairs not tied onX and the numbeot tied onY. Kendall’s tau has similar efficiency (power)nd significant correlations as Spearman’s rho, althougumerical values can be very different[11,12].

GPCM is also a non-parametric method taking intoount the pairwise (cor)relations of the above mentiohromatographic systems without calculating any correlaoefficients.

The pairwise correlation method (PCM)[13–15] selectsrom two independent variables (X1 andX2) which one isuperior, i.e. “correlates” better to the dependent variabY.hree vectors are definedY,X1 andX2 (the dependent and t

ndependent variables, respectively). The task is to choouperior one from theX1 andX2. First, it is assumed that bof the independent variables correlate positively with theendent variable,Y. Other cases are discussed in refs[13–15]xhaustively. All the possible element pairs of theY vector

s determined. The number of “superiority” is termed asumber of wins, i.e. how many times a givenX variable wassuperior” to the otherX variables. The number of “inferioty” is termed as the number of losses; i.e. how many timXariable was “inferior” to the otherX variables. The numbf wins is simple summed for all possible comparisonariable pairs. The “superiority” was determined usingistical tests: the Williams’t-test as a parametric test andcNemar’s, the Chi-square and the Conditional Fisher’ss non-parametric statistical tests. Several ranking mea) simple ranking according to the number of wins, (b) rang according to the differences in wins and losses, anrobability weighted ranking according to the differenceins and losses were elaborated for the generalized paorrelation method[13,14]. Here, the ranking based on tumber of wins minus the number of losses (b) was

or comparison of 38 chromatographic systems and thebove-mentioned statistical tests (Williams’t, Conditionaisher’s, McNemar’s and Chi-square tests)[15] for discrimi-ation of variables were applied to classify/select orthognd similar chromatographic systems.

. Experimental

.1. Drugs and systems

The 38 chromatographic systems and the 68 drugtances used to determine the orthogonality and similarystems based on normalized retention times of drugs

P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10 5

published earlier[5,6]. Accordingly, only the systems arespecified here in order to make the presentation of results un-derstandable and traceable.Table 1contains an overview ofthe systems.

3.2. Calculation procedure by GPCM

Each of the 38 chromatographic systems was consid-ered as supervisor, i.e. as dependent variable (Y), once andonly once. All remaining chromatographic systems (37) wereranked using GPCM. During this procedure, the number ofsuperiority of the given system was determined pairwiseagainst all the remaining systems (37). The numbers of winsand losses have been added together and the systems wereranked according to the differences in the number of winsand losses. In that way 38 rankings were received and ar-ranged in a matrix form. The procedure was executed fourtimes using different tests to establish the ordering.

3.3. Calculations

Two kinds of software were used for the calculations.The calculation of Spearman’s rho and Kendall’s tau, aswell as the tests for normality (Kolmogorov–Smirnov, Lil-liefors and Shapiro–Wilk’s tests) were performed by Statis-tica 6.0, data analysis software system (Tulsa, OK, USA,2 us-i are,t -e are(

TheM

4

e or-t asiso rhoa tionm er’s,M

en-t for-m everalt then tics.S s ofd cials nessa ases,s andS

lk’st ratio

Table 2The p-values for the distribution of retention on the different chro-matographic systems when applying Kolmogorov–Smirnov, Lilliefors andShapiro–Wilk’s tests

Systemnumber

p-value Distribution

Kolmogorov–Smirnov

Lilliefors Shapiro–Wilk’s

1 >0.20 >0.20 0.279 Normal2 >0.20 <0.10 0.010 Ambiguous3 >0.20 <0.05 0.135 Ambiguous4 >0.20 >0.20 0.091 Normal5 <0.10 <0.01 0.000 Non-normal6 >0.20 <0.05 0.004 Non-normal7 <0.15 <0.01 0.000 Non-normal8 <0.15 <0.01 0.001 Non-normal9 >0.20 <0.05 0.000 Non-normal

10 >0.20 >0.20 0.168 Normal11 >0.20 <0.10 0.027 Ambiguous12 >0.20 >0.20 0.085 Normal13 >0.20 <0.05 0.015 Non-normal14 >0.20 <0.10 0.038 Ambiguous15 >0.20 >0.20 0.353 Normal16 >0.20 >0.20 0.331 Normal17 >0.20 <0.15 0.035 Ambiguous18 >0.20 <0.01 0.001 Non-normal19 >0.20 >0.20 0.045 Normal20 <0.01 <0.01 0.000 Non-normal21 <0.01 <0.01 0.000 Non-normal22 <0.15 <0.01 0.000 Non-normal23 <0.05 <0.01 0.000 Non-normal24 <0.10 <0.01 0.000 Non-normal25 >0.20 <0.05 0.006 Non-normal26 >0.20 <0.05 0.000 Non-normal27 >0.20 <0.10 0.027 Ambiguous28 <0.10 <0.01 0.000 Non-normal29 >0.20 <0.05 0.011 Non-normal30 >0.20 <0.15 0.000 Ambiguous31 >0.20 <0.05 0.031 Non-normal32 >0.20 >0.20 0.128 Normal33 <0.15 <0.01 0.000 Non-normal34 <0.20 <0.01 0.001 Non-normal35 >0.20 >0.20 0.005 Ambiguous36 >0.20 <0.05 0.000 Non-normal37 >0.20 <0.10 0.012 Ambiguous38 >0.20 >0.20 0.471 Normal

of the best linear unbiased estimator for scaling to thestandard deviation. Unfortunately, this test is limited tosmall samples (n≤ 20), but valid approximations existup to n= 50 [19]. The efficiency of the methods forchecking the normality decreases in the following order:Shapiro–Wilk’s test > Lilliefors test > Kolmogorov–Smirnovtest. Even though the most reliable results can be obtainedby Shapiro–Wilk’s test, we used all three tests to evaluate thedata distribution.

The p-values by the different tests are summarized inTable 2. The applied retention parameter was the retentionfactor (k). If all threep-values for a given chromatographicsystem were higher than 0.05 (5%), we accepted the nullhypothesis (H0: the distribution of retention data is normal),and the system was considered as normally distributed. If two

004, http://www.statsoft.com/). The GPCM calculationsng different statistical tests (the McNemar’s, the Chi-squhe Conditional Fisher’s and the Williams’t-tests) were excuted by pairwise correlation method PCM 1.0 softwKaroly Heberger and Rajko Robert, Hungary, 1998).

The color maps were made by Matlab 6.5 software (athWorks Inc., MA, USA)

. Results and discussion

The classification of systems, i.e. which systems arhogonal or similar to others, was performed on the bf (a) Pearson’s correlation coefficient, (b) Spearman’snd Kendall’s tau, and (c) Generalized pairwise correlaethod applying four statistical tests (Conditional FishcNemar’s, Chi-square and Williams’t-tests).The examination of the normal distribution for ret

ion data of the drugs might be important to obtain ination about the best method to choose. There are s

ests available in the literature to check normality sinceormal distribution is the most important one in statisome of these tests are only sensitive to certain kindeviations from normality being suitable for some speituations (for example directional tests based on skewnd kurtosis), while others are applicable to general cuch as the Kolmogorov–Smirnov test, Lilliefors testhapiro–Wilk’s test[18].The most powerful test for normality is the Shapiro–Wi

est, which is essentially evaluating the squared

6 P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10

p-values were lower than 0.05, we rejected the null hypothesisand the distribution of retention parameters was considerednot normal. The other cases represent an ambiguous situationwith contradictory results of the tests. Therefore, the distri-bution of these systems has not been decided on. Still, thenon-parametric methods are more suitable for these ambigu-ous cases[18].

The retention data for more than half of the chromato-graphic systems was found not normally distributed (20cases) compared to the number of normal distributions (ninecases). Accordingly, the non-parametric methods are recom-mended, although the results[4–8] based on the Pearson’scorrelation coefficient used earlier, have been kept for com-parison. The proper application of the Williams’t-test alsorequires normal distribution (this test was used within theframework of the non-parametric GPCM). Therefore, deter-mining the orthogonal character of systems based on Pear-son’s correlation coefficients or GPCM using Williams’t-testis questionable from a statistical point of view. The resultsfrom these methods risk not completely reflecting the actualsimilar or orthogonal character between systems. As the nor-mal distribution cannot be proven in the majority of cases,the application of non-parametric methods is to be recom-mended.

The correlation coefficients data matrix contained 38× 38(n×m) values (between different chromatographic systems)f ouldb sr the“ ts)r ansu vari-a vityd zeroc ,n onals eloww zero( e,g onc el ofo dingr cor-r odeo ofB 2 att ).T han0 emsc , evenh ”, ift

achs atrixo the

correlation coefficients were replaced with values of−1 and+1. We used +1 if the correlation coefficient was higher than0.32, and−1 if it was smaller or equal to 0.32. The−1 or +1values represented orthogonality or similarity, respectively,between the considered systems. After this, we counted thenumber of−1 values in each row, and obtained one num-ber for each single system. The values for each system weredivided by 37 (= 38− 1, namely the number of all systemsminus one, the tested system) and multiplied by 100. Thesepercentage values, called orthogonality ratios (Table 3), rep-resent the orthogonal character of the tested system to allothers on the basis of Pearson’s correlation coefficient. Highvalues represent systems with orthogonality towards mostother systems, whereas low values reflect similarity to mostsystems. On the basis of the above, system 8 (orthogonalityratio = 97.3%) can be considered orthogonal to most others.

The retention factors for the 68 drugs were also used todetermine the Spearman’s rho and Kendall’s tau values inorder to obtain non-parametric information about the orthog-onality/similarity of the systems. The Spearman’s rho and theKendall’s tau are the counterparts of the correlation coeffi-cient for not-normally distributed data. The matrix (38× 38)in both cases was also transformed as in the preceding para-graph. If the Spearman’s rho value was higher than 0.237(significance limit), it was exchanged for +1. If the value wassmaller or equal to 0.237, it was replaced with−1. The sig-n red en onal-i ot hesev o allo mea-s hanw (ex-c al toa dall’st r fort

r theg achs d oneo d ac-c withd ta-t heir eret : allv oses -b e. ont ncesb ed int esf -

rom which the correlation between every two systems ce read out. In the “n-direction” (rows) theith system waeference system for calculating the correlation, and inm-direction” (columns) the results (correlation coefficienelative to theith were given. As the word orthogonal mencorrelatedness, we cannot find real orthogonalbles (r = 0) in practical systems. Experimental selectiifferences and numerical uncertainties do not alloworrelation coefficients in practical cases[8,9]. Moreoveron-significant correlation can be tolerated for orthogystems. Therefore, a limit value has been chosen, bhich the correlation coefficient can be considered as

i.e. not significantly different from zero). Such limit valuiven by Eq.(1), can be found in the linear correlatioefficient table of Bevington[16]. The table gathers thinear correlation coefficients (r) versus the numberbservations and the corresponding probability of exceein a random sample of observations taken from an un

elated parent population. Alternatively, the computer cf Pecka and Ponec[17] can be used iteratively insteadevington’s table. In our case the limit value was 0.3

he 5% significance level and atn= 38 (number of systemshis means that if the correlation coefficient is higher t.32, the correlation (similarity) between the two systannot be considered to appear by chance. In practiceigher values still might be considered as “orthogonal

he selectivity differences are high enough[8].To make the information about the orthogonality of e

ystem towards the other is easily understandable, the mf correlation coefficients was transformed as follows:

ificance limit for Kendall’s tau was 0.166. The limits weetermined at 5% significance level andn= 38 degrees (thumber of systems) of freedom in both cases. The orthog

ty ratio for each system (Table 3) was calculated similarly the method used for Pearson’s correlation coefficients. Talues represent the full orthogonality of the systems tthers on the basis of Spearman’s rho and Kendall’s tauures. The orthogonality ratios were significantly lower tith the correlation coefficients, i.e. none of the systemsept for system 8) can be considered as really orthogonll other systems according to Spearman’s rho and Ken

au. Both parameters showed less discrimination powehe different systems.

The retention factors for the 68 drugs were used foeneralized pairwise correlation method, too. First, for eystem, the values that were obtained using GPCM anf the four earlier mentioned statistical tests were sorteording to their system number (1–38). Thus, a matriximensions of 38× 38 (n×m) was constructed for each s

istical test. In the “n-direction” 38 systems occur where tth system is the reference system. In the “m-direction” theesults relative to theith system are given. The matrices wransformed to be comparable with previous matricesalues higher than−13, were exchanged for +1, and thmaller or equal to−13, were replaced with−1. The numer 13 depended on the number of wins and losses, i.

he given correlation structure. In our case the differeetween number of wins and number of losses chang

he range from−37 to +37. In general, the interval rangrom −(n− 1) ton− 1, where “n” is the number of the sys

P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10 7

Table 3Orthogonality ratios (OR) and classification of the tested systems based on the definition of three groups

Systemnumber

Pearson’scorrelationcoefficient

Spearman’s rho Kendall’s tau GPCM with

Williams’ t-test Conditional Fisher’s test McNemar’s test Chi-square test

8 97.3 35.1 32.4 81.1 89.2 89.2 89.25 91.9 16.2 18.9 75.7 94.6 91.9 91.94 89.2 10.8 8.1 73.0 83.8 81.1 81.16 86.5 10.8 5.4 78.4 86.5 86.5 86.59 81.1 0.0 0.0 59.5 67.6 67.6 67.62 73.0 5.4 8.1 56.8 94.6 91.9 91.93 64.9 5.4 5.4 64.9 89.2 89.2 89.222 62.2 5.4 5.4 59.5 78.4 78.4 78.420 45.9 10.8 8.1 54.1 48.6 48.6 48.67 37.8 0.0 2.7 56.8 81.1 81.1 81.115 32.4 2.7 5.4 48.6 59.5 59.5 59.519 32.4 2.7 0.0 18.9 32.4 29.7 29.721 27.0 0.0 0.0 18.9 8.1 5.4 5.414 24.3 2.7 2.7 16.2 40.5 40.5 37.816 24.3 2.7 2.7 0.0 24.3 24.3 24.325 24.3 2.7 2.7 8.1 5.4 5.4 2.727 24.3 2.7 2.7 0.0 18.9 16.2 16.229 24.3 5.4 5.4 5.4 18.9 16.2 13.51 21.6 5.4 5.4 8.1 37.8 37.8 37.810 21.6 5.4 2.7 2.7 21.6 21.6 21.612 21.6 2.7 2.7 0.0 16.2 13.5 13.531 21.6 5.4 5.4 5.4 8.1 8.1 8.137 21.6 2.7 2.7 0.0 5.4 5.4 5.413 18.9 2.7 0.0 5.4 2.7 0.0 0.017 18.9 0.0 0.0 5.4 2.7 2.7 2.718 18.9 0.0 0.0 0.0 2.7 2.7 2.724 18.9 0.0 0.0 0.0 0.0 0.0 0.026 18.9 0.0 0.0 2.7 2.7 2.7 2.711 16.2 0.0 0.0 0.0 0.0 0.0 0.033 16.2 0.0 0.0 0.0 0.0 0.0 0.036 16.2 0.0 0.0 18.9 45.9 45.9 45.928 13.5 0.0 0.0 0.0 0.0 0.0 0.030 13.5 0.0 0.0 0.0 0.0 0.0 0.032 13.5 0.0 0.0 0.0 0.0 0.0 0.035 13.5 0.0 0.0 2.7 24.3 24.3 24.338 13.5 0.0 0.0 0.0 0.0 0.0 0.023 10.8 0.0 0.0 0.0 0.0 0.0 0.034 8.1 0.0 0.0 0.0 0.0 0.0 0.0

Group I: OR≤ 30%, similar systems – regular style; Group II: OR = 30–60%, ambiguous systems – italic style; and Group III: OR≥ 60%, orthogonal systems– bold style.

tem. Altogether there were 74 possibilities and applying asimilar limit, as was 0.32 for the correlation coefficient, alimit at −13 (=−37 + 0.32× 74) was defined. After this, thenumber of−1 values were summed, and relative values werecalculated. The percentage values, which can also be seen inTable 3, again represent the orthogonality of the tested sys-tems to others. The original matrix contained empty spacesat the diagonal. These missing values represented equality,i.e. +1 in the transformed matrix. These values thus were ne-glected in the calculation of above-mentioned percentages.

We compared the systems on the basis of the orthogonal-ity ratios calculated using the different techniques to obtaininformation about the discrepancies existing in the results.For comparison reasons three groups were defined. The firstgroup (I) contained those systems, whose orthogonality ratiowas smaller or equal to 30%. The second group (II) those,whose orthogonality ratio was higher than 30%, but smaller

than 60%, and for the third group (III), it was higher or equalto 60%. The systems, which belong to group III, can be lookedupon orthogonal referring to most other systems using pair-wise comparisons, while the systems belonging to group Ican be considered very similar to most others. It is difficultto classify the systems belonging to group II. They repre-sent intermediate situations. The systems belonging to eachgroup can be seen inTable 3. They were sorted on the ba-sis of decreasing orthogonality ratios obtained for Pearson’scorrelation coefficient. It should be noted that the selectionof the value 60% is arbitrary and not crucial. A similar resultwould be achieved using for instance 50%.

It can be seen from the data ofTable 3that the great ma-jority of systems are correlated significantly. On the basis ofSpearman’s rho and Kendall’s tau values as well as the abovedefined class limits, the systems (except for system 8) wouldbe considered similar, i.e. no real orthogonal systems could

8 P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10

Fig. 1. Color map of the Pearson’s correlation coefficients between retentionfactors, measured on all 38 systems. Red: high correlation (similarity); blue:low correlation (orthogonality). (For interpretation of the references to colorin this figure legend, the reader is referred to the web version of this article.)

be found. It can also be observed that rho and tau showedmuch less variability in their values, which does not allow dif-ferentiating well between the different systems. These non-parametric correlation coefficients are not appropriate for thedetermination of system orthogonality. Although GPCM be-longs to the non-parametric methods, the four selection cri-teria gave similar results to those based on Pearson’s productmoment correlation coefficients.

According to the classification limits applied on the or-thogonality ratios from Pearson’s correlation coefficients(Table 3) eight systems (2, 3, 4, 5, 6, 8, 9, 22) can be consid-ered orthogonal (group III). In a previous study, the Pearson’scorrelation coefficients have already been used to define or-thogonal/similar systems[6]. Then, the correlation matrix forthe 38 systems was transformed into color maps that repre-sented the magnitude of correlation by a color scale. A colormap based on correlation coefficients is given inFig. 1. Thecolors were scaled from dark blue for low correlation coeffi-cients to red, which represented high correlation (similarity).The correlation between any two systems can be read outfrom this color map, although information concerning theorthogonality of each system to all other system can hardlybe obtained from this map. The mostly orthogonal systemscan be concluded by the occurrence probability of blue color(low correlation) in their rows and columns. System 8 is anevident example. System 7 can also be considered orthogonalt ms 2,3 otb d onP only3 hem to allo

ite-r idess on-s , 9,

Fig. 2. Color map on the basis of the number of wins minus the number oflosses for GPCM using non-parametric McNemar’s test as selection crite-rion. Red: high correlation (similarity); blue: low correlation (orthogonality).(For interpretation of the references to color in this figure legend, the readeris referred to the web version of this article.)

22) as orthogonal even if other methods did. Altogether fivesystems (3, 4, 5, 6, 8) were found orthogonal and six (2, 7, 9,15, 20, 22) were ambiguous using Williams’t-test.

The GPCM gave identical results (Table 3) for the threenon-parametric selection criteria, except for system 19, whichis borderline different. This system belonged to group II onthe basis of the Conditional Fisher’s test, while it belongedto group I in case of the other two selection criteria. Ninesystems were placed in group III (2, 3, 4, 5, 6, 7, 8, 9, 22) andfive (1, 14, 15, 20, 36) or six (system 19 for the ConditionalFisher’s test) in the ambiguous group II. The remaining 23or 24 systems were classified in group I, i.e. were consid-ered similar to most other systems. The GPCM using non-parametric test methods seems to be somewhat more sensitivethan the ranking method based on the correlation coefficient.

The color map based on the numbers of wins minus thenumbers of losses calculated using the GPCM with the non-parametric McNemar’s test as selection criterion can be seenin Fig. 2. This color map, in contrast to one based on Pear-son’s correlation coefficients (Fig. 1), is asymmetric. Theasymmetric property of this color map can be understood, asGPCM provides a series of rank numbers (37) for a chromato-graphic system (when considered as reference system) andnot just one coefficient for two systems. Evaluation of sys-tems orthogonal or similar to most others is to be made in therows of the color map. Comparing the above-diagonal partso ughm onal.E 36ac alsor ys-t as av ms.

ear-s CM

o most systems on the basis of color map, beside syste, 4, 5, 6, 8, 9 and 22 (Fig. 1). However, system 7 canne considered orthogonal, if our orthogonality ratio baseearson’s correlation coefficient is used. The ratio is7.8% (Table 3), hence, system 7 belongs to group II. Tembers of group II cannot be considered orthogonalther systems, if three groups are defined.

The Williams’t-test, which is a parametric selection crion used within a non-parametric method (GPCM), provimilar results to the correlation coefficient, but it is more cervative (Table 3). It did not mark some systems (e.g. 2

f Figs. 1 and 2, some differences can be observed thoainly the same systems would be indicated as orthogvaluatingFig. 2, for instance, one could consider systems fairly orthogonal to the rest, while fromFig. 1 this con-lusion is less evident. The difference in conclusion iseflected in the different orthogonality ratios found for sem 36. Again the orthogonality ratios can be consideredaluable tool to help deciding on the orthogonality of syste

The orthogonality ratios for all 38 systems in case of Pon’s correlation coefficients, Spearman’s rho and GP

P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10 9

Fig. 3. Orthogonality ratios based on four correlation measures: (1) on Pearson’s correlation coefficients; (2) on GPCM using Williams’t test; (3) on GPCMusing McNemar’s test; (4) on Spearman’s rho ((�) 1; (�) 2; (�) 3; (�) 4).

with Williams’ t-test and McNemar’s test are shown inFig. 3.The results of GPCM for Conditional Fisher’s and Chi-squaretests were essentially the same as for McNemar’s test, hencethey were omitted for clarity. Similarly, Kendall’s tau pro-vides essentially the same results as Spearman’s rho, hence,it was also omitted. The figure arranges the orthogonality ra-tios from Pearson’s correlation coefficients in decreasing or-der as inTable 3. It can be seen that the evaluation of systemsbased on the other criteria is not the same, and rather followsa zigzag pattern within the same general decreasing trend.The results for some systems (e.g. 1, 7, 14, 36) and hencetheir classification depend strongly on the applied method.The method based on the Spearman’s rho (and the Kendall’stau) cannot be applied to define orthogonality, because theircurves were too flat (Fig. 3).

As the Conditional Fisher’s test is usually reserved forsmall amount of samples[10] and the Chi-square test yieldsthe same results for large number of data as the McNemar’stest, we recommend the GPCM with McNemar’s test to de-termine the orthogonal systems. Although the GPCM withMcNemar’s test will not be so efficient as the parametric al-ternative, the arising error will be smaller than using the cor-relation coefficients, because the great majority of the testedsystems was not normally distributed. Accordingly, the sys-tems 2, 3, 4, 5, 6, 7, 8, 9 and 22 can be considered orthogonalto most others on the basis of calculated orthogonality ratiosu tedt al to

that based on the orthogonality ratios from the correlationcoefficients.

Having knowledge on the orthogonality of systems is veryimportant information to a user, because the time needed todevelop a new method, e.g. for the separation of a drug sub-stance from its impurities, might be reduced considerably.Method development could start, for instance, with the mostorthogonal system and then, when separation is unaccept-able go on with the second most orthogonal on the basis oforthogonality ratios, etc. Method development alternativelyalso could start with whatever system preferred by the an-alyst, and then occasionally be continued with the differentsystems with a high orthogonality ratio, because the latterhave the highest probability to be orthogonal towards thefirst selected and towards each other.

Which technique can be best used to select orthogo-nal systems? There is no general answer for this. As canbe seen inFig. 3, all methods provide different ordering(ranking) of orthogonality. The recommended procedure is:the distribution of data should be determined first. Fromthis information, methods to be applied (parametric or non-parametric) can be inferred. Williams’t-test, similarly to cor-relation coefficients, can be used, if the majority of dataare normally distributed. If the distribution of the data isnot normal, the usage of non-parametric tests is recom-mended. Naturally, one can use parametric tests for not nor-m etedc

sing GPCM with McNemar’s test. However, it can be nohat with the exception of system 7, the selection is equ

ally distributed data, but the results should be interprarefully.

10 P. Forlay-Frick et al. / Analytica Chimica Acta 539 (2005) 1–10

5. Conclusions

The vast majority of retention data for 38 chromatographicsystems is not normally distributed.

A novel measure, the so-called orthogonalithy ratio, hasbeen suggested for ranking ortogonality of chromatographicsystems. The measure is flexible; it can be adjusted to receivesimilar results as using color maps of correlation coefficients.Such a way, the new measure was validated. The main ad-vantage is that the orthogonality ratio can be used even if thedistribution of the data is not normal.

Generalized pairwise correlation method with the McNe-mar’s test, as non-parametric method, seems to be best toselect orthogonal systems.

The non-parametric alternatives of correlation coefficients(Spearman’s rho, Kendall’s tau) were found not sensitiveenough to define the orthogonality of the chromatographicsystems.

Acknowledgement

P. Forlay-Frick expresses his appreciation to the FlemishCommunity Project (BIL 01/23) for financial support. Theresearch of K. Heberger was supported by the HungarianResearch Foundation (OTKA T 037684). The research of E.V te fort logyi

R

uire-Har-ces,

[2] European Pharmacopoeia, 4th ed., 4.8, Monographs Chapter, Eu-ropean Pharmacopoeia Commission, Strasbourg Cedex, France,2004.

[3] United States Pharmacopoeia, 27-NF 22 Main ed., MonographsChapter, United States Pharmacopeial Convention Inc., Rockville,MD, USA, 2004.

[4] E. Van Gyseghem, M. Jimidar, R. Sneyers, D. Redlich, E. Verhoeven,W. Peys, D.L. Massart, Y. Vander Heyden: Validation of orthogo-nal RP-HPLC systems for method development using different drugimpurity profiles, submitted for publication.

[5] E. Van Gyseghem, S. Van Hemelryck, M. Daszykowski, F. Questier,D.L. Massart, Y. Vander Heyden, J. Chromatogr. A 988 (2003)77–93.

[6] E. Van Gyseghem, I. Crosiers, S. Gourvenec, D.L. Massart, Y. Van-der Heyden, J. Chromatogr. A 1026 (2004) 117–128.

[7] E. Van Gyseghem, M. Jimidar, R. Sneyers, D. Redlich, E. Verhoeven,D.L. Massart, Y. Vander Heyden, J. Chromatogr. A 1042 (2004)69–80.

[8] E. Van Gyseghem, M. Jimidar, R. Sneyers, D. Redlich, E.Verhoeven,D.L. Massart, Y. Vander Heyden, Orthogonality and similaritywithin silica-based reversed-phased chromatographic systems, J.Chromatogr. A, in press.

[9] G. Xue, A.D. Bendick, R. Chen, S.S. Sekulic, J. Chromatogr. A1050/2 (2004) 159–171.

[10] D.L. Massart, B.G.M. Vandeginste, L.M.C. Buydens, S. De Jong, P.J.Lewi, J. Smeyers-Verbeke, Handbook of Chemometrics and Quali-metrics, Part A, Elsevier, 1997.

[11] W.J. Conover, Measures of rank correlation, in: Practical Nonpara-metric Statistics, 2nd ed., Wiley, New York, 1980, pp. 250–256(Chapter 5.4).

[12] ibid, pp. 256–263.[ 2)

[[ .[ sical

[[ rac-

pp.

[ tat.

an Gyseghem was funded by a Ph.D. grant of the Instituhe Promotion of Innovation through Science and Technon Flanders (IWT-Vlaanderen).

eferences

[1] International Conference on Harmonisation of Technical Reqments for Registration of Pharmaceuticals for Human Use, ICHmonised Tripartite Guideline, Impurities in New Drug SubstanQ3A(R), February 2002, http://www.ich.org.

13] K. Heberger, R. Rajko, SAR QSAR Environ. Res. 13 (200541–554.

14] K. Heberger, R. Rajko, J. Chemometr. 16 (2002) 436–443.15] R. Rajko, K. Heberger, Chemometr. Intell. Lab. 57 (2001) 1–1416] P.R. Bevington, Data Reduction and Error Analysis for the Phy

Sciences, McGraw-Hill, New York, 1969, pp. 310–311.17] J. Pecka, R. Ponec, J. Mathem. Chem. 27 (2000) 13–22.18] W.J. Conover, Statistics of the Kolmogorov–Smirnov type, in: P

tical Nonparametric Statistics, 2nd ed., Wiley, New York, 1980,344–368 (Chapter 6).

19] J. Zhang, Y. Wu, Likelihood-ratio tests for normality, Comput. SData Anal., in press.