Fuzzy analysis for detection of inconsistent data in experimental datasets employed at the...

11
PLEASE SCROLL DOWN FOR ARTICLE This article was downloaded by: [Melgosa, Manuel] On: 4 September 2009 Access details: Access Details: [subscription number 914485824] Publisher Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Modern Optics Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713191304 Fuzzy analysis for detection of inconsistent data in experimental datasets employed at the development of the CIEDE2000 colour-difference formula Samuel Morillas a ; Luis Gómez-Robledo b ; Rafael Huertas b ; Manuel Melgosa b a Centro de Investigación en Tecnologías Gráficas, Universidad Politécnica de Valencia, Camino de Vera s/n, 46022 Valencia, Spain b Departamento de Óptica, Universidad de Granada, 18071 Granada, Spain First Published:July2009 To cite this Article Morillas, Samuel, Gómez-Robledo, Luis, Huertas, Rafael and Melgosa, Manuel(2009)'Fuzzy analysis for detection of inconsistent data in experimental datasets employed at the development of the CIEDE2000 colour-difference formula',Journal of Modern Optics,56:13,1447 — 1456 To link to this Article: DOI: 10.1080/09500340902944038 URL: http://dx.doi.org/10.1080/09500340902944038 Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Transcript of Fuzzy analysis for detection of inconsistent data in experimental datasets employed at the...

PLEASE SCROLL DOWN FOR ARTICLE

This article was downloaded by: [Melgosa, Manuel]On: 4 September 2009Access details: Access Details: [subscription number 914485824]Publisher Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Journal of Modern OpticsPublication details, including instructions for authors and subscription information:http://www.informaworld.com/smpp/title~content=t713191304

Fuzzy analysis for detection of inconsistent data in experimental datasetsemployed at the development of the CIEDE2000 colour-difference formulaSamuel Morillas a; Luis Gómez-Robledo b; Rafael Huertas b; Manuel Melgosa b

a Centro de Investigación en Tecnologías Gráficas, Universidad Politécnica de Valencia, Camino de Vera s/n,46022 Valencia, Spain b Departamento de Óptica, Universidad de Granada, 18071 Granada, Spain

First Published:July2009

To cite this Article Morillas, Samuel, Gómez-Robledo, Luis, Huertas, Rafael and Melgosa, Manuel(2009)'Fuzzy analysis for detectionof inconsistent data in experimental datasets employed at the development of the CIEDE2000 colour-difference formula',Journal ofModern Optics,56:13,1447 — 1456

To link to this Article: DOI: 10.1080/09500340902944038

URL: http://dx.doi.org/10.1080/09500340902944038

Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

This article may be used for research, teaching and private study purposes. Any substantial orsystematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply ordistribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directlyor indirectly in connection with or arising out of the use of this material.

Journal of Modern OpticsVol. 56, No. 13, 20 July 2009, 1447–1456

Fuzzy analysis for detection of inconsistent data in experimental datasets employed at the

development of the CIEDE2000 colour-difference formula

Samuel Morillasa, Luis Gomez-Robledob, Rafael Huertasb and Manuel Melgosab*

aCentro de Investigacion en Tecnologıas Graficas, Universidad Politecnica de Valencia, Camino de Vera s/n, 46022Valencia, Spain; bDepartamento de Optica, Universidad de Granada, Edificio Mecenas, Campus de Fuentenueva s/n,

18071 Granada, Spain

(Received 14 January 2009; final version received 1 April 2009)

Relating instrumental measurements to visually perceived colour-differences, under specific illuminating andviewing conditions, is one of the challenges of advanced colorimetry. Experimental data are used to devisenew colour-difference formulas as well as to assess the performance of other colour-difference formulas. Inthis paper, we analyse the consistency of experimental data employed at the development of the last CIErecommended colour-difference formula, CIEDE2000. Because of the subjective and imprecise nature of thesedata, we adopt a fuzzy approach, so that finally, for each experimental datum, we establish the fuzzy degreeto which it can be considered consistent with the remaining data. The results of our analyses show that onlya few data are associated with a rather low degree of consistency. These data in many cases correspond tocolour pairs with a very small colour-difference for which visual assessments seem to be overestimated.

Keywords: CIEDE2000; colour-difference formula; fuzzy logic; perceived colour differences

1. Introduction

Improved correlation between visually perceived (�V )

and instrumentally measured (�E) colour differences

is an important problem in modern colorimetry.

Undoubtedly, a very important step in this search is

to have a wide set of reliable experimental data, well

distributed in all regions of colour space, and obtained

through a common methodology (i.e. illuminating and

viewing conditions, fitting procedures, etc.). Currently,

new experimental datasets are requested by the

Technical Committee 1-55 of the International

Commission on Illumination (CIE) ‘Uniform color

space for industrial color difference evaluation’ [1].Research on classical and modern datasets has

shown that constant visual differences (�V ) with

respect to a given colour centre do not correspond to

constant computed colour-differences (�E) in a colour

space [2,3]. Traditionally, points with a constant visual

difference with respect to a fixed colour centre are

considered to be placed on the surface of an ellipsoid in

a given colour space, but the orientation, shape, and

size of this ellipsoid change with the fixed colour centre

[4]. In short, to date we do not have a uniform colour

space that is well related to visual perception.

The CIELUV and CIELAB colour spaces, recom-

mended by the CIE in 1976 [5], as well as other recent

colour spaces [6,7] are only approximately uniform.

In any case, it is reasonable to assume that colourdiscrimination in a colour space changes in a smoothand regular way. Thus, for example, experimentalcolour discrimination ellipses reported in previousexperiments [4,8,9], in each case follow a quite regularpattern in the CIE x, y chromaticity diagram, althoughrelevant differences (attributable to different para-metric factors such as viewing modes, sizes of colour-differences, etc.) may be noted when comparing ellipsesfrom different experiments. When data from differentexperimental datasets are put together to construct acombined dataset, the assumption of a regular overallpattern may become false in some regions of colourspace, even if global scale factors are used for eachindividual dataset before combination. This might bethe case in the development of the last CIE-recom-mended formula, CIEDE2000 [10], where four experi-mental datasets from different laboratories (BFD-P [9],Leeds [11], RIT-DuPont [12], and Witt [13]), usingdifferent illuminating and viewing conditions, wereweighted and combined [14], constituting the so-calledcombined dataset (COM dataset).

After the development of CIEDE2000, it was found[15] that the original RIT-DuPont dataset [12] was notcorrectly employed. In this paper, we use the correctRIT-DuPont dataset, which leads to the correctedCOM dataset [15]. Some characteristics of the indivi-dual datasets constituting this corrected COM dataset

*Corresponding author. Email: [email protected]

ISSN 0950–0340 print/ISSN 1362–3044 online

� 2009 Taylor & Francis

DOI: 10.1080/09500340902944038

http://www.informaworld.com

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

are shown in Table 1. For example, the BFD-P datasetis the result of the combination of three subsets (calledBFD-D65, BFD-M and BFD-C), with colour differ-ences in relatively different ranges. Threshold, supra-threshold, and colour pairs with large colourdifferences (�E�ab 4 5), were put together in BFD-P(as well as in the COM dataset), while the diverseperformance of the human visual system for colourdifferences of very different sizes is not well known.Table 1 also provides STRESS (a recently proposedindex for the measurement of the relationship betweenperceived and computed colour differences [16]), forthe different datasets and three colour-differenceformulas: CIELAB [5], CIEDE2000 [10], andDIN99d [17]. STRESS values are always in the range[0, 1], although they are usually given as percentages:perfect agreement between perceived and computedcolour differences leads to a STRESS value of zero,higher STRESS values indicating worse agreement.

The goal of this paper is to analyse the correctedCOM dataset and its individual subsets, using a fuzzymethodology to detect colour pairs having inconsistentvalues with respect to their neighbours. Fuzzy proce-dures have been successfully employed in different fieldsof colour science such as colour naming [18], andconstitute a good tool to represent imprecise conceptsexpressed in natural language. Therefore, we will usefuzzy techniques to analyse the regularity of the dis-tribution of the ratio between perceived and computedcolour differences (i.e. �V/�E) in datasets employed atCIEDE2000 development. From previous papers [7,15],the four datasets employed at CIEDE2000 development(BFD-P, Leeds, RIT-DuPont and Witt) seem to haveimportant differences, which could be better under-stood from current analyses.

2. Fuzzy consistency analysis

We undertake the analysis of the corrected COMdataset from a fuzzy approach in order to determine

the degree to which the data in this set can beconsidered consistent. Since consistent is a linguisticterm, it can be modelled by means of a fuzzy set.This means that data consistency is identifiedwith the membership degree of the data to thisfuzzy set.

The experimental dataset, which we denote by S,consists of a number of single experimental data,denoted as Si, each of them representing the perceptualcolour difference between two colour samples. Ourobjective is to determine the degree of consistency ofeach single experimental datum Si. Our analysis isbased on checking whether the perceptual colourdifference observed between two colour samplesagrees with the perceptual differences of near pairs ofcolour samples.

Each single experimental datum in S is inturn represented as a set Si¼ {Ai,Bi,Ci,Ri}, whereAi and Bi denote the CIELAB coordinates of the twocolour samples given by Ai ¼ fL

�1i , a�1i , b�1i g and

Bi ¼ fL�2i , a�2i , b�2i g, Ci denotes the mean point between

Ai and Bi given by Ci¼ (AiþBi)/2, and Ri, whichincludes the perceptual colour difference mentionedabove, denotes the ratio between the perceptualdifference �Vi observed between Ai and Bi and thecolour difference computed with the CIEDE2000colour difference formula (�Ei

00), that isRi ¼ �Vi=�Ei

00.We use CIEDE2000 because it is the last CIE

recommended colour-difference formula to providethe best estimation of the perceptual colour differencebetween any two colour samples, under specificilluminating and viewing conditions, especially forsmall–medium colour differences (�E�ab 5 5:0CIELAB units).

For each Si, we first determine the set of nearSj data. Again, near is a linguistic term and there-fore it can be represented by a fuzzy set. Fuzzysets, in turn, are represented by membershipfunctions. Thus, the membership degree of Sj

Table 1. Some characteristics of the datasets employed at CIEDE2000 development.

CIELAB colour differences STRESS

Datasets Number of colour pairs Max Min Average CIELAB CIEDE2000 DIN99d

COM 3813 18.2 0.04 2.6 0.43 0.29 0.31BFD-D65 2028 16.1 0.04 2.6 0.41 0.24 0.26BFD-M 548 18.2 0.05 5.2 0.43 0.35 0.37BFD-C 200 3.9 0.07 0.9 0.54 0.29 0.29BFD-P 2776 18.2 0.04 3.0 0.42 0.30 0.32Leeds 307 4.7 0.40 1.6 0.40 0.19 0.23RIT-DuPont 312 4.4 0.78 1.4 0.33 0.19 0.21Witt 418 10.6 0.12 1.9 0.52 0.30 0.30

1448 S. Morillas et al.

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

to the fuzzy set of samples near Si named N Si isgiven by

NSi ðSjÞ ¼ 1��ðkCi�CjkÞ

¼

1 if kCi�Cjk�� ,

1�2kCi�Cjk��

���

� �2

if �5kCi�Cjk��þ�

2,

2kCi�Cjk��

���

� �2

if�þ�

25kCi�Cjk� � ,

0 if kCi�Cjk4� ,

8>>>>>>>><>>>>>>>>:ð1Þ

where � is a S-type membership function [19] andkCi�Cjk denotes the CIELAB colour difference(Euclidean distance) between the mean points ofSi and Sj. We consider two possibilities for this fuzzyset, one for taking into account only very near dataassociated with small to medium colour differencesthat we denote by N

Si

1 and for which we set �1¼ 5,�1¼ 15, and another one for considering larger colourdifferences, denoted by N

Si

2 , with �2¼ 5, �2¼ 50.Figure 1 shows the curves associated with thesemembership functions.

To check whether the data Si agrees with itsnear neighbours represented by the fuzzy set N Si,we first compute the fuzzy mean and fuzzystandard deviation of the R components of the datain N Si that we denote by eRi and e�i, respectively.These statistics are computed analogously to theclassical mean and standard deviation butweighing the contribution of each data Sj

according to its membership to the fuzzy set N Si(Sj)as follows

eRi ¼

PSj2S,Sj 6¼Si

NSi

1 ðSjÞRjPSj2S,Sj 6¼Si

NSi

1 ðSjÞ, ð2Þ

and

e�i ¼ PSj2S,Sj 6¼Si

NSi

1 ðSjÞðRj � eRiÞ2P

Sj2S,Sj 6¼SiN

Si

1 ðSjÞ

!1=2

ð3Þ

and analogously for NSi

2 .The assumption behind our analysis is that Ri

should be similar to the corresponding R componentsof data Sj near Si. To measure the degree to which Si

is consistent with respect to its near neighbours inN

Si, we will compare the value of Ri with the meanvalue observed in N Si, i.e. eRi. However, we also wishto take into account the standard deviation e�i of theRj values in N Si. To achieve this, we propose to usea fuzzy metric [20–23]. Fuzzy metrics are functionswith the form M(x, y, t) that measures the degree

of nearness between x and y with respect to a

contextual parameter t4 0 in a fuzzy way [20–23].

In particular, we compute this nearness according to

the best-known fuzzy metric given by

FMðx, y, tÞ ¼t

tþ jx� yj: ð4Þ

FMðRi, eRi,e�iÞ provides a value in ]0, 1] that can be

interpreted as the degree to which Ri is near to eRi,

i.e. the degree to which the value of Ri agrees with

the values of its nearest neighbours in N Si. A high

value (close to 1) of FMðRi, eRi,e�iÞ indicates that the

agreement is good and the value of Ri is as might

be expected. On the other hand, if FMðRi, eRi,e�iÞ islower, this implies that the agreement is worse

and therefore Ri may be noisy or inconsistent. We

have used this fuzzy metric because it is based on the

usual metric in R (absolute value) [20–22], and

because it has been successfully applied in other

engineering problems [23,24]. In addition, it has the

property that, when the absolute difference

between Ri and eRi equals e�i, the value given by FM

is exactly 0.5. This implies that when the difference is

lower than the sample fuzzy standard deviation e�i,the degree of consistency is closer to 1 than to 0,

and, if the difference is larger than the standard

deviation, then the degree of consistency is closer to 0

than to 1.In this way, since the FM value for the data Si

is given by FMðRi, eRi,e�iÞ, then we can identify the

degree of consistency of the sample Si that we were

seeking with the fuzzy agreement given by

FMðRi, eRi,e�iÞ.

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

||Ci – Ci||

Mem

bers

hip

degr

ee

Figure 1. Memberships functions (solid) NSi

1 and(dotted) N

Si

2 used to represent the fuzzy set of neighboursof sample Si.

Journal of Modern Optics 1449

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

3. Results of fuzzy consistency analysis and

discussion

We performed the proposed fuzzy consistency analysis

on the corrected COM dataset. In so doing, we

computed the FM values of all data Si in the set

using the two possible fuzzy sets NSi

1 and NSi

2 to

represent the near neighbours of each Si. The results

are shown in Figure 2 by means of two accumulated

histograms of the FM values. This figure represents the

amount of Si data which have FM values lower than

a threshold ranging from 0.01 to 1. For example, in the

fuzzy set NSi

1 case, we have about 750 data Si with FM

values lower than 0.5. It can be seen that in most of the

cases, the FM values are high, but, for some data, FM

values are quite low (lower than 0.20). Also, we see

that, in general, FM values are higher for the NSi

2 case

than for the NSi

1 case. This may be due to the

agreement condition imposed by NSi

1 , which is more

restrictive than for NSi

2 . However, it also means that

if we take into account only small colour differences,

it is easier to find disagreements (low values of FM)

in the data.Now, we assessed the effect of removing from the

dataset those single data for which the FM value is

lower than a given threshold T. Once these data were

removed, we computed the STRESS [16] value

calculated by the colour-difference formulas

CIEDE2000 (�E00), CIELAB (�E�ab), and DIN99d

(�E99d) for the remainder of the dataset. The FM value

of the dataset was computed using the ratio �V/�E00

so that the FM value depends on both �V and �E00.

Thus, if FM is low, it may be due to inconsistency

related to �V or to �E00. To distinguish between

these two cases, we checked the STRESS values of

the three above-mentioned colour-difference formulas.

We expect that, if the removed data is inconsistent

because of the �V, the STRESS should decrease

similarly for the three formulas. However, if the cause

of the inconsistency is �E00, then STRESS for �E00

may decrease but STRESS for �E�ab and �E99d will

not decrease or, at least, will not decrease to the same

degree. We performed this study by varying the value

of T from 0.01 to 0.70 with a step of 0.01. The results

are shown in Figure 3 for both NSi

1 and NSi

2 .

The curves in this figure represent the STRESS

values found for the dataset after eliminating the

data Si that give FM values lower than threshold T,

which varies in [0.01, 0.70]. For instance, in the NSi

1

case, if Si with FM values lower than than 0.5 are

removed from the dataset, the CIEDE2000 for the rest

of the dataset gives a STRESS value approximately

equal to 0.25, whereas for the whole dataset the

STRESS was about 0.29. Also Figure 3 shows that

10 20 30 40 50 60 70

0.2

0.25

0.3

0.35

0.4

Value of threshold T for FM

ST

RE

SS

10 20 30 40 50 60 70

0.2

0.25

0.3

0.35

0.4

Value of threshold T for FM

ST

RE

SS

(a)

(b)

Figure 3. STRESS value of �E00 (solid), �E�ab 5 5:0(dashed), and �E99d (dotted) when removing from theexperimental set all single data with FM (�102) lower thanT for (a) N

Si

1 , and (b) NSi

2 fuzzy neighbour sets.

10 20 30 40 50 60 70 80 90 100

500

1000

1500

2000

2500

3000

3500

FM value (x 102)

Num

ber

of s

ingl

e da

ta

Figure 2. Accumulated histograms for the FM values (�102)computed using the fuzzy sets N

Si

1 (solid line), and NSi

2

(dotted line).

1450 S. Morillas et al.

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

when we remove the data with FM up to approxi-mately T¼ 0.3, the three colour-difference formulasexhibit the same behaviour: a very slight decrease inSTRESS values, as expected given that this index is notvery sensitive to outliers [16]. On the other hand, forT4 0.4, we observe clear differences amongst the threeformulas. This fact may be interpreted as indicatingthat inconsistency due to �V values is associated withFM values lower than T¼ 0.3 whereas for highervalues of T the disagreement is most probably due torelevant difference(s) amongst the tested formulas.

3.1. Analysis of COM subsets

As mentioned above, the corrected COM datasetincludes the results of experiments from four differentsources. Thus, we can identify four different subsetsin corrected COM, named BFD-P (which in turn iscomposed by another three subsets BFD-D65, BFD-C,and BFD-M), Leeds, RIT-DuPont, and Witt. For eachsubset, we performed the fuzzy consistency analysis interms of the FM values. We again used both N

Si

1 andN

Si

2 and computed the FM values. As above, theresults are represented with an accumulated histogramfor each subset. Also, we performed the STRESS-based assessment as described in the previous section.Figures 4–7 show these results.

By analysing the accumulated histograms of FMvalues of each subset, we find that the FM values inthe Leeds and RIT-DuPont subsets are in generallarger than in the BFD-P and Witt subsets. This isshown in Figures 5(a) and 6(a) revealing that Leedsand RIT-DuPont give almost no data with FMvalues lower than 0.25. In particular, the lowest FMvalues in general are observed for the BFD-P andWitt subsets, implying that BFD-P and Witt subsetsby themselves are less consistent than Leeds andRIT-DuPont. With respect to the STRESS-basedassessment, results on the subsets agree in generalwith the observations on the whole corrected COMdataset except for the Leeds and RIT-DuPontsubsets, where we see that inconsistencies in the�E00 begin to appear for values of T larger thanabout 0.2 while only FM values lower than 0.2 arerelated to the inconsistency of the �V values.

3.2. Selection of candidates to be considered forremoval in the datasets

To select a set of inconsistent candidates to beremoved from the different datasets, we inspected theFM values found for all Si. In Tables 2–5 we showthe Si data giving the lowest FM in each of the fourmain subsets of the corrected COM dataset: BFD-P

10 20 30 40 50 60 70 80 90 1000

500

1000

1500

2000

2500

FM value (x 102)

Num

ber

of s

ingl

e da

ta

10 20 30 40 50 60 70

0.2

0.25

0.3

0.35

0.4

Value of threshold T for FM

ST

RE

SS

10 20 30 40 50 60 70

0.2

0.25

0.3

0.35

0.4

Value of threshold T for FM

ST

RE

SS

(a)

(c)

(b)

Figure 4. (a) Accumulated histograms for the FM values(�102) for (solid line) N

Si

1 , and (dotted line) NSi

2 for the datain the BFD-P subset. (b) and (c) show STRESS value of�E00 (solid), �E�ab (dashed), and �E99d (dotted) whenremoving from the experimental set all single data withFM value lower than T for (b) N

Si

1 , and (c) NSi

2 .

Journal of Modern Optics 1451

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

10 20 30 40 50 60 70 80 90 1000

50

100

150

200

250

300

FM value (x 102)

Num

ber

of s

ingl

e da

ta

10 20 30 40 50 60 70

0.15

0.2

0.25

0.3

0.35

0.4

Value of threshold T for FM

ST

RE

SS

10 20 30 40 50 60 70

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Value of threshold T for FM

ST

RE

SS

(a)

(c)

(b)

Figure 5. (a) Accumulated histograms for the FM values(�102) for (solid line) N

Si

1 , and (dotted line) NSi

2 for the datain the Leeds subset. (b) and (c) show STRESS value of �E00

(solid), �E�ab (dashed), and �E99d (dotted) when removingfrom the experimental set all single data with FM value lowerthan T for (b) N

Si

1 , and (c) NSi

2 .

10 20 30 40 50 60 70 80 90 1000

50

100

150

200

250

300

FM value (x 102)

Num

ber

of s

ingl

e da

ta

10 20 30 40 50 60 70

0.14

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

0.32

Value of threshold T for FM

ST

RE

SS

10 20 30 40 50 60 70

0.1

0.15

0.2

0.25

0.3

Value of threshold T for FM

ST

RE

SS

(a)

(c)

(b)

Figure 6. (a) Accumulated histograms for the FM values(�102) for (solid line) N

Si

1 , and (dotted line) NSi

2 for the datain the RIT-DuPont subset. (b) and (c) show STRESS valueof �E00 (solid), �E�ab (dashed), and �E99d (dotted) whenremoving from the experimental set all single data withFM value lower than T for (b) N

Si

1 , and (c) NSi

2 .

1452 S. Morillas et al.

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

(pairs 1 to 2776), Leeds (pairs 2777 to 3083), RIT-

DuPont (pairs 3084 to 3395), and Witt (pairs 3396 to3813). We chose 50 pairs for BFD-P (Table 2) and15 pairs for each one of the three remaining subsets(Tables 3 to 5). Note that the lowest mean FM

values are 0.249 and 0.389 for the Leeds and RIT-DuPont datasets, respectively, while these values areconsiderably lower for BFD-P and Witt datasets:0.015 and 0.093, respectively. On this basis, weselected as candidates to be removed those Si with

a mean FM value for NSi

1 and NSi

2 lower than 0.2.Also, we selected all Si with mean FM value lowerthan 0.3 but that for N

Si

1 , where the sensitivity tolocal heterogeneities was greater, had a FM valuelower than 0.2. These thresholds were set according

to the previous STRESS-based analysis to avoidremoving data for which the FM values were lowdue to some inconsistency caused by �E00.The selected data and their corresponding FMvalues are given in Table 6 where we see that some

data have a low FM in their subset but they havea higher FM in the corrected COM dataset. Also,some data with low FM in the corrected COMdataset have a higher FM in their subset. When theselected data (Table 6) were removed from

the corrected COM dataset the STRESS value ofthe �E00 dropped from 29.20% to 28.77%. Since weare removing only 29 data (a 0.76% of the dataset)and STRESS is not very sensitive to outliers [16], theSTRESS value does not notably decrease (only

1.47%). On the other hand, Figure 8 shows thatdata for which the mean FM is low correspond tocases of low colour differences for which theperceptual colour difference �V is overestimated.This result agrees with the ones reported in [25].

4. Conclusions

In this paper, we employed fuzzy methodology toanalyse the consistency of experimental data (correctedCOM dataset) employed at the development of theCIEDE2000 colour-difference formula. The methodanalyses the regularity of the distribution of the ratio

between perceived and computed colour differences(i.e. �V/�E) in the corrected COM dataset, and itsfour main subsets. In this way, for each singleexperimental datum in a given dataset, we have

computed the fuzzy degree to which it can beconsidered consistent. Thus, we have introduceda methodology which can be used to analyse incon-sistencies in experimental datasets of perceived colourdifferences.

This methodology provided us the data whichdid not correlate well with their neighbours.

10 20 30 40 50 60 70 80 90 100

50

100

150

200

250

300

350

400

FM value (x 102)

Num

ber

of s

ingl

e da

ta

10 20 30 40 50 60 70

0.25

0.3

0.35

0.4

0.45

0.5

Value of threshold T for FM

ST

RE

SS

10 20 30 40 50 60 70

0.25

0.3

0.35

0.4

0.45

0.5

Value of threshold T for FM

ST

RE

SS

(a)

(c)

(b)

Figure 7. (a) Accumulated histograms for the FM values(�102) for (solid line) N

Si

1 , and (dotted line) NSi

2 for the datain the Witt subset. (b) and (c) show STRESS value of �E00

(solid), �E�ab (dashed), and �E99d (dotted) when removingfrom the experimental set all single data with FM value lowerthan T for (b) N

Si

1 , and (c) NSi

2 .

Journal of Modern Optics 1453

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

These data seem to correspond mainly with caseswhere �V is overestimated, in agreement withprevious findings [25]. We have been trying to findother relationships between these inconsistent data,

but with no clear result. By analysing the subsetscomprising the corrected COM dataset, we foundthat the consistency of the data included in the Leedsand RIT-DuPont subsets is in general higher thanthe one for the data in the BFD-P and Witt subsets.We observed that, unlike BFD-P and Witt subsets,in the Leeds and RIT-DuPont subsets, there arealmost no data with low degrees of consistency.Results from 13 experiments were put together in

Table 2. Single experimental data in the BFD-P dataset withthe lowest FM values. *Indicates selected data considered forremoval in this dataset: mean FM values lower than 0.2, ormean FM values lower than 0.3 with FM value for N

Si

1 lowerthan 0.2.

SubsetPair

numberFM forN

Si

1

FM forN

Si

2

MeanFM �E00 �V

BFD-M 2738* 0.018 0.013 0.015 0.024 0.630BFD-M 2401* 0.045 0.038 0.041 0.164 1.888BFD-M 2706* 0.046 0.037 0.042 0.162 1.713BFD-D65 1330* 0.040 0.076 0.058 0.031 0.203BFD-M 2683* 0.109 0.057 0.083 0.160 1.430BFD-C 2153* 0.079 0.104 0.091 0.050 0.215BFD-M 2484* 0.125 0.089 0.107 0.584 3.081BFD-M 2554* 0.132 0.145 0.138 0.163 0.707BFD-M 2342* 0.178 0.149 0.163 0.287 1.072BFD-M 2432* 0.134 0.198 0.166 0.662 1.987BFD-M 2473* 0.206 0.129 0.168 0.560 2.203BFD-D65 770* 0.098 0.244 0.171 0.222 0.550BFD-M 2492* 0.211 0.132 0.172 1.007 3.898BFD-M 2703* 0.204 0.141 0.172 0.548 2.166BFD-M 2709* 0.204 0.141 0.173 0.433 1.712BFD-D65 612* 0.147 0.204 0.175 0.541 1.442BFD-M 2380* 0.229 0.147 0.188 0.461 2.057BFD-M 2337* 0.207 0.171 0.189 0.208 0.708BFD-M 2353* 0.187 0.209 0.198 1.358 3.566BFD-M 2530 0.218 0.199 0.208 0.259 0.873BFD-D65 1742 0.248 0.182 0.215 0.039 0.124BFD-C 2043* 0.192 0.248 0.220 0.104 0.235BFD-C 2095* 0.125 0.319 0.222 0.056 0.127BFD-M 2766 0.214 0.249 0.232 0.510 1.199BFD-D65 1964* 0.186 0.278 0.232 0.102 0.224BFD-M 2699 0.273 0.190 0.232 0.431 1.338BFD-D65 1111 0.201 0.264 0.233 0.728 1.797BFD-C 2212* 0.122 0.354 0.238 0.127 0.271BFD-M 2666 0.324 0.163 0.243 0.078 0.359BFD-D65 1796* 0.128 0.360 0.244 0.053 0.146BFD-C 2147* 0.139 0.355 0.247 0.060 0.122BFD-M 2344 0.279 0.225 0.252 0.266 0.748BFD-D65 615 0.255 0.265 0.260 0.579 1.344BFD-M 2368 0.268 0.260 0.264 1.412 3.155BFD-D65 1945 0.213 0.315 0.264 0.105 0.212BFD-D65 613 0.247 0.285 0.266 0.808 1.736BFD-D65 1898 0.217 0.321 0.269 0.060 0.120BFD-M 2776 0.254 0.289 0.272 0.877 1.867BFD-D65 1950 0.220 0.325 0.273 0.080 0.159BFD-M 2702 0.327 0.228 0.278 0.339 0.916BFD-M 2288 0.288 0.282 0.285 1.022 2.274BFD-M 2254 0.266 0.335 0.300 0.580 1.156BFD-M 2452 0.247 0.358 0.303 0.818 1.334BFD-C 2044 0.278 0.330 0.304 0.227 0.424BFD-C 2045 0.279 0.331 0.305 0.208 0.386BFD-D65 637 0.281 0.332 0.306 0.880 1.554BFD-M 2624 0.153 0.464 0.308 0.901 2.363BFD-M 2773 0.295 0.322 0.308 1.498 2.968BFD-M 2685 0.411 0.212 0.311 0.611 2.257BFD-D65 717 0.264 0.362 0.313 0.473 0.820

Table 4. Single experimental data in the RIT-DuPontdataset with the lowest FM values. There are no dataconsidered for removal in this dataset: mean FM values lowerthan 0.2, or mean FM values lower than 0.3 with FM valuefor N

Si

1 lower than 0.2.

Pairnumber

FM forN

Si

1

FM forN

Si

2

MeanFM �E00 �V

3357 0.266 0.511 0.389 0.669 1.0203343 0.381 0.398 0.389 0.651 1.0203192 0.314 0.468 0.391 1.618 1.0203187 0.375 0.408 0.392 0.648 1.0203201 0.272 0.517 0.394 0.673 1.0203138 0.358 0.462 0.410 1.750 1.0203228 0.377 0.475 0.426 1.966 1.0203361 0.384 0.477 0.431 0.726 1.0203348 0.373 0.502 0.438 1.500 1.0203384 0.392 0.485 0.439 1.869 1.0203360 0.353 0.536 0.445 1.461 1.0203205 0.408 0.485 0.446 0.733 1.0203294 0.401 0.493 0.447 1.608 1.0203204 0.365 0.552 0.458 1.419 1.0203328 0.464 0.507 0.485 1.322 1.020

Table 3. Single experimental data in the Leeds dataset withthe lowest FM values. *Indicates selected data consideredfor removal in this dataset: mean FM values lower than 0.2,or mean FM values lower than 0.3 with FM value for N

Si

1lower than 0.2.

Pairnumber

FM forN

Si

1

FM forN

Si

2

MeanFM �E00 �V

2945* 0.193 0.304 0.249 0.337 0.7733024 0.209 0.309 0.259 0.448 0.9122890 0.248 0.363 0.305 0.448 0.8203011 0.205 0.486 0.346 0.459 0.9362961 0.322 0.470 0.396 0.627 1.0062846 0.373 0.474 0.423 1.089 0.6613012 0.271 0.581 0.426 0.639 1.1032953 0.362 0.513 0.437 0.576 0.9223008 0.282 0.598 0.440 0.634 1.0712851 0.360 0.526 0.443 1.330 0.9503032 0.397 0.494 0.445 1.858 0.9922925 0.403 0.490 0.446 1.029 0.7262786 0.300 0.598 0.449 0.988 0.5223042 0.419 0.511 0.465 1.029 0.7622957 0.387 0.549 0.468 0.939 1.378

1454 S. Morillas et al.

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

Table 5. Single experimental data in the Witt dataset with the lowest FM values. *Indicates selecteddata considered for removal in this dataset: mean FM values lower than 0.2, or mean FM values lowerthan 0.3 with FM value for N

Si

1 lower than 0.2.

Pairnumber

FMfor N

Si

1

FM forN

Si

2

MeanFM �E00 �V

3421* 0.099 0.088 0.093 0.086 0.5003790* 0.166 0.146 0.156 0.139 0.5643471* 0.150 0.228 0.189 0.129 0.3273406 0.260 0.230 0.245 0.251 0.7203446 0.291 0.215 0.253 0.268 0.7503529 0.248 0.261 0.255 0.103 0.2713759 0.278 0.245 0.261 0.314 0.8573416 0.294 0.260 0.277 0.214 0.5603453 0.333 0.249 0.291 0.396 0.9913548 0.310 0.275 0.293 0.266 0.6683452 0.339 0.253 0.296 0.340 0.8403547 0.320 0.284 0.302 0.293 0.7203411 0.334 0.296 0.315 0.274 0.6513461 0.268 0.392 0.330 0.184 0.3233454 0.378 0.284 0.331 0.555 1.258

Table 6. Si single experimental data selected to be removed from the COM dataset and FM valuescomputed for them. Low FM values of these data are interpreted as a lack of correlation with respect tonear samples.

FM in COM dataset FM in corresponding subset

Pair number Subset NSi

1 case NSi

2 case NSi

1 case NSi

2 case

612 BFD-D65 0.147 0.204 0.154 0.155770 BFD-D65 0.098 0.244 0.093 0.1901330 BFD-D65 0.040 0.076 0.040 0.0481796 BFD-D65 0.128 0.360 0.138 0.1311964 BFD-D65 0.186 0.278 0.185 0.2352043 BFD-C 0.192 0.248 0.221 0.2802095 BFD-C 0.125 0.319 0.125 0.1252147 BFD-C 0.139 0.355 0.146 0.1672153 BFD-C 0.079 0.104 0.146 0.0962212 BFD-C 0.122 0.354 0.089 0.1492337 BFD-M 0.207 0.171 0.251 0.2512342 BFD-M 0.178 0.149 0.216 0.2162353 BFD-M 0.187 0.209 0.187 0.3192380 BFD-M 0.229 0.147 0.413 0.3212401 BFD-M 0.045 0.038 0.070 0.0742432 BFD-M 0.134 0.198 0.134 0.3442473 BFD-M 0.206 0.129 0.263 0.2212484 BFD-M 0.125 0.089 0.160 0.1492492 BFD-M 0.211 0.132 0.269 0.2282554 BFD-M 0.132 0.145 0.132 0.3002683 BFD-M 0.109 0.057 0.109 0.0742703 BFD-M 0.204 0.141 0.518 0.4312706 BFD-M 0.046 0.037 0.094 0.0792709 BFD-M 0.204 0.141 0.519 0.4312738 BFD-M 0.018 0.013 0.020 0.0182945 Leeds 0.193 0.304 0.116 0.1223421 Witt 0.099 0.088 0.109 0.1093471 Witt 0.150 0.228 0.117 0.1943790 Witt 0.166 0.146 0.209 0.209

Journal of Modern Optics 1455

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009

BFD-P [9], which may explain the lowest consistency

found for this subset.

Acknowledgements

Samuel Morillas acknowledges the support of GeneralitatValenciana under Grants BEST/2008/144 and GVPRE/2008/257. The authors from the University of Granada are gratefulto research project FIS2007-64266, Ministerio de Educaciony Ciencia, Spain, with Fondo Europeo de DesarrolloRegional (FEDER) support.

References

[1] Melgosa, M. Color Res. Appl. 2007, 32, 159.

[2] Melgosa, M.; Hita, E.; Romero, J.; Jimenez del Barco, L.

J. Opt. Soc. Am. A 1992, 9, 1247–1254.

[3] Melgosa, M.; Quesada, J.J.; Hita, E. Appl. Opt. 1994, 33,

8069–8077.

[4] Melgosa, M.; Hita, E.; Poza, A.J.; Alman, D.H.; Berns,

R.S. Color Res. Appl. 1997, 22, 148–155.

[5] CIE Publication 15:2004. Colorimetry, 3rd ed.; CIE

Central Bureau: Vienna, 2004.

[6] Luo, M.R.; Cui, G.; Li, C. Color Res. Appl. 2006, 31,320–330.

[7] Oleari, C.; Melgosa, M.; Huertas, R. J. Opt. Soc. Am. A2009, 26, 121–134.

[8] MacAdam, D.L. J. Opt. Soc. Am. 1942, 32,247–274.

[9] Luo, M.R.; Rigg, B. Color Res. Appl. 1986, 11,25–42.

[10] CIE Publication 142-2001. Improvement to Industrial

Colour-difference Evaluation; CIE Central Bureau:Vienna, 2001.

[11] Kim, D.H.; Nobbs, J. New weighting functions for the

weighted CIELAB color difference formula. Proceedingsof AIC Colour 97; Color Science Association of Japan:Kyoto, Japan, May 25–30, 1997; Vol. 1, pp 446–449.

[12] Berns, R.S.; Alman, D.H.; Reniff, L.; Snyder, G.D.;

Balonon-Rosen, M.R. Color Res. Appl. 1991, 16,297–315.

[13] Witt, K. Color Res. Appl. 1999, 24, 78–92.

[14] Luo, M.R.; Cui, G.; Rigg, B. Color Res. Appl. 2001,26, 340–350.

[15] Melgosa, M.; Huertas, R.; Berns, R.S. J. Opt. Soc.

Am. A 2008, 25, 1828–1834.[16] Garcıa, P.A.; Huertas, R.; Melgosa, M.; Cui, G. J. Opt.

Soc. Am. A 2007, 24, 1823–1829.

[17] Cui, G.; Luo, M.R.; Rigg, B.; Roesler, G.; Witt, K.Color Res. Appl. 2002, 27, 282–290.

[18] Benavente, R.; Vanrell, M.; Baldrich, R. J. Opt. Soc.Am. A 2008, 25, 2582–2593.

[19] Kerre, E.E. Fuzzy Sets and Approximate

Reasoning; Xian Jiaotong University Press:Jiaotong, 1998.

[20] George, A.; Veeramani, P. Fuzzy Sets and Systems. 1994,64, 395–399.

[21] Gregori, V.; Romaguera, S. Fuzzy Sets and Systems.

2000, 115, 485–489.[22] Gregori, V.; Romaguera, S. Fuzzy Sets and Systems.

2004, 144, 411–420.

[23] Morillas, S. Fuzzy Metrics and Fuzzy Logic for ColourImage Filtering. Ph.D. Thesis, Universidad Politecnicade Valencia, Valencia, 2007.

[24] Morillas, S.; Gregori, V.; Peris-Fajarnes, G.; Sapena, A.

J. Electron. Imaging. 2007, 16, 33007.[25] Melgosa, M.; Huertas, R.; Garcıa, P.A. Analysis

of color-differences with different magnitudes using

the visual data employed at CIEDE2000 development.Proceedings of AIC Colour 2008, Stockholm, Sweden,June 15–18, 2008; pp 279–280.

Figure 8. �V versus �E00 for all data in corrected COM.Data with lowest mean FM for N

Si

1 and NSi

2 in correctedCOM correspond with cases of low colour difference forwhich its �V is overestimated. On the other hand, data withhighest FM seem to match with cases of best linearcorrelation.

1456 S. Morillas et al.

Downloaded By: [Melgosa, Manuel] At: 17:57 4 September 2009