Distinguishability quantification of fuzzy sets

21
This article was originally published in a journal published by Elsevier, and the attached copy is provided by Elsevier for the author’s benefit and for the benefit of the author’s institution, for non-commercial research and educational use including without limitation use in instruction at your institution, sending it to specific colleagues that you know, and providing a copy to your institution’s administrator. All other uses, reproduction and distribution, including without limitation commercial reprints, selling or licensing copies or access, or posting on open internet sites, your personal or institution’s website or repository, are prohibited. For exceptions, permission may be sought for such use through Elsevier’s permissions site at: http://www.elsevier.com/locate/permissionusematerial

Transcript of Distinguishability quantification of fuzzy sets

This article was originally published in a journal published byElsevier, and the attached copy is provided by Elsevier for the

author’s benefit and for the benefit of the author’s institution, fornon-commercial research and educational use including without

limitation use in instruction at your institution, sending it to specificcolleagues that you know, and providing a copy to your institution’s

administrator.

All other uses, reproduction and distribution, including withoutlimitation commercial reprints, selling or licensing copies or access,

or posting on open internet sites, your personal or institution’swebsite or repository, are prohibited. For exceptions, permission

may be sought for such use through Elsevier’s permissions site at:

http://www.elsevier.com/locate/permissionusematerial

Autho

r's

pers

onal

co

py

Distinguishability quantification of fuzzy sets q

Corrado Mencar *, Giovanna Castellano, Anna M. Fanelli

Department of Informatics, University of Bari, Campus Universitario, V.E. Orabona 4, 70126 Bari, BA, Italy

Received 25 July 2005; received in revised form 17 March 2006; accepted 11 April 2006

Abstract

Distinguishability is a semantic property of fuzzy sets that has a great relevance in the design of interpretable fuzzymodels. Distinguishability has been mathematically defined through different measures, which are addressed in this paper.Special emphasis is given to similarity, which exhibits sound theoretical properties but its calculation is usually computa-tionally intensive, and possibility, whose calculation can be very efficient but it does not exhibit the same properties of sim-ilarity. It is shown that under mild conditions – usually met in interpretable fuzzy modeling – possibility can be used as avalid measure for assessing distinguishability, thus overcoming the computational inefficiencies of similarity measures.Moreover, procedures that minimize possibility also minimize similarity and, consequently, improve distinguishability.In this sense, the use of possibility is fully justified in interpretable fuzzy modeling.� 2006 Elsevier Inc. All rights reserved.

Keywords: Similarity measures; Possibility measure; Interpretable fuzzy modeling; Distinguishability of fuzzy sets

1. Introduction

Interpretability is one of the most central issues concerning fuzzy modeling which has recently led to thedevelopment of new flourishing research directions (see, e.g. [3] for some relevant contributions). While accu-racy was the main concern of the first fuzzy model builders, in recent years interpretability has been recognizedas the key feature of fuzzy models in the context of Soft Computing [2].

A common approach for designing interpretable fuzzy models is based on a collection of interpretability

constraints, i.e. a set of formal properties that are imposed on model components (fuzzy sets, rules, etc.) soas to avoid incomprehensible configurations.

Distinguishability of fuzzy sets is one of the most common interpretability constraints adopted in literature(see, e.g. [1,5,9–12,14–17,19,20,22,26]). Informally speaking, distinguishability is a relation between fuzzy sets(defined on the same Universe of Discourse) directly related to their overlapping: the more overlapping two

0020-0255/$ - see front matter � 2006 Elsevier Inc. All rights reserved.

doi:10.1016/j.ins.2006.04.008

q This paper is a revised and extended version of the paper: C. Mencar, G. Castellano, A.M. Fanelli, A. Bargiela, ‘‘Similarity vs.

Possibility in Measuring Fuzzy Sets Distinguishability’’ published in the Proceedings of RASC 2004 (Recent Advances in Soft Computing),Nottingham (UK), 16–18 December 2004, pp. 354–359.

* Corresponding author. Tel.: +39 080 5442456; fax: +39 080 5442135.E-mail address: [email protected] (C. Mencar).

Information Sciences 177 (2007) 130–149

www.elsevier.com/locate/ins

Autho

r's

pers

onal

co

py

fuzzy sets are, the less distinguishable they become. Distinguishable fuzzy sets are well disjunct so as to rep-resent distinct concepts and can be associated to metaphorically1 different linguistic labels. As stated in thefirst pioneering works on interpretable fuzzy modeling, well distinguishable fuzzy sets provide the followingadvantages:

• Obviate the subjective establishment of membership function/linguistic term association [25];• Avoid potential inconsistencies in fuzzy models [27];• Reduce model’s redundancy and consequently computational complexity [22];• Linguistic interpretation of the fuzzy set is easier [22].

Distinguishability is hence a fundamental requirement for interpretable fuzzy modeling, especially for thedesign of Frames of Cognition [6]. A Frame of Cognition is a collection of fuzzy sets defined on the same Uni-verse of Discourse, which can be labelled with natural language terms to represent human-oriented concepts.The fuzzy sets of a Frame of Cognition must be representative of the Universe of Discourse and semantically

sound. Representativeness is usually formalized through the notion of coverage (i.e. for each element of theUniverse of Discourse there exists a fuzzy set in the Frame of Cognition such that the element belongs tothe fuzzy set with a degree greater than a user-specified level, usually set to 0.5). The notion of semantic sound-ness is not univocal and is usually formalized by the designer with a set of interpretability constraints, to whichdistinguishability very often belongs. It should be noted that the formal properties defined for a Frame of Cog-nition may be conflicting. As an example, completely disjunct fuzzy sets are maximally distinguishable, but aFrame of Cognition made of such kind of fuzzy sets could violate the coverage property. As a consequence,distinguishability must be balanced with other interpretability constraints that may require a partial overlap-ping of fuzzy sets.

Distinguishability is a property that can be formalized in different ways. The most widely adopted mathe-matical characterization is by means of similarity measures. In [22] similarity measures are deeply discussed inthe context of fuzzy modeling. There, similarity is interpreted as a fuzzy relation defined over fuzzy sets andcorresponds to the ‘‘degree to which two fuzzy sets are equal’’. Such interpretation is then formally character-ized by a set of axioms.

Similarity measures well capture all the requirements for distinguishable fuzzy sets, but their calculation isusually computationally intensive. As a consequence, most strategies of model building that adopt similaritymeasures for interpretability enhancement are based on massive search algorithms such as Genetic Algorithms[15,20,23], Evolution Strategies [13], Symbiotic Evolution [11], Coevolution [19], or Multi-Objective GeneticOptimization [12]. Alternatively, distinguishability improvement is realized in a separate design stage, oftenafter some data driven procedure like clustering, in which similar fuzzy sets are usually merged together[17,22].

When the distinguishability constraint has to be included in less time consuming learning paradigms, likeneural learning, similarity measure is seldom used. In [17], after a merging stage that uses similarity, fine tun-ing is achieved by simply imposing some heuristic constraints on centers and width of membership functions.In [14], a distance function between fuzzy sets (restricted to the Gaussian shape) is used in the regularizationpart of the RBF2 cost function. In [10] a sophisticated distance function is used to merge fuzzy sets. In [4,9,16]the possibility measure is adopted to evaluate distinguishability.

The possibility measure has been extensively studied within Fuzzy Set Theory and has some attracting fea-tures that promote a deeper investigation in the context of distinguishability assessment. Although it is not asimilarity measure, it has a clear and well-established semantics since it can be interpreted as the degree towhich a flexible constraint ‘‘X is A’’ is satisfied [24,28]. In addition, the possibility quantification of two fuzzysets can be often analytically expressed in terms of fuzzy set parameters. This makes possibility evaluation veryefficient so that it can be effortlessly embodied in computationally inexpensive learning schemes.

1 The metaphor of a linguistic label is the implicit semantics that it bears. As an example, the metaphor of the linguistic label TALL is the(fuzzy) set of people heights a person perceives as tall.

2 Radial Basis Function neural network.

C. Mencar et al. / Information Sciences 177 (2007) 130–149 131

Autho

r's

pers

onal

co

py

The objective pursued in this paper is to examine a number of distinguishability measures commonlyadopted in literature, with special emphasis on similarity and possibility (Section 2). For these two measures,some significant theoretical relationships are derived. Specifically, in Section 3 some sufficient conditions areproved, which positively correlate possibility and similarity in the worst case, i.e. the lower is the possibilitybetween two fuzzy sets, the lower is the upper-bound of similarity between the same fuzzy sets. Finally, in Sec-tion 4 some theorems show that, under some mild conditions (such as normality, convexity and continuity offuzzy sets), any transformation aimed at reducing possibility between fuzzy sets actually reduces also their sim-ilarity measure and, consequently, improves their distinguishability. In light of such theoretical results, thepossibility measure emerges as a good candidate for interpretability analysis as well as for efficient interpret-able fuzzy modeling.

2. Distinguishability measures

In this section the most common measures to quantify distinguishability are formalized and brieflydescribed. A deep description is provided for the similarity and the possibility measures, since they are subjectof further investigation in successive sections. Other measures proposed in literature are also surveyed, andsome theoretical results for them are established.

For the sake of clarity, the adopted formal notation is defined here. In this paper any fuzzy set is denotedwith a capital letter (A,B, etc.) and the corresponding membership function with lA, lB, etc. Each membershipfunction is defined on the same Universe of Discourse U, which is assumed to be a one-dimensional closedinterval ½mU ;MU � � R. The set of all possible fuzzy (sub-)sets defined over U is denoted with FðUÞ, whilethe finite family of fuzzy sets actually involved in a fuzzy model is called ‘‘Frame of Cognition’’ (or, briefly,‘‘Frame’’) and it is denoted with F.

2.1. Similarity measure

According to [22], a similarity measure between two fuzzy sets A and B is a fuzzy relation that expresses thedegree to which A and B are equal. Put formally, in [22] any similarity measure is a function:

S : FðUÞ �FðUÞ ! ½0; 1� ð1Þsuch that:

(1) Non-overlapping fuzzy sets are totally non-similar3:

8x 2 U : lAðxÞlBðxÞ ¼ 0) SðA;BÞ ¼ 0 ð2ÞNote that right-to-left implication is not possible since fuzzy sets that overlap in a finite (or countable)

number of points still have zero similarity.(2) Only equal fuzzy sets have maximum similarity:

8x 2 U : lAðxÞ ¼ lBðxÞ ) SðA;BÞ ¼ 1 ð3ÞAgain, if two fuzzy sets have different membership values for a finite (or countable) number of points

have maximal similarity; hence right-to-left implication is not possible.(3) The similarity measure is invariant under linear scaling of the fuzzy sets, provided that the scaling is the

same:

SðA;BÞ ¼ SðA0;B0Þ ð4Þwhere

9k 6¼ 0 9l 2 R :lAðxÞ ¼ lA0 ðkxþ lÞlBðxÞ ¼ lB0 ðkxþ lÞ

ð5Þ

3 In [22] another definitory property states that overlapping fuzzy sets must have similarity greater than zero. However, this is a directconsequence of property 1 and definition of S.

132 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

Property 1 assures that mutually exclusive fuzzy sets are not similar at all, while property 2 assures thatequal fuzzy sets are totally similar. Property 3 establishes that similarity between fuzzy sets does not changeif the Universe of Discourse changes according to a linear transformation of its values (e.g. if two fuzzy tem-peratures are similar when expressed Celsius degrees, they must remain similar also when expressed in Fahr-enheit degrees).

For the sake of completeness, the following symmetry property is added, which makes similarity moreadherent to the informal definition of ‘‘degree of equality’’:

(4) Similarity is a symmetrical function:

SðA;BÞ ¼ SðB;AÞ ð6Þ

Several similarity measures have been proposed in literature, and some of them can be found in [7,21].However, in interpretability analysis, the most commonly adopted similarity measure is the following:

SðA;BÞ ¼ jA \ BjjA [ Bj ð7Þ

where intersection \ and union [ are defined by a proper couple of s-norm and s-conorm and jÆj is the car-dinality of the resulting fuzzy set, usually defined as

jAj ¼Z

UlAðxÞdx ð8Þ

Often, minimum and maximum are used as s-norm and s-conorm respectively, hence definition (7) canbe rewritten as

SðA;BÞ ¼ jA \ BjjAj þ jBj � jA \ Bj ð9Þ

The definition (9) of similarity will be considered hereafter. Using similarity, distinguishability in aFrame of Cognition can be formally defined as the complement of the similarity measure between two fuzzysets, i.e.

DðA;BÞ ¼ 1� SðA;BÞ ð10ÞDistinguishability between two distinct fuzzy sets is guaranteed by imposing that its value cannot be less

than a user-given threshold d:

8A;B 2 F : A 6¼ B! DðA;BÞP d ð11Þwhich is equivalent to imposing:

8A;B 2 F : A 6¼ B! SðA;BÞ 6 r ð12Þbeing r = 1 � d.

The evaluation of the similarity measure, either in the general form (7) or in the specific form (9), can bedone analytically only for particular classes of fuzzy sets (e.g. triangular fuzzy sets), but may become compu-tationally intensive for other classes of fuzzy sets (e.g. Gaussian) because of the integration operation (8),which is necessary for determining the cardinality of the involved fuzzy sets. For such reason, similarity isnot used in some learning schemes, where more efficient measures are instead adopted.

2.2. Possibility measure

The possibility measure between two fuzzy sets A and B is defined as the degree of applicability of the softconstraint ‘‘x is B’’ for x = A. Possibility is evaluated according to the following definition:

PðA;BÞ ¼ supx2U

minflAðxÞ; lBðxÞg ð13Þ

C. Mencar et al. / Information Sciences 177 (2007) 130–149 133

Autho

r's

pers

onal

co

py

An useful interpretation of possibility measure is the extent to which A and B overlap [18]. Furthermore, pos-sibility can be employed to verify the representativeness of the fuzzy sets within a frame of cognition. Indeed, itis easy to show that if the possibility value of two adjacent fuzzy sets in the Frame4 is greater than a thresholde, then the coverage of the Frame is guaranteed at level e.

Often the possibility measure (13) is used to evaluate distinguishability. As for similarity measure, distin-guishability can be formally defined as the complement of the possibility between two distinct fuzzy sets, whichmust not be less than a threshold d:

DðA;BÞ ¼ 1�PðA;BÞP d ð14Þwhich implies

8A;B 2 F : A 6¼ B! PðA;BÞ 6 # ð15Þbeing # = 1 � d.

Some important considerations are noteworthy for the possibility measure, especially in comparison withthe definition of similarity measure. First of all, the possibility measure is not a similarity measure becauseproperty 2 for similarity measures is not verified. Moreover, in general there is not any monotonic correlationbetween possibility and similarity as showed in the following two examples.

Example 1. Let A and B be two crisp sets defined as closed intervals [0,1] and [1,2] respectively (see Fig. 1).Suppose that intersection and union operations are defined by the minimum and maximum operators

respectively. Then

SðA;BÞ ¼ jA \ BjjA [ Bj ¼

jf1gjj½0; 2�j ¼

0

2¼ 0 ð16Þ

On the other hand, the possibility measure is maximal for the two sets:

PðA;BÞ ¼ supx2½0;2�

minflAðxÞ; lBðxÞg ¼ supx2½0;2�

0; x 6¼ 1

1; x ¼ 1

�¼ 1 ð17Þ

This result is coherent with the definition of the two measures. Indeed, the two intervals are quasi-disjunct (i.e.disjunct everywhere except in a single point), hence their similarity should be very low (zero using the standarddefinition (9)). On the other hand there is the full possibility that an element of the Universe of Discoursebelongs both to A and B, hence the possibility measure is maximal.

Example 2. Let A be a fuzzy set defined over the entire Universe of Discourse U with constant membershipfunction 0 < e� 1 (see Fig. 2).

–1 –0.5 0 0.5 1 1.5 2 2.5 30

0.5

1

U

Mem

bers

hip

degr

ee

Fig. 1. Example of fuzzy sets with full possibility but zero similarity.

4 In this context, two fuzzy sets in a Frame of Cognition are said adjacent if there is not any other fuzzy set in the frame whose prototype(assumed only one) lies between the prototypes of A and B.

134 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

Then, by definition

SðA;AÞ ¼ 1 ð18Þbut

PðA;AÞ ¼ supx2U

lAðxÞ ¼ e ð19Þ

Here, again, the similarity is maximal, reflecting the idea of ‘‘degree of equality’’. On the other hand, the pos-sibility is very low since it is very unlikely (or undesirable, etc.) that an element belongs to the fuzzy set A.

The previous examples show that the possibility measure cannot be used to evaluate similarity between twofuzzy sets. Nevertheless, the possibility measure has features that are important in distinguishability analysis:

(1) The possibility threshold # has a clear semantics as it can be interpreted in the context of Possibility The-ory [8]. On the other hand, the similarity threshold r has a more arbitrary nature (it also depends on thespecific definition of intersection and union operators).

(2) Numerical integration is not necessary when calculating possibility, in contrast with similarity calcula-tion. Moreover, although the general definition (13) may require a numerical sampling of the Universeof Discourse, the possibility measure can be evaluated analytically for several classes of membershipfunctions (e.g. triangular, Gaussian, bell-shaped, etc., see Table 1 for an example). This feature enablesthe adoption of possibility measure in efficient learning schemes.

The two aforementioned arguments promote a deeper investigation on possible relationships occurringbetween similarity and possibility, especially in the context of interpretability analysis.

–1 0 1 2 30

0.2

0.4

0.6

0.8

1

U

Mem

bers

hip

degr

ee

Fig. 2. Example of fuzzy set with low possibility.

Table 1Possibility and similarity measures for two common classes of fuzzy sets (special cases not considered)

Shape Possibility/similarity

Triangular P ¼ ðc1 � a2Þ=ðb2 � b1 � a2 þ c1ÞS ¼ �ðc1 � a2Þ2=ðaþ bþ cÞ

liðxÞ ¼ max 0;minx� ai

bi � ai;

x� ci

bi � ci

� �� �wherea = a1(b2 � b1 � a2 + c1)b = �a2(b1 � b2 � c1 � c2)c = b1c2 � b2c1 � b2c2 � c1c2

Gaussian P ¼ exp � ðx1 � x2Þ2

2ðr1 � r2Þ2

!

liðxÞ ¼ exp �ðx� xiÞ2

2r2i

!S not analytically definable

C. Mencar et al. / Information Sciences 177 (2007) 130–149 135

Autho

r's

pers

onal

co

py

2.3. Other distinguishability measures

Similarity and possibility are not the unique measures adopted in literature for distinguishability quantifi-cation. In [5] a different characterization of distinguishability – called ‘‘overlap’’ – is used. Such measure isdefined as

DðA;BÞ ¼ diamð½A�a \ ½B�aÞdiamð½A�aÞ

ð20Þ

being [A]a and [B]a the a-cuts5 of A and B respectively, and‘‘diam’’ evaluates the extension of the crisp set inargument:

diam X ¼ max X �min X ð21ÞAccording to the authors, the overlap constraint is satisfied when

L 6 DðA;BÞ 6 U ð22Þbeing L and U two user-defined thresholds and A, B two adjacent fuzzy sets in the Frame of Cognition. Suchspecial measure of overlap is not commonly used in literature (especially in interpretability analysis), conceiv-ably for the following drawbacks:

(1) The measure depends on three parameters (a, L and U). Moreover, the sensibility of the measure w.r.t.such parameters is very high, as stated by the same authors.

(2) The measure is asymmetrical (D(A,B) 5 D(B,A)), due to the denominator in (20). This property is coun-ter-intuitive, as the distinguishability appears as a symmetrical relation between two fuzzy sets.

In [26], another criterion is adopted to quantify distinguishability. It does not use a measure between fuzzysets, but a pointwise property that must hold everywhere in the Universe of Discourse. Specifically, the follow-ing condition must be verified:

8x 2 U :ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXX2F

lX ðxÞp

p

r6 1 ðp P 1Þ ð23Þ

Such condition assures that any element of U will not have simultaneously high membership grades in differentfuzzy sets of the same Frame of Cognition.

The parameter p is user-defined and regulates the strength of the condition, like threshold values when sim-ilarity and possibility measures are used. For p = 1, the constraint (23) requires that the sum of the member-ship values of a point to all fuzzy sets is not greater than one. Hence it is valid only for highly distinguishablefuzzy sets. For instance, if an element has a high membership degree for a fuzzy set (i.e. it is well representedby a fuzzy concept), then it must have small membership degrees for all the remaining fuzzy sets. As a con-sequence, the fuzzy sets of the Frame of Cognition must be highly distinguishable. The strength of the con-straint is reduced for higher values of p and vanishes for p! +1. In this case, it is simply required thatthe maximum membership degree of an element to all fuzzy sets of the Frame of Cognition is less thanone. Actually this is a trivial condition that does not impose any constraint on the distinguishability of fuzzysets.

The advantage of using condition (23) stands in the possibility of evaluating the inequality when just aninput example is given. The violation of the condition for an input example can lead to the modification ofthe fuzzy sets in the Frame of Cognition, through the application of an appropriate learning rule. As a con-sequence, the adoption of condition (23) for fuzzy modeling is advantageous for controlling distinguishabilityof fuzzy sets during on-line training sessions. It has been effectively used in [26] for designing a neuro-fuzzynetwork trained with a regularized learning scheme (like a RBF network), together with other interpretabilityconstraints.

5 The a-cut of a fuzzy set A is a crisp set defined as [A]a = {x 2 U : lA(x) P a}.

136 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

It can be proved that this measure is strictly related to the possibility measure, as shown in the following:

Proposition 1. Let F be a Frame of Cognition. If the constraint (23) holds, then

8A;B 2 F : A 6¼ B! PðA;BÞ 6 2�1p ð24Þ

The proof is given in Appendix A.1.The proposition shows that the condition (23) is actually a constraint on the possibility measure between

any two fuzzy sets in the Frame of Cognition. Condition (23) is therefore a clear example of application of thepossibility measure for evaluating distinguishability between fuzzy sets and further motivates a deeper under-standing of the role of possibility in the context of distinguishability analysis.

In [14], the distinguishability is evaluated according to a distance function especially suited for Gaussianmembership functions. If A and B are Gaussian fuzzy sets with centers xA and xB respectively, and widthsrA and rB respectively, then the distance is evaluated according to

dJ ðA;BÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðxA � xBÞ2 þ ðrA � rBÞ2

qð25Þ

From the distance function dJ, a new similarity measure can be induced according to the set-theoreticalapproach described in [29]. More specifically, the following similarity measure can be defined:

SJ ðA;BÞ ¼1

1þ dJ ðA;BÞð26Þ

Such kind of measure, while inexpensive for its calculation, is strictly limited to Gaussian fuzzy sets. More-over, it is not scale invariant as shown by the following proposition.

Proposition 2. Given two Gaussian fuzzy sets A, B of centers xA, xB and widths rA, rB and a rescaling of factor

k > 0 and shift l, then

SJ ðA0;B0Þ 6¼ SJðA;BÞ ð27Þbeing A 0, B 0 the re-scaled fuzzy sets.

The proof is given in Appendix A.2.This and other issues are faced in [10], where a more sophisticated metric is adopted for fuzzy partitioning

with semantic constraints.

3. Theoretical relations between similarity and possibility

On the basis of the considerations discussed in the previous section, the possibility measure emerges as apotentially good candidate for distinguishability quantification. The main advantage deriving from adoptingpossibility consists in a more efficient evaluation of distinguishability, which can be used in on-line learningschemes. However, the adoption of possibility for quantifying distinguishability is consistent provided theexistence of a monotonic relation between possibility and similarity, i.e. a relation that assures low gradesof similarity for small values of possibility and vice versa. Unfortunately, as previously showed in Examples1 and 2, such relation does not exist unless some restrictions are imposed on the involved fuzzy sets.

To find a relationship between possibility and similarity, in this section sufficient conditions will be proved,provided that the involved fuzzy sets verify the following interpretability constraints:

• Normality.• Convexity.• Continuity.

The three above-mentioned constraints are widely used in interpretable fuzzy modeling, since usually fuzzysets used in model design belong to specific classes (such as triangular, Gaussian, trapezoidal, etc.) that fulfillall the three properties. As a consequence, the assumption of the validity of the above constraints does notlimit the subsequent analysis to any specific class of interpretable fuzzy model.

C. Mencar et al. / Information Sciences 177 (2007) 130–149 137

Autho

r's

pers

onal

co

py

Before enunciating the relation between similarity and possibility, two lemmas are necessary to establishsome properties of convex fuzzy sets, which will turn useful in the successive discussion.

Lemma 3. For any convex fuzzy set A there exists an element p 2 U such that the membership function lA

restricted to the sub-domain

U L ¼ fx 2 U : x 6 pg ð28Þ

is non-decreasing. Similarly, the membership function lA restricted to the sub-domain

U R ¼ fx 2 U : x P pg ð29Þ

is non-increasing.

Proof. Let p 2 argmaxx2UlA(x). Consider two points x1 < x2 6 p and the corresponding membership valuesa1 = lA(x1) and a2 = lA(x2). By definition of convexity, a2 P min{lA(p),a1} = a1. Hence, the membershipfunction lA is non-decreasing in UL. With the same procedure, it can be proved that lA is non-increasingin UR. h

Lemma 4. Let A and B be two continuous convex normal fuzzy sets with pA 2 argmaxlA and pB 2 argmaxlB

such that pA < pB. Then, there exists a point between pA and pB whose membership degree (both on A and B) cor-

responds to the possibility measure between A and B:

PðA;BÞ ¼ #! 9x 2 ½pA; pB� : lAðxÞ ¼ lBðxÞ ¼ # ð30Þ

The proof is reported in Appendix A.3.The last lemma states that for convex, continuous and normal fuzzy sets, the ‘‘intersection point’’ of the

fuzzy membership functions lA and lB laying between the modal values of A and B determines the possibilityvalue (see Fig. 3).

In this way, the analysis can be limited to the interval between the modal values of the two fuzzy sets, thussimplifying further considerations on relating possibility and similarity.

Now it is possible to establish an important relation between possibility and similarity with the followingtheorem.

0

1

pA pB

π

x

Fig. 3. The membership degree of intersection point between two membership functions corresponds to the possibility between the twofuzzy sets.

138 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

Theorem 5. Let A and B be two fuzzy sets that are continuous, normal and convex. Let pA 2 argmaxlA,

pB 2 argmaxlB and suppose pA < pB. Let # = P(A,B) and x# 2 [pA,pB] such that lA(x#) = lB(x#) = #. Inaddition, suppose that:

8x 2�pA; x#½:d2lA

dx2ðxÞ 6 0 ð31Þ

and

8x 2�x#; pB½:d2lB

dx2ðxÞ 6 0 ð32Þ

Then, the similarity between A and B is upper-bounded by

SðA;BÞ 6 Smax ¼2#

r þ 2#� r#ð33Þ

being r the ratio between the distance pB � pA and the length of the support6 of A [ B:

r ¼ pB � pA

jsupp A [ Bj ð34Þ

The proof is reported in Appendix A.4.It should be remarked that the additional constraint for the second derivatives, required in the theorem

hypothesis, is not particularly limiting, since commonly used fuzzy set shapes (triangular, trapezoidal, etc.)satisfy such requirement.

It should be also noted that the relationship between possibility and similarity established by the theoremholds only for the upper-bound of the similarity measure, while the actual value is strictly related to the shapeof the membership function. Paradoxically, the actual similarity measure between two low-possibility fuzzysets may be higher than two high-possibility fuzzy sets. However, relation (33) assures that the similarity mea-

sure does not exceed a defined threshold that is monotonically related to the possibility measure. As a conse-quence, any modeling technique that assures small values of possibility between fuzzy sets, indirectlyprovides small values of similarity and, hence, good distinguishability between fuzzy sets (see Fig. 4). Thus,relation (33) justifies the adoption of possibility measure in interpretable fuzzy modeling.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

π

r

Similarity

0.1

0.1

0.1

0.2

0.2

0.2

0.3

0.3

0.3

0.4

0.4

0.4

0.5

0.5

0.5

0.6

0.6

0.6

0.70.

7

0.7

0.8

0.8

0.8

0.8

0.9

0.9

0.9

0.9

11

1

Fig. 4. Contour plot of maximal similarity Smax with respect to r and p.

6 The support of a fuzzy set is the (crisp) set of all elements with non-zero membership, i.e. suppX = {x 2 U : lX(x) > 0}. For convexfuzzy sets, the support is an interval.

C. Mencar et al. / Information Sciences 177 (2007) 130–149 139

Autho

r's

pers

onal

co

py

A problem arises when Gaussian fuzzy sets are adopted, since the second derivatives of the respective mem-bership functions may not be negative as requested by the theorem. In order to satisfy the theorem hypothesis,it is necessary that the intersection point between two Gaussian fuzzy set must lay between the prototype andthe inflection point of each membership function. To guarantee this condition, the possibility threshold shouldnot be less than e�1/2 � 0.60653. However, a specific analysis can be made for Gaussian fuzzy sets, whichshows that possibility and similarity are still monotonically related. Indeed, the following proposition canbe proved.

Proposition 6. Let A and B be two Gaussian fuzzy sets with same width. If the possibility value between A and B

is # then their similarity measure is

SðA;BÞ ¼ 1� erfðffiffiffiffiffiffiffiffiffiffiffiffiffi� ln#p

Þ1þ erfð

ffiffiffiffiffiffiffiffiffiffiffiffiffi� ln#p

Þð35Þ

where

erfðxÞ ¼ 2ffiffiffipp

Z x

0

e�t2 dt ð36Þ

The proof is given in Appendix A.5.It is noteworthy that the similarity of two Gaussian membership functions with equal width is related to

their possibility measure but it is independent from the value of width. In Fig. 5 the relationship between pos-sibility and similarity is depicted, showing the monotonic behavior that justifies the adoption of possibilitymeasure for distinguishability quantification even when Gaussian fuzzy sets (of equal width) are adopted.

When Gaussian fuzzy sets have different width, the relation between possibility and similarity becomesmore complex and non-monotonic.7 As can be seen from Fig. 6(a), when widths are very different the areadelimited by the intersection of two fuzzy sets has a very different shape compared with that depicted inFig. A.3, where equal width fuzzy sets are intersected. As a consequence, the functional relationship betweenpossibility and similarity does not follow the trend depicted in Fig. 5. In this case, the sufficient condition pro-vided by Theorem 5 should be considered. However, if the widths of the Gaussian fuzzy sets are similar, as

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Π (A,B)

S (

A,B

)

Fig. 5. Functional relationship between possibility and similarity of two Gaussian fuzzy sets of equal width.

7 It can be proved that the relationship between possibility and similarity depends on the specific widths of the Gaussian fuzzy sets. Theproof is omitted since it does not convey further arguments.

140 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

exemplified in Fig. 6(b), the relationship between possibility and similarity can be still considered roughlymonotonic.

4. Distinguishability improvement

During the design of a fuzzy model, a standard approach to increase distinguishability between two fuzzysets is to reduce their similarity value. However, the calculation of similarity calls for computationally inten-sive methods (e.g. genetic algorithms) or separate stages (e.g. fuzzy sets merging). When efficient learningschemes are necessary, other measures are adopted in place of similarity. Hence, an interesting issue concernshow reducing non-similarity measures effectively reduces similarity. Here the analysis is focused on possibilitymeasure as an alternative quantification of distinguishability. The following lemma characterizes a wide classof possibility-reducing procedures.

Lemma 7. Let A and B two fuzzy sets defined on the Universe of Discourse U, which are continuous normal andconvex. Let pA 2 argmaxlA, pB 2 argmaxlB and suppose pA < pB. Let U : FðUÞ !FðUÞ be a transformation

such that B 0 = U(B) is a continuous, normal and convex fuzzy set, and 8x 2 U : x 6 pB ! lB0 ðxÞ 6 lBðxÞ. Then,

PðA;B0Þ 6 PðA;BÞ ð37Þ

Conversely, if B 0 is such that 8x 2 U : x 6 pB ! lB0 ðxÞP lBðxÞ, then

PðA;B0ÞP PðA;BÞ ð38ÞThe proof is portrayed in Appendix A.6.

Two very common examples of transformations satisfying the lemma’s hypothesis are the translationðlB0 ðxÞ ¼ lBðx� x0Þ; x0 > 0Þ and the contraction ðlB0 ðxÞ ¼ lBðxÞ

p; p > 1Þ of the membership function. Such

transformations can be effectively used to reduce possibility between two fuzzy sets, but in order to establishwhether such transformations also reduce similarity an additional condition must be introduced, as proved inthe following theorem.

Theorem 8. Any transformation U : FðUÞ !FðUÞ such that B 0 = U(B) preserves Lemma 7 hypothesis and

additionally

r0 ¼ pB0 � pA

jsuppA [ B0jPpB � pA

jsuppA [ Bj ¼ r ð39Þ

produces a decrease of the maximal similarity Smax.

The proof is reported in Appendix A.7.

0

1

(a)

0

1

(b)

Fig. 6. Two intersections between Gaussian fuzzy sets with different width: (a) very different widths; (b) similar widths.

C. Mencar et al. / Information Sciences 177 (2007) 130–149 141

Autho

r's

pers

onal

co

py

As a corollary of the theorem, every method aimed at reducing possibility actually reduces (maximal) sim-ilarity, thus improving distinguishability. In this sense, the adoption of possibility as a measure of distinguish-ability is fully justified. The additional constraint required in the corollary (the ratio r must not decrease) isalways fulfilled by any translation that lengthens the distance between the prototypes, as well as by contrac-tions. However, attention must be paid for those transformation that reduce the support, for which the rela-tion (A.51) must be carefully taken into account.

5. Final remarks

In this paper it has been shown that possibility is both an effective and computationally efficient measure toquantify distinguishability, an important constraint for interpretable fuzzy modeling. Indeed, while similaritycan be considered as the most representative measure for distinguishability of fuzzy sets, it has been provedthat under mild conditions – such as continuity, convexity and normality – possibility and similarity arerelated monotonically, so that small values of possibility imply small values of maximal similarity. The addedvalues of possibility measure are its sound semantic meaning and the computational efficiency of the calcula-tion procedure. As a matter of fact, in many cases possibility measure can be expressed analytically in terms offuzzy sets parameters, so it can be used in many learning schemes without resorting computationally intensivealgorithms.

Finally, it is noteworthy to observe that the possibility measure can be used for the twofold objective ofensuring high distinguishability and high coverage of fuzzy sets. The first objective can be achieved by impos-ing an upper bound on the possibility value of two fuzzy sets, while coverage can be guaranteed by imposing alower bound between adjacent fuzzy sets of a Frame of Cognition. For both objectives it is sufficient to com-pute the possibility value just once, thus saving valuable time during the design process. As an alternativeapproach, it is possible to use relation (33) – or refined variants, such as (35) for Gaussian fuzzy sets – toimprove coverage without compromising too much distinguishability.

Appendix A. Proofs

A.1. Proposition 1

Proof. Consider the min function (x,y) 2 [0, 1]2 # min(x,y) and let U � [0,1]2 be a compact set. In order tofind the supremum of the min function within the region U, the following line segments can be defined:

uðxÞ ¼ fðx; yÞ 2 ½0; 1�2 : y P xg ðA:1Þ

and

rðyÞ ¼ fðx; yÞ 2 ½0; 1�2 : x P yg ðA:2Þ

Let be

x1 ¼ supfx 2 ½0; 1� : uðxÞ \ U 6¼ ;g ðA:3Þ

and

y2 ¼ supfy 2 ½0; 1� : rðyÞ \ U 6¼ ;g ðA:4ÞThe maximum between x1 and y2 is the supremum of min within U. Indeed

8ðx; yÞ 2 ½0; 1�2 : ðx; yÞ 2 U ^ y P x! x 6 x1 ) 8ðx; yÞ 2 ½0; 1�2 : ðx; yÞ 2 U ^ y P x! minðx; yÞ 6 x1

ðA:5Þand similarly

8ðx; yÞ 2 ½0; 1�2 : ðx; yÞ 2 U ^ x P y ! y 6 y2 ) 8ðx; yÞ 2 ½0; 1�2

: ðx; yÞ 2 U ^ x P y ! minðx; yÞ 6 y2

ðA:6Þ

142 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

As a consequence

8ðx; yÞ 2 U : minðx; yÞ 6 maxfx1; y2g ðA:7Þ

Since, by definition,

9y1 2 ½0; 1� : ðx1; y1Þ 2 U ðA:8Þ

and

9x2 2 ½0; 1� : ðx2; y2Þ 2 U ðA:9Þ

then

9ðx; yÞ 2 U : minðx; yÞ ¼ maxfx1; y2g ðA:10Þ

Hence

supðx;yÞ2U

minðx; yÞ ¼ maxfx1; y2g ðA:11Þ

This general result can be applied to a specific class of sets, defined as follows:

Up ¼ fðx; yÞ 2�0; 1�2 : xp þ yp6 1g ðA:12Þ

If the constraint (23) is verified by the Frame of Cognition F, then the following inequality is verified:

WA;B � Up ðA:13Þ

where

WA;B ¼ fðlAðuÞ; lBðuÞÞ 2 ½0; 1�2

: u 2 Ug ðA:14Þ

Fig. A.1 illustrates the relationships between the sets WA,B and Up, as well as the contour of the min function.Relation (A.13) is due to the following inequality:

lAðuÞp þ lBðuÞ

p6 1�

XX2FnfA;Bg

lX ðxÞp6 1 ðA:15Þ

Given two sets A and B in the frame F, the possibility measure between A and B, namely P(A,B), verifies thefollowing equality:

PðA;BÞ ¼ supðx;yÞ2WA;B

minðx; yÞ ðA:16Þ

Since WA,B � Up, then

PðA;BÞ 6 supðx;yÞ2Up

minðx; yÞ ðA:17Þ

In order to quantify the right side of (A.17), the values x1 and y2 should be calculated as in (A.3) and (A.4). Itis actually unnecessary to calculate both values, since they coincide due to the symmetry of Up w.r.t. thestraight line y = x.

It is easy to prove that x1 ¼ y2 ¼ 2�1p. Indeed, for a given �x > 2�

1p then

8ðx; yÞ 2 uð�xÞ : xp þ yp > 2�xp > 1 ðA:18Þhence uð�xÞ \ Up ¼ ;. Furthermore, for a given �x < 2�

1p there exists ��x such that �x < ��x < 2�

1p such that

uð��xÞ \ Up 6¼ ;. As a consequence:

supðx;yÞ2Up

minðx; yÞ ¼ 2�1p ðA:19Þ

which proves the theorem. h

C. Mencar et al. / Information Sciences 177 (2007) 130–149 143

Autho

r's

pers

onal

co

pyA.2. Proposition 2

Proof. The membership functions of A 0, B 0 are

lA0 ðxÞ ¼ lAðkxþ lÞ ¼ exp �ðx� x0AÞ2

2r02A

!ðA:20Þ

and

lB0 ðxÞ ¼ lBðkxþ lÞ ¼ exp �ðx� x0BÞ2

2r02B

!ðA:21Þ

where

xA0 ¼xA � l

k; xB0 ¼

xB � lk

ðA:22Þ

and

r0A ¼rA

k; r0B ¼

rB

kðA:23Þ

As a consequence:

dJ ðA0;B0Þ ¼dJ ðA;BÞ

kðA:24Þ

and hence

SJ ðA0;B0Þ ¼k

k þ dJ ðA;BÞ6¼ SJ ðA;BÞ ðA:25Þ

for k51. h

Fig. A.1. The contour of the min function (gray lines), the set U3 (shadowed) and an example of WA,B (dashed line).

144 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

A.3. Lemma 4

Proof. By definition of possibility (13):

# ¼ sup minx2U

flAðxÞ; lBðxÞg ðA:26Þ

Because of continuity of fuzzy set A, and by Weierstrass theorem, the range of lA in the interval [pA,pB] is aclosed interval [mA, 1]. Similarly, the range of lB in the same interval is [mB, 1]. Let bee# ¼ sup min

x2½pA;pB�flAðxÞ; lBðxÞg ðA:27Þ

Let ~x 2 ½pA; pB� such that minflAð~xÞ; lBð~xÞg ¼ e#. Without loss of generality, suppose that lAð~xÞ ¼ e# < 1. Be-cause of convexity of A, then 8x : pA 6 x < ~x! lAðxÞ > e#. As a consequence lBð~xÞP e#. If lBð~xÞ > e#, thenfor any D~x > 0 sufficiently small, lBð~x� D~xÞ > e# and lAð~x� D~xÞ > e#. Thus, minflAð~x� D~xÞ;lBð~x� D~xÞg > e# and (A.27) would be invalidated. As a consequence, lBð~xÞ ¼ lAð~xÞ ¼ e#. In the specific caseof e# ¼ 1, the equality of the two membership values is trivial.

Suppose now, ab absurdo, that the possibility measure # is different from e#, i.e. # > e# (the case # < e# isimpossible by definition of possibility measure). Two cases can be considered:

ðiÞ 9x# < pA : minflAðx#Þ; lBðx#Þg ¼ # ðA:28ÞðiiÞ 9x# > pB : minflAðx#Þ; lBðx#Þg ¼ # ðA:29Þ

We consider the first case (i). Then, x# < ~x 6 pB and lBðx#ÞP # > e# ¼ lBð~xÞ. But since B is convex, by Lemma 3x# < ~x 6 pB ! lBðx#Þ 6 lBð~xÞ ¼ e#. This is a contradiction, thus, # ¼ e#. Case (ii) is symmetrical to case (i). h

A.4. Theorem 5

Proof. The first objective is to define two normal and convex fuzzy sets that are maximally similar but havepossibility measure #. These fuzzy sets must be defined so that the cardinality of their intersection is thehighest possible, while the cardinality of their union is the smallest possible. The following two fuzzy sets Amax

and Bmax satisfy such requirements (see Fig. A.2 for an illustrative example):

lAmaxðxÞ ¼

# if x 2 ½min supp A [ B; pA½xð#� 1Þ þ x# � pA#

x# � pA

if x 2 ½pA; x#�

# if x 2�x#;max supp A [ B�0 elsewhere

8>>>>><>>>>>:ðA:30Þ

lBmaxðxÞ ¼

# if x 2 ½min supp A [ B; x#½xð#� 1Þ þ x# � pB#

x# � pB

if x 2 ½x#; pB�

# if x 2�pB;max suppA [ B�0 elsewhere

8>>>>><>>>>>:ðA:31Þ

Fig. A.2. Example of fuzzy sets with maximal similarity for a given possibility measure.

C. Mencar et al. / Information Sciences 177 (2007) 130–149 145

Autho

r's

pers

onal

co

py

The membership functions so defined are such that intersection and union coincide in all the support exceptthe interval [pA,pB] , while within such interval the membership functions have null second derivative. As aconsequence, any couple of fuzzy sets A and B satisfying the hypothesis will have

8x 2 ½pA; x#� : lAðxÞP lAmaxðxÞ ðA:32Þ

and

8x 2 ½x#; pB� : lBðxÞP lBmaxðxÞ ðA:33Þ

In this way, the cardinality of the intersection fuzzy set is minimized while the union is maximized. More spe-cifically, the intersection of the two fuzzy sets has the following membership function:

lAmax\BmaxðxÞ ¼

# if x 2 supp A [ B

0 elsewhere

�ðA:34Þ

The union of the two fuzzy sets has the following membership function:

lAmax[BmaxðxÞ ¼

# if x 2 ½min suppA [ B; pA½xð#� 1Þ þ x# � pA#

x# � pA

if x 2 ½pA; x#�

xð#� 1Þ þ x# � pB#

x# � pB

if x 2 ½x#; pB�

# if x 2�pB;max supp A [ B�0 elsewhere

8>>>>>>>>>><>>>>>>>>>>:ðA:35Þ

The similarity of the two fuzzy sets is

SðAmax;BmaxÞ ¼jAmax \ BmaxjjAmax [ Bmaxj

¼ #jsuppA [ Bj#jsuppA [ Bj þ 1

2ð1� #ÞðpB � pAÞ

ðA:36Þ

By defining r as in (34), the similarity is shown to be equal to (33). Note that Amax and Bmax are not contin-uous. However, continuous fuzzy sets may be defined so as to be arbitrary similar to Amax and Bmax. Hence, bydefining Smax = S(Amax,Bmax), the maximal similarity measure is the upper-bound of the actual similaritybetween the original fuzzy sets A and B. h

A.5. Proposition 6

Proof. Let xA and xB be the centers of the fuzzy sets A and B respectively. Without loss of generality, it can beassumed xA 6 xB. Let r > 0 be the common width of the two fuzzy sets. By Lemma 4, it is possible to statethat there exists a point x# 2 [xA,xB] such that

lAðx#Þ ¼ # ¼ lBðx#Þ ðA:37ÞMoreover, it is easy to prove that the intersection point between two fuzzy sets of equal width is unique. As aconsequence, the cardinality of the intersection fuzzy set between the fuzzy sets A and B is the area depicted inFig. A.3.

The expression for the cardinality of the intersection between A and B is therefore:

jA \ Bj ¼Z

R

minflAðxÞ; lBðxÞgdx ¼Z x#

�1lBðxÞdxþ

Z þ1

x#

lAðxÞdx ðA:38Þ

For the sake of simplicity, it is assumed that the Universe of Discourse coincided with the real line R. Since eachmembership function is symmetrical w.r.t. its center, and since the membership function lB can be viewed as ahorizontal translation of lA of length xB � xA, it is possible to simplify the previous relation in the following:

jA \ Bj ¼ 2

Z þ1

x#

lAðxÞdx ðA:39Þ

146 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

By explicating the definition of Gaussian membership function, Eq. (A.39) can be rewritten as

jA \ Bj ¼ 2

Z þ1

x#

exp �ðx� xAÞ2

2r2

!dx ðA:40Þ

The following change of variable can be conveniently defined:

t ¼ x� xA

rffiffiffi2p ðA:41Þ

In this way, relation (A.40) can be rewritten as

jA \ Bj ¼ 2ffiffiffi2p

rZ þ1

x#�xArffiffi2p

e�t2

dt ¼ffiffiffiffiffiffi2pp

r 1� erfx# � xA

rffiffiffi2p

� �� �ðA:42Þ

Now the point x# can be defined in terms of #. More specifically, since lA(x#) = # and x# > xA, then

x# ¼ xA þ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�2 ln#p

ðA:43ÞBy replacing (A.43) into (A.42), it results

jA \ Bj ¼ffiffiffiffiffiffi2pp

rð1� erfðffiffiffiffiffiffiffiffiffiffiffiffiffi� ln#p

ÞÞ ðA:44ÞThe cardinalities of A and B are identical, and can be calculated from (A.39) by taking x# = xA, resulting in

jAj ¼ jBj ¼ 2

Z þ1

0

lAðxÞdx ¼ffiffiffiffiffiffi2pp

r ðA:45Þ

Finally, the similarity measure between A and B is calculated as

SðA;BÞ ¼ jA \ BjjAj þ jBj � jA \ Bj ¼

1� erfðffiffiffiffiffiffiffiffiffiffiffiffiffi� ln#p

Þ1þ erfð

ffiffiffiffiffiffiffiffiffiffiffiffiffi� ln#p

Þ� ðA:46Þ

A.6. Lemma 7

Proof. We consider only the first thesis (37), because the other can be proved similarly. Let # be the possibilitymeasure P(A,B). In force of Lemma 4,

9x# 2 ½pA; pB� : lAðxÞ ¼ lBðxÞ ¼ # ðA:47Þ

0

1

Fig. A.3. The cardinality of the intersection between A and B is the area of the shaded region.

C. Mencar et al. / Information Sciences 177 (2007) 130–149 147

Autho

r's

pers

onal

co

py

By hypothesis, lB0 ðx#Þ 6 #, and re-applying Lemma 4 to A and B 0, it can be stated that:

9x0 2 ½pA; pB0 � : lAðx0Þ ¼ lB0 ðx0Þ ¼ PðA;B0Þ ¼ #0 ðA:48Þwhere pB0 2 arg max lB0 . Since by hypothesis lB0 ðx#Þ < lBðx#Þ, then 8x00 < x# : lAðx00ÞP # ^ lB0 ðx00Þ 6lB0 ðx#Þ < #: Hence, "x00 < x# : lA(x00) 5 lB(x00). As a consequence, x 0 P x#. In such case, # 0 = lA(x 0) 6lA(x) = #, i.e. P(A,B 0) 6 P(A,B). h

A.7. Theorem 8

Proof. The function Smaxð#; rÞ ¼ 2#rþ2#�r# given in (33) is continuous and derivable for every # 2 [0,1]. Its

derivative is

oSmax

o#¼ 2r

ðr þ 2#� r#Þ2ðA:49Þ

Since the ratio r is always positive, Smax is monotonically decreasing as # decreases, for a fixed ratio r. How-ever, the new fuzzy set B 0 may determine a different ratio r 0 different from r. The directional derivative of Smax

w.r.t. the direction (D#,Dr) is

dSmax

dðD#;DrÞ ¼ rSmax ðD#;DrÞkðD#;DrÞk ¼

2rD#þ 2#ð#� 1ÞDr

kðD#;DrÞkðr þ 2#� r#Þ2ðA:50Þ

Such derivative is negative (i.e. Smax is decreasing) when

Dr >2rD#

2#ð1� #Þ ðA:51Þ

For D# < 0 (reduced possibility), the second member of (A.51) is negative, hence any transformation that doesnot reduce the ratio (Dr = r 0 � r P 0) will effectively reduce the maximal possibility Smax. h

References

[1] Robert Babuska, Data-driven fuzzy modeling: transparency and complexity issues, in: Proceedings 2nd European Symposium onIntelligent Techniques ESIT’99, Crete, Greece, 1999, ERUDIT.

[2] Piero P. Bonissone, Y.-T. Chen, K. Goebel, P.S. Khedkar, Hybrid soft computing systems: industrial and commercial applications,Proceedings of the IEEE 87 (9) (1999) 1641–1667.

[3] Jorge Casillas, Oscar Cordon, Francisco Herrera, Luis Magdalena (Eds.), Interpretability Issues in Fuzzy Modeling, Springer,Germany, 2003.

[4] Giovanna Castellano, Anna Maria Fanelli, Corrado Mencar, A neuro-fuzzy network to generate human understandable knowledgefrom data, Cognitive Systems Research, Special Issue on Computational Cognitive Modeling 3 (2) (2002) 125–144.

[5] Mo-Yuen Chow, Sinan Altug, H. Joel Trussel, Heuristic constraints enforcement for training of and knowledge extraction from aFuzzy/Neural architecture – Part I: Foundation, IEEE Transactions on Fuzzy Systems 7 (2) (1999) 143–150.

[6] Krzysztof Cios, Witold Pedrycz, R. Swiniarski, Data Mining. Methods for Knowledge Discovery, Kluwer, 1998.[7] Valerie Cross, An analysis of fuzzy set aggregators and compatibility measures, Master’s thesis, Department of Electronic

Engineering, University of Delft, 1993.[8] Didier Dubois, Henri Prade, Possibility Theory: An Approach to Computerized Processing of Uncertainty, Plenum Press, New York,

1988.[9] Jairo Espinosa, Joos Vandewalle, Constructing fuzzy models with linguistic integrity from numerical data – AFRELI algorithm,

IEEE Transactions on Fuzzy Systems 8 (5) (2000) 591–600.[10] Serge Guillaume, Brigitte Charnomordic, Generating an interpretable family of fuzzy partitions from data, IEEE Transactions on

Fuzzy Systems 12 (3) (2004) 324–335.[11] Masoud Jamei, Mahdi Mahfouf, Derek A. Linkens, Elicitation and fine tuning of Mamdani-type fuzzy rules using symbiotic

evolution, in: Proceedings of European Symposium on Intelligent Technologies, Hybrid Systems and their Implementation on SmartAdaptive Systems (EUNITE 2001), Tenerife, Spain, 2001.

[12] Fernando Jimenez, Antonio Gomez-Skarmeta, Hans Roubos, Robert Babuska, A multi-objective evolutionary algorithm for fuzzymodeling, in: Proceedings of 2001 Conference of the North American Fuzzy Information Processing Society (NAFIPS’01), NewYork, 2001, pp. 1222–1228.

[13] Yaochu Jin, Fuzzy modeling of high-dimensional systems: complexity reduction and interpretability improvement, IEEETransactions on Fuzzy Systems 8 (2) (2000) 212–221.

148 C. Mencar et al. / Information Sciences 177 (2007) 130–149

Autho

r's

pers

onal

co

py

[14] Yaouchu Jin, Bernard Sendhoff, Extracting interpretable fuzzy rules from RBF networks, Neural Processing Letters 17 (2003) 149–164.

[15] Phayung Meesad, Gary G. Yen, Quantitative measures of the accuracy, comprehensibility, and completeness of a fuzzy expert system,in: Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’02), Honolulu, Hawaii, 2002, pp. 284–289.

[16] Detlef Nauck, Rudolf Kruse, A neuro-fuzzy approach to obtain interpretable fuzzy systems for function approximation, in:Proceedings of IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’98), Anchorage (AK), 1998, pp. 1106–1111.

[17] Rui P. Paiva, Antonio Dourado, Merging and constrained learning for interpretability in neuro-fuzzy systems, in: Proceedings of theFirst International Workshop on Hybrid Methods for Adaptive Systems, Tenerife, Spain, December 2001.

[18] Witold Pedrycz, Fernando Gomide, An Introduction to Fuzzy Sets. Analysis and Design, MIT Press, Cambridge, MA, 1998.[19] Carlos-Andres Pena-Reyes, Moshe Sipper, Fuzzy CoCo: balancing accuracy and interpretability of fuzzy models by means of

coevolution, in: J. Casillas, O. Cordon, F. Herrera, L. Magdalena (Eds.), Accuracy Improvements in Linguistic Fuzzy Modeling,Studies in Fuzziness and Soft Computing, Springer-Verlag, 2003, pp. 119–146.

[20] Hans Roubos, Magne Setnes, Compact and transparent fuzzy models and classifiers through iterative complexity reduction, IEEETransactions on Fuzzy Systems 9 (4) (2001) 516–524.

[21] Magne Setnes, Fuzzy rule-base simplification using similarity measures, Master’s thesis, Department of Electronic Engineering,University of Delft, 1995.

[22] Magne Setnes, Robert Babuska, Uzay Kaymak, Hans R. Van Nauta Lemke, Similarity measures in fuzzy rule base simplification,IEEE Transactions on Systems, Man and Cybernetics, Part B 28 (3) (1998) 376–386.

[23] Magne Setnes, Robert Babuska, H.B. Verbruggen, Rule-based modeling: precision and transparency, IEEE Transactions on Systems,Man and Cybernetics, Part C 28 (1) (1998) 165–169.

[24] Herbert Toth, Towards fixing some ‘fuzzy’ catchwords: a terminological primer, in: L.A. Zadeh, J. Kacprzyk (Eds.), Computing withWords in Information/Intelligent Systems 1. Foundations, Physica-Verlag, Heidelberg, New York, 1999, pp. 154–181.

[25] Jose Valente de Oliveira, On the optimization of fuzzy systems using bio-inspired strategies, in: Proceedings of the IEEE InternationalConference on Fuzzy Systems (FUZZ-IEEE’98), Anchorage, AK, May 1998, IEEE, 1998, pp. 1229–1234.

[26] Jose Valente de Oliveira, Semantic constraints for membership function optimization, IEEE Transactions on Systems, Man andCybernetics, Part A 29 (1) (1999) 128–138.

[27] Jose Valente de Oliveira, Towards neuro-linguistic modeling: Constraints for optimization of membership functions, Fuzzy Sets andSystems 106 (1999) 357–380.

[28] Lotfi A. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems 1 (1978) 3–28.[29] Rami Zwick, Edward Carlstein, David V. Budescu, Measures of similarity among fuzzy concepts: A comparative analysis,

International Journal of Approximate Reasoning 1 (1987) 221–242.

C. Mencar et al. / Information Sciences 177 (2007) 130–149 149