Regranulation: A granular algorithm enabling communication between granular worlds

28
Regranulation: A granular algorithm enabling communication between granular worlds Scott Dick a, * , Adam Schenker b , Witold Pedrycz a , Abraham Kandel b a Department of Electrical and Computer Engineering, University of Alberta, 2nd Floor ECERF Building, Edmonton, AB, Canada T6G 2V4 b Department of Computer Science and Engineering, University of South Florida, Tampa, FL, USA Abstract In this paper, we describe a granular algorithm for translating information between two granular worlds, represented as fuzzy rulebases. These granular worlds are defined on the same universe of discourse, but employ different granulations of this universe. In order to translate information from one granular world to the other, we must regranulate the information so that it matches the information granularity of the target world. This is accomplished through the use of a first-order interpolation algorithm, implemented using linguistic arithmetic, a set of elementary granular computing operations. We first demonstrate this algorithm by studying the common ‘‘fuzzy-PD’’ rulebase at several different granularities, and conclude that the ‘‘3 · 3’’ granulation may be too coarse for this objective. We then examine the question of what the ‘‘nat- ural’’ granularity of a system might be; this is studied through a 10-fold cross-validation experiment involving three dif- ferent granulations of the same underlying mapping. For the problem under consideration, we find that a 7 · 7 granulation appears to be the minimum necessary precision. Ó 2006 Elsevier Inc. All rights reserved. Keywords: Fuzzy systems; Granular computing; Granular worlds; Granular algorithms 1. Introduction Granular computing is a new information-processing paradigm, which recognizes that precision is an expensive – and often unnecessary – goal in describing the world around us. In particular, granular computing recognizes that human thought does not operate at a numeric level of precision, but at a much more abstract level. Granular computing is a formalism for expressing that abstraction within computational processes, thus endowing computer systems with a more human-centric view of the world. The central notion in granular computing is that there are many ‘‘levels’’ of precision in which information about the real world can be expressed, with numeric precision being the most refined and a binary value the coarsest [3]. Bargiela and Ped- rycz have formalized this idea through the mechanism of ‘‘granular worlds,’’ which are frameworks for infor- mation processing. These frameworks incorporate all the mechanisms an intelligent agent would require for 0020-0255/$ - see front matter Ó 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2006.03.020 * Corresponding author. Tel.: +1 780 492 6216; fax: +1 780 492 1811. E-mail address: [email protected] (S. Dick). Information Sciences 177 (2007) 408–435 www.elsevier.com/locate/ins

Transcript of Regranulation: A granular algorithm enabling communication between granular worlds

Information Sciences 177 (2007) 408–435

www.elsevier.com/locate/ins

Regranulation: A granular algorithm enablingcommunication between granular worlds

Scott Dick a,*, Adam Schenker b, Witold Pedrycz a, Abraham Kandel b

a Department of Electrical and Computer Engineering, University of Alberta, 2nd Floor ECERF Building, Edmonton, AB, Canada T6G 2V4b Department of Computer Science and Engineering, University of South Florida, Tampa, FL, USA

Abstract

In this paper, we describe a granular algorithm for translating information between two granular worlds, represented asfuzzy rulebases. These granular worlds are defined on the same universe of discourse, but employ different granulations ofthis universe. In order to translate information from one granular world to the other, we must regranulate the informationso that it matches the information granularity of the target world. This is accomplished through the use of a first-orderinterpolation algorithm, implemented using linguistic arithmetic, a set of elementary granular computing operations.We first demonstrate this algorithm by studying the common ‘‘fuzzy-PD’’ rulebase at several different granularities, andconclude that the ‘‘3 · 3’’ granulation may be too coarse for this objective. We then examine the question of what the ‘‘nat-ural’’ granularity of a system might be; this is studied through a 10-fold cross-validation experiment involving three dif-ferent granulations of the same underlying mapping. For the problem under consideration, we find that a 7 · 7 granulationappears to be the minimum necessary precision.� 2006 Elsevier Inc. All rights reserved.

Keywords: Fuzzy systems; Granular computing; Granular worlds; Granular algorithms

1. Introduction

Granular computing is a new information-processing paradigm, which recognizes that precision is anexpensive – and often unnecessary – goal in describing the world around us. In particular, granular computingrecognizes that human thought does not operate at a numeric level of precision, but at a much more abstractlevel. Granular computing is a formalism for expressing that abstraction within computational processes, thusendowing computer systems with a more human-centric view of the world. The central notion in granularcomputing is that there are many ‘‘levels’’ of precision in which information about the real world can beexpressed, with numeric precision being the most refined and a binary value the coarsest [3]. Bargiela and Ped-rycz have formalized this idea through the mechanism of ‘‘granular worlds,’’ which are frameworks for infor-mation processing. These frameworks incorporate all the mechanisms an intelligent agent would require for

0020-0255/$ - see front matter � 2006 Elsevier Inc. All rights reserved.

doi:10.1016/j.ins.2006.03.020

* Corresponding author. Tel.: +1 780 492 6216; fax: +1 780 492 1811.E-mail address: [email protected] (S. Dick).

S. Dick et al. / Information Sciences 177 (2007) 408–435 409

information processing, including a formalism for the information granules, a set of reference informationgranules, a universe of discourse, and a communication mechanism for exchanging information with othergranular worlds (i.e. agents).

In granular computing, the atomic units of computation are groups of objects. Objects from some collec-tion, such as real numbers, will be aggregated into groups, and then all computations are performed on thesegroups rather than the individual objects. These groups are the information granules manipulated in a gran-ular world. The key questions in granular computing can be summed up as, how do we perform this aggre-gation, and how do we define operations for these groups once they have been assembled? The first questionhas normally been addressed through the use of clustering algorithms [3,20], while suggestions on the secondhave been put forward in [24,44,18,32,31,10,11,42,43].

The work reported in the current article represents a new contribution to the development of granular com-puting algorithms and applications. We have developed a novel granular computing algorithm for translatinga rulebase at one particular granularity into an equivalent rulebase at either a coarser or more refined gran-ularity. In this sense, our algorithm is a mechanism for communicating between granular worlds; at this time,our algorithm is restricted to the class of granular worlds defined by type-1 fuzzy rulebases. Our algorithm,termed ‘‘regranulation’’, is based on linear interpolation and operates directly on the linguistic rulebase ofa fuzzy system, utilizing the operations of linguistic arithmetic [11] as elementary operations. We illustratethe operation of this new algorithm through two experiments: the first investigation uses the well-knownfuzzy-PD rulebase at four different granulations, while the second investigates a two-dimensional periodicfunction. Both investigations focus on the idea of the ‘‘natural’’ granularity of a system, i.e. is there a mini-mum granularity for a system such that any coarser granularity will obscure major features of the system?Our results indicate that the regranulation algorithm can be a platform for answering this question; for bothsystems, we observed behaviors indicating that granulating the universes of discourse into three fuzzy sets didin fact miss important behaviors in the system, which were confirmed by visual inspection of the defuzzifiedreasoning surfaces.

The remainder of this paper is organized as follows. In Section 2, we review the relevant literature in gran-ular computing. In Section 3, we provide an overview of the linguistic arithmetic developed in [11]. In Section4, we present the regranulation algorithm, and describe our fuzzy-PD regranulation experiment to illustratethe operation of this novel technique. In Section 5, we present our second regranulation experiment, whichinvolves rulebases created by a fuzzy rule induction algorithm over the 10 partitions of a 10-fold cross-vali-dation experiment. We offer a summary and discussion of future work in Section 6.

2. Review of related work

In this section, we review the relevant literature in granular computing and fuzzy systems theory. We willdiscuss granular computing, including the formation of information granules, granular worlds, and granularoperations. We will briefly discuss the related areas of qualitative reasoning and fuzzy mathematics, and thenfocus on linguistic variables and linguistic functions, which are at the heart of our current work.

2.1. Granular computing

Granular computing incorporates a number of mathematical areas, some of which (such as interval anal-ysis) have been investigated for much longer than others. However, if we assert that the central characteristicof granular computing is that these various theories are regarded as constituent components of granular com-puting, then we can say that the formal investigation of granular computing began with a paper by Zadeh in1979 [50]. In this early work, a granule was defined as a collection of indistinguishable objects, and informa-tion granularity was defined as the grouping of objects into granules. The size of the granules is determined bythe precision with which we are able to measure (or just as importantly, desire to measure) some attributes ofthe underlying objects. The formal definition of a granule in this paper is as a possibility distribution over a setof objects. In a second, later paper [51], Zadeh outlines a method for executing computations on purely lin-guistic quantities, by means of constraint propagation. In this paper, granules as treated as the atomic units of

410 S. Dick et al. / Information Sciences 177 (2007) 408–435

computation, with each linguistic term being associated with a granule. Ideas from both of these papers aredrawn together in [52], in which granules are treated as constraints on a variable. Possibilistic constraints(i.e. fuzzy sets) are one of several classes of constraints that can define a granule. In this paper, Zadeh describesinformation granulation as the unifying principle in fuzzy set theory, fuzzy systems and fuzzy logic, as well as akey mechanism in human cognition.

More recently, an attempt was made to establish a computational theory for granular computing by Yingin [45]. Ying points out that the word ‘‘computing’’ in granular computing (or ‘‘computing with words’’) refersonly to computationally efficient mechanisms for modeling and reasoning under uncertain conditions, not toany formal theory of computing. Formal models of computing rely on automata theory (finite state machines,pushdown automata, Turing machines), which assumes crisp, unambiguous inputs to the automata. The con-tribution in this paper is to define automata which accept fuzzy inputs. More specifically, instead of acceptingor rejecting a string in some language L with alphabet R, where L is a subset of R*, these fuzzy automata willaccept, to a degree, strings from a language L 0 that is a fuzzy subset of R*. Definitions for both fuzzy finite statemachines and fuzzy pushdown automata are provided, and Ying proves that acceptance in a final state andacceptance with an empty stack are equivalent for the fuzzy pushdown automaton, as in the crisp automaton.These automata are proposed as formal models of granular computing.

2.1.1. Forming information granules

At this time, the single most important technique for creating information granules is the use of clusteringalgorithms. Clustering is a form of unsupervised learning, in which patterns in a feature space are groupedtogether based on their similarity to one another. The goal is to create groupings that accurately reflect theactual structure of the underlying data. There are a tremendous number of clustering algorithms in the liter-ature, including hierarchical and mountain clustering [22], and a host of others. In the granular computingcommunity, one of the most common clustering algorithms is the fuzzy c-means (FCM) algorithm, originallydeveloped as a fuzzy variant of the ISODATA algorithm [20]. The FCM algorithm works by alternately opti-mizing two representations of a dataset: a set of cluster centers, and a fuzzy partition matrix, which records thedegree to which each pattern belongs to the cluster defined by each cluster center. The optimization is drivenby minimizing the cost function

Jðf Þ ¼Xx2X

Xk2K

f mðxÞðkÞ � d2ðx; kÞ ð2:1Þ

where f is a fuzzy partition of the dataset, f(x)(k) is the membership of pattern x in cluster k, m is the fuzzifierexponent, and d(x,k) is the distance between pattern x and the centroid of cluster k. The ultimate output of theFCM algorithm is a set of fuzzy clusters, to which an element may partly belong. The sum of any particularelement’s memberships over all clusters must equal 1. The FCM algorithm enjoys a good reputation as a basicclustering technique, but has its shortcomings. The most serious of these is the fact that it only creates hyper-spherical clusters, whereas in reality clusters might not be spherical at all. Some variants of FCM can createhyperellipsoidal clusters by using a fuzzy covariance matrix, as in Gustafson–Kessel fuzzy c-means. The Gath–Geva algorithm takes this a step further, adding an exponentially decaying distance function instead of theusual Euclidean distance. Gustafson–Kessel and Gath–Geva clustering, however, are dependent on findinggood initial values for the cluster centers. As a result, Hoppner [20] recommends that FCM should be appliedfirst in a dataset, followed by Gustafson–Kessel clustering (initialized by the FCM cluster centers), and thenGath–Geva clustering (initialized by the Gustafson–Kessel cluster centers).

The distance measures used in clustering are measures of similarity. We assume that when two patterns areclose to each other in feature space, then the objects those patterns represent are similar. In the same vein, weassume that when two patterns are distant in feature space, the underlying objects are dissimilar. Similarityplays a major role in granular computing, and so it is worthwhile to briefly review some underlying ideasabout similarity. A similarity relation is a fuzzy binary relation, which must be reflexive, symmetric, and tran-sitive. Similarity is often treated as a fuzzy analog of equality, in that it is an extension of equivalence relationsto fuzzy relations. Furthermore, any a-cut of a similarity relation yields an equivalence relation [25]. A set ofaxioms for similarity relations presented in [37] includes the following:

S. Dick et al. / Information Sciences 177 (2007) 408–435 411

1. S(�A,�B) = S(A,B),2. 0 6 S(A,B) 6 1,3. A = B if and only if S(A,B) = 1,4. S(A,B) = 0 if and only if A \ B = B,5. If A � B � C, then S(A,B) P S(A,C).

A similarity measure between fuzzy sets that meets these axioms is proposed in [37], and used to develop amethod of approximate reasoning based on similarity measures, without using the compositional rule ofinference.

There are also alternatives to clustering as a method of forming information granules. In [4], Bargiela andPedrycz proposed a method for forming set-theoretic granules, which take the form of hyperboxes in featurespace. The method is fairly similar to agglomerative clustering, in that the granules are combined by mergingsmaller granules together. However, since the granules must at all times be hyperboxes, two smaller hyper-boxes will be merged by creating a new hyperbox that encloses both of the smaller ones. This process is guidedby a cluster validity measure called the information density. The method could be applied to time series data,images, and spatial data (such as network topologies).

Bortolan and Pedrycz investigated the use of self-organizing maps (SOM) for converting a dataset fromnumeric to granular data [6]. A number of SOM were computed for each dataset, using variations on thelearning scheme. This caused the SOM to visualize different facets of the dataset; there was a SOM for iden-tifying clusters, one for displaying the data distribution, and a SOM for each attribute. Comparison of thelatter group of SOM allows an analyst to visually identify dependencies between attributes. All this informa-tion was used as a means to qualitatively determine how the dataset should be granulated. The authors alsosuggest that using image-processing and computer vision techniques to process the SOM would be helpful;however, since the SOM has a number of parameters that must be selected by hand through trial and error,the authors do not propose an automated system for granulation based on SOM. Finally, Pedrycz and Vuko-vich discuss the generation of fuzzy membership functions in [35]. They propose that semantically sound fuzzysets should first be roughed out (using the energy and entropy of a fuzzy set as guides), and then numeric opti-mization should be used to perfect the fuzzy set with respect to experimental evidence.

2.1.2. Granular worlds

Although the investigation of granular computing began in the fuzzy systems community, fuzzy sets arenow recognized as only one of several possible mathematical formalisms (including rough sets, interval anal-ysis, and shadowed sets) underlying granular computing. Zadeh [52], Pedrycz and Vukovich [33], and Bargielaand Pedrycz [3], have all proposed that granular computing refers both to the actual information granules in aproblem (along with their underlying formalism), as well as to general computing techniques that are indepen-dent of the specific formalism used. Zadeh [52] first suggested that these more general techniques should takethe form of constraint propagation; the specific constraints being propagated would depend on the formalismused. More recently, Refs. [33,3] have used the concept of a granular world to represent these general tech-niques. A granular world encapsulates a particular set of information granules, a formalism for those granules,a set of objects that can be represented using those information granules, and a mechanism for communicatingwith other granular worlds.

Formally, a granular world is a tuple hX,G,A, Ci where G is the single mathematical formalism that will beused within the granular world, A is a reference set of information granules, C represents the communicationmechanisms between this granular world and others, and X is the set of concepts that can be expressed usingthe formalism G and the set of reference granules A (i.e. a universe of discourse). An element x 2 X of theuniverse of discourse, or a subset Y 2 X thereof, will be approximately represented in terms of the referencegranules contained in A; equivalently, x or Y can be viewed as the image of the reference granules A undersome mapping f, i.e. x ffi f(A). The precision with which x can be represented depends on both the formalismG and reference granules A used in a particular granular world. For example, a digital computer using float-ing-point arithmetic constitutes a granular world in which the formalism is interval analysis, and the set ofreference granules is given by the set of floating-point values that may be encoded in this computer. Eachfloating-point value actually represents a small interval of the real line R, within which all values are

412 S. Dick et al. / Information Sciences 177 (2007) 408–435

indistinguishable to the computer. Real-valued inputs will then be expressed in terms of these reference gran-ules; in this case, by encoding the inputs according to which interval they fall into [3,33].

In order for two granular worlds to communicate, they must exchange not only specific data items, but alsothe formalism and reference granules used to express them; in other words, the full context of each granularworld. The prototypical example of communication between granular worlds is the analog/digital and digital/analog converters used in signal processing. Communication between the analog world and a digital world islimited by the resolution available in the particular digital device. An analog signal must be quantized into adigital bitstream of some fixed resolution, whereas the digital code (representing an interval) must be recon-structed into an analog signal that best represents that interval. Information is lost during quantization, andcan at best only be partially recovered during reconstruction. These two processes are specific to the digitaldevice (information system) under consideration; the context of the physical universe is a constant. If weare communicating between two granular worlds which are both information systems, then the communica-tion mechanisms become even more complex. As a practical matter, this means that two communicating gran-ular worlds will in general exchange information according to a third formalism, representing a hybridizationof the formalisms used in the two original granular worlds. For example, two granular worlds using theinterval formalism would communicate using rough sets. Some special cases exist; for instance, Bortolanand Pedrycz [5] determined that communication without information loss was possible between either numericor interval-valued granular worlds and a granular world using uniform triangular fuzzy sets as its referencegranules. Numeric information was represented using possibility measures, while interval-valued informationwas represented using both possibility and necessity measures. This basic approach was applied to data fusionin [32].

2.1.3. Granular operations

In order to be of practical use, granular computing must produce operations and algorithms that createadded value for a user. The development of these granular algorithms (including the regranulation algo-rithm described in the current paper) is still in its infancy, but a few important research directions haveemerged. One of these important directions is the development of the ordered weighted averaging(OWA) operator. The OWA operator was first described by Yager in [42], and is a variation on the tradi-tional weighted average. Weights in an OWA are associated with the position of an element in an orderedseries, rather than with a specific element. For a collection X = {x1,x2, . . . ,xn}, the OWA operator is definedas

Xn

j¼1

wjbj ð2:2Þ

where bj is the jth largest element of X and

Xj

wj ¼ 1 ð2:3Þ

The OWA operator subsumes the mean, median, maximum and minimum functions, among many others,and has been applied to multi-criteria decision-making problems. An important extension to the OWA oper-ator is the induced ordered weighted averaging (IOWA) operator [43], in which elements of the set X areordered pairs (ai,ui), where the ai are values to be aggregated, and the ui are an index used to order the setX. Importantly, the ui need only be drawn from an ordinal scale, whereas the ai are necessarily drawn froman interval scale.

Data fusion has emerged as an important application area for granular computing. Sensor data may arriveat different levels of granularity, and fusing this data requires the translation of all these data into a commongranular world for further processing. Refs. [18,32] examine the fusion of numeric, interval-valued, and fuzzy-valued data, through both parametric and clustering approaches. Auephanwiriyakul and Keller [1] develop anefficient fuzzy c-means algorithm for non-interactive vectors of fuzzy numbers (i.e. vectors that can be realizedpurely as cylindrical closures of their elements). Their method is based on fuzzy arithmetic using a-cuts, as adirect solution using the extension principle is computationally prohibitive. Finally, data fusion using the

S. Dick et al. / Information Sciences 177 (2007) 408–435 413

Choquet integral can be viewed as a type of granular operation, in that fuzzy sets are combined in a type ofweighted sum, using a fuzzy measure [2].

Finally, a new direction in developing granular operations (which includes the current paper) has been thedevelopment of mathematical operations using linguistic terms. Some of the basic operations that have beendeveloped include linguistic sum, difference, product, and a scalar product (see Section 3, or the expanded dis-cussion in [11]), and a linguistic gradient operator [10]. The linguistic gradient has been applied to the deve-lopment of a granular similarity detector for fuzzy rulebases [27], while the linguistic arithmetic was developedas part of a granular neural network in [11].

2.2. Qualitative reasoning

The human ability to operate under uncertain conditions, and to make decisions based on incomplete andimperfect information, has not escaped the notice of researchers outside the domain of computational intel-ligence. Qualitative reasoning is a branch of artificial intelligence research that also deals with informationgranularity. In qualitative reasoning, all granules are treated as compact intervals of the real line, and so qual-itative reasoning is limited to the set-based formalism of granular computing. Each interval receives a label,and mathematical operators are developed for those labels. One example is the sign algebra (see [9], amongothers). The sign algebra contains the value set {+,�, 0, ?}, which represent the intervals (�1, 0), 0, (0,1)and ambiguity, respectively. The usual operations of algebra are defined for these values in the following man-ner: the result of an expression in the sign algebra is the sign of the result, as determined by simple algebra. Forinstance, � multiplied by � is clearly +. However, an appeal to numeric precision in deciding between positiveor negative is not allowed. Thus, � added to + is ambiguous, and the result is ?, because there is no way todetermine if the result of the addition would have been positive or negative without actually performing theaddition in numeric precision. Sugeno and Yasukawa have explored the relationship between qualitative rea-soning and linguistic modeling in [39]; a critique and extension of their approach appears in [40].

2.3. Fuzzy arithmetic

Fuzzy arithmetic usually refers to arithmetic operations involving fuzzy numbers, which are convex, normalfuzzy subsets of the real numbers. Fuzzy numbers represent a value that is not precisely known, and a con-siderable amount of work has been done in defining arithmetic operators for them; see [25] or [23] for a thor-ough overview. The most general approach to defining arithmetic operations for fuzzy numbers is to use theextension principle, first proposed by Zadeh in [49] and explored in depth by Yager in [41]. The extension prin-ciple ‘‘fuzzifies’’ a given real-valued function, transforming the domain and codomain from the real numbersto fuzzy numbers. However, attempting to directly use the extension principle to define arithmetic operationsgenerally reduces to a non-linear programming problem [19,16].

The second method for defining fuzzy arithmetic operators is restricted to the so-called LR fuzzy numbers.An LR fuzzy number is a fuzzy number whose membership function is of the form

AðxÞ ¼L a�x

a

� �if ða� aÞ 6 x 6 a

R x�ab

� �if a 6 x 6 ðaþ bÞ

0 otherwise

8>><>>: ð2:4Þ

where a is the unique modal value, a,b > 0 are respectively the left and right spread of the fuzzy number, andL( ), R( ) are shape functions for which L(0) = R(0) = 1 and L(1) = R(1) = 0. In order to define fuzzy arithme-tic operators on LR fuzzy numbers, we can use a-cuts of the fuzzy numbers. Since LR fuzzy numbers areconvex, the family of a-cuts of a fuzzy number forms a nested family of compact intervals of the real line; thismeans that interval arithmetic can be directly applied, leading to a much more computationally efficient meth-od for performing fuzzy arithmetic. Dubois and Prade have produced a large body of work on LR fuzzy num-bers; see Refs. [13,15,14], for example.

It is also possible to derive arithmetic operators that act only on certain classes of fuzzy numbers. Forinstance, Meier defines sum, product and difference operators for the class of triangular fuzzy numbers in

414 S. Dick et al. / Information Sciences 177 (2007) 408–435

[30]. Members of this class of fuzzy numbers are described by three parameters (a,b,c), where a and c arerespectively the left and right endpoints of the triangular function, and b is the modal value. Meier’s opera-tions are based on manipulating these parameters rather than the extension principle or a-cuts, and are thushighly efficient. In a similar vein, Giachetti and Young [16] develop a parameterized representation and arith-metic operators for several classes of fuzzy numbers, including triangular fuzzy numbers.

2.4. Linguistic variables

Zadeh’s recent interpretation of fuzzy set theory as the ‘computational theory of perception’ rests on theconstruct of the linguistic variable. A linguistic variable is a variable whose possible values are drawn froma set of words in natural language. A specific meaning is attached to each word by associating it with a fuzzyset on some appropriate universe of discourse. Linguistic variables were introduced by Zadeh in [48], andgreatly elaborated in Zadeh’s 1975 monologue [49]. Formally, a linguistic variable is a 5-tuple (X,U,T,S,M)where X is the name of the variable, U is the universe of discourse, T is a set of atomic terms and hedges, S is asyntactic rule, and M is a semantic rule. The syntactic rule S is a context-free grammar for generating linguisticterms (the values of X), using the elements of T as terminal symbols. The semantic rule M associates a fuzzysubset of U with each linguistic term. This is done by associating a fuzzy set with each atomic term in T, and afunction H: [0, 1]! [0, 1] with each hedge in T. To determine the meaning of a linguistic term (i.e. its associ-ated fuzzy set), we first determine the parse tree for that term, given the grammar S. We then apply hedgefunctions to the fuzzy set associated with the atomic portion of the linguistic value, in the order indicatedby the parse tree [46].

The idea of linguistic variables has given rise to an entire industry that uses fuzzy logic to model and controla variety of systems. However, comparatively little attention has been paid to linguistic hedges. A hedge is afunction

H : ½0; 1� ! ½0; 1� ð2:5Þ

associated by the semantic rule M with a linguistic modifier, such as the word ‘‘very.’’ The function H modifiesthe fuzzy set associated with an atomic term, and thus modifies its meaning. Hedges were first proposed byZadeh in [46], and further explored in [47], where the idea of treating hedges as a function was introduced.The idea of coupling a change in a term (a syntactic operation) with a change in the meaning of the term(a semantic operation) is an essential feature of linguistic variables. Lakoff published a major work on hedgesin 1973 [26]. His analysis was based on psychological experiments, in which subjects were asked to assignobjects a degree of membership to a given category. He concluded that hedges must be represented as a vec-tor-valued function, whose components are themselves membership functions, rather than the scalar functionsproposed by Zadeh.

2.5. Linguistic space and linguistic functions

The ideas of linguistic space and linguistic functions can be traced to a pair of papers by Braae and Ruth-erford [7,8]. In these two papers, linguistic spaces of one and two dimensions were defined, representing thestate space of a system. State spaces of one dimension were defined by a linguistic variable, while a two-dimen-sional state space is the cross-product of two linguistic variables. A linguistic trajectory is the sequence of statesin this linguistic space that the system passes through. Linguistic spaces were defined by a partition of the sys-tem’s numeric state space, induced by linguistic variables representing the system. These concepts are used toderive an optimal fuzzy controller for that system, and to examine the stability of the controlled system usinggeometric considerations. These ideas are summarized in [12], and criticized for not extending to higher-dimensional spaces.

A somewhat different interpretation of a linguistic space is given in [28]. This paper again defines a linguisticspace as the cross-product of several linguistic variables. The term sets for each of these linguistic variableswere ordered by using a standard vector. The standard vector contains all atomic terms in a linguistic variable,arranged in some ‘‘reasonable’’ ordering. For an N-dimensional space, there would thus be N standard

S. Dick et al. / Information Sciences 177 (2007) 408–435 415

vectors, each associated with one dimension of the space. The idea of the standard vector will be important indefining the linguistic arithmetic used in our current paper.

In general, a function is a mapping between two sets, having two properties: first, the mapping must includeall elements of both the domain and codomain sets; and second, an element in the domain may only map to asingle element in the codomain. Plainly, we could describe a linguistic function as a mapping between two setsof linguistic terms, subject to those same restrictions. This is the approach that will be taken in the currentpaper. An alternative idea would be to find a granular function between two sets by using a customized clus-tering algorithm. Instead of clustering two sets separately, and thus generating clusters that represent the indi-vidual sets, we might require that the clustering algorithm find clusters in the two datasets that preserve aknown mapping between them. This is the idea of directional clustering, described in [3,34]. In directional clus-tering, we cluster the domain set of a mapping as usual, and alter the FCM objective function in clustering thecodomain set. The modified objective function is

Q ¼Xy2Y

Xk2K

l2i;kðyÞd2ðy; kÞ

!þ b

Xy2Y

Xk2K

li;kðyÞ � /iðU ½1�Þ� �2 � d2ðy; kÞ

!ð2:6Þ

The additional term drives the FCM algorithm to find codomain clusters that both represent the codomainand maintain the mapping between domain and codomain at the level of information granules; the balancebetween these two objectives is represented by the constant b. Notice that, since only the codomain clustersare driven to maintain the mapping, there is no assurance that a bijective mapping at the numeric level wouldbe maintained at the level of granules; it would seem necessary to have both domain and codomain clustersoptimized to maintain such a mapping.

3. Linguistic arithmetic

The principal mechanism underlying our regranulation operator is the linguistic arithmetic developed in[11]. For the reader’s convenience, we will now review the main concepts behind the linguistic arithmetic,and provide definitions of the operators in linguistic arithmetic. We will first discuss generalized linguistic vari-ables (in which the syntactic rule is a phrase-structure grammar instead of the traditional context-free gram-mar) along with two novel linguistic hedges, first discussed in [11], which are at the heart of the linguisticarithmetic. With this basic material in place, we will then define the operations of linguistic arithmetic. Weomit a discussion of the properties of linguistic arithmetic operators herein; readers interested in this materialare directed to [11] for an expanded discussion.

3.1. Generalized linguistic variables

As noted in Section 2.4, a linguistic variable is a 5-tuple (X,U,T,S,M), with X the name of the variable, U

the universe of discourse, T the term set, S a syntactic rule, and M a semantic rule. Virtually all practical appli-cations of fuzzy systems have S equal to null for every LV, so that no terms other than atomic terms are gen-erated. The semantic rule M simply associates a fuzzy set with each atomic term. This means that a fuzzyrulebase can be implemented as a lookup table, rather than having to compute a composition of multiple func-tions for each term. However, these rulebases are completely static; there is no need, nor any support, forinterpolating a linguistic value between two terms. This, however, is precisely our goal in regranulating a rule-base, and so we must add a mechanism that will permit this interpolation.

The mechanism that we will use to permit interpolation between two terms in a linguistic variable is the useof two novel linguistic hedges, coupled with a set of rewrite rules to be inserted into the syntactic rule of theLV. The hedges are greater than and less than, as defined below for a linguistic variable L = (X,U,T,S,M):

Definition 3.1. The linguistic hedge greater than is defined by the function

lgreater thanðxÞ ¼ðlðxÞÞ2 if x 6 x0

ðlðxÞÞ0:5 if x > x0

(ð3:1Þ

416 S. Dick et al. / Information Sciences 177 (2007) 408–435

where x 2 U, l(x) is the membership function of an LR fuzzy number whose core is the singleton x0 and whichis associated with some term t 2 T.

Definition 3.2. The hedge less than is defined by the function

lless thanðxÞ ¼ðlðxÞÞ0:5 if x 6 x0

ðlðxÞÞ2 if x > x0

(ð3:2Þ

where x and l obey the same provisions as in Definition 3.1. These hedges differ from most others in the lit-erature, as they do not act uniformly over the entire support of a fuzzy set. One exception is the INT operatorin [47], which also has a non-uniform behavior over the support of its argument. However, where the INToperator discriminates between different values of the membership function l(x), the ‘‘less than’’ and ‘‘greaterthan’’ hedges discriminate between different values in the universe of discourse U. Zadeh also provided a dif-ferent definition for the hedge ‘‘greater than’’ in [47]. To illustrate the effect of these hedges, we depict a term(‘‘small’’) in Fig. 1, and the composite terms ‘‘less than small’’ and ‘‘greater than small’’ in Figs. 2 and 3,respectively.

We now encounter a problem with the syntactic rule of our linguistic variable. Linguistically, there is apoint at which ‘‘greater than greater than . . .greater than SMALL’’ will become ‘‘less than less than . . . lessthan MEDIUM.’’ However, this requires a replacement of the atomic term SMALL with the atomic termMEDIUM, and of each instance of the hedge ‘‘greater than’’ with an instance of ‘‘less than,’’ all triggered byadding a threshold number of ‘‘greater than’’ hedges to SMALL. This operation (hereafter referred to ascrossover) cannot be encoded into a linguistic variable using the standard definitions. The problem is that thesyntactic rule S of a linguistic variable is a context-free grammar. In a context-free grammar, the left-hand side

Small

0

0.2

0.4

0.6

0.8

1

1.2

1

1.06

1.12

1.18

1.24 1.3

1.36

1.42

1.48

1.54 1.6

1.66

1.72

1.78

1.84 1.9

1.96

x

(x)

μ

Fig. 1. Triangular membership function ‘‘small’’.

LT Small

0

0.2

0.4

0.6

0.8

1

1.2

1

1.06

1.12

1.18

1.24 1.3

1.36

1.42

1.48

1.54 1.6

1.66

1.72

1.78

1.84 1.9

1.96

x

(x)

μ

Fig. 2. Less than small.

GT Small

0

0.2

0.4

0.6

0.8

1

1.2

1

1.06

1.12

1.18

1.24 1.3

1.36

1.42

1.48

1.54 1.6

1.66

1.72

1.78

1.84 1.9

1.96

x

(x)

μ

Fig. 3. Greater than small.

S. Dick et al. / Information Sciences 177 (2007) 408–435 417

of any rewrite rule must be a single non-terminal, making it impossible to perform the syntactic operation ofcrossover.

In order to include the crossover operation in the syntactic rule S of a linguistic variable, we must define Sto be a phrase-structure grammar, which allows us to place an arbitrary combination of terminals and non-terminals on the left-hand side of a rewrite rule. Since the set of languages that can be defined by context-freegrammars is a proper subset of the languages than can be formed by phrase-structure grammars, thisdefinition of S makes our linguistic variables more general than the classical LV. Accordingly, we will refer tothese constructs as generalized linguistic variables (GLV). Zadeh studied using phrase-structure grammars forthe syntactic rule of a LV in [46], but rejected the idea as being computationally too expensive. However, in thespecific case of having only atomic terms and the hedges ‘‘greater than’’ and ‘‘less than’’ in the term set of aGLV, the computational burden is not an issue, as we shall see.

Definition 3.3. A generalized linguistic variable (GLV) is a 5-tuple (X,U,T,S,M) where X is the name of thevariable, U is the universe of discourse for X, T is a set of linguistic terms and hedges, S is a syntactic rule forgenerating new terms (a phrase-structure grammar), and M is a semantic rule that associates each string gen-erated by S with a fuzzy subset of U.

An example of the syntactic rule S is given in Fig. 4. For this rule, the atomic terms are negative (N), zero(Z), and positive (P). Crossovers occur when three Greater Than (GT) hedges are added to a term, or whenthree Less Than (LT) hedges are added to a term. After the crossover, for instance, GT GT GT N would berewritten as LT LT Z. Note also that GT LT t = LT GT t = t for any term t. Plainly, the hedges GT and LTare semantic inverses of each other; we have incorporated this into the syntactic rule as well. One of the veryimportant implications of this is that any composite term generated by our grammar will be a homogeneous

<start> →→

→→→→→→→→→→→

<hedge><term><term> N <term> Z <term> P <hedge> $ <hedge> <hedge> GT <hedge> <hedge> LT GT LT $ LT GT $ GT GT GT N LT LT Z LT LT LT Z GT GT N GT GT GT Z LT LT P LT LT LT P GT GT Z

Fig. 4. Syntactic rule.

418 S. Dick et al. / Information Sciences 177 (2007) 408–435

sequence of either GT or LT hedges added to an atomic term, which will greatly simplify the implementationof GLVs. Note also that there is no atomic term ‘‘larger’’ than positive, or ‘‘lesser’’ than negative. Wetherefore permit an infinite number of GT hedges to be attached to the positive term, and an infinite numberof LT hedges for the negative term. The threshold for executing crossovers is a constant value for agiven GLV, denoted v. When the number of GT or LT hedges for a term exceeds v, a crossover occurs.

3.2. Linguistic arithmetic

We define four operators for the linguistic arithmetic: linguistic sum, linguistic difference, scalar product(the product of a numeric value and a linguistic term) and linguistic product. These are the operations wewill use to create the regranulation algorithm in Section 4. These operators accept linguistic terms fromGLVs as their operands, and return a linguistic value from a GLV. Two further concepts are used todefine the linguistic arithmetic operators: an ordering of the terms in a GLV, and a distance betweentwo terms.

We will represent the ordering of terms in a GLV using a standard sequence, an analog of the standard vec-tor in [28]. Recall that the standard vector is a listing of the terms of a linguistic variable, arranged in some‘‘reasonable’’ ordering. The set of terms for the linguistic variable is assumed finite. The standard sequenceextends this idea to term sets with an infinite number of entries. Take for instance the GLV in Fig. 4. Wedenote the term set of this GLV as s. We first arrange the atomic terms in a standard vector x, as in [10].We then build the standard sequence r, consisting of all the elements of s, arranged according to the followingfive rules:

1. Atomic terms are ordered by the standard vector x.2. For any term t 2 s, LT t < t.3. For any term t 2 s, t < GT t.4. For any term t 2 s, if GT t triggers a crossover, then t < crossover(GT t).5. For any term t 2 s, if LT t triggers a crossover, then crossover(LT t) < t,

where crossover( ) denotes the element of s generated by a crossover operation. Clearly, these rules impose atotal ordering on the elements of s. In our example, the standard sequence is:

r¼ . . .LT N; N; GT N; GT GT N; LT LT Z; LT Z; Z; GT Z; GT GT Z; LT LT P; LT P; P; GT P; . . .

where N = negative, Z = zero, and P = positive. In general, a standard sequence must be constructed for eachGLV.

The point of departure for our discussion of a distance between two linguistic terms comes from Stilman[38]. He defines a distance metric using a family of reachability sets, which are subsets of some universe ofdiscourse whose elements represent locations in a network. The set of all points y in a network that can bereached from point x in a single transition is a reachability set for x. Stilman defines the distance betweenx and y as the minimum number of transitions required to reach x from y, regardless of any underlying phys-ical distance. We can adapt this distance metric to GLV variables as follows: for ant two entries x,y 2 r, thenumber of entries appearing between x and y will always be finite and known. For instance, in the example ofFig. 4, there are exactly four terms that separate the entires ‘‘Z’’ and ‘‘P’’ (GT N, GT GT N, LT LT Z, LT Z).We will henceforth refer to these in-between terms as intervening terms.

Definition 3.4. Given a GLV with term set s, ordered by a standard sequence r, the difference D(t1, t2) betweentwo terms t1, t2 2 s is an integer whose value is the number of intervening terms between t1 and t2 plus 1, andwhose sign is positive if t1 < t2 and negative if t1 > t2.

Definition 3.4 establishes a numeric difference metric over the term set s of a GLV, as ordered by thestandard sequence r. We could directly use Definition 3.4 to define our linguistic addition and subtractionoperators; however, we cannot define linguistic or scalar products in this manner. Definition 3.4 provides aninterval scale over the term set s; the elements of s are totally ordered, and a linear distance exists between any

S. Dick et al. / Information Sciences 177 (2007) 408–435 419

two elements. Sum and difference operators can be defined for interval scales; however, product operationscannot be [17]. In order to define product operations, we must incorporate a multiplicative zero into s, whichwill give us a ratio scale over s. (More precisely, the ‘‘zero term’’ in s must constitute an additive identity and amultiplicative zero under the operations of linguistic sum and linguistic product.) Sum, difference and productoperations may then be defined. Accordingly, we will use Definition 3.4 as a building block for a mappingbetween s and the set of integers; this function will provide a ratio scale over s, and a framework for definingall operations in the linguistic arithmetic.

Definition 3.5. The mapping

C : s! Z

CðtÞ ¼ DðZT;tÞð3:3Þ

where t 2 s, Z is the set of integers, ZT is the ‘‘zero term’’ of s, and the function D is defined in Definition 3.4.When t = ZT, C(t) = 0, giving us a ratio scale. This function treats a linguistic value as a vector in a one-dimensional space of linguistic terms, which describes the location of that term relative to ZT. This corre-sponds to the classical interpretation of a number as a vector on the real line. We will define the operationsof linguistic arithmetic in terms of the function C. We will first state a theorem concerning the cardinality of s.

Theorem 1. The term set s, as ordered by the standard sequence r, is countably infinite [11].

Proof. We refer the reader to [11] for a proof of this statement; for the reader’s convenience, we note that theproof relies on demonstrating that C is one-to-one and onto (a bijection); hence, the sets s and Z must be iden-tical in size. h

Corollary. The inverse of the function C(t), denoted C�1(x), is a mapping from Z to the unique t 2 s for whichC(t) = x.

Definition 3.4 provides a linear distance over linguistic terms, creating an interval scale. Definition 3.5establishes a multiplicative zero for that interval scale, making it a ratio scale. Theorem 1 shows that theset of elements belonging to this scale is countably infinite, and its corollary establishes the inverse functionC�1. Therefore, we may now proceed to define sum, difference and product operators over this ratio scale.

Definition 3.6. We define the operations of linguistic arithmetic to be linguistic sum, linguistic difference,scalar product, and linguistic product.

(a) The linguistic sum is a mapping

� : s� s! s

t1 � t2 ¼ C�1ðCðt1Þ þ Cðt2ÞÞð3:4Þ

(b) The linguistic difference is a mapping

� : s� s! s

t1 � t2 ¼ C�1ðCðt1Þ � Cðt2ÞÞð3:5Þ

(c) The scalar product is a mapping

: R� s! s

s t1 ¼ C�1ðroundðs Cðt1ÞÞÞð3:6Þ

(c) The linguistic product is a mapping

� : s� s! s

t1 � t2 ¼ C�1ðroundððCðt1Þ Cðt2ÞÞ=vÞÞð3:7Þ

420 S. Dick et al. / Information Sciences 177 (2007) 408–435

In the above, s is the set of composite terms for a GLV, t1, t2 2 s, s 2 R is a real value, + is integer addition,� is integer subtraction, * is integer multiplication, / is real division, and round(x) is the nearest integer to thereal value x. The linguistic product in Eq. (3.7) includes a scaling factor of 1/v, which causes terms having thezero term of s as their atomic component to act as ‘‘fractions’’; that is, when one of these terms is multiplied byanother term t, the linguistic product y follows the rule jC(y)j < jC(t)j.

Example 3.1. The linguistic product GT GT P • LT Z is computed for the GLV in Fig. 4 as follows:

¼ C�1ðroundððCðGT GT PÞ CðLT ZÞÞ=2ÞÞ¼ C�1ðroundðð7 �1Þ=2ÞÞ¼ C�1ð�4Þ¼ GT N

4. Regranulating fuzzy rulebases

In this section, we provide a comprehensive overview of the regranulation algorithm. The algorithm isbased on the idea of zooming in image processing, using linear interpolations between pixels. Regranulationis likewise a transformation from an original (source) rulebase to a regranulated (sink) rulebase. Rulebases aretreated as functions in a linguistic space; the consequent of a rule in a sink rulebase is a weighted sum of theconsequents from all overlapping rules in the source rulebase. The weighted sum is computed using the oper-ations of linguistic arithmetic discussed in the preceding sections. We use the well-known fuzzy-PD rulebase todemonstrate our algorithm, and find that a 3 · 3 granulation is perhaps too coarse for this particular problem.

4.1. Regranulation algorithm

Linear interpolation is a very common technique in diverse fields of study. Our particular algorithmdraws its inspiration from the use of linear interpolation in image processing, where it is used in image mag-nification. The basic consideration in such algorithms to create a transformed image from an original, inwhich (a) a group of pixels in the transformed image represents a single pixel from the original (i.e. zoomingin), or (b) a single pixel in the transformed image represents a group of pixels in the original image (zoom-ing out). Let us first consider case (a); one simple approach is to simply replicate the value of the originalpixel for all corresponding pixels in the transformed image, i.e. each pixel is a 0-order hold. The problem, ofcourse, is that the image becomes grainy and unnatural to human eyes. A superior approach is to first inter-lace the original image with rows and columns of 0’s. Then each element of a ‘0’ column is treated as a first-order hold between the two pixels horizontally adjacent to it. Then, each element of a ‘0’ row is likewisetreated as a first-order hold between the pixels vertically adjacent to it. The resulting image is twice the sizeof the original, and is not grainy; rather, it appears slightly blurred, a visual effect that is much less startlingto the human eye [21]. For case (b), a basic approach is to set the pixel to the average value of the corre-sponding pixels in the original image. Drawing these cases together, and considering the case of magnifica-tions that are not integer multiples, we see that the ordinary weighted sum is the general operationunderlying these algorithms.

In regranulating a fuzzy rulebase, we assume that the rulebase is complete and consistent (forming a func-tion from the linguistic-valued domain to the consequent linguistic variable); that all linguistic variables inducea uniform partition of their universes of discourse; that the term sets of all linguistic variables contain onlyatomic terms; that the semantic rule for each linguistic variable associates each term with an LR fuzzy number;and that the term sets, when ordered by the standard vector of their LV, are either symmetric about some zeroterm or have the zero term as their least element. In close analogy with image processing, we must set the con-sequent value for each rule in the sink rulebase to represent the consequent value for some set of rules in thesource rulebase. Our algorithm does not require the granularity of the LVs in the sink rulebase to be an integermultiple of the granularity of the corresponding LVs from the source rulebase; indeed, our example in Section4.2 covers the case of different, prime granulations in each LV.

S. Dick et al. / Information Sciences 177 (2007) 408–435 421

The consequent value of a rule in the sink rulebase will be the weighted sum of the consequents of all over-

lapping rules from the source rulebase. Overlapping rules are those source rules whose partitions overlap thepartition of the sink rule being computed. The partitions in a rulebase are defined by the semantic rules of theunderlying linguistic variables, and represent regions of the input space where one specific rule has the highestmembership. This is best visualized in the two-dimensional case, as in Fig. 5. In this example, the rulebase isbeing coarsened from a 5 · 5 granulation (bold lines) to a 3 · 3 granulation (thin lines). Consider the sink rulein the upper left corner; this rule will overlap with four rules from the original 5 · 5 rulebase. We must nowconsider a weighting of the consequents from each of those overlapping rules. Our approach is to compute thefraction of the partition of the sink rule that overlaps each of the source rules; that fraction is then the weightassigned to the consequent of that source rule. More precisely, consider a linguistic antecedent from an over-lapping rule in the source rulebase, belonging to an LV over the universe of discourse Ui, and having a par-tition interval of [u1,u2]. There will be a linguistic antecedent in the corresponding rule in the sink rulebase,also defined over Ui, and having partition interval [v1,v2], and the intervals [u1,u2] and [v1,v2] will overlapto some extent. Compute the normalized overlap in this dimension by

minðu2; v2Þ �maxðu1; v1Þv2 � v1

ð4:1Þ

and compute the final weight of the rule as the product of all the one-dimensional overlaps between the sourceand sink rule. Clearly, the final sum of the weights of all overlapping source rules will be exactly 1. The con-sequent of the sink rule is then calculated by

X

i:wi 6¼0

wi Cð~xiÞ ð4:2Þ

where wi is the weight assigned to overlapping source rule~xi, Cð~xiÞ is the consequent of source rule~xi, rep-resents the scalar product of Eq. (3.6), and the summation represents linguistic sum from Eq. (3.4). This algo-rithm is summarized in Fig. 6.

Examining Fig. 6, we can see that the computational complexity of the regranulation algorithm will be pro-portional to the product of the # of source rules, # of sink rules and the input dimensionality. Consideringthat we assume the rulebases are complete and consistent, the regranulation algorithm will definitely experi-ence the ‘‘curse of dimensionality’’. The number of source and sink rules will increase exponentially asthe number of input dimensions increases. Some possible strategies to reduce the computational burden ofregranulating high-dimensional rulebases would be to eliminate rules with 0 overlap from the computation,

Fig. 5. Overlapping source and sink rules.

For every rule in the sink rulebase yi

For every rule in the source rulebase xi

Product = 1;

For every input dimension

Compute the normalized overlap (Equation (4.1)) between the current source rule and

current sink rule in the current dimension

Product = Product *

Wi = Product

C(yi) is computed using Equation (4.2) for all Wi, xi.

End

ω

ω

Fig. 6. Regranulation algorithm.

422 S. Dick et al. / Information Sciences 177 (2007) 408–435

and to delete the assumption of a complete rulebase. These are, however, beyond the scope of the currentpaper.

4.2. Illustrative example: fuzzy-PD rulebases

In this section, we will examine the widely used fuzzy-PD rulebase using regranulation. The fuzzy-PD rule-base is normally a control element, whose inputs are the error signal and its time derivative from a plant. Thecontrol goal is to stabilize the plant at a given set point. Our experimental goal is to determine if regranulationcan provide quantifiable evidence of a qualitative difference between different granulations of the basic fuzzy-PD rulebase. Consider the decision surface plots (generated by MATLAB�) of 3 · 3, 5 · 5, 7 · 7 and 9 · 9fuzzy-PD rulebases, presented in Figs. 7–10, respectively. In all these figures, we use Mamdani max–min infer-encing and the centroid defuzzifier. As can be seen, with an increasingly refined granulation, the control sur-

Fig. 7. Fuzzy-PD rulebase, 3 · 3 granulation.

Fig. 8. Fuzzy-PD rulebase, 5 · 5 granulation.

S. Dick et al. / Information Sciences 177 (2007) 408–435 423

face approaches the ideal of a smooth plane more and more closely. There is plainly an observable differencebetween the very coarse 3 · 3 rulebase and the refined 9 · 9 rulebase; indeed, one could say that there isalready a very noticeable difference between a 3 · 3 and a 5 · 5 rulebase. The question is, can quantitative evi-dence be found to back up this assertion, and provide an empirically sound reason to favor more refined rule-bases over the simple 3 · 3 fuzzy-PD rulebase? This is the topic we will take up in the current section.

One method for comparing fuzzy rulebases is the RMS difference between their reasoning surfaces. In Table1, we present the pairwise RMS differences for all four rulebases, using a 200 point by 200 point grid. Onemight well expect that the RMS differences would generally be more similar for closer granulations, but this

Fig. 9. Fuzzy-PD rulebase, 7 · 7 granulation.

Fig. 10. Fuzzy-PD rulebase, 9 · 9 granulation.

424 S. Dick et al. / Information Sciences 177 (2007) 408–435

is not always the case. For the 5 · 5, 7 · 7 and 9 · 9 granulations, the RMS difference generally increases as thedifference between granularities increases. However, the 3 · 3 granulation is most similar to the 7 · 7 granu-lation, and most different from the 5 · 5 granulation, with the 9 · 9 granulation falling in between. The 5 · 5granulation is also more similar to the 7 · 7 rulebase than the 9 · 9 rulebase, but the 9 · 9 and 7 · 7 rulebasesare always more similar to rulebases with more similar granulations. At a minimum, there seems to be some

Table 1RMS difference between fuzzy-PD rulebase

3 · 3 5 · 5 7 · 7 9 · 9

3 · 3 – 0.104997 0.098163 0.1012415 · 5 – – 0.063975 0.0618267 · 7 – – – 0.0503149 · 9 – – – –

Table 2RMS differences between source, derived, and goal rulebases

Source granularity Goal granularity Source vs. derived RMS Derived vs. goal RMS

3 · 3 5 · 5 0.074698 0.0960293 · 3 7 · 7 0.100298 0.1299563 · 3 9 · 9 0.116395 0.1650465 · 5 3 · 3 0.075929 0.0650195 · 5 7 · 7 0.047428 0.0604475 · 5 9 · 9 0.042274 0.0618517 · 7 3 · 3 0.083522 0.1006907 · 7 5 · 5 0.069683 0.0718217 · 7 9 · 9 0.038697 0.0420859 · 9 3 · 3 0.094641 0.1159139 · 9 5 · 5 0.044398 0.0535549 · 9 7 · 7 0.047149 0.050755

S. Dick et al. / Information Sciences 177 (2007) 408–435 425

qualitative difference between the 3 · 3 granulation and the other rulebases; the question is, can we find con-firmatory evidence? We will use regranulation to answer this question.

In Table 2, we present the results of regranulating each of the rulebases. We report the RMS differencebetween the source and sink rulebases in the regranulation, and also between the sink rulebase and the originalrulebase of that same granularity (the goal rulebase).

Examination of Tables 1 and 2 indicates that there is a difference between the 3 · 3 granulation of thefuzzy-PD rulebase and the others. This appears most clearly in the derived vs. goal results in Table 2. Thesevalues are normally larger than corresponding RMS difference between source and goal rulebases reportedin Table 1. However, the relative change in regranulating with the 3 · 3 rulebase is larger than when usingany other granularity as the source rulebase. As an extreme example, regranulating from 3 · 3 to 9 · 9 pro-duces a derived vs. goal difference 65% greater than the Table 1 value of 0.101241. Combined with thebehaviors identified in Table 1, we believe this constitutes evidence that the 3 · 3 granulation may be toocoarse for the fuzzy-PD controller, and we would recommend using at least a 5 · 5 granulation for thesecontrol problems.

5. Regranulation in mined rulebases

Our second experiment examines the case of rulebases created through fuzzy rule induction, specificallythrough the use of the info-fuzzy network (IFN) [29], a data-mining tool that can induce fuzzy rules fromnumerical data. The issue we wish to highlight in this experiment is what happens when a dataset containsoscillatory behavior. This might occur due to a periodic relationship between independent and dependentvariables, or at the boundary of a non-convex decision region in classification problems. The problemoscillatory behavior poses for granular computing is illustrated in Fig. 11. If too coarse a granulationis used, it is not possible for a granular world to capture the oscillatory behavior; on the otherhand, if the granulation is adequate, then the granular world can faithfully represent the oscillatorybehavior.

Since we are interested in oscillatory behavior, we have chosen to sample a simple periodic function (givenin Eq. (5.1)) in order to create our dataset. We have elected to use this dataset instead of one of the more com-monly used classification datasets because the oscillatory behaviors we are interested in are only a small frac-tion of those datasets. Oscillatory behaviors in a classification dataset will only occur at the class boundaries;the interior of a decision region will be essentially homogeneous. On the other hand, the regularity of a peri-odic function will serve to highlight this behavior, and facilitate our study.

z ¼ cosffiffiffiffiffiffiffiffiffiffiffiffiffiffix2 þ y2

p� �ð5:1Þ

x

y(x)

0

0 x

(xμ )

Fig. 11. Oscillatory behavior of a variable.

Fig. 12. Function F1 (Eq. (5.1)).

0 x

(x)

Small Med Large1

1

μ

Fig. 13. Semantic rule for 3 · 3 linguistic variables.

0 1

1

(x)

x

NL NM Z PM PL1

μ

Fig. 14. Semantic rule for 5 · 5 linguistic variables.

426 S. Dick et al. / Information Sciences 177 (2007) 408–435

We first sample the two-dimensional function given by Eq. (5.1) on a uniform 50 · 50 grid over the range[�10,10] in each dimension of the domain, producing 2500 data vectors (see Fig. 12). We will induce rulebasesconsisting of 3 · 3, 5 · 5, and 7 · 7 granulations from this dataset. Each rulebase will also have the same

0 1

1

(x)

x

N NM N Z PS PM PL

μ

Fig. 15. Semantic rule for 7 · 7 linguistic variables.

S. Dick et al. / Information Sciences 177 (2007) 408–435 427

granularity of output as inputs, and the semantic rules for all linguistic variables will be uniform triangularmembership functions, as in Figs. 13–15. We have selected uniform triangular membership functions becauseof their widespread use, as well as the theoretical finding (in [3]) that uniform triangular memberships func-tions can represent numeric data with no loss of precision. Our results, however, should be relatively insensi-tive to minor changes in the semantic rules, such as converting to Gaussian or trapezoidal membershipfunctions.

We first normalize our dataset and partition it for a 10-fold cross-validation experiment, using the xval rou-tine supplied with the c4.5 release 8 decision-tree generator [36]. This is a stratified sampling routine thatensures that each of the 10 partitions of a dataset has the same class distribution as the original dataset. Sinceour dependent variable is numeric rather than categorical, we first classify each sample using the maximum-membership method on the dependent variable only. The xval routine then produces the training and testingfiles for the 10-fold cross-validation experiment, with the proportion of data vectors assigned to each outputclass balanced across all 10 partitions. We then use the IFN, version 1.4 beta, to induce complete and consis-tent rulebases from each training file. This yields a total of 30 rulebases, each with an associated test data file.We have also induced 3 · 3, 5 · 5 and 7 · 7 rulebases from the original dataset of 2500 data vectors; the rea-soning surface for these rulebases are presented in Figs. 16–18, respectively. On a visual inspection, we can seethat the 3 · 3 rulebase clearly does not match Fig. 12, while the 7 · 7 rulebase appears to capture all the majorbehaviors present in Fig. 12. The 5 · 5 rulebase seems to capture some of the behaviors of Fig. 12, but not all.

Fig. 16. Function F1, 3 · 3 granulation.

Fig. 17. Function F1, 5 · 5 granulation.

Fig. 18. Function F1, 7 · 7 granulation.

428 S. Dick et al. / Information Sciences 177 (2007) 408–435

For our regranulation experiments, we will refine the 3 · 3 rulebase to 5 · 5 and 7 · 7 granularities; we willrefine the 5 · 5 rulebase to 7 · 7 and coarsen it to 3 · 3, and we will coarsen the 7 · 7 rulebase to 3 · 3 and5 · 5 granularities. For each of these six experiments, each of the 10 rulebases in the original granularity willbe compared to each of the 10 rulebases in the target granularity. As in Section 4.2, we will measure the RMSdifference between the source and derived rulebases’ reasoning surfaces, and between the derived and goal rea-soning surfaces. In addition, we will determine the sum of squared error for the source and goal testing files,over the source, derived and goal rulebases. These results are presented in Tables 3–8.

The first observation we can make from Tables 3–8 is that the sum of squared errors for the derived rule-base is always greater than that for either the source or goal rulebases. However, a comparison of refinementand coarsening between two granularities reveals some important differences that bear directly on the question

Table 3Refining 3 · 3 to 5 · 5

Mean STD

Refine 3 · 3 to 5 · 5 Source vs. derived (RMS) 0.0947 0.0150Derived vs. goal (RMS) 0.1489 0.0101

Source testing file Source rulebase (SSE) 38.4183 2.7249Derived rulebase (SSE) 42.8638 1.7479Goal rulebase (SSE) 32.8204 1.1288

Goal testing file Source rulebase (SSE) 38.4789 3.4382Derived rulebase (SSE) 42.9086 2.3839Goal rulebase (SSE) 33.2485 1.3400

Table 4Refining 3 · 3 to 7 · 7

Mean STD

Refine 3 · 3 to 7 · 7 Source vs. derived (RMS) 0.0905 0.0144Derived vs. goal (RMS) 0.1689 0.0049

Source testing file Source rulebase (SSE) 38.4183 2.7249Derived rulebase (SSE) 44.2986 1.7379Goal rulebase (SSE) 25.6915 0.9076

Goal testing file Source rulebase (SSE) 38.4789 3.4576Derived rulebase (SSE) 44.2444 1.8630Goal rulebase (SSE) 25.7337 0.4946

Table 5Refining 5 · 5 to 7 · 7

Mean STD

Refine 5 · 5 to 7 · 7 Source vs. derived (RMS) 0.1146 0.0058Derived vs. goal (RMS) 0.1612 0.0078

Source testing file Source rulebase (SSE) 33.2485 1.3400Derived rulebase (SSE) 43.1025 1.6924Goal rulebase (SSE) 25.6915 0.8601

Goal testing file Source rulebase (SSE) 32.8204 1.0645Derived rulebase (SSE) 42.5619 1.8508Goal rulebase (SSE) 25.7337 0.4946

Table 6Coarsening 5 · 5 to 3 · 3

Mean STD

Coarsen 5 · 5 to 3 · 3 Source vs. derived (RMS) 0.1385 0.0069Derived vs. goal (RMS) 0.1413 0.0223

Source testing file Source rulebase (SSE) 33.2485 1.3400Derived rulebase (SSE) 44.4053 2.0734Goal rulebase (SSE) 38.4789 3.4382

Goal testing file Source rulebase (SSE) 32.8204 1.1288Derived rulebase (SSE) 44.4516 2.1962Goal rulebase (SSE) 38.4183 2.7249

S. Dick et al. / Information Sciences 177 (2007) 408–435 429

Table 7Coarsening 7 · 7 to 3 · 3

Mean STD

Coarsen 7 · 7 to 3 · 3 Source vs. derived (RMS) 0.1341 0.0022Derived vs. goal (RMS) 0.1292 0.0134

Source testing file Source rulebase (SSE) 25.7337 0.4946Derived rulebase (SSE) 43.0982 0.7046Goal rulebase (SSE) 38.4789 3.4576

Goal testing file Source rulebase (SSE) 25.6915 0.9076Derived rulebase (SSE) 43.1705 1.3872Goal rulebase (SSE) 38.4183 2.7249

Table 8Coarsening 7 · 7 to 5 · 5

Mean STD

Coarsen 7 · 7 to 5 · 5 Source vs. derived (RMS) 0.1383 0.0042Derived vs. goal (RMS) 0.1310 0.0080

Source testing file Source rulebase (SSE) 25.7337 0.4946Derived rulebase (SSE) 40.9031 0.9208Goal rulebase (SSE) 32.8204 1.0645

Goal testing file Source rulebase (SSE) 25.6915 0.8601Derived rulebase (SSE) 40.9207 1.1323Goal rulebase (SSE) 33.2485 1.3400

430 S. Dick et al. / Information Sciences 177 (2007) 408–435

of what an adequate granularity for this problem is. Consider first the 3 · 3 vs. 5 · 5 granulations. When refin-ing the 3 · 3 rulebase to 5 · 5, the average SSE for the derived rulebase is 42.4638 for the source testing file,and 42.9086 for the goal testing file. By comparison, when coarsening the 5 · 5 rulebase to 3 · 3, the averageSSE for the source testing file was 44.4053, and was 44.4516 for the goal testing file. Thus, the rulebase derivedby refining the 3 · 3 rulebase to 5 · 5 was more accurate than the one obtained by coarsening the 5 · 5 rule-base to 3 · 3 (see Tables 3 and 6). However, the other two possible pairings (3 · 3 to 7 · 7 and 5 · 5 to 7 · 7)show the opposite pattern; the average SSE is lower when coarsening the 7 · 7 rulebase to 3 · 3 or 5 · 5 thanwhen refining either 3 · 3 or 5 · 5 to 7 · 7. We have tested these statements for statistical significance using thet-test, which compares two populations to determine if they share the same mean value. T-values for each ofthe refinement and coarsening combinations are presented in Table 9; these t-values are all significant at the0.01 level. This indicates that substantially more important information is being captured in the 7 · 7 granu-lation than in the 3 · 3 or 5 · 5 granulations, and this is why coarsening a 7 · 7 rulebase leads to a lower SSEthan refining a 3 · 3 or 5 · 5 rulebase to the 7 · 7 granularity. On the other hand, this pattern does not appearto hold when considering the 3 · 3 vs. 5 · 5 granularities. Our interpretation of these results is that the 3 · 3,

Table 9t-Values for comparing refinement and coarsening

Treatment t-Value

Refine 3 · 3 to 5 · 5 vs. coarsen 5 · 5 to 3 · 3: Source testing file �5.6840Refine 3 · 3 to 5 · 5 vs. coarsen 5 · 5 to 3 · 3: Goal testing file �4.7609Refine 3 · 3 to 7 · 7 vs. coarsen 7 · 7 to 3 · 3: Source testing file 6.4021Refine 3 · 3 to 7 · 7 vs. coarsen 7 · 7 to 3 · 3: Goal testing file 4.6130Refine 5 · 5 to 7 · 7 vs. coarsen 7 · 7 to 5 · 5: Source testing file 11.4136Refine 5 · 5 to 7 · 7 vs. coarsen 7 · 7 to 5 · 5: Goal testing file 7.5631

Table 10Refining 3 · 3 to 5 · 5

Contrast Correlation (q) p-Value

Source testing file Source SSE/source to derived RMS �0.9491 <0.0001Source SSE/derived to goal RMS 0.8459 <0.0001Derived SSE/source to derived RMS �0.8321 <0.0001Derived SSE/derived to goal RMS 0.7211 <0.0001Goal SSE/source to derived RMS �0.1193 0.23Goal SSE/derived to goal RMS �0.0639 0.52

Goal testing file Source SSE/source to derived RMS �0.9175 <0.0001Source SSE/derived to goal RMS 0.7920 <0.0001Derived SSE/source to derived RMS �0.8331 <0.0001Derived SSE/derived to goal RMS 0.7578 <0.0001Goal SSE/source to derived RMS 0 1.0Goal SSE/derived to goal RMS �0.0501 0.62

Table 11Refining 3 · 3 to 7 · 7

Contrast Correlation (q) p-Value

Source testing file Source SSE/source to derived RMS �0.9491 <0.0001Source SSE/derived to goal RMS 0.7074 <0.0001Derived SSE/source to derived RMS �0.6170 <0.0001Derived SSE/derived to goal RMS 0.4671 <0.0001Goal SSE/source to derived RMS 0.1963 0.0503Goal SSE/derived to goal RMS �0.2106 0.0354

Goal testing file Source SSE/source to derived RMS �0.9123 <0.0001Source SSE/derived to goal RMS 0.6072 <0.0001Derived SSE/source to derived RMS �0.8276 <0.0001Derived SSE/derived to goal RMS 0.6975 <0.0001Goal SSE/source to derived RMS 0 1.0Goal SSE/derived to goal RMS 0.0695 0.49

Table 12Refining 5 · 5 to 7 · 7

Contrast Correlation (q) p-Value

Source testing file Source SSE/source to derived RMS 0.4391 <0.0001Source SSE/derived to goal RMS 0.1923 0.0553Derived SSE/source to derived RMS 0.7905 <0.0001Derived SSE/derived to goal RMS 0.5821 <0.0001Goal SSE/source to derived RMS �0.1089 0.2808Goal SSE/derived to goal RMS �0.2614 0.0086

Goal testing file Source SSE/source to derived RMS 0.4313 <0.0001Source SSE/derived to goal RMS 0.3674 0.0002Derived SSE/source to derived RMS 0.7523 <0.0001Derived SSE/derived to goal RMS 0.6821 <0.0001Goal SSE/source to derived RMS 0 1.0Goal SSE/derived to goal RMS �0.0909 0.3687

S. Dick et al. / Information Sciences 177 (2007) 408–435 431

and possibly the 5 · 5 rulebase, are too coarse to represent the underlying function, whereas the 7 · 7 rulebasedoes capture all important behaviors. We can also examine the correlation between the testing SSE values in

Table 13Coarsening 5 · 5 to 3 · 3

Contrast Correlation (q) p-Value

Source testing file Source SSE/source to derived RMS 0.2624 0.0084Source SSE/derived to goal RMS 0.0277 0.7847Derived SSE/source to derived RMS 0.9606 <0.0001Derived SSE/derived to goal RMS 0.4257 <0.0001Goal SSE/source to derived RMS �0.0146 0.8852Goal SSE/derived to goal RMS �0.7631 <0.0001

Goal testing file Source SSE/source to derived RMS 0.2672 0.0072Source SSE/derived to goal RMS �0.0365 0.7183Derived SSE/source to derived RMS 0.8452 <0.0001Derived SSE/derived to goal RMS 0.4562 <0.0001Goal SSE/source to derived RMS 0 1.0Goal SSE/derived to goal RMS �0.7920 <0.0001

Table 14Coarsening 7 · 7 to 3 · 3

Contrast Correlation (q) p-Value

Source testing file Source SSE/source to derived RMS 0.0981 0.3315Source SSE/derived to goal RMS 0.0915 0.3651Derived SSE/source to derived RMS 0.3926 0.0001Derived SSE/derived to goal RMS 0.0659 0.5145Goal SSE/source to derived RMS 0.0795 0.4316Goal SSE/derived to goal RMS �0.9397 <0.0001

Goal testing file Source SSE/source to derived RMS �0.1218 0.2275Source SSE/derived to goal RMS 0.2500 0.0121Derived SSE/source to derived RMS 0.0490 0.6281Derived SSE/derived to goal RMS 0.3750 0.0001Goal SSE/source to derived RMS 0 1.0Goal SSE/derived to goal RMS �0.9174 <0.0001

Table 15Coarsening 7 · 7 to 5 · 5

Contrast Correlation (q) p-Value

Source testing file Source SSE/source to derived RMS 0.4483 <0.0001Source SSE/derived to goal RMS 0.3443 0.0005Derived SSE/source to derived RMS 0.6265 <0.0001Derived SSE/derived to goal RMS 0.4338 <0.0001Goal SSE/source to derived RMS �0.3715 0.0001Goal SSE/derived to goal RMS �0.3486 0.0004

Goal testing file Source SSE/source to derived RMS 0.4201 <0.0001Source SSE/derived to goal RMS 0.4428 <0.0001Derived SSE/source to derived RMS 0.6831 <0.0001Derived SSE/derived to goal RMS 0.4353 <0.0001Goal SSE/source to derived RMS 0 1.0Goal SSE/derived to goal RMS �0.0801 0.4284

432 S. Dick et al. / Information Sciences 177 (2007) 408–435

an experiment, and the RMS differences between reasoning surfaces. We compute Pearson’s correlation coef-ficient and the associated p-values between each of the testing SSE results and the RMS differences in Tables10–15.

S. Dick et al. / Information Sciences 177 (2007) 408–435 433

Let us first consider the refinement experiments reported in Tables 10–12. In Tables 10 and 11 (refining the3 · 3 rulebase) there are strong negative correlations between the RMS difference between source and derivedrulebases, and both the source and derived rulebase testing SSE (both the source and goal testing files). Sim-ilarly, there are strong positive correlations between the RMS difference between derived and goal rulebase,and both the source and derived rulebase testing SSE. No significant linear relationship was found betweenthe RMS differences and the goal rulebase testing SSE. However, this pattern changes when refining the5 · 5 rulebase to 7 · 7. The correlations between source testing SSE and the RMS differences all becomeweakly positive, while the derived SSE values show a strong positive correlation with both RMS differences.We can interpret these results by considering what the RMS differences reveal: they measure how different thederived rulebase is from the source and goal rulebases. A strong positive correlation between RMS differencesand testing SSE may indicate that the source and derived rulebases generalize well on unseen samples from ourunderlying function; the regranulation process, which communicates information from one granular world toanother, has shown that the original rulebase contained an adequate representation of the underlying system.On the other hand, a strong negative correlation may well indicate the opposite; this means that testing SSEdecreases with an increasing difference between the source and derived rulebases. A similar argument can beapplied to the RMS difference between derived and goal rulebases. This indicates that, as before, the 3 · 3 rule-base definitely appears to be too coarse a granularity for this problem. The 5 · 5 granularity is not excluded bythis analysis; however, the evidence is somewhat weak for 5 · 5 being a sufficient granularity.

Consider next the coarsening experiments analyzed in Tables 13–15. The only strong correlations are apositive correlation between derived SSE and the source-to-derived RMS in coarsening 5 · 5 to 3 · 3 and7 · 7 to 5 · 5; and a negative correlation between goal SSE and derived-to-goal RMS in coarsening 5 · 5to 3 · 3 and 7 · 7 to 3 · 3. A coarsening experiment is different from a refinement; there is a definite loss ofinformation being experienced in coarsening, whereas information should be preserved in refinement. We sus-pect the positive correlations we are observing reflect a difference in the amount of information being lost;notice that the correlations appear for coarsening between adjacent granularities; the correlation betweenderived SSE and source-to-derived RMS for coarsening 7 · 7 to 3 · 3 remains positive, but is weak. The neg-ative correlations between goal testing SSE and derived-to-goal RMS may again represent the apparent inabil-ity of a 3 · 3 rulebase to capture the important behaviors in this system. Overall, considering Tables 3–15, wewould recommend the following: the 3 · 3 granulation definitely appears to be too coarse, while the 7 · 7 rule-base seems to capture all important behaviors within the rulebase. The case of the 5 · 5 rulebase is murkier,but a conservative approach would dictate that the 7 · 7 granularity be adopted for this problem. The evi-dence in favor of this recommendation arises from an analysis of regranulation experiments, in which thereare definite qualitative differences between 3 · 3 and 7 · 7 rulebases, and important but less definite differencesbetween the 5 · 5 and 7 · 7 rulebases.

6. Conclusions

In this paper, we have presented a new granular algorithm for communicating between granular worlds.The distinguishing characteristic of a granular algorithm is that the basic operations of the algorithm musttreat information granules as atomic objects; in our algorithm, the basic operations are the linguistic arithme-tic operators developed in [11]. Our algorithm permits communication between the subset of granular worldsdefined by fuzzy rulebases, by allowing us to translate the information contained in one granular world intothe granularity of another granular world. An experiment in regranulating fuzzy-PD rulebases demonstratesthe operation of this algorithm, while a second experiment examines the question of what is the minimum nec-essary granularity (which we might term the essential granularity) for a rulebase induced from data? Webelieve that the essential granularity of a rulebase is data-dependent, and mechanisms are required to distin-guish adequately refined rulebases from inadequate ones. We believe that there is an important role for theregranulation algorithm in this problem, as a means of comparing the information content of two rulebasesusing a granular computing approach.

Future work in this area will involve experimental and theoretical investigations of both linguistic arithme-tic and algorithms built from it. The role of the crossover limit v is particularly interesting; in both the currentpaper and [11], a value of v = 7 was used. Experimental comparisons of different values of v in both

434 S. Dick et al. / Information Sciences 177 (2007) 408–435

regranulation and granular neural networks will provide important theoretical and practical guidance in thefurther development of granular algorithms based on the linguistic arithmetic. In addition, regranulation ofincomplete rulebases will be an important extension of our work; a practical data-mining system necessarilyproduces an incomplete rulebase, to reduce the impact of the curse of dimensionality.

Acknowledgements

This work was supported in part by the Natural Sciences and Engineering Research Council of Canadaunder Grant Nos. PGSB-222631-1999 and G230000109, in part by the National Institute for Systems Testand Productivity at USF under the USA Space and Naval Warfare Systems Command Grant No.N00039-01-1-2248, and in part by the Alberta Software Engineering Research Consortium under GrantNo. G230000109.

References

[1] S. Auephanwiriyakul, J.M. Keller, Analysis and efficient implementation of a linguistic fuzzy c-means, IEEE Transactions on FuzzySystems 10 (5) (2002) 563–582.

[2] S. Auephanwiriyakul, J.M. Keller, P.D. Gader, Generalized Choquet fuzzy integral fusion, Information Fusion (2002).[3] A. Bargiela, W. Pedrycz, Granular Computing: An Introduction, Kluwer Academic Publishers, Boston, MA, 2003.[4] A. Bargiela, W. Pedrycz, Recursive information granulation: aggregation and interpretation issues, IEEE Transactions on Systems,

Man and Cybernetics, Part B: Cybernetics 33 (1) (2003) 96–112.[5] G. Bortolan, W. Pedrycz, Reconstruction problem and information granularity, IEEE Transactions on Fuzzy Systems 5 (2) (1997)

234–248.[6] G. Bortolan, W. Pedrycz, Fuzzy descriptive models: an interactive framework of information granulation, IEEE Transactions on

Fuzzy Systems 10 (6) (2002) 743–755.[7] M. Braae, D.A. Rutherford, Theoretical and linguistic aspects of the fuzzy logic controller, Automatica 15 (1979) 553–577.[8] M. Braae, D.A. Rutherford, Selection of parameters for a fuzzy logic controller, Fuzzy Sets and Systems 2 (1979) 185–199.[9] J. De Kleer, J.S. Brown, A qualitative physics based on confluences, Artificial Intelligence 24 (1–3) (1984) 7–83.

[10] S. Dick, W. Rodriguez, A. Kandel, A granular counterpart to the gradient operator, Soft Computing 6 (2) (2002) 124–140.[11] S. Dick, A. Kandel, Granular computing in neural networks, in: W. Pedrycz (Ed.), Granular Computing: An Emerging Paradigm,

Physica-Verlag, New York, 2001, pp. 275–305.[12] D. Driankov, H. Hellendoorn, M. Reinfrank, An Introduction to Fuzzy Control, Springer-Verlag, New York, 1993.[13] D. Dubois, H. Prade, Systems of linear fuzzy constraints, Fuzzy Sets and Systems 3 (1980) 37–48.[14] D. Dubois, H. Prade, Additions of interactive fuzzy numbers, IEEE Transactions on Automatic Control 26 (4) (1981) 926–936.[15] D. Dubois, H. Prade, On several definitions of the differential of a fuzzy mapping, Fuzzy Sets and Systems 24 (1987) 117–120.[16] R.E. Giachetti, R.E. Young, A parametric representation of fuzzy numbers and their arithmetic operators, Fuzzy Sets and Systems 91

(1997) 185–202.[17] M. Grabisch, S.A. Orlovski, R.R. Yager, Fuzzy aggregation of numerical preferences, in: R. Slowinski (Ed.), Fuzzy Sets in Decision

Analysis Operations Research, and Statistics, Kluwer Academic Pub., Norwell, MA, 1998, pp. 31–67.[18] R.J. Hathaway, J.C. Bezdek, W. Pedrycz, A parametric model for fusing heterogeneous fuzzy data, IEEE Transactions on Fuzzy

Systems 4 (3) (1996) 270–281.[19] D.H. Hong, C. Hwang, A T-sum bound of LR fuzzy numbers, Fuzzy Sets and Systems 91 (1997) 239–252.[20] F. Hoppner, F. Klawonn, R. Kruse, T. Runkler, Fuzzy Cluster Analysis: Methods for Classification, Data Analysis, and Image

Recognition, John Wiley & Sons, Inc., New York, 1999.[21] A.K. Jain, Fundamentals of Digital Image Processing, Prentice-Hall, Inc., Engewood Cliffs, NJ, 1989.[22] J.-S.R. Jang, C.-T. Sun, E. Mizutani, Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine

Intelligence, Prentice-Hall, Inc., Upper Saddle River, NJ, 1997.[23] A. Kandel, Fuzzy Mathematical Techniques with Applications, Addison-Wesley Pub. Co., Reading, MA, 1986.[24] N.N. Karnik, J.M. Mendel, An introduction to Type-2 fuzzy logic systems, Technical Report, Signal and Image Processing Institute,

Los Angeles, CA: Department of Electrical Engineering and Systems, University of Southern California, 1998.[25] G.J. Klir, B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications, Prentice Hall PTR, Upper Saddle River, NJ, 1995.[26] G. Lakoff, Hedges: a study in meaning criteria and the logic of fuzzy concepts, Journal of Philosophical Logic 2 (1973) 458–508.[27] H. Li, S. Dick, A similarity measure for fuzzy rulebases based on linguistic gradients, Information Sciences, in press, doi:10.1016/

j.ins.2005.09.003.[28] C. Liu, A. Shindhelm, D. Li, K. Jin, A numerical approach to linguistic variables and linguistic space, in: Proceedings, 1996 IEEE Int.

Conf. Fuzzy Systems, pp. 954–959.[29] O. Maimon, M. Last, Knowledge Discovery and Data Mining, the Info-Fuzzy Network (IFN) Methodology, Kluwer Academic

Publishers, Norwell, MA, 2000.[30] K. Meier, Methods for decision making with cardinal numbers and additive aggregation, Fuzzy Sets and Systems 88 (1997) 135–159.

S. Dick et al. / Information Sciences 177 (2007) 408–435 435

[31] Z. Pawlak, Granularity of knowledge, indiscernability, and rough sets, in: Proceedings of the 1998 IEEE Int. Conf. on Fuzzy Systems,pp. 106–110.

[32] W. Pedrycz, J.C. Bezdek, R.J. Hathaway, G.W. Rogers, Two nonparametric models for fusing heterogeneous fuzzy data, IEEETransactions on Fuzzy Systems 6 (3) (1998) 411–425.

[33] W. Pedrycz, G. Vukovich, Granular worlds: representation and communication problems, International Journal of IntelligentSystems 15 (2000) 1015–1026.

[34] W. Pedrycz, Relational and directional aspects in the construction of information granules, IEEE Transactions on Systems, Man andCybernetics, Part A: Systems and Humans 32 (5) (2002) 605–614.

[35] W. Pedrycz, G. Vukovich, On elicitation of membership functions, IEEE Transactions on Systems, Man and Cybernetics, Part A:Systems and Humans 32 (6) (2002) 761–767.

[36] J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Pub., San Mateo, CA, 1993.[37] S. Raha, N.R. Pal, K.S. Ray, Similarity-based approximate reasoning: methodology and application, IEEE Transactions on Systems,

Man and Cybernetics, Part A: Systems and Humans 32 (4) (2002) 541–547.[38] B. Stilman, Linguistic Geometry: Methodology and Techniques, Cybernetics and Systems 26 (1995) 535–597.[39] M. Sugeno, T. Yasukawa, A fuzzy-logic-based approach to qualitative modeling, IEEE Transactions on Fuzzy Systems 1 (1) (1993)

7–31.[40] D. Tikk, G. Biro, T.D. Gedeon, L.T. Koczy, J.D. Yang, Improvements and critique on Sugeno’s and Yasukawa’s qualitative

modeling, IEEE Transactions on Fuzzy Systems 10 (5) (2002) 596–606.[41] R.R. Yager, A characterization of the extension principle, Fuzzy Sets and Systems 18 (1986) 205–217.[42] R.R. Yager, On ordered weighted averaging operators in multi-criteria decision making, IEEE Transactions on Systems, Man and

Cybernetics 18 (1) (1988) 183–190.[43] R.R. Yager, D.P. Filev, Induced ordered weighted averaging operators, IEEE Transactions on Systems, Man and Cybernetics,

Part B – Cybernetics 29 (2) (1999) 141–150.[44] H. Wu, J.M. Mendel, Uncertainty bounds and their use in the design of interval type-2 fuzzy logic systems, IEEE Transactions on

Fuzzy Systems 10 (5) (2002) 622–639.[45] M. Ying, A formal model of computing with words, IEEE Transactions on Fuzzy Systems 10 (5) (2002) 640–652.[46] L.A. Zadeh, Quantitative fuzzy semantics, Information Sciences 3 (1971) 159–176.[47] L.A. Zadeh, A fuzzy-set-theoretic interpretation of linguistic hedges, Journal of Cybernetics 2 (3) (1972) 4–34.[48] L.A. Zadeh, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Transactions on Systems,

Man and Cybernetics 3 (1) (1973) 28–44.[49] L.A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning – Parts I, II, III, Information Sciences 8

(1975) 199–249, 8 (1975) 301–357; 9 (1975) 43–80.[50] L.A. Zadeh, Fuzzy sets and information granularity, in: M.M. Gupta, R.K. Ragade, R.R. Yager (Eds.), Advances in Fuzzy Set

Theory and Applications, North-Holland, New York, 1979.[51] L.A. Zadeh, Fuzzy logic = computing with words, IEEE Transactions on Fuzzy Systems 4 (2) (1996) 103–111.[52] L.A. Zadeh, Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic, Fuzzy Sets and

Systems 90 (1997) 111–127.