POPFNN-CRI(S): pseudo outer product based fuzzy neural network using the compositional rule of...

12
838 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 6, DECEMBER 2003 POPFNN-CRI(S): Pseudo Outer Product Based Fuzzy Neural Network Using the Compositional Rule of Inference and Singleton Fuzzifier Kai Keng Ang, Chai Quek, Member, IEEE, and Michel Pasquier Abstract—A pseudo-outer product based fuzzy neural network using the compositional rule of inference and singleton fuzzifier [POPFNN-CRI(S)] is proposed in this paper. The correspondence of each layer in the proposed POPFNN-CRI(S) to the composi- tional Rule of inference using standard -norm and fuzzy relation gives it a strong theoretical foundation. The proposed POPFNN- CRI(S) training consists of two phases; namely: the fuzzy mem- bership derivation phase using the novel fuzzy Kohonen partition (FKP) and pseudo Kohonen partition (PFKP) algorithms, and the rule identification phase using the novel one-pass POP learning al- gorithm. The proposed two-phase learning process effectively con- structs the membership functions and identifies the fuzzy rules. Extensive experimental results based on the classification perfor- mance of the POPFNN-CRI(S) using the Anderson’s Iris data are presented for discussion. Results show that the POPFNN-CRI(S) has taken only 15 training iterations and misclassify only three out of all the 150 patterns in the Anderson’s Iris data. Index Terms—Anderson’s IRIS data, CRI, fuzzy logical founda- tion, POPFNN-CRI, pseudo fuzzy Kohonen partitioning of fuzzy sets, two-stage one-pass learning process. I. INTRODUCTION F UZZY neural networks are hybrid systems that possess the advantages of both neural networks and fuzzy systems. The integration of fuzzy systems and neural networks combines the human inference style and natural language description of fuzzy systems with the learning and parallel processing of neural net- works. There are numerous approaches to integrate fuzzy sys- tems and neural networks [16], [8], [13], [24], [10], [17], [27]. Extensive bibliography on fuzzy neural network can be found in [1]. Zadeh suggested the use of fuzzy logic as a framework to manage vagueness and uncertainty [28]. When modeling vagueness, fuzzy predicates without well-defined boundaries concerning the set of objects may be applied. The rationale for using fuzzy logic is that the denotations of vague predicates are fuzzy sets rather than probability distributions. In many situations, vagueness and uncertainty are simultaneously presented since any precise or imprecise fact may be uncertain as well. Fuzzy set and possibility theories provide a unified framework to deal with vagueness and uncertainty [5]. In this paper, the pattern of approximate reasoning in pseudo outer Manuscript received January 24, 1999; revised July 18, 2001. This paper was recommended by Associate Editor S. Lakshmivarahan. The authors are with the Intelligent Systems Laboratory, School of Computer Engineering, Nanyang Technological University, Singapore 639798 (e-mail: [email protected]). Digital Object Identifier 10.1109/TSMCB.2003.812850 product fuzzy neural network using the compositional rule of inference and a singleton fuzzifier called POPFNN-CRI(S) is developed within such a framework. This paper is organized as follows. Section II gives a detailed description of POPFNN-CRI(S) and Section III introduces the two-phase learning process of POPFNN-CRI(S). Detailed de- scriptions on the fuzzy Kohonen partition (FKP) and pseudo fuzzy Kohonen partition (PFKP) algorithms [20] used in the fuzzy set derivation phase and the POP learning algorithm [18] used in the fuzzy rule identification phase are provided. Ex- tensive experimental results and analysis are provided in Sec- tion IV. Finally, the conclusions are presented in Section V. II. POPFNN-CRI(S) The proposed singleton pseudo outer-product based fuzzy neural network [POPFNN-CRI(S)] is developed on the basis of possibility theory and the fuzzy compositional rule of infer- ence. The structure of POPFNN-CRI(S) resembles the neural- network-like structure of POPFNN-TVR [18], but has strict cor- respondence to the inference steps in fuzzy rule-based systems that employ the compositional rule of inference using standard fuzzy operators and singleton fuzzifiers [12]. A. Fuzzy Logic and Possibility Theory Approach A fuzzy set is defined by a membership function that assigns each element of a universe of discourse a degree of compatibility with the concept represented by . Given the vague proposition “ is ,” where is a linguistic vari- able and is the fuzzy subset representing the th linguistic label from a universal set , the possibility distribution induced by the proposition is characterized by the membership function given in (1) where is the possibility distribution of and is the membership function of fuzzy set . Composition of propositions are represented by the fuzzy intersection, union and negation of simple propositions of the form “ is .” The possibility distribution of composed propositions is derived from the possibility distributions of component propositions by operations using a triangular norm, a triangular co-norm, and the negation connectives of fuzzy logic [14]. Various definitions for -norms exist. In this paper we adopt the following expressions. Given two vague propositions is ” and 1083-4419/03$17.00 © 2003 IEEE

Transcript of POPFNN-CRI(S): pseudo outer product based fuzzy neural network using the compositional rule of...

838 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 6, DECEMBER 2003

POPFNN-CRI(S): Pseudo Outer Product BasedFuzzy Neural Network Using the Compositional Rule

of Inference and Singleton FuzzifierKai Keng Ang, Chai Quek, Member, IEEE, and Michel Pasquier

Abstract—A pseudo-outer product based fuzzy neural networkusing the compositional rule of inference and singleton fuzzifier[POPFNN-CRI(S)] is proposed in this paper. The correspondenceof each layer in the proposed POPFNN-CRI(S) to the composi-tional Rule of inference using standard -norm and fuzzy relationgives it a strong theoretical foundation. The proposed POPFNN-CRI(S) training consists of two phases; namely: the fuzzy mem-bership derivation phase using the novel fuzzy Kohonen partition(FKP) and pseudo Kohonen partition (PFKP) algorithms, and therule identification phase using the novel one-pass POP learning al-gorithm. The proposed two-phase learning process effectively con-structs the membership functions and identifies the fuzzy rules.Extensive experimental results based on the classification perfor-mance of the POPFNN-CRI(S) using the Anderson’s Iris data arepresented for discussion. Results show that the POPFNN-CRI(S)has taken only 15 training iterations and misclassify only three outof all the 150 patterns in the Anderson’s Iris data.

Index Terms—Anderson’s IRIS data, CRI, fuzzy logical founda-tion, POPFNN-CRI, pseudo fuzzy Kohonen partitioning of fuzzysets, two-stage one-pass learning process.

I. INTRODUCTION

FUZZY neural networks are hybrid systems that possess theadvantages of both neural networks and fuzzy systems. The

integration of fuzzy systems and neural networks combines thehuman inference style and natural language description of fuzzysystems with the learning and parallel processing of neural net-works. There are numerous approaches to integrate fuzzy sys-tems and neural networks [16], [8], [13], [24], [10], [17], [27].Extensive bibliography on fuzzy neural network can be foundin [1].

Zadeh suggested the use of fuzzy logic as a frameworkto manage vagueness and uncertainty [28]. When modelingvagueness, fuzzy predicates without well-defined boundariesconcerning the set of objects may be applied. The rationale forusing fuzzy logic is that the denotations of vague predicatesare fuzzy sets rather than probability distributions. In manysituations, vagueness and uncertainty are simultaneouslypresented since any precise or imprecise fact may be uncertainas well. Fuzzy set and possibility theories provide a unifiedframework to deal with vagueness and uncertainty [5]. In thispaper, the pattern of approximate reasoning in pseudo outer

Manuscript received January 24, 1999; revised July 18, 2001. This paper wasrecommended by Associate Editor S. Lakshmivarahan.

The authors are with the Intelligent Systems Laboratory, School of ComputerEngineering, Nanyang Technological University, Singapore 639798 (e-mail:[email protected]).

Digital Object Identifier 10.1109/TSMCB.2003.812850

product fuzzy neural network using the compositional rule ofinference and a singleton fuzzifier calledPOPFNN-CRI(S)isdeveloped within such a framework.

This paper is organized as follows. Section II gives a detaileddescription of POPFNN-CRI(S) and Section III introduces thetwo-phase learning process of POPFNN-CRI(S). Detailed de-scriptions on the fuzzy Kohonen partition (FKP) and pseudofuzzy Kohonen partition (PFKP) algorithms [20] used in thefuzzy set derivation phase and the POP learning algorithm [18]used in the fuzzy rule identification phase are provided. Ex-tensive experimental results and analysis are provided in Sec-tion IV. Finally, the conclusions are presented in Section V.

II. POPFNN-CRI(S)

The proposed singleton pseudo outer-product based fuzzyneural network [POPFNN-CRI(S)] is developed on the basisof possibility theory and the fuzzy compositional rule of infer-ence. The structure of POPFNN-CRI(S) resembles the neural-network-like structure of POPFNN-TVR [18], but has strict cor-respondence to the inference steps in fuzzy rule-based systemsthat employ the compositional rule of inference using standardfuzzy operators and singleton fuzzifiers [12].

A. Fuzzy Logic and Possibility Theory Approach

A fuzzy set is defined by a membership function thatassigns each elementof a universe of discourse a degreeof compatibility with the concept represented by. Given thevague proposition “ is ,” where is a linguistic vari-able and is the fuzzy subset representing theth linguisticlabel from a universal set , the possibility distribution inducedby the proposition is characterized by the membership functiongiven in

(1)

where is the possibility distribution of and isthe membership function of fuzzy set . Composition ofpropositions are represented by the fuzzy intersection, unionand negation of simple propositions of the form “is .”The possibility distribution of composed propositions is derivedfrom the possibility distributions of component propositions byoperations using a triangular norm, a triangular co-norm, and thenegation connectives of fuzzy logic [14]. Various definitions for

-norms exist. In this paper we adopt the following expressions.Given two vague propositions “ is ” and “

1083-4419/03$17.00 © 2003 IEEE

ANG et al.: POPFNN-CRI(S): PSEUDO OUTER PRODUCT BASED FUZZY NEURAL NETWORK 839

is ,” where and are fuzzy subsets of andrespectively, (2)–(4) define the corresponding composed possi-bility distributions of and for , and

(2)

(3)

(4)

where is the triangular norm, is the triangular co-norm, andis the negation function. The triangular norm and triangular

co-norm operations in (2) and (3) that involved multiple fuzzysets can be performed cumulatively. For convenience, the nota-tions given in (5) and (6) are adopted for these operations

(5)

(6)

where number of component propositions used to derivethe composed proposition.

Given the rule “if is , then is ”; whereand take values in the set and respectively, expressesa fuzzy relation on . This fuzzy relation , whichrestricts the possible values of for each given value of ,can be represented by a conditional possibility function[14]. The joint possibility distribution function is de-rived by combining and in

(7)

The possibility distribution function is derived by re-stricting the possible values of using the fuzzy relation

of as shown in (8). The possibility distributionfunction of the fuzzy relation is defined in

(8)

(9)

where is the supremum or least upper bound between twosets, is the fuzzy implication function associated with a tri-angular norm , and is a fuzzy relation on . Var-ious types of fuzzy implication functions are available [11]. Thefuzzy implications that satisfy (8) are Gaines–Rescher implica-tion [7], Gödel Implication [11] and Wu implication[26].

B. Compositional Rule of Inference

Given the rule “if is , then is ” and knowingis , where is not exactly equal to , the possibility

distribution function of , given in (10) is derived from(8) and (9)

(10)

Equation (10) is called the compositional rule of inference (CRI)[28]. The equivalent notation in (11) is adopted for convenience

(11)

The general schema of multiconditional approximate reasoningwith multiple inputs is shown in (12), at the bottom of the page.The index denotes a subset of the fuzzy setthat representsa linguistic label used as an antecedent for theth input to the

th fuzzy rule. Given if-then rules: if is , then isfor , , and facts is ; the

conclusions is for are derivable as shownin (12). To derive the conclusions in (12), (5) is used to represent

(12)

840 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 6, DECEMBER 2003

Fig. 1. Structure of POPFNN-CRI(S).

the composed propositions and antecedents for theth rule in(13) and (14), respectively

(13)

(14)

where is th the linguistic label used as an antecedent forthe th input to the th fuzzy rule.

Under the CRI method, (11) is used to obtain the consequentfor each rule in (15), then combine all consequents to-

gether as shown in (16), and finally derive the conclusionfrom and in

(15)

(16)

(17)

C. Architecture of POPFNN-CRI(S)

The proposed POPFNN-CRI(S) architecture for a multi-inputmulti-output (MIMO) system is a five-layer neural network asshown in Fig. 1. For simplicity, only the interconnections for theoutput are shown.

Each layer in POPFNN-CRI(S) performs a specific fuzzyoperation. The inputs and outputs of the POPFNN-CRI(S) arerepresented as nonfuzzy vectorand nonfuzzy vector respec-tively. Fuzzification of the input data and defuzzification of theoutput data are respectively performed by the input and outputlinguistic layers, while the fuzzy inference is collectivelyperformed by the rule-base and the consequence layers. Thenumber of neurons in the condition and the rule-base layers are

defined in (18)–(20), respectively. A detailed description of thefunctionality of each layer is given as

(18)

(19)

(20)

wherenumber of linguistic labels for theth input;number of linguistic labels for the th output;number of inputs;number of neurons in the condition layer;number of rules or rule-based neurons;number of linguistic labels for the output;number of outputs.

1) Input Linguistic Layer: Neurons in the input linguisticlayer are called input linguistic nodes. Each input linguistic node

represents an input linguistic variable of the correspondingnonfuzzy input . Each node transmits the nonfuzzy input di-rectly to the condition layer. The net input and output of an inputlinguistic node are defined in

net input: and

net output: (21)

where is the value of theth input and is the layer no.2) Condition Layer: Neurons in the condition layer are

called input-label nodes. Each input-label node repre-sents the th linguistic label of the th linguistic node from theinput layer. The input-label nodes constitute the antecedent

ANG et al.: POPFNN-CRI(S): PSEUDO OUTER PRODUCT BASED FUZZY NEURAL NETWORK 841

Fig. 2. Trapezoidal-shaped membership function.

of the fuzzy rules. Each node is represented by a trapezoidalmembership function described by a fuzzy intervalformed by four parameters ( , , , ) and a centroid

as shown in Fig. 2. This fuzzy interval is also known as atrapezoidal fuzzy number [9].

The subinterval [ , ], where , is called thekernel of the fuzzy interval, and the subinterval [ , ] iscalled the support. When the kernel reduces to a single point,the fuzzy interval becomes a triangular fuzzy number [9]. Thesemantic interpretation of this representation is as follows.

1) The linguistic term is totally compatible for values ofbetween , .

2) The linguistic term is incompatible with values smallerthan or greater than .

3) Between and , the degree of compatibility in-creases linearly.

4) Between and , the degree of compatibility de-creases linearly.

The net input and net output of an input-label node are given in

net input: and

net output:

if

or

if

if

if

(22)

where is the kernel of the fuzzy interval for thethlinguistic label of the th input, is the support of thefuzzy interval for the th linguistic label of theth input, andis the output of th input node.

3) Rule-Base Layer:Neurons in the rule-base layer arecalled rule nodes. Each rule node represents a fuzzy if-thenrule. The net input and output of a rule node are given in

net input: and

net output: (23)

where is the output of the input-label node that formsthe antecedent conditions for theth input to the th fuzzy rule

.4) Consequence Layer:Neurons in the consequence layer

are called output-label nodes. The output-label node rep-resents theth linguistic labels of the output . The net inputand output of the output-label node are given in

net input: and

net output: (24)

where is the output of the rule node whose consequenceis .

5) Output Linguistic Layer:The neurons in the outputlinguistic layer are called output linguistic nodes and performdefuzzification. The output linguistic node represents theoutput linguistic value of the output . Its net input and outputare given in

net input:if

if

and

net output: if

if

(25)

where is the centroid of the output-label node andis the width of the membership function for output-

label node .

D. Correspondence Between POPFNN-CRI(S) and CRI

The fuzzy operations performed by each layer of thePOPFNN-CRI(S) have very close correspondence withthe inference steps in the compositional rule of inference(CRI) method. This section explains the mapping of thePOPFNN-CRI(S) onto the CRI method.

1) Input Linguistic Layer: Equation (21) describes the func-tion performed by the input linguistic layer of the POPFNN-CRI(S). The input vector at theinput linguistic layer is nonfuzzy. Under the CRI method, eachindividual proposition is of the form is where is afuzzy set. Hence each elementof the nonfuzzy input vector

is fuzzified using

if

otherwisefor (26)

842 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 6, DECEMBER 2003

Fig. 3. Singleton fuzzy sets derived from the nonfuzzy input.

The fuzzified sets from (26) cor-respond to fuzzy singletons as shown in Fig. 3 [3]. Each fuzzysingleton includes only one element. Once the elementis defined, the corresponding fuzzy set is accordingly de-fined. Comparing (26) and (21), it can be observed that theinput linguistic layer actually performs the fuzzification of thenonfuzzy input vector into its corresponding fuzzy vector

before passing it to thecondition layer.

2) Condition Layer: Equation (22) describes the functionperformed by the condition layer of POPFNN-CRI(S). Underthe CRI method, the operations of the-norm and fuzzy im-plication operator have to be defined. A common way of com-puting the consequences is given facts is under thegeneral schema described in (12) is to use the standard-normand fuzzy relation described in (27) and (28) [11]. This type ofreasoning, known as fuzzy interpolation [2], is typical in fuzzylogic control

(27)

(28)

where is the membership degree ofwith the subsetof the fuzzy set that forms the antecedent of the

th input to the th fuzzy rule and is the member-ship degree of with the subset of the fuzzy set thatforms the conclusion of the th output from the th fuzzy rule.From (9), the fuzzy implication function directly correspondsto the definition of the fuzzy relation . The fuzzy implicationfunction based on the standard fuzzy relation described in (28)is described in

(29)

Based on the -norm defined in (27) and the fuzzy implicationfunction defined in (29), the CRI method described in (10) isdefined in

(30)

Under the CRI method defined in (30), the input fuzzy sets fromeach individual propositions obtained from (26) are matchedagainst each antecedent of the fuzzy rules in the rule-base. Each

individual antecedent, represented by the trapezoidal member-ship of an input-label node in the POPFNN-CRI(S), has an as-sociated fuzzy set as the semantic of a linguistic label for a par-ticular linguistic node. This associated fuzzy set is representedas , , , for input linguistic node

which corresponds to the set of fuzzy membership functions, , , in the condition layer. Similarly,

the fuzzy set , , , that forms theoutput linguistic labels for the output linguistic node is asso-ciated with the set of membership functions , ,

, in the consequence layer. Given the inputpropositions is , where is not exactly equal toany , and the th if-then rule “if is , then is ”for and , the degree of compatibilityof the given facts and the antecedents of theth if-then rule aredetermined using (31), independently from (30)

(31)

wheredegree of compatibility between the proposition

and the th linguistic label of theth input linguistic variable;

membership function of input element withthe fuzzy subset .

From (26), the fuzzy set for the input proposition is a fuzzysingleton . The membership function of for an input ele-ment is determined using

if

if .(32)

Equation (31) is reduced to (33) using (32)

(33)

Equation (22), which represents the trapezoidal membershipfunction stored in each input-label node, is expressed in

(34)

where is the membership function of theth antecedent forthe th input and is the th element of the input vector .Comparing (34) and (33), it can be observed that the conditionlayer actually measures the compatibility of the each input fuzzyset with each fuzzy subset that semanticallyrepresents a linguistic label used as an antecedent in the fuzzyrules in the rule-base layer.

3) Rule-Base Layer:Equation (23) describes the functionperformed by the rule-based layer of POPFNN-CRI(S). Therule-base and consequence layer collectively performs the map-ping of the inference step in the CRI method. Therefore, themapping of the two layers are closely related. From the general

ANG et al.: POPFNN-CRI(S): PSEUDO OUTER PRODUCT BASED FUZZY NEURAL NETWORK 843

schema of multi-conditional approximate reasoning describedin (12), the index to the linguistic label that represents the an-tecedent of theth input for the th rule is denoted as. There-fore, the degree of compatibility between theth input fuzzyset and its corresponding antecedent for theth rule is repre-sented as . Under the CRI method, the conclusiongiven in (17) can be derived using (35) by truncating the subset

of the fuzzy set by the conjunctive values offor and and then taking the union ofall the truncated subsets from

(35)

where is the number of fuzzy if-then rules. The standard-norm operation for multiple fuzzy sets is given the notation

in

(36)

Using (30), the membership function of the conclusionfrom (35) is given in

(37)where is the membership function of theth conse-quence of the th linguistic variable for the th fuzzy rule.Using the CRI method as defined in (30), as well as (33) and(36), the correspondence of the inference of described in(37) to the CRI method defined in (17) is shown in

(38)

Using (33), (37) is equivalently given in (39) and (40) where(40) computes the firing strength of theth rule

(39)

(40)

where is the total degree of compatibility between the inputfuzzy sets and the antecedents for theth fuzzy rule. Comparing(40) and (23), it can be observed that the rule-based layer actu-

ally computes the firing strength of theth fuzzy rule using thestandard -norm operator, which is also the total degree of com-patibility between the input fuzzy sets and the antecedents of the

th fuzzy rule.4) Consequence Layer:Equation (24) describes the func-

tion performed by the consequence layer of POPFNN-CRI(S).Expanding (39) gives

(41)

Equation (41) is further simplified as in (42) and (44)

(42)

(43)

where is the maximum firing strength for the consequence. Comparing (43) and (24), it can be observed that the

output-label nodes perform the disjunction operation and com-pute the maximum firing strength of the consequencefrom the degree of compatibility between the input fuzzy setsand the antecedents of all the rules.

5) Output Linguistic Layer:Equation (25) describesthe function performed by the output linguistic layer ofPOPFNN-CRI(S). The previous sections described the map-ping of the POPFNN-CRI(S) architecture in deriving a singleconsequence . However, the final output set is actuallycomposed of a number of consequences that semanticallyrepresent linguistic labels for . These consequences aresubsets of , given as , , .

From (42), the membership function of the output setgiven in (44) is derived from the each individual consequenceusing the disjunction operation. This method coincides with thederivation of each individual consequence using the disjunctionoperation in (42)

(44)

The final consequence is expressed in terms of a fuzzy set.Defuzzification of has to be performed to convert the con-clusion to a real number. Several methods of defuzzification areavailable: center of gravity, center of maxima, mean of maximaand weighted average mean of maxima [11]. Each method isbased on some rationale, and they involved integration or sum-mation of the continuous or quantized universe of discourse forthe consequence fuzzy set . Equation (45), that simulates thecenter of maxima method [13], is employed to defuzzify

(45)

844 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 6, DECEMBER 2003

Fig. 4. Example on inference using POPFNN-CRI(S).

wherecenter of the consequence fuzzy set;width of the consequence fuzzy set ;final real output based on the consequence fuzzy set

.Comparing (45) and (25), it can be observed that the outputlinguistic layer integrates the consequences derived from theprevious layers and performs defuzzification of these conse-quences.

The above discussion describes the direct mapping fromCRI using standard T-norm and its associated fuzzy relationto the proposed POPFNN-CRI(S) architecture. This givesthe POPFNN-CRI(S) a strong theoretical foundation in fuzzyinference. As a result, fuzzy inference systems using the CRImethod and singleton fuzzifiers can also be accomplished usingthe proposed architecture. In addition, the neural-network-likearchitecture allows highly parallel computational ability.

An illustrative example of the inference process with two in-puts and a single output system is given in Fig. 4. The two fuzzyif-then rules used are given

If is and is then is

If is and is then is (46)

When the values of and are presented to the singletonPOPFNN-CRI, they are respectively fuzzified into fuzzy sin-gletons and by the input linguistic layeras shown in Fig. 4(a). The condition layer then determines thedegrees of compatibility of thegiven facts and the antecedents of the rules. Next, the followingrule-base layer determines the firing strengthsand based

on the total compatibility of the facts with the antecedents ofeach rule as shown in Fig. 4(b). The consequence layer then de-termines the firing strengths and for the consequences

and , respectively, from and as shown in Fig. 4(c).Finally, the output linguistic layer integrates the consequences

and into as shown in Fig. 4(d). The output layeralso performs the defuzzification of to yield the real outputvalue .

III. L EARNING PROCESS OFPOPFNN-CRI(S)

The learning process of POPFNN-CRI(S) consists of onlytwo phases; namely the fuzzy membership learning and thePOP learning [21]. Similar to the POPFNN-TVR architecture[18], a self-organizing type of learning algorithm [KOHO88]is employed in the first phase to determine the membershipfunctions of the condition and consequence layers. The dif-ference between the learning process of POPFNN-CRI(S) andPOPFNN-TVR is that the former only requires two phasesof learning while the latter requires an additional supervisedlearning to adjust the membership functions. The former usestwo novel membership learning algorithms, known as pseudofuzzy Kohonen partition (PFKP) and fuzzy Kohonen partition(FKP) [20] to determine the membership functions of thecondition and consequence layers. A detailed description of thetwo-phase learning process is presented as follows:

A. Fuzzy Membership Learning Algorithms

Fuzzy membership information is stored in the condition andconsequence layers of the POPFNN-CRI(S). These membershipfunctions can be identified and stored in the POPFNN-CRI(S)

ANG et al.: POPFNN-CRI(S): PSEUDO OUTER PRODUCT BASED FUZZY NEURAL NETWORK 845

from relevant information contained in the training data. Thefuzzy Kohonen partition (FKP) and pseudo fuzzy Kohonen par-tition (PFKP) algorithms [20] can be employed in POPFNN-CRI(S) to train its condition and consequence layers to derivethe membership functions of the input and output variables.

The difference between FKP and PFKP is that the latter pro-duces pseudo fuzzy partitions while the former only producesfuzzy partitions. The former is a supervised learning algorithm,while the latter is unsupervised. Since they are similar, the FKPand PFKP algorithms are described hereafter in a single listing,with the differences between the two approaches clearly high-lighted via FKP/PFKP labels in the code:

Fuzzy/Pseudo Fuzzy Kohonen Partition Algorithm:

Step 1: Define as the number of classes, asthe learning constant, as the learning width anda small positive number as a stopping criterion;where number of data vectors in a cluster,total number of data vectors.

Step 2: Initialize the training iteration and theweights with

for

Step 3: Initialize

for (47)

Step 4: For :FKP: Determine the th cluster the data be-

longs to from the training data.PFKP: Find the winner using

for

(48)Update weights of

FKP: the th clusterPFKP: the winnerwith

(49)

Step 5: Compute using

(50)

Step 6: Compare and where , using

(51)

Step 7: If stop, else repeat steps 3–7 for.

Step 8:Initialize for.

Step 9: ForFKP: Determine the th cluster the data be-

longs to from the training data.PFKP: Find the winner using

for (52)

Update pseudo weights ofFKP: the th clusterPFKP: the winnerthe th cluster using

(53)

Update the four points of the trapezoidal fuzzynumber with

FKP

PFKPfor

for

FKP

PFKPfor

for(54)

B. Pseudo Outer Product Learning Algorithm

After the membership functions of the condition and conse-quence layers have been identified, the pseudo outer-product(POP) learning algorithm [21] is employed to identify the fuzzyrules. The POP learning algorithm is a simple one-pass learningalgorithm. The algorithm is easy to comprehend as it coincideswith intuitive ways of identifying relevant rules.

In POPFNN-CRI(S), each node in the condition and conse-quence layers represents a linguistic label once the member-ship functions have been identified. Under the POP learning al-gorithm, the set of training data , , where is theinput vector and is the output vector, is simultaneously fedinto both the input linguistic and output linguistic layers. Themembership values of each input-label node are then de-termined. These values are subsequently used to compute thefiring strength of the rule nodes in the rule-base layer. Sim-ilarly, the membership values of each output-label node are de-termined by feeding the output value back from the output layerto the consequence layer. The weights of the consequence layerlinking the rule-based layer are then determined using

(55)

where is the weight of the link between theth rule nodeand the th linguistic label for the th output, is thefiring strength of th rule node when presented with input vector

, and is the membership value of theth outputof with the fuzzy subset that semantically representsthe th linguistic label of the th output. The weights in (55)are initially set to zero. After performing POP learning, theseweights represent the strength of the fuzzy rules having the cor-responding output-label nodes as their consequences. Amongthe links between a rule node and the output-label nodes, the linkwith the highest weight is chosen and the rest are deleted. Thelinks with zero weights to all output-label nodes are also deleted.The remaining rule nodes after this link selection process sub-sequently represent the rules used in the POPFNN-CRI(S) [21].

846 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 6, DECEMBER 2003

Fig. 5. Membership function identified using FKP algorithm.

IV. EXPERIMENTAL RESULTSUSING POPFNN-CRI(S)

In this section, the Anderson’s Iris data [6] was used as the ex-perimental data set to validate the proposed POPFNN-CRI(S).This data set contains 50 vectors in for three classes of Irissubspecies and has been used extensively to illustrate variousclustering and classifier designs. The properties of this data areextensively discussed in [4], [15]. This data set was also usedto extract fuzzy rules for clustering data in other fuzzy neuralnetworks [8], [16].

A. Membership Function Identified Using FKP Algorithm

Fig. 5 shows the membership function of the condition andconsequence layers of POPFNN-CRI(S) identified using FKPalgorithm with , and . Triangularmembership functions are obtained when the pseudo learningrate is set to zero. The FKP learning took 12, 7, 8, 12, and 14iterations to identify the membership functions of the four inputsand output linguistic labels respectively. As these five differentmembership functions are trained in parallel, the first phase ofPOPFNN-CRI(S) learning took only 14 iterations.

Table I shows the some of the typical weights of the conse-quence layer linking the rule-based layer determined using POPlearning algorithm. The linguistic labels, , and are used to

TABLE IFUZZY RULES IDENTIFIED WITH FKP AND POP LEARNING

represent each of the four inputs from the data set. The linguisticlabels , and are used to represent each of the threeclusters, namely Setosa, Versicolor and Virginica of the data set.The first four columns describe the conditions for a particularrule node. At the beginning of POP learning, each rule is con-nected to all the output-label nodes with the weights initializedto zero. The next three columns give the weights of these linksafter applying (55) from POP learning. The last column givesthe consequence derived for each rule.

In the rule selection phase of POP learning, the weight ofa particular consequence that has larger strength than others isused as the correct consequence for a rule node. Rules with zero

ANG et al.: POPFNN-CRI(S): PSEUDO OUTER PRODUCT BASED FUZZY NEURAL NETWORK 847

Fig. 6. Membership function identified using PFKP/FKP algorithm.

weights to all consequences are subsequently deleted. An ex-ample of a derived rule in the first row from Table I is interpretedas shown in

If is and is and is and is

then is (56)

where is the first rule in POPFNN-CRI(S), is the th inputfuzzy set, and is the output fuzzy set. Together with POPlearning, the POPFNN-CRI(S) took only 15 training iterations.A total of 42 fuzzy rules are derived. A total of 42 fuzzy rulesare derived, seven out of 150 patterns are classified wrongly, and23 out of 150 patterns are unclassified.

B. Membership Function Identified Using PFKP/PKPAlgorithm

Fig. 6 shows the membership function of the condition layerof POPFNN-CRI(S) identified using PFKP algorithm with

, and and the consequence layeridentified using FKP algorithm with , and

. Trapezoidal membership functions are obtainedwhen the pseudo learning rateis not set to zero. The PFKPlearning took 12, 7, 8, 12 iterations to identify the membershipfunctions of the four inputs, respectively, and the FKP algorithmtook 14 iterations to identify the membership functions of theoutput linguistic labels.

TABLE IIFUZZY RULES IDENTIFIED WITH PFKP/FKPAND POP LEARNING

Table II shows some of the typical weights of the conse-quence layer linking the rule-based layer determined using thePOP learning algorithm. Similar to the experiment described inSection V, the training of the POPFNN-CRI(S) took only 15iterations. Classification experiments were performed with thetrained POPFNN-CRI(S) using the 150 patterns from the An-derson’s Iris data set. A total of 47 fuzzy rules are derived, threeout of 150 patterns are classified wrongly, and 0 out of 150 pat-terns are unclassified.

C. Benchmark Against Other Techniques

Table III gives a comparison between the POPFNN-CRI(S)against other existing techniques in classifying the Anderson’s

848 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 33, NO. 6, DECEMBER 2003

TABLE IIIBENCHMARK OF POPFNN-CRI(S) AGAINST OTHER CLASSIFIERS

Iris data set. The performance of NEFCLASS is described in[16], and the performance of ILQV and FCM are described in[22]. The membership functions derived using the FKP algo-rithm are comparable to the ones derived in the NEFCLASSsystem [16].

Compared against the other techniques, the POPFNN-CRI(S)membership function trained using the FKP algorithmis only comparable with POPFNN-TVR. In contrast, thePOPFNN-CRI(S) with the membership function of conditionlayer trained using the PFKP algorithm and consequencelayer trained using the FKP algorithm incorrectly classifiedonly three out of all the 150 patterns with only 15 trainingiterations. These performance results are superior to the otherclassification techniques in terms of training iterations andmisclassification rates.

V. CONCLUSION

A pseudo-outer product based fuzzy neural network usingthe compositional rule of inference and singleton fuzzifier[POPFNN-CRI(S)] is proposed in this paper. The functionsperformed by each layer in the proposed POPFNN-CRI(S)correspond strictly to the inference steps in the compositionalrule of inference method using standard-norm and fuzzyrelation. This gives the proposed POPFNN-CRI(S) a strongtheoretical foundation. The learning process of the proposedPOPFNN-CRI(S) consists of only two phases. In the firstphase, two novel fuzzy membership identification algorithmscalled the fuzzy Kohonen partition (FKP) and pseudo Kohonenpartition (PFKP) [20] are used to identify the fuzzy membershipfunctions of the input and output linguistic labels. In the secondphase, a novel one-pass rule-identification algorithm calledPOP learning [21] is used to identify the fuzzy rules. The pro-posed two-phase learning algorithm can effectively constructmembership functions and identify the fuzzy rules without theneed of any supervised training to tune the membership func-tions. Experiments were conducted using the POPFNN-CRI(S)with FKP and POPFNN-CRI(S) with PFKP/FKP to cluster theAnderson’s Iris Data [6] Extensive experiment results on thetraining of POPFNN-CRI(S) is presented. Results on the laterexperiment show that the POPFNN-CRI(S) has taken only 15training iterations and misclassify only 3 out of all 150 patternsof the Anderson’s Iris data.

POPFNN-CRI(S) is a member of a class of fuzzy neural net-works based on strong fuzzy logical inference (TVR, AARS,Yager) [14], [18], [19], [23]. The POP learning algorithm [21]is used in this class of fuzzy neural network to objectively de-rive the fuzzy rules describing a problem domain. These net-

works are self-organizing but suffer from the inherent offlineclustering techniques used to derive the fuzzy labels. This makesthem unsuitable for online modeling. GenSoFNN [25] devel-oped at the Intelligent Systems Laboratory, Nanyang Techno-logical University is an attempt to provide a generalized frame-work that will support online modeling of the decision makingrules independently of the fuzzy inference engine employed.

REFERENCES

[1] J. J. Buckley and Y. Hayashi, “Neural nets for fuzzy systems,” inFuzzySets and Systems. New York: Elsevier, 1995, vol. 71, pp. 265–276.

[2] V. Cross and T. Sudkamp, “Patterns of fuzzy rule-based inference,”Int.J. Approx. Reas., vol. 11, pp. 235–255, 1994.

[3] D. Dubois and H. Prade,Fuzzy Sets and Systems: Theory and Applica-tions. New York: Academic, 1980.

[4] R. O. Duda and P. E. Hart,Pattern Classification and Scene Anal-ysis. New York: Wiley, 1973.

[5] F. Esteva, P. Garcia-Calves, and L. Godo, “Relating and extending se-mantical approaches to possibilistic reasoning,” inInternational Journalof Approximate Reasoning. New York: Elsevier, 1994, vol. 10, pp.311–344.

[6] R. A. Fisher, “The use of multiple measurement in taxonomic prob-lems,”Annu. Eugenics, vol. 7, pp. 179–188, 1936.

[7] B. R. Gaines, “Foundations of fuzzy reasoning,”Int. J. Man-Mach.Stud., vol. 8, no. 6, pp. 623–688, 1976.

[8] N. K. Kasabov, “Learning fuzzy rules and approximate reasoning infuzzy neural networks and hybrid systems,”Fuzzy Sets Syst., vol. 82,pp. 135–149ce, 1996.

[9] A. Kaufmann and M. M. Gupta,Introduction to Fuzzy Arithmetic:Theory and Applications. New York: Van Nostrand Reinhold, 1985.

[10] J. M. Keller, R. R. Yager, and H. Tahani, “Neural network implementa-tion of fuzzy logic,”Fuzzy Sets Syst., vol. 45, pp. 1–12, 1992.

[11] G. J. Klir and B. Yuan,Fuzzy Sets and Fuzzy Logic, Theory and Appli-cations. Englewood Cliffs, NJ: Prentice-Hall, 1995.

[12] T. Kohonen,Self-Organizing Maps. New York: Springer, 1995.[13] C. T. Lin, “A neural fuzzy control system with structure and parameter

learning,” inFuzzy Sets and Systems. New York: Elsevier, 1995, vol.70, pp. 183–212.

[14] R. L. Mantaras,Approximate Reasoning Models. New York: Ellis Hor-wood, 1990.

[15] M. Nadler and E. P. Smith,Pattern Recognition Engineering. NewYork: Wiley, 1993.

[16] D. Nauck and R. Kruse, “A neuro-fuzzy method to learn fuzzy classifica-tion rules from data,” inFuzzy Sets and Systems. New York: Elsevier,1997, vol. 89, no. 3, pp. 277–288.

[17] H. C. Quek, G. S. Ng, and P. W. Ng, “Fuzzy integrated process super-vision of neural network control regimes,” inProc. 2nd Singapore Int.Conf. Intell. Syst., Singapore, 1994, SPICIS 1994, pp. B152–B158.

[18] H. C. Quek and R. W. Zhou, “POPFNN: A pseudo outer-product basedfuzzy neural network,” inNeural Networks. New York: Elsevier, 1996,vol. 9, pp. 1569–1581.

[19] C. Quek and R. W. Zhou, “POPFNN-AARS: A pseudo outer-productbased fuzzy neural network,”IEEE Trans. Syst., Man Cybern. B, vol.29, pp. 859–870, June 1999.

[20] K. K. Ang and H. C. Quek, “MLVQ: a modified learning vector quan-tization algorithm for identifying centroids of fuzzy membership func-tions,” Pattern Recognit. Lett., 2000, to be published.

[21] C. Quek and R. W. Zhou, “The POP learning algorithms: Reducing workin identifying fuzzy rules,”Neural Networks, vol. 14, pp. 1431–1445,2001.

[22] C. Quek and K. K. Ang, “Determination of fuzzy membership functionusing fuzzy Kohonen partition and pseudo fuzzy Kohonen partition al-gorithms,” Pattern Recognit., 2001, to be published.

[23] C. Quek, A. Wahel, and A. Singh, “POP-Yager: self-organizing fuzzyneural network,”Proc. SPIE, pp. 14–25, 2000.

[24] J. J. Shann and H. C. Fu, “A fuzzy neural network for rule acquiring onfuzzy control systems,” inFuzzy Sets and Systems. New York: Else-vier, vol. 71, pp. 345–357.

[25] W. L. Tung and C. Quek, “GenSoFNN: A generic self organizing fuzzyneural network,”IEEE Trans. Neural Networks, vol. 13, pp. 1075–1086,Sept. 2002.

[26] W. M. Wu, “Fuzzy reasoning and fuzzy relational equations,”Fuzzy SetsSyst., vol. 20, no. 1, pp. 67–78, 1986.

ANG et al.: POPFNN-CRI(S): PSEUDO OUTER PRODUCT BASED FUZZY NEURAL NETWORK 849

[27] R. R. Yager, “Modeling and formulating fuzzy knowledge bases usingneural network,”Neural Networks, vol. 7, no. 8, pp. 1273–1283, 1994.

[28] L. A. Zadeh, “A theory of approximate reasoning,” inFuzzy Sets andApplications, L. A. Zadeh, Ed. New York: Wiley, 1979, pp. 367–412.

Kai Keng Ang received the B.A.S.c degree (withfirst class honors) in computer engineering and theM.Phil. degree from Nanyang Technological Uni-versity, Singapore, in 1997 and 1999, respectively.

His research area is soft computing, mainly onneural networks, fuzzy systems and fuzzy neuralsystems. He has conducted research on neural net-work control for continuously variable transmission,modified cerebellar articulation controller and theautomatic derivation of fuzzy membership functionsto support fuzzy neural networks. Since 1999, he has

been a lead Software Engineer with Delphi Automotive Systems, SingaporePte, Ltd., working on software for automotive engine controllers.

Chai Quek (M’96) received the B.Sc. degree in elec-trical and electronics engineering and the Ph.D. de-gree in intelligent control from Heriot-Watt Univer-sity, Edinburgh, U.K.

He is an Associate Professor and a member of theIntelligent Systems Laboratory, School of ComputerEngineering, Nanyang Technological University,Singapore. His research interests include intelligentcontrol, intelligent architectures, AI in education,neural networks, fuzzy neural systems, and geneticalgorithms.

Michel Pasquier received the M.S. degree in elec-trical engineering and the Ph.D. degree in computerscience from the Institut National Polytechnique deGrenoble (INPG), France, in 1985 and 1988, respec-tively.

From 1989 to 1994, he worked in Tsukuba, Japan,first as a Visiting Researcher at the ElectroTechnicalLaboratory (ETL), then as a Research Engineerfor Sanyo Electric. In 1994, he joined NanyangTechnological University, Singapore, where he hassince been teaching courses in artificial intelligence

and software engineering. His research interests focus on fundamental AIand especially search, planning, and decision systems, with applications tointelligent transportation, robotics, and automation.