Subsethood-product fuzzy neural inference system (SuPFuNIS

22
578 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002 Subsethood-Product Fuzzy Neural Inference System (SuPFuNIS) Sandeep Paul and Satish Kumar, Member, IEEE Abstract—A new subsethood-product fuzzy neural inference system (SuPFuNIS) is presented in this paper. It has the flexibility to handle both numeric and linguistic inputs simultaneously. Numeric inputs are fuzzified by input nodes which act as tunable feature fuzzifiers. Rule based knowledge is easily translated directly into a network architecture. Connections in the network are represented by Gaussian fuzzy sets. The novelty of the model lies in a combination of tunable input feature fuzzifiers; fuzzy mutual subsethood-based activation spread in the network; use of the product operator to compute the extent of firing of a rule; and a volume-defuzzification process to produce a numeric output. Supervised gradient descent is employed to train the centers and spreads of individual fuzzy connections. A subsethood-based method for rule generation from the trained network is also suggested. SuPFuNIS can be applied in a variety of application domains. The model has been tested on Mackey–Glass time series prediction, Iris data classification, Hepatitis medical diagnosis, and function approximation benchmark problems. We also use a standard truck backer-upper control problem to demonstrate how expert knowledge can be used to augment the network. The performance of SuPFuNIS compares excellently with other various existing models. Index Terms—Fuzzy mutual subsethood, fuzzy neural net- work, gradient descent learning, product conjunction, volume defuzzification. I. INTRODUCTION I NTEGRATED fuzzy neural models exploit parallel com- putation and demonstrate the ability to operate and adapt in both numeric as well as linguistic environments. Numerous examples of such synergistic models have been proposed in the literature [1]–[5]. These include models for: approximate reasoning, inferencing and control [6]–[15]; classification [16]–[18]; diagnosis [19], [14]; rule extraction from numerical training data [20]–[23]; and rule simplification and pruning [24]–[26]. Other fuzzy-neural networks that fuzzify standard neural network architectures include: the fuzzy multilayer perceptron [27], [28]; models that utilize fuzzy teaching inputs with fuzzy weights in neural networks [29]–[32]; and evolvable neuro-fuzzy systems [33]–[37]. The development of fuzzy neural models have a common thread that derives from the desire to: Manuscript received January 10, 2001; revised July 11, 2001. This work was supported by the Department of Science and Technology, Ministry of Science and Technology, New Delhi, under Research Grant III.5(142)-ET. S. Paul is with the Department of Electrical Engineering, D.E.I. Technical College, Dayalbagh Educational Institute, Dayalbagh, Agra 282005, India (e-mail: [email protected]). S. Kumar is with the Department of Physics and Computer Science, Faculty of Science, Dayalbagh Educational Institute, Dayalbagh, Agra 282005, India (e-mail: [email protected]). Publisher Item Identifier S 1045-9227(02)04432-6. 1) embed data-driven knowledge into a network architecture to facilitate fast learning; 2) design an appropriate composition and evidence aggrega- tion mechanism that can simultaneously handle numeric and linguistic features in order to generate outputs or de- rive conclusions; 3) incorporate a mechanism for fine tuning rules by learning from numeric data; 4) extract and interpret the learned knowledge as a rule base. Let us consider each of these points in greater detail. Embedding Data Driven Knowledge: Most hybrid models embed data-driven or expert derived knowledge in the form of fuzzy if–then rules, which are ultimately represented in a neural network framework [7], [9], [14]. This embedding of knowl- edge is often done by assuming that antecedent and consequent labels of standard fuzzy if–then rules are represented as connec- tion weights of the network as in [14], [30], [29]. It has been shown formally that knowledge-based networks require rela- tively smaller training set size for better generalization [38]. When such rule based knowledge is extracted from numeric data, a common approach is to use either clustering or parti- tioning to derive the rules. Using clustering, the centers of fuzzy rules are initialized as cluster vectors extracted from the input data set [11], [16], [39], [40]. Subsequently, a learning algorithm fine tunes these rules based on the available training data that de- scribes the problem. Partitioning techniques recursively divide the input–output cross space into finer regions depending upon a local mean-squared error estimate. Each partition leads to an if–then rule [21, ch. 5], [41]. In each of these techniques, the se- lection of the number of rules to solve a problem is more or less still based on a heuristic approach. Composition and Evidence Aggregation: The issue of com- position of input information with the embedded rule base de- pends on whether the input feature is numeric or linguistic. With numeric inputs the usual way is to work with membership values computed from fuzzy membership functions that represent net- work weights [16], [22]. In order to handle fuzzy inputs, a given universe of discourse is generally quantized into prespecified fuzzy sets. A fuzzy input is then simply one of these prespeci- fied fuzzy sets [10], [27], [28]. Alternatively, there are models where interval arithmetic has been employed to handle such sit- uations [29], [30], [42]. Learning: The third issue is commonly dealt with by using either supervised gradient descent and their variants [9], [13], [16], [27], unsupervised learning, reinforcement learning [8], [34], [43], [44], and heuristic methods [22], or genetic algorithm based search [23], [35], [36]. 1045-9227/02$17.00 © 2002 IEEE Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

Transcript of Subsethood-product fuzzy neural inference system (SuPFuNIS

578 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

Subsethood-Product Fuzzy Neural Inference System(SuPFuNIS)

Sandeep Paul and Satish Kumar, Member, IEEE

Abstract—A new subsethood-product fuzzy neural inferencesystem (SuPFuNIS) is presented in this paper. It has the flexibilityto handle both numeric and linguistic inputs simultaneously.Numeric inputs are fuzzified by input nodes which act as tunablefeature fuzzifiers. Rule based knowledge is easily translateddirectly into a network architecture. Connections in the networkare represented by Gaussian fuzzy sets. The novelty of the modellies in a combination of tunable input feature fuzzifiers; fuzzymutual subsethood-based activation spread in the network; use ofthe productoperator to compute the extent of firing of a rule; anda volume-defuzzification process to produce a numeric output.Supervised gradient descent is employed to train the centersand spreads of individual fuzzy connections. A subsethood-basedmethod for rule generation from the trained network is alsosuggested. SuPFuNIS can be applied in a variety of applicationdomains. The model has been tested on Mackey–Glass time seriesprediction, Iris data classification, Hepatitis medical diagnosis,and function approximation benchmark problems. We also usea standard truck backer-upper control problem to demonstratehow expert knowledge can be used to augment the network.The performance of SuPFuNIS compares excellently with othervarious existing models.

Index Terms—Fuzzy mutual subsethood, fuzzy neural net-work, gradient descent learning, product conjunction, volumedefuzzification.

I. INTRODUCTION

I NTEGRATED fuzzy neural models exploit parallel com-putation and demonstrate the ability to operate and adapt

in both numeric as well as linguistic environments. Numerousexamples of such synergistic models have been proposed inthe literature [1]–[5]. These include models for: approximatereasoning, inferencing and control [6]–[15]; classification[16]–[18]; diagnosis [19], [14]; rule extraction from numericaltraining data [20]–[23]; and rule simplification and pruning[24]–[26]. Other fuzzy-neural networks that fuzzify standardneural network architectures include: the fuzzy multilayerperceptron [27], [28]; models that utilize fuzzy teaching inputswith fuzzy weights in neural networks [29]–[32]; and evolvableneuro-fuzzy systems [33]–[37].

The development of fuzzy neural models have a commonthread that derives from the desire to:

Manuscript received January 10, 2001; revised July 11, 2001. This work wassupported by the Department of Science and Technology, Ministry of Scienceand Technology, New Delhi, under Research Grant III.5(142)-ET.

S. Paul is with the Department of Electrical Engineering, D.E.I. TechnicalCollege, Dayalbagh Educational Institute, Dayalbagh, Agra 282005, India(e-mail: [email protected]).

S. Kumar is with the Department of Physics and Computer Science, Facultyof Science, Dayalbagh Educational Institute, Dayalbagh, Agra 282005, India(e-mail: [email protected]).

Publisher Item Identifier S 1045-9227(02)04432-6.

1) embed data-driven knowledge into a network architectureto facilitate fast learning;

2) design an appropriate composition and evidence aggrega-tion mechanism that can simultaneously handle numericand linguistic features in order to generate outputs or de-rive conclusions;

3) incorporate a mechanism for fine tuning rules by learningfrom numeric data;

4) extract and interpret the learned knowledge as a rule base.

Let us consider each of these points in greater detail.Embedding Data Driven Knowledge:Most hybrid models

embed data-driven or expert derived knowledge in the form offuzzy if–thenrules, which are ultimately represented in a neuralnetwork framework [7], [9], [14]. This embedding of knowl-edge is often done by assuming that antecedent and consequentlabels of standard fuzzyif–thenrules are represented as connec-tion weights of the network as in [14], [30], [29]. It has beenshown formally that knowledge-based networks require rela-tively smaller training set size for better generalization [38].When such rule based knowledge is extracted from numericdata, a common approach is to use either clustering or parti-tioning to derive the rules. Using clustering, the centers of fuzzyrules are initialized as cluster vectors extracted from the inputdata set [11], [16], [39], [40]. Subsequently, a learning algorithmfine tunes these rules based on the available training data that de-scribes the problem. Partitioning techniques recursively dividethe input–output cross space into finer regions depending upona local mean-squared error estimate. Each partition leads to anif–thenrule [21, ch. 5], [41]. In each of these techniques, the se-lection of the number of rules to solve a problem is more or lessstill based on a heuristic approach.

Composition and Evidence Aggregation:The issue of com-position of input information with the embedded rule base de-pends on whether the input feature is numeric or linguistic. Withnumeric inputs the usual way is to work with membership valuescomputed from fuzzy membership functions that represent net-work weights [16], [22]. In order to handle fuzzy inputs, a givenuniverse of discourse is generally quantized into prespecifiedfuzzy sets. A fuzzy input is then simply one of these prespeci-fied fuzzy sets [10], [27], [28]. Alternatively, there are modelswhere interval arithmetic has been employed to handle such sit-uations [29], [30], [42].

Learning: The third issue is commonly dealt with by usingeither supervised gradient descent and their variants [9], [13],[16], [27], unsupervised learning, reinforcement learning [8],[34], [43], [44], and heuristic methods [22], or genetic algorithmbased search [23], [35], [36].

1045-9227/02$17.00 © 2002 IEEE

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 579

Rule Interpretation: The final issue of extracting and inter-preting the tuned fuzzy weights is done by assigning each fuzzyweight a linguistic label chosen on the basis of comparison witha set of fixed fuzzy sets, based on a similarity measure [24],[26], [45]. This helps in generating a rule base that is easilycomprehensible.

In this paper we present the design of a fuzzy-neural networkmodel that specifically addresses the following objectives:

1) to incorporate a mechanism that can handle numeric andlinguistic inputs seamlessly;

2) to stress on the economy of the number of parameters thata model employs to solve a particular problem;

3) to be able to easily incorporate data-driven as well as ex-pert knowledge in the generation of an initial set of if–thenrules;

4) to attempt to have the system learn data-driven knowledgeto fine tune the set of if–then rules;

5) to be able to interpret a trained fuzzy-neural system.

The resulting Subsethood-Product Fuzzy Neural InferenceSystem (SuPFuNIS) adequately addresses each of these issues.

SuPFuNIS uses a standard fuzzy-neural network architec-ture that embeds fuzzyif–thenrules as hidden nodes; rule an-tecedents as input to hidden connections, and rule consequentsas hidden to output connections [16], [22]. Knowledge in theform of if–then rules derived from clustering numeric data isused to initialize the rules embedded in the network [11], [12],[39], [46]. However, SuPFuNIS is different from other fuzzy-neural network models on various counts.

1) It uses a tunable input fuzzifier that is responsible forfuzzification of numeric data. In other words, numericinputs are fuzzified using a feature-specific Gaussianspread.

2) All information that propagates from the input layer isfuzzy. The model therefore uses a composition mecha-nism that employs a fuzzy mutual subsethood measure todefine the activation that propagates to a rule node alonga fuzzy connection.

3) The model aggregates activities at a rule node using afuzzy inner product: a product of mutual subsethoods.This is different from the more common approach to use afuzzy conjunction operator for activity aggregation.

4) Outputs are generated using volume defuzzification,which is a variant of the commonly employed centroidaldefuzzification procedure.

As demonstrated in this paper, it is the combination of the abovefour mechanisms that lends the model its uniformly high perfor-mance and its high level of parameter economy.

Earlier variants of the proposed model with applicationsin function approximation, inference and classification havebeen presented elsewhere [47]–[49]. In [47] a combination ofweighted subsethood and soft-minimum conjunction operatorwas employed. The model used a triangular approximationinstead of Gaussian fuzzy weights for subsethood computation.It addressed the applications of function approximation and in-ference. In [48], which extended [47] by increasing the numberof free parameters, a simple heuristic to derive the numberof rules using clustering was introduced. A combination of

mutual subsethood and product conjunction operator with anontunable feature fuzzifier has been presented in [49]. Thenetwork in [49] uses Gaussian fuzzy weights and targets theclassification problem domain.

SuPFuNIS also has a diversity of application domains. In sup-port of our claims, SuPFuNIS is tested on five different appli-cations: approximation of nonlinear Mackey–Glass time series;Iris data classification; hepatitis medical diagnosis; function ap-proximation; and a truck backer-upper control problem. For theMackey–Glass time series approximation problem we also showthe efficacy of employing cluster-based initialization, and sub-sethood-based interpretation of the learnt knowledge in the formof rules. The idea of seamlessly presenting mixed linguistic–nu-meric inputs to the model is exemplified in the hepatitis di-agnostic problem. The ease with which expert knowledge canbe incorporated into SuPFuNIS is demonstrated on the truckbacker-upper application where we show the effect of using anetwork trained using only numeric data, and compare it with anetwork trained on a reduced training set but augmented withexpert knowledge. All the applications demonstrate the highperformance of the SuPFuNIS model.

The organization of the paper is as follows: Section II pro-vides the operational details of SuPFuNIS; Section III detailsthe supervised learning of model; Section IV presents four ap-plications of SuPFuNIS: time series prediction, classification,diagnosis, and function approximation. In Section V we discussthe issue of rule interpretation; and Section VI shows the effec-tiveness of SuPFuNIS to work with a numerically trained net-work augmented with linguistic knowledge. Finally, Section VIIconcludes the paper.

II. A RCHITECTURE ANDOPERATIONAL DETAILS

The proposed SuPFuNIS model directly embeds fuzzy rulesof the form

If is LOW and is HIGH then is MEDIUM (1)

where LOW, MEDIUM, and HIGH are fuzzy sets defined, re-spectively, on input or output universes of discourse (UODs).Input nodes represent domain variables or features, and outputnodes represent target variables or classes. Each hidden noderepresents a rule, and input-hidden node connections representfuzzy rule antecedents. Each hidden-output node connectionrepresents a fuzzy-rule consequent. Fuzzy sets corresponding tolinguistic labels of fuzzyif–thenrules (such as LOW, MEDIUM,and HIGH), are defined on input and output UODs and are rep-resented by symmetric Gaussian membership functions speci-fied by a center and spread. Fuzzy weightsfrom input nodes

to rule nodes are thus modeled by the center and spreadof a Gaussian fuzzy set and denoted by .

In a similar fashion, consequent fuzzy weights from rule nodesto output nodes are denoted by . Data-drivenknowledge in the form of fuzzyif–then rules is translated di-rectly into a network architecture as shown in Fig. 1.

SuPFuNIS can simultaneously admit numeric as well asfuzzy inputs. Numeric inputs are first fuzzified so that allinputs to the network are uniformly fuzzy. Now since theantecedent weights are also fuzzy, this requires the adoption of

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

580 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

Fig. 1. Architecture of the SuPFuNIS model.

(a) (b)

Fig. 2. (a) An example of prespecified fuzzy sets for fuzzy inputs. (b) An example of fuzzification of numeric input by a tunable fuzzy set.

a method to transmit a fuzzy signal along a fuzzy weight. Inconventional neural networks numeric inputs are scaled by theweight directly and these scaled values are aggregated as theactivation of a node using simple summation. In the SuPFuNISmodel signal transmission along the fuzzy weight is handled bycalculating the mutual subsethood (detailed in Section II-B).We now proceed to discuss these issues in detail.

A. Signal Transmission at Input Nodes

Since the input feature vector can com-prise either numeric or linguistic values, there are two kinds ofnodes in the input layer. Linguistic nodes accept a linguisticinput represented by a fuzzy set with a Gaussian membershipfunction and modeled by a center and spread . Theselinguistic inputs can be drawn from prespecified fuzzy sets asshown in Fig. 2(a), where three Gaussian fuzzy sets have beendefined on the UOD [ 1, 1]. Thus, a linguistic input is rep-resented by the pair . This is also the signal

transmitted out of the linguistic node since no trans-formation of inputs takes place at these nodes in the input layer.

Numeric nodes are tunable feature-specific fuzzifiers.They accept numeric inputs and fuzzify them using Gaussianfuzzy sets. The numeric input is fuzzified by treating it asthe center of a Gaussian membership function with atunable spread . This is shown in Fig. 2(b) where a numericfeature value of 0.25 has been fuzzified into a Gaussianmembership function centered at 0.25 with spread 0.5. The

Gaussian shape is chosen to match the Gaussian shape ofweight fuzzy sets since this facilitates subsethood calculationsdetailed in Section II-B. Therefore, the signal transmittedfrom a numeric node of the input layer is also represented bythe pair . These fuzzy signals fromnumeric or linguistic inputs are transmitted to hidden rulenodes through fuzzy weights that correspondto rule antecedents.

B. Mutual Subsethood

Since both the signal and the weight are fuzzy sets, beingrepresented by Gaussian membership functions, we intuitivelyseek to quantify the net value of the signal transmitted alongthe weight by the extent of overlap between the two fuzzy sets.This is measured by theirmutual subsethoodwhich is intro-duced next.

Consider two fuzzy sets and described by Gaussianmembership functions with centers , and spreads ,respectively:

(2)

(3)

The cardinality of fuzzy set is then defined by

(4)

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 581

(a) (b)

(g) (h)

(e) (f)

(g) (h)

Fig. 3. Four cases depending upon relative values ofc , c , � and� . Case 1: (a)c = c and� > � . (b) c = c and� = � . Case 2: (c)c > c and� = � . (d) c < c and� = � . Case 3: (e)c > c and� > � . (f) c < c and� > � . Case 4: (g)c > c and� < � . (h) c < c and� < � .

Then, the mutual subsethood [21, ch. 13], measuresthe degree to which fuzzy set equals fuzzy set

Degree

Degree and (5)

and can be formulated as

(6)

The mutual subsethood measure has values in the interval[0,1] that depend on the relative values of centers and spreadsof fuzzy sets and . Four different cases of overlap can arise.

• Case 1: having any values of and .• Case 2: and .• Case 3: and .• Case 4: and .

These four cases depend on relative values of, , , and ,as portrayed in Fig. 3. Notice that in Case 1, the two fuzzy sets

do not cross over—either one fuzzy set belongs completely tothe other, or the two fuzzy sets are identical. In Case 2 there isexactly one crossover point, whereas in Cases 3 and 4 there areexactly two crossover points.

To calculate the crossover points we set to obtainthe equal valued points. This yields the two crossover pointsand

(7)

(8)

can then be evaluated in terms of as in (6).In subsequent sections, to facilitate evaluation of cardinality weexpress it in terms of the standard error function

(9)

which has limiting values ,and .

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

582 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

Fig. 4. Fuzzy signal transmission.

C. Mutual Subsethood Based Signal Transmission

As now shown schematically in Fig. 4, SuPFuNIS transmits afuzzy signal from an input node along a fuzzy weight that repre-sents an antecedent connection. The transmitted signal is quan-tified by , which denotes the mutual subsethood between thefuzzy signal and fuzzy weight ( ). is computedusing (6). Symbolically, for a signal(generated from either a numeric or linguistic input node) anda fuzzy weight , the mutual subsethood is de-fined as

(10)

The derivations of the expressions for for eachof the four cases identified above are given below.

Case 1: : If the signal fuzzy setcompletely belongs to the weight fuzzy set [as portrayedin case of Fig. 3(a)] and the cardinality

(11)

Similarly, if and. If , the two fuzzy sets are identical [as

portrayed in case of Fig. 3(b)]. Summarizing these threesubcases

if

if

if .

(12)

Case 2: : In this case there will beexactly one crossover point as shown in Fig. 3 (c) and 3(d).Assuming [Fig. 3(c)], the cardinality canbe evaluated as

(13)

If , [Fig. 3(d)] the expression for cardinalityis

(14)

Case 3: : In this case there will be twocrossover points and as calculated in (7) [see Fig. 3(e)and 3(f)]. Assuming and [Fig. 3(e)], thecardinality can be evaluated as

(15)

If [Fig. 3(f)] the expression for is identicalto (15).

Case 4: : This case is similar to Case 3,and once again there will be two crossover pointsand ascalculated in (7). Assuming and [Fig. 3(g)],the cardinality can be evaluated as

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 583

(16)

If [Fig. 3(h)] the expression for cardinality is identicalto (16).

The corresponding expressions for are obtained bysubstituting for from (12)–(16) into (10).

D. Activity Aggregation at Rule Nodes With a Fuzzy InnerProduct

By measuring all the values of mutual subsethoodsfor a rule node we are in essence

assessing the compatibility between the linguistic signal vector(transmitted from the input layer) and the

fuzzy weight vector that fans-in to therule node . Each rule node is expected to somehow aggregatethis vector in such a way that the resulting node activationreflects this compatibility. In other words, the extent of rulefiring as represented by the rule node activation, measures theextent to which the corresponding linguistic inputmatchesthe antecedent of the rule in question. We replace thestandard operator commonly used in fuzzy systems withthe operator to aggregate activities at a rule node.The activation , of rule node is thus a mutual subsethoodbased product, the differentiability of which allows the modelto employ gradient descent based learning in a straightforwardway.

In summary, the net activation of the rule node , is aproduct of all mutual subsethoods, thefuzzy inner product

(17)

Notice that the inner product in (17) (and thus the rule nodeactivation function) exhibits the following properties: it isbounded between zero and one; monotonic increasing; contin-uous; symmetric; and nonidempotent. The use of such a fuzzyinner product of subsethoods lends novelty to SuPFuNIS.

The behavior of the product aggregation operator has beendiscussed at length in [49], where we pointed out that the

operator does not ignore information regardingthe dimension of the input, as does the operator [21]. Itprovides a better estimate of thejoint strengthof the variousinputs. Also, over a wide range of spreads, theoperator is able to clearly differentiate between inputs thatare similar to the weight vector, and inputs that are dissimilarfrom the weight vector. In other words, the operatoris capable of better discrimination than the operator. Webelieve that this is an important contributing factor to the highperformance and economy of SuPFuNIS networks.

The signal function for a rule node is linear

(18)

Numeric activation values are transmitted unchanged to conse-quent connections.

E. Output Layer Signal Computation

The signal of each output node is determined using standardvolume based centroid defuzzification [21]. The term volume isused in a general sense as to include multidimensional functions.For two-dimensional functions the volume reduces to area.

If the activation of the output node is , and s denoteconsequent set volumes and s are the weights that scale,then the general expression of defuzzification is

(19)

where is the number of rule nodes. The volume , in our caseis simply the area of consequent weights which are representedby Gaussian fuzzy sets. Thus, . If the weightsare considered to be unity as we do so in this paper, then

(20)

The signal of output node is . Note that with thesubstitutions and , (20) can besimplified

(21)

(22)

where the coefficients are normalized and sumto one. The defuzzifier (20) thus essentially computes a convexsum of consequent set centers.

This completes our discussion on how inputs are mapped tooutputs in SuPFuNIS.

III. SUPERVISEDLEARNING

The SuPFuNIS network is trained by supervised learning.This involves repeated presentation of a set of input patternsdrawn from the training set. The output of the network is com-pared with the desired value to obtain the error, and networkweights are changed on the basis of an error minimizing crite-rion. Once the network is trained to the desired level of error, itis tested by presenting a new set of input patterns drawn fromthe test set.

A. Iterative Update Equations

Learning is incorporated into SuPFuNIS model using the gra-dient descent method. A squared error criterion is used as atraining performance parameter. The squared errorat iter-ation is computed in the standard way

(23)

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

584 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

where is the desired value at output node, and the errorevaluated over all outputs for a specific pattern . Fora one-of- class classification the desired outputs will be thenzero or one. Both the centers , , and spreads , , ofantecedent and consequent connections and the spreads of theinput features are modified on the basis of update equationsfor that take on the form

(24)

(25)

(26)

(27)

(28)

where is the learning rate, is the momentum parameter and

(29)

(30)

(31)

(32)

(33)

B. Evaluation of Partial Derivatives

The expressions of partial derivatives required in these updateequations are derived as follows. For the error derivative withrespect to consequent centers

(34)

and the error derivative with respect to the consequent spreads

(35)

The error derivatives with respect to antecedent centers andspreads involve subsethood derivatives in the chain and aresomewhat more involved to evaluate. Specifically, the errorderivative chains with respect to antecedent centers and spreadsare, respectively

(36)

and

(37)

and the error derivative chains with respect to input featurespreads is

(38)

where

(39)

and

(40)

The expressions for antecedent connection mutual subsethoodpartial derivatives , and are ob-tained by differentiating (10) with respect to , and asshown in (41)–(43) at the bottom of the next page.

In (41)–(43), , anddepend on the nature of overlap of the input

feature fuzzy set and weight fuzzy set, i.e., upon the values of, , and . Case-wise expressions therefore need to be

derived as follows.Case 1: : As is evident from (12), is

independent of , and therefore

(44)

Differentiating (12) with respect to we have

if and

if and(45)

and differentiating (12) with respect to we have

if and

if and .(46)

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 585

Case 2: : When ,, and are de-

rived by differentiating (13) as follows:

(47)

(48)

(49)

When , , andare derived by differentiating (14) as follows:

(50)

(51)

(52)

Case 3: : Once again two subcases ariseas in Case 2. When , ,

and are derived by differentiating(15)

(53)

(41)

(42)

and

(43)

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

586 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

(54)

(55)

Similarly, if

(56)

Thus, whether or , identical expressions forare obtained. Similarly, the expression forand remains the same as

(54) and (55), respectively, when or .Case 4: : When ,

, and is derivedby differentiating (16)

(57)

(58)

(59)

If the expressions for ,and are the same as (57)–(59),

respectively.

IV. A PPLICATIONS

The SuPFuNIS model finds application in a variety ofdomains. In this section, we compare and contrast the perfor-mance of SuPFuNIS with other models on four applications:Mackey–Glass time series approximation; Iris data classifica-tion; Hepatitis disease diagnosis; and a function approximationproblem. We deal with the Mackey–Glass series in greaterdetail than others to highlight important behavioral propertiesof the model that carry over to the other problems where wereport only final results.

A. Mackey–Glass Time Series Prediction

Nonlinear dynamical time series modeling is a centralproblem in different disciplines such as economics, forecasting,planning and control. In this paper we consider a benchmarkchaotic time series first investigated by Mackey and Glass [50]which is a widely investigated problem in the fuzzy-neural

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 587

domain [9], [13], [36], [39], [51]–[53]. The series is generatedby the following delay differential equation:

(60)

As in (60) is varied, the system can exhibit either fixed point,limit cycle or chaotic behavior. For the system exhibitschaotic behavior and we attempt the problem of approximatingtime series function of (60) for this value of.

In the present context, the time series prediction problem in-volves predicting a future value ( being the pre-diction time step) based on a set of values of at certaintimes less than. The standard method for this type of predic-tion is to create a mapping from points of the time series

, spaced apart, topredict a future value . To facilitate comparison withearlier work we use , and . The goal then is to usea fuzzy neural model to construct a function as follows:

(61)

For the purpose of training and testing the generalization abilityof the model, a data set was generated using the Runge–Kuttaprocedure applied to (60) with time step 0.1, with an ini-tial condition . From the Mackey–Glass time se-ries generated by the above procedure, we extracted 1000input–output data pairs of the following format from 118 to1117

(62)

where the first four values are the inputs to the system and thelast value is the desired output. The first 500 pairs were used astraining data, and the second 500 pairs were employed as the testset. Training involves sequential presentation of data pairs andstandard application of batch mode gradient descent learningprocedure.

The number of free parameters that SuPFuNIS employs isstraightforward to calculate: one spread for each numeric input;a center and a spread for each antecedent and consequent con-nection of a rule. For the Mackey–Glass time series applicationSuPFuNIS employs a 4--1 network architecture, whereis thenumber of rule nodes. Therefore, since each rule has four an-tecedents and one consequent, an-rule SuPFuNIS system willhave 10 free parameters.

Experiments were conducted on SuPFuNIS to test its perfor-mance with and without data driven initialization, steadily in-creasing the rules from three to ten. For the simulation resultsof Table I, the rule base was initialized in two ways:

1) randomizing weights (centers in interval [0, 1.5] andspreads in the interval [0.2, 0.9]);

2) using fuzzy -means clustering in conjunction with theXie–Beni cluster validity measure, . Here, givenrules, , 1000 randomly generated sets ofclusters each were evaluated using the Xie–Beni clustervalidity measure. The best cluster was then selected forinitialization. Details of this procedure are provided inAppendix I.

TABLE IrmseOF SuPFuNISFOR MACKEY–GLASS TIME SERIESOBTAINED AFTER

500 EPOCHS FORDIFFERENTRULE COUNTS

Fig. 5. Error-epoch trajectories for training of SuPFuNIS on Mackey–Glasstime series data.

The training and testing root mean square errors (rmses) after500 epochs of training for different number of rules for bothrandomized initial weight values and weight values initializedusing the FCM- procedure (see Appendix I) are shown inTable I. During training the learning rate and momentum wereinitialized to 0.1 and decayed linearly to 0.01 in 500 steps. No-tice that the final results obtained after cluster based initializa-tion are better than those obtained by simple randomization, andthis difference is more pronounced at lower rule counts. Thisdifference however reduces as the number of rules increases—aresult that is intuitively expected, since at a higher rule countthe system has many more rules tocoverthe data and the initialplacement of these rules is not so critical as it is when the rulecount is low. Finally, as the number of rules increases the rmsedecreases.

As is to be expected, if the SuPFuNIS model is trained fora larger number of epochs, lower rmse is obtained at the costof computation time. For example: after 5000 epochs, using tenrules the training rmse is 0.003 70 and testing rmse is 0.003 74.These values are to be compared with those reported for 500epochs in Table I.

An important aspect of the training behavior of SuPFuNISis that most learning is complete during the first few tens ofepochs. Error-epoch training plots for different numbers of ruleswith FCM based initialization are shown in Fig. 5(a). It is clearthat most of the learning is complete within 50 epochs, after

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

588 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

Fig. 6. Approximation performance of SuPFuNIS for Mackey–Glass timeseries using three and ten rules, respectively.

which the network goes through a fine-tuning phase. Fig. 5(b)compares error-epoch training plots for random initializationand FCM- based initialization. Clearly, correct initializa-tion certainly improves not only the final performance but ac-celerates learning as well.

The approximation performance for the cases of three rules(34 parameters) and ten rules (104 parameters) after 500 epochsof training with FCM- based initialization are shown inFig. 6(a) and (b), respectively. The plots show a zoomed portionfrom data points 1 to 100 of the time series so as to visualizethe difference in approximation quality. Solid lines indicate thedesired function, and dash-dot lines indicate the predicted func-tion. The prediction error plots for three rules and ten rules areportrayed graphically in Fig. 6(c) and (d), respectively.

We now summarize the above observations:

1) As the number of rules increases the approximation tendsto improve. Notice from Table I that random initializationdoes not necessarily yield this improvement.

2) For low rule counts considerably lower rmse values in lessepochs are obtained when FCM is employed to initializethe weights as against randomization. This indicates thatinitial knowledge can help improve the performance ofthe model.

3) The use of the Xie–Beni index with FCM for initial-ization can give gains in the speed of learning at thecost of an increased preprocessing computation time. Ob-serving the decreasing trend of the error-epoch trajectoryin Fig. 5(b) justifies the above statement. Specifically, forthe case for ten rules, for random initialization of fuzzyweights the training and testing rmse after 30 epochs are0.011608 and 0.011691, respectively, while initializationusing the FCM- method yields the values 0.009838and 0.009792, respectively.

The Mackey–Glass time series prediction problem has beenattempted with various classical models, neural networks,and fuzzy neural networks in the literature. A comparison ofSuPFuNIS with a selection of these models is shown in Table IIbased on normalized root mean square error (Nrmse). The

TABLE IICOMPARISON OFSuPFuNIS NrmseS WITH OTHER MODELS FOR

MACKEY–GLASS TIME SERIESAPPLICATION. y: RESULTSADAPTED FROM [52]

Nrmse is defined as the rmse divided by the standard deviationof the target series [13]. The other results in Table II are adaptedfor comparison from [9], [37], [52], [54], [55]. Both ANFIS[9] and GEFREX [37] outperform other models in terms ofNrmse. However, ANFIS has the drawback is that it has lessinterpretability in terms of learned information; and the imple-mentation of GEFREX is difficult. As can be seen, excludingGEFREX and ANFIS, SuPFuNIS performs the best with anNrmse of 0.016 with just ten rules or 104 parameters. By wayof example, note that ANFIS employs 104 parameters, cascadecorrelation learning uses 693 connections, the backpropagationtrained neural network uses 540 connections, and EPNetemploys 103 (average) parameters. Clearly SuPFuNIS has acombination of architectural economy and high performance.

Importantly, as we discuss in Section V, SuPFuNIS also pro-vides easy interpretability of learned information since the rulebase structure remains intact after learning completes. We reportnumeric values of rules for a ten-rule SuPFuNIS in Appendix II.

B. Iris Data Classification

Iris data involves classification of three subspecies ofthe Iris flower, namelyIris sestosa, Iris versicolor, and Irisvirginica on the basis of four feature measurements of theIris flower—sepal length, sepal width, petal length, and petalwidth [56]. There are 50 patterns (of four features) for each ofthe three subspecies of Iris flower. The input pattern set thuscomprises 150 four-dimensional patterns. This data can beobtained from UCI repository of machine learning databasesfrom http://www.ics.uci.edu/~mlearn/MLRepository.html.

For this classification problem, SuPFuNIS employs a 4--3network architecture: the input layer consists of four numericnodes; the output layer comprises three class nodes; andrulenodes in the hidden layer. To train the network, initially the cen-ters of antecedent weight fuzzy sets were randomized in therange of the minimum and maximum values of respective inputfeatures of Iris data. Feature-wise, these ranges are (4.3, 7.9),(2.0, 4.4), (1.0, 6.9), and (0.1, 2.5). The centers of hidden-outputweight fuzzy sets were randomized in the range (0,1) and thespreads of all fuzzy weights and feature spreads were random-ized in the range (0.2, 0.9). All 150 patterns of the Iris datawere presented sequentially to the input layer of the networkfor training. The learning rate and momentum were both taken

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 589

TABLE IIINUMBER OF RESUBSTITUTIONERRORS FORIRIS DATA FOR STANDARD

ALGORITHMS WITH DIFFERENT NUMBER OF PROTOTYPES/RULES.y: RESULTSADAPTED FROM [58]

TABLE IVBEST RESUBSTITUTION ACCURACY FOR IRIS DATA FOR

DIFFERENTSOFT COMPUTING ALGORITHMS

as 0.0001 and kept constant during the training period. Once thenetwork was trained, the test patterns (which again comprisedall 150 patterns of Iris data) were presented to the trained net-work and theresubstitution errorcomputed.

Simulation experiments were conducted with different num-bers of rule nodes to illustrate the performance of the classi-fier with a variation in the number of rules. Notice that forrules, the number of connections in the 4--3 architecture forIris data will be 7 . Once again, since the representation of afuzzy weight requires two parameters (center and spreads), thetotal number of free parameters to be trained will be 14 .

Attempts to solve the same problem using other techniqueslike genetic algorithm (GA), learning vector quantization (LVQ)and its family of generalized fuzzy algorithms (GLVQ-F) [57],[58], and random search (RS), have been reported in the litera-ture. Table III compares the resubstitution error with these tech-niques. The results obtained from GA, RS, LVQ, and GLVQ-Fhave been adapted from [58] for the purpose of comparison. InGA and random search techniques two resubstitution errors forthree prototypes are reported. For four prototypes the GA per-formed poorly with four errors in comparison to two errors inrandom search. Results from Table III show that the SuPFuNISmodel has only one resubstitution error for three and four rulesand zero resubstitution error any number of rules greater thanor equal to five. Note that for three rules, 46 parameters requirespecification in our model. The 84th pattern (6.0 2.7 5.1 1.6)that belongs toIris versicolor is misclassified as belonging toIris virginica.

In Table IV, SuPFuNIS is compared with other soft com-puting models [16], [35], [36], [59], [60] in terms of the numberof rules and % resubstitution accuracy. The performance of theSuPFuNIS model is better than all other techniques and is atpar with FuGeNeSys [36]. The fuzzy weights of the trainednetwork with five rules that produces zero resubstitution errorare illustrated in the scatter plot of Iris data in Fig. 7 and the

corresponding numeric values of weight parameters are givenin Appendix II.

Apart from the above techniques, we mention that the Irisclassification problem has also been solved using a multilayerperceptron (MLP) with a 4-6–6-3 architecture (four nodes in theinput layer and three nodes in the output layer with two hiddenlayers consisting six nodes each) [4]. The MLP can achievea zero resubstitution error with 93 connections (parameters).Zero resubstitution error was obtained with SuPFuNIS usingfive rules or 74 free parameters. The SuPFuNIS model clearlyperforms very well as a classifier.

C. Medical Diagnosis

The next benchmark problem deals with hepatitis diag-nosis which requires classifying patients into two classesDie or Live on the basis of features which are both numericand linguistic (symbolic). The data can be obtained fromhttp://www.ics.uci.edu/~mlearn/MLRepository.html. The pur-pose of including this example is to show how easily SuPFuNIScan handle both numeric and symbolic data. In addition, weshow that SuPFuNIS is robust against variations in trainingdata. The hepatitis data set has 155 patterns of 19 input featureswith a number of missing values. An example pattern having19 features and classLive with all feature values defined isgiven in Table V. There are six numeric features namely Age,Bilirubin, Alk Phosphate, SGOT, Albumin, and Protime, andthe remaining 13 features are linguistic in nature.

As there are a number of missing data, preprocessing of datais required. The data set consists of 75 patterns that have oneor more features unspecified. A new set of data was formedby fitting some of the missing numeric values. Twenty patternswhich had either a missing symbolic feature value, or more thantwo missing numeric feature values were first discarded. Themissing numeric values in the remaining 55 incomplete caseswere filled with the average value of the missing feature cal-culated on a class-wise basis from the 80 original completedata [61]. This way we were able to reconstruct a data set of135 patterns. The numeric features of these 135 patterns werenormalized feature-wise in the range [0, 1]. Symbolic features(yes/no or male/female) were represented by constructing twofuzzy sets: the symbolic value “no” represented by a fuzzy setwith Gaussian membership function having center as zero andspread as 0.5, and “yes” represented by a Gaussian membershipfunction centered at one and spread 0.5. The spreads were as-sumed to be trainable during the learning procedure.

Experiments were conducted using two data sets:Data Set1 comprising of only 80 of 155 patterns that were originallycomplete in all respects; andData Set 2comprising 135patterns (80 originally complete and 55 reconstructed). Fortraining, 70% patterns were randomly chosen and the re-maining 30% were used for testing. Five combinations of such70%(train)–30%(test) were randomly generated separately forData Set 1andData Set 2. Experiments were then conducted oneach of these individual data set combinations using a 19-3-2SuPFuNIS architecture. During training both learning rate andmomentum were kept constant as 0.0001. These results arereported in Table VI. In all the experiments SuPFuNIS has a

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

590 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

Fig. 7. SuPFuNIS rule patches for Iris data for the case of five rules.

TABLE VA CASE FORHEPATITIS DATA

high classification accuracy ranging from 87.5% to 100% withonly three rules. In addition to its architectural economy thisexperiment demonstrates an important aspect of the model: it isrobust against random variations in data sets.

Table VII shows the average classification accuracy ob-tained using SuPFuNIS for both data sets, compared withother approaches like CN2 [63], Bayes [64], Assistant-86 [64],k-NN, LVQ, multilayer perceptron, and Wang and Tseng ap-proach using GA based technique [23] to solve the sameproblem. The results of other approaches are adapted from[23] and [62].

Once again, we stress on the economy of the number ofrules that is able to yield a high classification accuracy withSuPFuNIS. At the same time we have shown the model to berobust against data set variations. Above all, mutual subsethoodproducts allow seamless integration of numeric and linguisticinformation, while being amenable to gradient descent learning.Although in this example, linguistic inputs were only of theYes/No type, truly graded linguistic inputs would only provethe worth of the model to a greater extent.

D. Function Approximation

A single input function as given in (63), frequently used inliterature [36], [37], [65], [66] to test the learning capacities ofthe proposed models, was used to test the performance of SuP-FuNIS

(63)

Twenty-one training patterns were generated at inter-vals of 0.05. Thus, the training patterns are of the form:

. The evaluation was

TABLE VITESTINGACCURACY IN % USING THREERULES FORHEPATITIS DATA

TABLE VIICOMPARISON OFSuPFuNIS WITH OTHER METHODS FORHEPATITIS DATA

done using 101 test data taken at intervals of 0.01. For the pur-pose of comparison, performance indexes and as de-fined in [65], were also used in this paper. Table VIII comparesthe test accuracy performance index for different modelsalong with the number of rules and tunable parameters used inachieving it. With three rules SuPFuNIS obtainedand which is better than all others [36], [65],[66] except for GEFREX [37]. With five rule SuPFuNIS ob-tained and which are comparable toGEFREX ( and ).

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 591

TABLE VIIICOMPARISON OFSuPFuNIS WITH OTHER METHODS FORNARAZAKI –RALESCU’S FUNCTION

Fig. 8. Fuzzy interpretation sets.

V. INFERRINGRULES FROM A TRAINED NETWORK

In this section we expand upon the ease of interpretation ofknowledge that is embedded in SuPFuNIS which we demon-strate for the case of the Mackey Glass time series predictionproblem. A similar approach can be applied to other applica-tions as well.

The trained rules obtained after 500 epochs from the exper-iment using ten rules with FCM- based initialization areshown in Fig. 9 whose numeric values are given in Appendix II.To interpret these rules, we consider afuzzy interpretation setwhich provides an exhaustive linguistic interpretation on the in-terval representing a UOD. In the present example, linguisticlabels of a fuzzy interpretation set are represented by normal-ized symmetric Gaussian membership functions with identicalspreads, and centers fixed at equal intervals. Exemplar fuzzy in-terpretation sets of three and five linguistic labels defined on aUOD of [0 1.5] are shown in Fig. 8.

In order to interpret a trained rule in terms of linguistic la-bels of a selected fuzzy interpretation set, the fuzzy subsethoodis measured between each antecedent set and every fuzzy set ofa fuzzy interpretation set; and between the consequent set andevery fuzzy set of a fuzzy interpretation set. A rule antecedentor consequent is then associated with the fuzzy interpretationset for which the maximum fuzzy subsethood measure is ob-tained. With three linguistic labels SMALL (S), MEDIUM (M)and LARGE (L) in a fuzzy interpretation set [Fig. 8(a)], theif–thenrules generated from the ten-rule model (obtained after500 epochs of training as shown in Fig. 9) are summarized inTable IX. It can be observed from Table IX that rules 1, 2, 4, 5,and 10 are identical. Thus instead of ten rules, the system canbe represented broadly by six rules using three linguistic labels.

However, observe that at this level of set-granularity rules 1,2, 4, 5, 10 are inconsistent with rule 3. This calls for an increasein the number of sets in the fuzzy interpretation set employed

for interpretation. Specifically, if the rule interpretation proce-dure is carried out using five linguistic labels [Fig. 8(b)] namelyVERY SMALL (VS), SMALL (S), MEDIUM (M), LARGE(L), VERY LARGE (VL), defined on the same UOD [0, 1.5],the system can be represented by 10 distinct rules as shownin Table X. Note that the aforesaid inconsistency is now elim-inated. The minimum level of fuzzy interpretation set granu-larity at which no inconsistencies exist is the appropriate levelat which to interpret the embedded knowledge of SuPFuNIS.Thus, in SuPFuNIS rule interpretation and pruning are facili-tated in a straightforward fashion by employing fuzzy subset-hood in conjunction with a fuzzy interpretation set with speci-fied number of labels.

VI. A UGMENTING SuPFuNIS WITH EXPERTKNOWLEDGE

Finally, we show that the SuPFuNIS model is also suitable insituations where a small set of numeric data is to be augmentedby expert linguistic knowledge. This is demonstrated in case oftruck backer-upper control problem. By employing this appli-cation example we also show that the model can be applied to acontrol problem with excellent results.

The problem at hand deals with backing up a truck to aloading dock. The truck corresponds to the cab part of the truckin the Nguyen–Widrow neural truck backer-upper system [67].The truck position is exactly determined by three state variables

, , and where is the angle of the truck with the horizontal,and are the coordinates in the space as depicted in Fig. 10.The control of the truck is the steering angle. The truck

moves backward by a fixed unit distance every stage. We also as-sume enough clearance between the truck and the loading docksuch that the coordinatedoes not have to be considered as aninput. (For validation of this assumption refer to [68].) We de-sign a control system, whose inputs are and

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

592 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

Fig. 9. Plots of antecedent and consequent sets for Mackey–Glass time series for 10 rule SuPFuNIS. Numeric values are given in Appendix II.

TABLE IXFUZZY RULES GENERATED WITH THREE LABELS

FOR THEMACKEY–GLASS TIME SERIES

TABLE XFUZZY RULES GENERATED WITH FIVE LABELS

FOR THEMACKEY–GLASS TIME SERIES

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 593

Fig. 10. Diagram of simulated truck and loading zone.

(a) (b)

Fig. 11. Truck trajectories from three testing points (a) using three rules and (b) using five rules obtained from complete set of numeric data.

, and whose output is , such that thefinal state will be .

The following kinematic equations are used to simulate thecontrol system [20]:

(64)

(65)

(66)

where is the length of the truck and is assumed as four for thepresent simulation.

We used a normalized variant of the docking error [whichessentially measures the Euclidean distance from the actualfinal position ( ) to the desired final position ( )],as well as the trajectory error (the ratio of the actual length ofthe trajectory to the straight line distance from the initial pointto the loading dock) as performance measures (derived from[68]):

Normalized Docking Error

(67)

TABLE XIDOCKING ERRORSWHEN ONLY NUMERIC DATA IS USED

Trajectory Error

length of truck trajectorydistance initial position, desired final position

(68)

A. Simulation Results Using Only Numeric Data

The training data (adapted from [20]) comprise 238 pairswhich are accumulated from 14 sequences of desired ( )values. The data was linearly normalized in range [0, 1] and usedto train SuPFuNIS for different numbers of rules. The learningrate and momentum were kept as 0.0001 throughout the trainingperiod. The number of free parameters for this application are6 . Three initial states, (3, 30), (10, 220), and(13, 30) were used to test the performance of the controller. The

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

594 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

Fig. 12. Fuzzy sets for linguistic labels ofx, � and�.

Fig. 13. Truck trajectories from three test points: (a) using five rules obtained from 42 numeric data (b) using five rules obtained from reduced numeric data andfive linguistic rules (c) using five rules obtained from reduced numeric data and nine linguistic rules.

TABLE XIITHE FUZZY SET LABELS FOR THEMEMBERSHIPFUNCTIONS OFFIG. 12

docking errors for three test points for rules 3 and 5 are reportedin Table XI. Results shows that SuPFuNIS is able to performvery well (high docking accuracy) with just five rules. This hasto be compared with Kosko and Kong’s fuzzy controller forbacking up the truck to the dock that uses 35 linguistic rules[68], and the Wang–Mendel controller [20] that uses 27 ruleswhich are either linguistic or a mixture of linguistic and rulesobtained from numeric data. The truck trajectories from threeinitial states are shown in Fig. 11.

B. Simulation Results Using Numeric and Linguistic Data

Next we trained SuPFuNIS with 42 data pairs obtainedby considering the first three pairs of data from each of the

14 sequences. These pairs train the system for initial path con-trol of the truck. The finer control of the trajectory toward thedock was implemented using nine linguistic rules constructedfrom expert knowledge [20]. In the present simulation thecontroller consists of five rules obtained by learning from nu-meric data, and nine linguistic rules. The nine linguistic rulesare derived from [20] by suitably modifying the membershipfunctions to be Gaussian

Rule 1: If is and is VE then is ZE

Rule 2: If is and is LV then is PM

Rule 3: If is and is RV then is NM

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 595

TABLE XIIIDOCKING ERRORS FOR5 NUMERIC RULES (OBTAINED FROM

42 NUMERIC PAIRS) AND NINE LINGUISTIC RULES

Rule 4: If is and is VE then is PM

Rule 5: If is and is VE then is NM

Rule 6: If is and is RV then is NS

Rule 7: If is and is RV then is NB

Rule 8: If is and is LV then is PB

Rule 9: If is and is LV then is PS

The linguistic labels are defined as in Table XII and arerepresented by fuzzy sets with Gaussian membership functionsas shown in Fig. 12. Truck trajectory simulation results forthe 14 rule hybrid controller are reported in Table XIII, andtruck trajectories are shown in Fig. 13 illustrating the effect oflinguistic rules.

Clearly, SuPFuNIS is able to successfully generate low errortrajectories from each of the initial test points. From Table XIIIwe observe that the overall average normalized docking errorsare lower with the incorporation of expert knowledge, thanthe case with five rules trained directly on the entire numericdata. In addition we were able to incorporate expert knowledgeeasily and seamlessly into the network. Once again, noticethe economy of the rule base. This kind of economy has beenconsistently observed in all the applications presented.

VII. CONCLUSION

In this paper we propose an SuPFuNIS that employs a novelcombination of: tunable feature fuzzifiers that convert numericinputs to Gaussian fuzzy sets; mutual subsethood based acti-vation spread; a fuzzy inner product conjunction operator; anda volume defuzzification technique. SuPFuNIS embeds rulebased knowledge directly into its architecture. This facilitatesnot only easy data-driven cluster-based initialization of thenetwork, but also the read-back of rules from a trained network.The mutual subsethood measures the similarity between afuzzy input and a fuzzy weight to decide the extent of activationtransfer from input nodes to rule nodes. The extent of rulefiring is computed by a product operator which lends gooddiscriminatory power to the model. The network generatesoutputs using a volume defuzzification technique. Gradient

descent learning is used to adjust the centers and spreads ofvarious weights of the network, as well as the spreads of theinput fuzzifiers.

The application potential of SuPFuNIS is demonstratedon various benchmark problems, each bringing out differentstrengths of the model. In the Mackey–Glass time seriesprediction problem, high performance is achieved with aneconomical network architecture. In addition, the significanceof a data-driven FCM cluster based weight initialization is jus-tified by simulation results. The Iris data classification problemhighlights the network economy further: SuPFuNIS achieves azero resubstitution error with a mere five rules. In the hepatitismedical diagnosis problem we demonstrate the ease with whichboth numeric and linguistic input features can be seamlesslyintegrated by the network to achieve a high performance.This application also highlights the performance of SuPFuNISagainst data set variations. SuPFuNIS also compares well withother models on a function approximation application. Finally,in the truck-backer upper control problem we not only show thecapability of SuPFuNIS to deal with control problems, but alsodemonstrate how expert rule-based knowledge can be easilyintegrated into networks that are trained on partial numericdata. The paper compares the performance of the model withvarious other classical and soft-computing techniques.

The mutual subsethood measure used for activation spreadin the network also provides a natural measure for identifyingthe minimal number of rules that can help characterize theknowledge embedded in a trained network. This makes theinterpretation of embedded knowledge quite straightforward.This is demonstrated for the Mackey–Glass time series predic-tion problem.

We reiterate that the major strengths of the model are its con-sistently high performance on a wide variety of applications, itseconomy of parameters, fast learning, ease of integration of ex-pert knowledge and transparency of fine tuned knowledge.

However, the model suffers various drawbacks. Theseinclude the use of a heuristic approach to select the number ofrule nodes to solve a particular problem. Also, in the presentversion of the model rule formats that use disjunctions ofconjunctive antecedents cannot be accommodated. Theselimitations are presently being investigated and the network iscurrently being extended to include genetic algorithm basedevolvable SuPFuNIS. This will be reported as part of futurework.

APPENDIX IINITIALIZATION OF RULE BASE

One of the methods to extract initial knowledge from thetraining data set is to cluster the data using a clustering tech-nique [11], [16], [39], [40]. Cluster-based initialization has beenknown to improve the rate of learning as well as the performanceof the model. The number of clusters decides the number ofrules. If the clustering is done in the input–output cross spacethen the centroids and boundaries of clusters can then be em-ployed to initialize values of the centers and spreads of fuzzy

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

596 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

weights that fan in and out of a rule node. In this paper we em-ploy the fuzzy -means (FCM) clustering algorithm [69] in con-junction the Xie–Beni index to cluster the given data and choosethe best cluster, respectively.

FCM Clustering: In clustering using FCM the following ob-jective function is minimized

(69)

where represent the training data,represents the cluster centroids,

is a matrix, where denotes the thcolumn of and is the membership of data pointincluster . and are given by

(70)

(71)

where and is a constant with a value 2.The approximate solution of (69) is obtained by an alternatingoptimization (AO) approach in which is initialized and iscalculated from (71). The new value of obtained is used tocalculate from (70). and are alternately updated and theprocess is iterated till a termination criterion is met.

Xie–Beni Cluster Validity Index:Given the number ofclusters desired, an issue that arises is how do we ensure thatthe cluster we are using for initialization is good enough. SinceFCM itself depends on initial values of the cluster centers( ), we employ a cluster validity index to identify the optimalclusters from those obtained using different initial values.The Xie–Beni index [70] is one such cluster validity measure.A cluster structure having the least value of this index isconsidered the best cluster, and is used to initialize the freeparameters (centers and spreads) of SuPFuNIS. The Xie–Beniindex , is computed as follows:

(72)

Since the best cluster is obtained for the least value of thisindex, it implies that for an optimal cluster the numerator of(72) (which indicates the compactness of clusters) should be assmall as possible, and the denominator of (72) (which suggeststhe separability between clusters), should be large. Theattempts to search compact and separable clusters.

The way initialization is performed is:

1) run FCM on a set of randomly chosen centroids;2) compute the Xie–Beni index for the resulting cluster;

3) repeat 1) and 2) a large number of times (say 1000);4) select the cluster with the minimum Xie–Beni index.

In this way, FCM coupled with cluster validity index providesa convenient and somewhat optimal initialization of the centersof the fuzzy weights of SuPFuNIS.

Projection of Cluster Centers and Boundaries:Given a setof clusters of training data, we next derive the initial rule base.We do so in two steps.

First, since the FCM- procedure outlined above yieldscluster centroids, these are used to directly initialize the cen-ters of Gaussian membership functions defined on the input andoutput universes of discourse. Second, to derive the spreads ofeach of the sets we use the covariance matrices of individualclusters since a covariance matrix of a data clusters defines anellipsoidal patch centered at the centroid [21, ch. 5], which inour case is obtained from the FCM- algorithm. An ellip-soid corresponding to theth cluster is defined by

(73)

where is a positive real number, and is the center of thethellipsoid, and is the covariance matrix of theth cluster.is computed from the data points that belong to the cluster asfollows:

(74)

where is the standard expectation operator. In the presentcase is computed by averaging over data points that aremembers of theth cluster.

For ease of calculation, the ellipsoids described by (74) areinscribed in rectangles [21] which are projected onto the axesof the input–output space to derive the fuzzy sets. Although thiswould mean that some correlation information would get lostwe believe that this might not present a very serious problemsince we are subsequently going to fine tune each of the spreadsduring training.

The projected length of the th rectangle onto theth di-mension is defined as

(75)

where are the eigenvalues of and is the angle be-tween the th eigenvector and theth dimension for the th ellip-soid. A triangular fuzzy set can thus be generated having unityheight and as base length. The area of this triangular fuzzy setis . The area of the Gaussian fuzzy sets used in SuPFuNISis where is the spread. Therefore, if we consider a trian-gular fuzzy set and a Gaussian fuzzy set having equal areas, theresulting spread is

(76)

Thus, given a cluster centroid , the center and spreads of allweights that fan in and out of theth rule node are initialized.

Note that the overlap of the projected fuzzy sets proportionalto . In the simulations presented in the text we used a value of

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 597

TABLE XIVTHE FINAL VALUES OF FUZZY WEIGHTSw AND v FOR MACKEY–GLASS TIME SERIESAPPLICATION USING 10 RULES PLOTTED IN FIG. 9

TABLE XVTHE FINAL VALUES OF TRAINED FEATURE SPREAD(x ) OF INPUT FUZZIFIER

FOR MACKEY–GLASS TIME SERIES APPLICATION FOR THECASE OF

TEN RULES AND 500 EPOCHS

TABLE XVITHE FINAL VALUES OF FUZZY WEIGHTS w AND v FOR

IRIS DATA CLASSIFICATION

TABLE XVIITHE FINAL VALUES OF TRAINED FEATURE SPREAD (x ) OF

INPUT FUZZIFIER FORIRIS DATA CLASSIFICATION

since we found that this value gives an overlap rangingfrom 30–45% for cluster counts ranging from three to ten, forMackey–Glass time series data. Note that this middle-of-the-ground value is chosen solely for the purpose of initialization.Supervised learning will finally tune the Gaussian set spreads.In a similar fashion, the value offor a data set in general, canbe selected in a way that ensures an overlap of approximately30–40% averaged over all input features. This computation isstraightforward and is facilitated by the subsethood measure.

APPENDIX II

ACKNOWLEDGMENT

The authors wish to thank the Editor, Associate Editor, andreferees for their detailed comments, suggestions and encour-agement that helped strengthen this paper and mould it into thepresent form.

REFERENCES

[1] C. T. Lin and C. S. G. Lee,Neural Fuzzy Systems: A Neuro-Fuzzy Syn-ergism to Intelligent Systems. Upper Saddle River, NJ: Prentice-Hall,1996.

[2] S. Pal and S. Mitra,Neuro-Fuzzy Pattern Recognition: Methods in SoftComputing: Wiley, 1999.

[3] J. Buckley and T. Feuring, “Fuzzy and neural: Interactions and applica-tions,” in Studies in Fuzziness and Soft Computing. Heidelberg, Ger-many: Physica-Verlag, 1999.

[4] J. C. Bezdek, J. Keller, R. Krishnapuram, and N. R. Pal,FuzzyModels and Algorithms for Pattern Recognition and Image Pro-cessing. Boston, MA: Kluwer, 1999.

[5] S. Mitra and Y. Hayashi, “Neuro-fuzzy rule generation: Survey in softcomputing framework,”IEEE Trans. Neural Networks, vol. 11, pp.748–768, May 2000.

[6] H. Takagi and I. Hayashi, “Artificial neural network driven fuzzy rea-soning,”Int. J. Approximate Reasoning, vol. 5, pp. 191–212, 1991.

[7] C. T. Lin and C. S. G. Lee, “Neural-network-based fuzzy logic controland decision system,”IEEE Trans. Comput., vol. 40, pp. 1320–1336,Dec. 1991.

[8] H. Berenji and P. Khedkar, “Learning and tuning fuzzy logic controllersthrough reinforcements,”IEEE Trans. Neural Networks, vol. 3, pp.724–740, 1992.

[9] J.-S. R. Jang, “ANFIS: Adaptive-network-based fuzzy inferencesystem,”IEEE Trans. Syst., Man, Cybern., vol. 23, pp. 665–685, May1993.

[10] S. Mitra and S. Pal, “Fuzzy multi-layer perceptron, inferencing and rulegeneration,”IEEE Trans. Neural Networks, vol. 6, pp. 51–63, Jan. 1995.

[11] J. Chen and Y. Xi, “Nonlinear system modeling by competitive learningand adaptive fuzzy inference system,”IEEE Trans. Syst., Man, Cybern.,vol. 28, pp. 231–238, May 1998.

[12] C. Juang and C. Lin, “An on-line self-constructing neural fuzzy infer-ence network and its applications,”IEEE Trans. Fuzzy Syst., vol. 6, pp.12–32, 1998.

[13] J. Kim and N. Kasabov, “HyFIS: Adaptive neuro-fuzzy inference sys-tems and their application to nonlinear dynamical systems,”Neural Net-works, vol. 12, no. 9, pp. 1301–1321, 1999.

[14] D. Nauck and R. Kruse, “Obtaining interpretable fuzzy classificationrules from data,”Artificial Intell. Med., vol. 16, no. 2, pp. 149–169, 1999.

[15] A. Wu and P. K. S. Tam, “A fuzzy neural network based on fuzzy hi-erarchy error approach,”IEEE Trans. Fuzzy Syst., vol. 8, pp. 808–816,Dec. 2000.

[16] D. Nauck and R. Kruse, “A neuro-fuzzy method to learn fuzzy classifi-cation rules from data,”Fuzzy Sets Syst., vol. 89, pp. 277–288, 1997.

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

598 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 3, MAY 2002

[17] L. Cai and H. Kwan, “Fuzzy classifications using fuzzy inference net-works,” IEEE Trans. Syst., Man, Cybern. B, vol. 28, pp. 334–347, June1998.

[18] H. Ishibuchi, K. Nozaki, N. Yamamoto, and H. Tanaka, “Selecting fuzzyif–then rules for classification problems using genetic algorithms,”IEEETrans. Fuzzy Syst., vol. 3, pp. 260–270, Aug. 1995.

[19] S. Mitra and S. Pal, “Logical operation based MLP for classification andrule generation,”Neural Networks, vol. 7, no. 2, pp. 353–373, 1994.

[20] L. X. Wang and J. M. Mendel, “Generating fuzzy rules from numericaldata, with applications,” Univ. Southern California, Los Angeles, Tech.Rep. 169, USC SIPI, Jan. 1991.

[21] B. Kosko, Fuzzy Engineering. Englewood Cliffs, NJ: Prentice-Hall,1997.

[22] D. Nauck and R. Kruse, “A neuro-fuzzy approach to obtain interpretablefuzzy systems for function approximation,” inProc. IEEE Int. Conf.Fuzzy Syst. (FUZZ-IEEE’98), May 1998, pp. 1106–1111.

[23] C.-H. Wang, T.-P. Hong, and S.-S. Tseng, “Integrating fuzzy knowledgeby genetic algorithms,”IEEE Trans. Evol. Comput., vol. 2, pp. 138–148,Nov. 1998.

[24] C. Chao, Y. Chen, and C. Teng, “Simplification of fuzzy-neural systemsusing similarity analysis,”IEEE Trans. Syst., Man, Cybern. B, vol. 26,pp. 344–354, Apr. 1996.

[25] N. R. Pal and T. Pal, “On rule pruning using fuzzy neural networks,”Fuzzy Sets Syst., vol. 106, pp. 335–347, 1999.

[26] Y. Jin, “Fuzzy modeling of high-dimensional systems: Complexity re-duction and interpretability improvement,”IEEE Trans. Fuzzy Syst., vol.8, pp. 212–221, Apr. 2000.

[27] S. Pal and S. Mitra, “Multilayer perceptron, fuzzy sets, and classifica-tion,” IEEE Trans. Neural Networks, vol. 3, pp. 683–697, Sept. 1992.

[28] S. Mitra, R. K. De, and S. K. Pal, “Knowledge-based fuzzy MLP forclassification and rule generation,”IEEE Trans. Neural Networks, vol.8, pp. 1338–1350, Nov. 1997.

[29] H. Ishibuchi, “Neural network that learn from fuzzy if then rules,”IEEETrans. Fuzzy Syst., vol. 1, pp. 85–97, May 1993.

[30] Y. Hayashi, J. J. Buckley, and E. Czogala, “Fuzzy neural network withfuzzy signals and weights,”Int. J. Intell. Syst., vol. 8, no. 4, pp. 527–537,1993.

[31] H. Ishibuchi and Y. Hayashi, “A learning algorithm of fuzzy neuralnetworks with triangular fuzzy weights,”Fuzzy Sets Syst., vol. 71, pp.277–293, 1995.

[32] T. Feuring, J. Buckley, and Y. Hayashi, “A gradient descent learningalgorithm for fuzzy neural networks,” inProc. IEEE Int. Conf. FuzzySyst. FUZZ-IEEE’98, Anchorage, AK, May 1998, pp. 1136–1141.

[33] L. Wang and J. Yen, “Extracting fuzzy rules for system modeling usinghybrid of genetic and Kalman filter,”Fuzzy Sets Syst., vol. 101, pp.353–362, 1999.

[34] N. Kasabov,Neuro-Fuzzy Techniques for Intelligent Information Pro-cessing, ser. Studies in Fuzziness and Soft Computing. Heidelberg,Germany: Physica-Verlag, 1999, vol. 30.

[35] N. Kasabov and B. Woodford, “Rule insertion and rule extractionfrom evolving fuzzy neural networks: Algorithms and applicationsfor building adaptive, intelligent expert systems,” inProc. IEEE Int.Conf. Fuzzy Syst. FUZZ-IEEE’99, vol. 3, Seoul, Korea, Aug. 1999, pp.1406–1411.

[36] M. Russo, “FuGeNeSys—A fuzzy genetic neural system for fuzzy mod-eling,” IEEE Trans. Fuzzy Syst., vol. 6, pp. 373–388, Aug. 1998.

[37] , “Genetic fuzzy learning,”IEEE Trans. Evol. Comput., vol. 4, pp.259–273, Sept. 2000.

[38] L. M. Fu, “Learning capacity and sample complexity on expert net-works,” IEEE Trans. Neural Networks, vol. 7, pp. 1517–1520, 1996.

[39] J. Leski and E. Czogala, “A new artificial neural network based fuzzyinference system with moving consequents in if–then rules and selectedapplications,”Fuzzy Sets Syst., vol. 108, pp. 289–297, 1999.

[40] N. R. Pal, K. Pal, and J. C. Bezdek, “Some issues in system identifi-cation using clustering,” inProc. IEEE Int. Conf. on Neural Networks.Pisctaway, NJ, 1997, pp. 2524–2529.

[41] Y. Lin, G. A. Cunningham, III, and S. V. Coggeshall, “Using fuzzy par-tition to create fuzzy systems from input–output data and set the initialweights in a fuzzy neural network,”IEEE Trans. Fuzzy Syst., vol. 5, pp.614–621, Nov. 1997.

[42] J.-L. Chen and J. Y. Chang, “Fuzzy perceptron neural networks for clas-sifiers with numerical data and linguistic rules as inputs,”IEEE Trans.Fuzzy Syst., vol. 8, pp. 730–745, Dec. 2000.

[43] D. Nauck and R. Kruse, “NEFCON-I: An X-window based simulatorfor neural fuzzy controllers,” inProc. IEEE Int. Conf. Neural Networks,Orlando, FL, June 1994, pp. 1638–1643.

[44] S. Mitra and S. Pal, “Fuzzy self organization, inferencing and rulegeneration,”IEEE Trans. Syst. Man, Cybern., vol. 26, pp. 608–620,1996.

[45] Y. Jin, W. von Seelen, and B. Sendhoff, “On generatingFC fuzzy rulesystems from data using evolution strategies,”IEEE Trans. Syst., Man,Cybern. B, vol. 29, pp. 829–845, Dec. 1999.

[46] F. Klawonn and R. Kruse, “Constructing a fuzzy controller from data,”Fuzzy Sets Syst., vol. 85, pp. 177–193, 1997.

[47] S. Paul and S. Kumar, “Rule based neuro-fuzzy linguistic networks forinference and function approximation,” inKnowledge Based ComputerSystems, M. Sasikumar, D. D. Rao, P. R. Prakash, and S. Ramani,Eds. Mumbai, India: NCST, Dec. 1998, pp. 287–298.

[48] , “Adaptive rule-based linguistic networks for function approxima-tion,” in Advances in Pattern Recognition and Digital Techniques, N. R.Pal, A. K. De, and J. Das, Eds. Calcutta, India: Narosa, Dec. 1999, pp.246–250.

[49] , “Subsethood based adaptive linguistic networks for pattern clas-sification,” IEEE Trans. Syst., Man, Cybern. C, 2002, to be published.

[50] M. Mackey and L. Glass, “Oscillation and chaos in physiological controlsystems,”Science, vol. 197, pp. 287–289, 1977.

[51] L.-X. Wang and J. M. Mendel, “Generating fuzzy rules by learning fromexamples,”IEEE Trans. Syst., Man, Cybern., vol. 22, pp. 1414–1427,Nov./Dec. 1992.

[52] D. Kim and C. Kim, “Forecasting time series with genetic fuzzy pre-dictor ensemble,”IEEE Trans. Fuzzy Syst., vol. 5, pp. 523–535, Nov.1997.

[53] S. Wu and M. J. Er, “Dynamic fuzzy neural networks—A novel approachto function approximation,”IEEE Trans. Syst., Man, Cybern. B, vol. 30,pp. 358–364, Apr. 2000.

[54] N. Kasabov and Q. Song, “Dynamic evolving fuzzy neural networkswith ‘m-out-of-n’ activation nodes for on-line adaptive systems,” Dep.Inform. Sci., Univ. Otago, Dunedin, New Zealand, Tech. Rep. TR99-04,1999.

[55] X. Yao and Y. Lin, “A new evolutionary system for evolving artificialneural networks,”IEEE Trans. Neural Networks, vol. 8, pp. 694–713,May 1997.

[56] R. A. Fisher, “The use of multiple measurements in taxonomic prob-lems,”Ann. Eugenics, vol. 7, no. 2, pp. 179–188, 1936.

[57] N. B. Karayiannis, J. C. Bezdek, N. R. Pal, R. J. Hathaway, and P.-I.Pai, “Repairs to GLVQ: A new family of competitive learning schemes,”IEEE Trans. Neural Networks, vol. 7, pp. 1062–1071, Sept. 1996.

[58] L. I. Kuncheva and J. C. Bezdek, “Nearest prototype classification: Clus-tering, genetic algorithms, or random search?,”IEEE Trans. Syst., Man,Cybern. C, vol. 28, pp. 160–164, Feb. 1998.

[59] N. Kasabov, “Learning fuzzy rules and approximate reasoning in fuzzyneural networks and hybrid systems,”Fuzzy Sets Syst., vol. 82, pp.135–149, 1996.

[60] S. Halgamuge and M. Glesner, “Neural networks in designing fuzzy sys-tems for real world applications,”Fuzzy Sets Syst., vol. 65, pp. 1–12,1994.

[61] C. Bishop,Neural Networks for Pattern Recognition. Oxford, U.K.:Clarendon, 1995.

[62] W. Duch, R. Adamezak, and K. Grabezewski, “A new methodology ofextraction, optimization and application of crisp and fuzzy logical rules,”IEEE Trans. Neural Networks, vol. 12, pp. 277–306, Mar. 2001.

[63] P. Clark and T. Niblett, “The CN2 induction algorithm,”MachineLearning, vol. 3, pp. 261–283, 1989.

[64] G. Cestnik, I. Konenenko, and I. Bratko, “Assistant-86: A knowl-edge elicitation tool for sophisticated users,” inMachine Learning,Bratko and Lavrac, Eds. South Bound Brook, NJ: Sigma, 1987,pp. 31–45.

[65] H. Narazaki and A. Ralescu, “An improved synthesis method for mul-tilayered neural networks using qualitative knowledge,”IEEE Trans.Fuzzy Syst., vol. 1, pp. 125–137, May 1993.

[66] Y. Lin and G. A. Cunningham, III, “A new approach to fuzzy-neuralsystem modeling,”IEEE Trans. Fuzzy Syst., vol. 3, pp. 190–198, May1995.

[67] D. Nguyen and B. Widrow, “The truck backer-upper: An example ofself-learning in neural network,”IEEE Contr. Syst. Mag., vol. 10, pp.18–23, 1990.

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.

PAUL AND KUMAR: SUBSETHOOD-PRODUCT FUZZY NEURAL INFERENCE SYSTEM (SuPFuNIS) 599

[68] S.-G. Kong and B. Kosko, “Adaptive fuzzy systems for backing up atruck-and-trailer,”IEEE Trans. Neural Networks, vol. 3, pp. 211–223,Mar. 1992.

[69] J. C. Bezdek,Pattern Recognition With Fuzzy Objective Function Algo-rithms. New York: Plenum, 1981.

[70] X. Xie and G. Beni, “Validity measure for fuzzy clustering,”IEEE Trans.Pattern Anal. Machine Learning, vol. 3, pp. 841–846, Aug. 1991.

Sandeep Paulreceived the B.Sc. degree in electricalengineering from the Aligarh Muslim University,Aligarh, India, in 1992 and the M.Tech. degreein the engineering systems from the DayalbaghEducational Institute, Agra, India, in December1994. He is currently pursuing the Ph.D. degree infuzzy-neural systems from Dayalbagh EducationalInstitute.

He is presently working as a Lecturer in theDepartment of Electrical Engineering, D.E.I.Technical College, Dayalbagh Educational Institute,

Dayalbagh, Agra. His current interests include hybrid fuzzy neural systemsand their applications to pattern recognition, classification, and functionapproximation.

Mr. Paul was recipient of the Director’s Medal for achieving the highest marksin the M.Tech. course.

Satish Kumar (M’87) received the B.Sc. degreein electrical engineering from the Dayalbagh Edu-cational Institute, Dayalbagh, Agra, India, in 1985,the M.Tech. degree in the integrated electronics andcircuits from the Indian Institute of Technology,Delhi, in December 1986, and the Ph.D. degree inphysics and computer science from the DayalbaghEducational Institute, in 1992. In the course of hisdoctoral studies, he worked on structured models forsoftware engineering, system dynamics, and neuralnetworks.

He was Senior Research Assistant at the Center for Applied Research in Elec-tronic, Indian Institute of Technology, Delhi, where he was involved in researchon CAD applications from January to July 1987. He then joined the Depart-ment of Physics and Computer Science at Dayalbagh Educational Institute asa lecturer, where he has been teaching ever since. He is presently a Reader incomputer science and applications in the department. His current research in-terests are in the area of fuzzy-neural systems, evolvable systems, theoreticalaspects of neural networks, and pulsed neuron models.

Authorized licensed use limited to: UNIVERSITY OF WINDSOR. Downloaded on February 13,2010 at 21:59:28 EST from IEEE Xplore. Restrictions apply.