ASAFES: adaptive stochastic algorithm for fuzzy computing/function estimation

6
A.S.A.F.ES.: Adaptive Stochastic Algorithm for Fuzzy computing / function EStimation. Asafes means "fuzzy' in Greek (neuter gender adjective). Athanasios V. Vasilakos I Konstantinos C. Zikidis. Hellenic Air Force Academy, Dekelia TGA 1010, ATHENS, GREECE. Mailing address: A.V. Vasilakos, P.O. Box 65251, 15410 Psychiko, ATHENS, GREECE. Abstract. In this paper, we present ASAFES, a novel architecture for fuzzy computing, featuring a different approach to the function approximation problem. ASAFES is a reinforcement learning algorithm which is able to learn a multi-variable function on line, from examples, and create the set of fuzzy rules and their corresponding weights (significances) which expresses the function, without requiring prior knowledge of the exact output value for each combination of input values. It can be initialized with explicit human knowledge or start from complete ignorance, can learn from noisy data, generalize, automatically adapt on the way if needed, using just a reinforcement signal, which approximately indicates how correct was its output at every iteration. Our scheme employs a stochastic search for the right consequence and corresponding weight for each possible fuzzy rule, using Stochastic Estimator Learning Algorithm (SELA [11])and regression analysis. It is a fuzzy computer, integrating the neural networks advantages, and the fuzzy logic appeal. 1. Introduction. The theory of fuzzy sets provides the mathematical background for processing and dealing with inexact data. This enables us to cope with the vagueness, imprecision and fuziness, which are intrinsic in many aspects of our world. The first great step to this direction was done in 1965, by Lotfi Zadeh, with his paper "Fuzzy sets" [1]. Zadeh went on with the introduction of Fuzzy Logic and Approximate Reasoning [2], in 1973, as an alternative way of modelling and working on real life problems, based on fuzzy set theory. Later, research on fuzzy control (Mamdani (3J,e.t.c.), initialized a new era in automatic control. Today, fuzzy logic control has found many applications,· from small cameras to large industrial processes (Yamakaya(10]). The fuzzy logic controller (FLC) has replaced the PID controller (proportional - integral - derivative) wherever there is need for a fast reacting, simple to build, and inexpensive, model-free . controller, or where there are qualitative input-output data. With the use of the FLC it is not necessary to obtain the exact mathematical equations which model the system under control - if there are any! During the last years, the main interest W<'lS in matching fuzzy logic with artificial neural networks (ANNs), which are also model-free estimators, in order' to combine the simplicity and effectiveness of the FLC with the generalization, noise immunity and learning ability of the ANNs: Many different configurations were proposed, (e.g. [7],[8]). Supervised learning algorithms (e.g. back-prop (12]) were mainly used in training and the NN usually had to learn a given set of fuzzy rules, off-line of course. Our attempt began with the purpose to design an intelligent FLC which could be trained with reinforcement learning (a term taken from ANNs (13]). Reinforcement learning algorithms do not receive with every input pattern the desired output pattern (while supervised learning algorithms do). The only information they require is a single number, called "reinforcement" (usually, but not always, in (0,1]), whose expected value indicates how "good" was their "guess" for the output pattern. The proposed scheme is a reinforcement learning algorithm that can create on line the set of fuzzy rules, along with their (significance) weights: that best control an unknown system. Seeing the ASAFES scheme from another point of view, we can say that it is an algorithm that discovers the function that fits best to a set of data points (examples}, which is the desired function provided that the examples are enough and the fuzzy sets for each variable are enough to represent it with the required precision (they shouldn't be more than enough, either). Furthermore it expresses this function with a fuzzy rule base. The algorithm does not require, of course, prior knowledge of the system behaviour, nor exact input-output data of the form "input vector-output value", but only a scalar number, an approximate indication of the correctness of its "guess" of the output value, corresponding to the given example input vector. Multi- variable function estimation can also be made with the use of a feedforward multilayer neural network but the training would take much longer and, if it succeeded and hadn't got stuck into a local minimum, then 0-7803-1896-X/94 $4.00 ©1994 IEEE 1087

Transcript of ASAFES: adaptive stochastic algorithm for fuzzy computing/function estimation

A.S.A.F.ES.: Adaptive Stochastic Algorithmfor Fuzzy computing / function EStimation.

Asafes means "fuzzy' in Greek (neuter gender adjective).

Athanasios V. Vasilakos I Konstantinos C. Zikidis.Hellenic Air Force Academy, Dekelia TGA 1010, ATHENS, GREECE.

Mailing address: A.V. Vasilakos, P.O. Box 65251, 15410 Psychiko, ATHENS, GREECE.

Abstract.In this paper, we present ASAFES, a novel architecture for fuzzy computing, featuring a different approach to the

function approximation problem. ASAFES is a reinforcement learning algorithm which is able to learn a multi-variablefunction on line, from examples, and create the set of fuzzy rules and their corresponding weights (significances)which expresses the function, without requiring prior knowledge of the exact output value for each combination ofinput values. It can be initialized with explicit human knowledge or start from complete ignorance, can learn fromnoisy data, generalize, automatically adapt on the way if needed, using just a reinforcement signal, whichapproximately indicates how correct was its output at every iteration. Our scheme employs a stochastic search forthe right consequence and corresponding weight for each possible fuzzy rule, using Stochastic Estimator LearningAlgorithm (SELA [11])and regression analysis. It is a fuzzy computer, integrating the neural networks advantages, andthe fuzzy logic appeal.

1. Introduction.The theory of fuzzy sets provides the mathematical background for processing and dealing with inexact

data. This enables us to cope with the vagueness, imprecision and fuziness, which are intrinsic in manyaspects of our world. The first great step to this direction was done in 1965, by Lotfi Zadeh, with his paper"Fuzzy sets" [1]. Zadeh went on with the introduction of Fuzzy Logic and Approximate Reasoning [2], in1973, as an alternative way of modelling and working on real life problems, based on fuzzy set theory.Later, research on fuzzy control (Mamdani (3J,e.t.c.), initialized a new era in automatic control. Today,fuzzy logic control has found many applications,· from small cameras to large industrial processes(Yamakaya(10]). The fuzzy logic controller (FLC) has replaced the PID controller (proportional - integral -derivative) wherever there is need for a fast reacting, simple to build, and inexpensive, model-free

.controller, or where there are qualitative input-output data. With the use of the FLC it is not necessary toobtain the exact mathematical equations which model the system under control - if there are any!

During the last years, the main interest W<'lS in matching fuzzy logic with artificial neural networks(ANNs), which are also model-free estimators, in order' to combine the simplicity and effectiveness of theFLC with the generalization, noise immunity and learning ability of the ANNs: Many different configurationswere proposed, (e.g. [7],[8]). Supervised learning algorithms (e.g. back-prop (12]) were mainly used intraining and the NN usually had to learn a given set of fuzzy rules, off-line of course.

Our attempt began with the purpose to design an intelligent FLC which could be trained withreinforcement learning (a term taken from ANNs (13]). Reinforcement learning algorithms do not receivewith every input pattern the desired output pattern (while supervised learning algorithms do). The onlyinformation they require is a single number, called "reinforcement" (usually, but not always, in (0,1]),

whose expected value indicates how "good" was their "guess" for the output pattern.The proposed scheme is a reinforcement learning algorithm that can create on line the set of fuzzy

rules, along with their (significance) weights: that best control an unknown system. Seeing the ASAFESscheme from another point of view, we can say that it is an algorithm that discovers the function that fitsbest to a set of data points (examples}, which is the desired function provided that the examples areenough and the fuzzy sets for each variable are enough to represent it with the required precision (theyshouldn't be more than enough, either). Furthermore it expresses this function with a fuzzy rule base. Thealgorithm does not require, of course, prior knowledge of the system behaviour, nor exact input-outputdata of the form "input vector-output value", but only a scalar number, an approximate indication of thecorrectness of its "guess" of the output value, corresponding to the given example input vector. Multi-variable function estimation can also be made with the use of a feedforward multilayer neural network butthe training would take much longer and, if it succeeded and hadn't got stuck into a local minimum, then

0-7803-1896-X/94 $4.00 ©1994 IEEE 1087

we would have a "black box", which just responds with the right answers, without any chance ofexpressing its internal representations in any understandable form.

ASAFES is a "neurally inspired" scheme, whose main comprising ideas are the use of a learningautomaton (SELA [11]) and regression analysis, for selecting the right consequense and the calculation ofthe appropriate weight of each fuzzy rule, respectively. The paper is divided as follows: Section 2 statessome basic definitions, Section 3 is a short presentation of SELA, Section 4 is the presentation ofASAFES, Section 5 the simulation results and finally Section 6, the conclusion.

2. Definitions and basic concepts of fuzzy logic and FLC.In fig. 1 we can see the structure of the FLC. In a few words, the controller uses some certain

measurements of the controlled system as input variables, fuzzifies them into fuzzy numbers, "thinks",that is calculates a fuzzy output according to the fuzzy rule base, and finally defuzzifies the output andproduces a crisp result, which is then fed as a control variable into the system under control. But beforewe proceed further we should give some basic definitions: Fuzzy set A in a universe of discource U is aset whose elements are characterized by a membership function IJA: U~[O,1]. So, if UEU, A is a set ofordered pairs: A={ (u , IJA(U)) I UEU}. If PA(U»O, u is called a support value. If U is a continuousuniverse (e.g. ~) and A is normalized (i.e. maxUEU IJA(u)=1) and convex (e.g. a bell-shaped function),then A is a fuzzy number. Linguistic variable is a variable whose values are fuzzy numbers. Formally,a linguistic variable with label x, in a universe of discourse U, is characterized by a term setT(X)={ T~, T; , ... ,T~} and the set M(X)={ M~,M~,....M~} of the corresponding membership functions [7].

inputX - f- INFERENCE - f-output Y

-~ Fuzzifier. Fuzzy rule base Oefuzzifier. -- Plant. -~- of- updating-learning

IteacherBasic structure of a fuzzy logic controller. Figure 1.

3. Brief presentation of the SELA automaton.A learning automaton interacts iteratively with a stochastic environment and tries to learn to respond

with the optimal action offered by the environment, via a learning process. The learning process, in a fewwords, is the following: The algorithm chooses one of the offered actions according to a probability vector,which at every instant contains the probability of choosing each action. The chosen action triggers theenvironment, which answers with a feedback, depending on the stochastic characteristics of the chosenaction. The automaton, taking into account this ans'.••.er. rrodifies its probability vector, trying to learn theaction that offers the maximum mean reward.

SELA is an estimator automaton and the probability vector modification is not made according to thereceived feedback directly, but via the use of the estimated mean reward of each action. SELA weightsthe real estimates with a Gaussian random variable, with a variance proportional to the time that haselapsed since the last time this action was selected. The SELA theorem is stated below and its proof withthe detailed description can be found in [11].

The SELA automaton is s-optimal in every stochastic environment that offers symmetrically distributednoise. Thus, if action am is the optimal one and Pm is the corresponding probability of choosing am, thenfor every value of N>No (No>O) of the resolution parameter there is a time instant to<oo, such that forevery t>to it holds that E[Pm(t)]=1.

4. Presentation of ASAFES.In order to describe the ASAFES algorithm we shall use a simple example of learning a function of two

input variables and one output variable. Let us assume we have to control a system, takingmeasurements of two of its characteristics as inputs x1,x2, to ASAFES and feeding the system with acontrol variable y, which is the output variable of ASAFES. A problem like this is the classical controlproblem of stabilizing an inverted pendulum. In fig.3 is the ASAFES scheme architecture.

The first step is to define the appropriate fuzzy numbers for each variable. The number of fuzzy setsfor every variable affects the representation resolution of its value range and depends on the criticalness

1088

of its accuracy. So, let us "divide" the ranges of the three variables X1. x2. and v, into I, m, and n, fuzzysets. respectively. Our scheme does not change the "position" on the 9l axis of the fuzzy sets nor theirmembership function curves. on line. We have to define their exact membership functions from the start.In our simulations we used a double logistic function, that is two symmetric logistic functions (see fig.2).The resulting curve can safely be considered as convex and the corresponding fuzzy set as a fuzzynumber. In fig.4 we can see a possible "division" of the 9\ axis into 1=7 fuzzy numbers for one of the threevariables. We should mention that the exact shapes of the fuzzy set curves of the output variable do notmatter. as far as they are convex and symmetrical.

.'~:'.~::::.....cr.-·S:...·..·..·..·..o 7"..·~~·{~id· ~

I Figure 2 : Double logistic function.We have accepted two input variables X1. x2. with I and m fuzzy numbers defined for each.

respectively. and an output variable y with n (hypothetical) fuzzy numbers. So. there are Ixm possiblefuzzy rules (FRs) with n possible consequences. of the form:Rule ij : IF x1 is M~, and x2 is M~, ' THEN Y is M~, i=1.2 ..... I. j=1.2 .... .rn, k=1.2 ..... n, where M~ is the ith

fuzzy number for the variable a. We add one more possible consequence for each possible FR: the "noaction" or "don't care" consequence, so there are Ixm possible FR with n+1 possible consequences.

output variable y. reinforcement

defuzzifier

Fuzzy Ruie level

~lGjtuzzuier ,/1'level f.n.

input variable x, input variable x2-

ASAFES archtecture. Figure 3.ASAFES interacts iteratively with the system under control. The system presents to ASAFES the

specific values for the two input varables. the algorithm calculates the control variable y. feeds it to thesystem and the system returns a reinforcement whose expected value lies in [0.1] and informs thealgorithm how "good" was its output value.

The main idea of our algorithm is the "breakinq" of the hyperspace of the input variables into smaller"fuzzy parts". each corresponding to a possible FR. and working in each "fuzzy part" separately. ASAFESuses two very powerful tools for this task. The first is SELA. which is a very fast. adaptive. reliable androbust to noise. learning automaton. and is used for deciding which is the best consequence for each·activated FR. in other words. which of the n+1 consequence offers the greatest reinforcement. So. thereis a separate SELA running for everyone of the Ixm possible FRs. Apart from that. not all the rules havethe same significance or strength (attention: we are not refering to the firing strength of a FR. namely thedegree to which the precedents of the FR are met). There are rules which. when they are "activated".that is their precedents are met. they affect the inference process more than others. This fact leads us tothe use of "weights". which affect the significance of each possible FR. just like in neural networks. The

1089

estimation of the optimum weight is done with the aid of regression. The weights Wij, i=1,2,... ,I, j=1,2, ... .m,are real numbers, while what is actually used for the computation of the output is the significance

- coefficients. The significance coefficients are positive real numbers and are equal to eWii.ASAFES uses correlation-product inference (Larsen) and the power or firing strength of the ij FR is:

p=M~,(X1)·M~,(X2). where M~ is the ith fuzzy number of the variable a. p is always positive. We ignore all

the rules that have p<t, where t is a fixed threshold. In every iteration, the rules with p~t are activated;the others remain idle. There can be four, at maximum, different FR activated at any time (when thereare two input variables).

If the algorithm is at the beginning of its search, then the weights remain unchanged and equal to zeroand the coefficients are equal to eO=1. The firing strength p is multiplied with the corresponding coefficient,which now is 1, so the total strength s remains equal to the firing strength: s = eO . p . Then, the SELArunning for each activated FR chooses one of the n+1 possible actions, i.e. consequences for eachactivated FR (we should remind that the (n=1)th possible action is the "don't care" action). We cancalculate the crisp output y from only n sample points, if we that the output fuzzy sets are symmetric andunimodal [9]. If furthermore the respective areas of the fuzzy numbers are equal, we can compute the

output y using the formula:y = I~CSi.Ci/i~CSi , where cs: is the total strength of the ith consequence and

Ci is the point in the real axis that corresponds to the ith consequence. If the denominator is equal to 0, anoutput cannot be produced and ASAFES proceeds to the next iteration, without updating.

lower limit of upper limitof_~-/.rrr:t ~...< X v~reb' range_H

. I·I .' \.

tnresnoo t. jsbpe or' bgistic function. I overlap. I'POSSIBLE DIVISION OF ONE OF THE THREE VARIABLES INTO SEVEN FUZZY SETS.

Figure 4.The system under control receives the value of y and returns the reinforcement. The internal arrays

are updated. Also the SELAs of the activated FRs update their estimator matrixes and probability vectors.A certain modification has been made to the updating of SELA, which ensures correct convergence andincreases speed considerably. This was the presentation of an iteration of the first stage of learning,where the SELAs are trying to find the appropriate consequence for each possible FR, if it is activated,and where the weights remain unchanged and equal to zero.

Suppose now that the SELA of the ij FR has almost converged. This means that there is an actionwhose selection probability approaches 1. Then, regression starts automatically. The weight Wij starts tofluctuate. A zero mean, symmetrical noise is added to Wij, every time this FR is activated. The "noisy"weight is nWij = Wij + N( 0 , a2). The significance coefficient is enWij and the total strength for the ij FR is:s = enWij . p. ASAFES registers these "noisy" weights and, at every time instant, holds the last Wweights and the corresponding reinforcements of the ij FR. Here is the critical part: from the data pairs ofweights-reinforcements, ASAFES can discover the hidden linear relationship between these two variables,in order to find the optimum weight which is related to the greater reinforcement. In other words, if weconsider the ij reiforcement (received when ij FR is activated) as the independent and Wij as thedepended variable (even though the opposite holds, in fact), then, on the basis of sample data, we shalltry to estimate the value of the variable Wij that corresponds to rij=1. Of course, the relationship generallycan be other than linear. However, if the fluctuations are small, i.e. the sample data are concentrated in arelatively "small" region, then the least square line would be a good approximation of the sample points.So, we shall estimate the Wij from the least square curve that fits the sample data, the regression curve ofWij on r.

When updating the ij weight, the value added to it is proportional to the slope of the regression line, tothe correlation coefficient raised to the second power and to the difference between 1 and the current

1090

· ,

average reinforcement. After the weight updating of the ij FR, the ij SELA updates its matrixes andvectors. This was the description of a complete iteration.

ASAFES can be implemented on hardware if speed is the essential factor. The interesting thing is thatit can very easily be used on a conventional computer as software, working at a high speed, despite thefact that there is a need for a considerable number of separately running SELAs and other processes for

originalfunctionsurface.

iteration:

Fig. 7 & 8.each possible FR. The key is that, in every iteration, updating of the internal arrays is made only for theactivated rules, which are very few (may be one or two). All the other rules and corresponding processesremain idle. We cannot claim that the computational overhead or the need for variable value storage isnegligible, but they can be minimized, depending on the specific problem we are facing. ASAFES.simulation results showed that these requirements are by no means prohibitive; on the contrary, if weconsider that the required precision for the stored variable values can be very low, the total amount of therequired memory bytes is decreased tremendously, and so the computational effort does.

5. Simulation results.In the example task, ASAFES is trying to learn the two variable function shown in fig. 5, from 1000

examples. In fact, this is the classical control surface for systems with a first order delay (like the invertedpendulum stabilization problem), but with the significance of the center rule (i=4, j=4) 10 times larger. Theexamples are data points of the form (x1,x2,Y)' We defined seven fuzzy numbers for each variable,namely l=m=n=7. The overlapping of the fuzzy sets and the slope of the logistic functions in conjuctionwith the threshold t, affect the controller stability. The fuzzy numbers depicted in fig.4 give very smoothcurves in the output surface, without "steps" and discontinuities. The reinforcement given to ASAFES inour simulations was the absolute value of the difference between the actual and the desired output.Finally, we considered that ASAFES had learned the desired function when tre>0.99, where tre is thetotal reinforcement estimator, a global performance index.

1091

The average of the iterations that ASAFES needed to reach tre>0.99 in over than 200 runs for the firsttask, was 20915 iterations. Very few times exceeded 30000 iterations. In fig.S is the original functionsurface, which ASAFES tries to learn. In order to observe the process of learning we have three figures:in fig.6 is the ASAFES output surface after 10000 iterations, in fig.7 after 20000 iterations and in fig.8after 30000 iterations. A very good approximation of the surface was also learned from only 100

.examples. We can see that ASAFES "learns more" than the original function, where the examles camefrom: the learned surface "covers" a greater region in the X1x2 plane than the original surface. Thisreveals the generalization ability of the algorithm.

Trying to simulate a stochastic environment, we added a zero mean noise to the given reinfocement.The "noisy" reinforcement we used was: nr = r + N(O , 02), with 0=0.1. rmight ecxeed 1 in some cases.ASAFES performed just as well, reaching tre>0.99 after 33031 iterations on the average, in 50 runs.Notice that tre>0.99 is a difficult goal for such a noisy environment.

6. Conclusion.We presented, briefly because of space restrictions, ASAFES, a reinforcement learning algorithm, which

is able to learn on line, a multi variable function from examples, noisy or not, and express it with a set offuzzy rules. ASAFES has the advantages of a neural network - generalization, noise immunity andlearning ability - and furthermore expresses the learned function in an understandable form. We arecurrently working on the use of ASAFES on BISDN-ATM network routing, economical models andindustrial control.

This algorithm can certainly be implemented on specially designed hardware but performs very well ona conventional, serial computer, without any need for special equipment or software. So, it can very easilybe used as a controller, for example in the chemical industry; where are met non-linear, multi variableproblems, with high noise levels. ASAFES would be the ideal solution for such a task, since it is a model-free estimator that can learn to control a system, interacting with a stochastic environment, needingminimal information.'

7. References.1. L.A. ZADEH, "Fuzzy sets", Information and control VoL8, pg.338-353, 1965.2. L.A. ZADEH, "The concept of a linguistic variable and its applications to approximate reasoning",

Memorandum ERL-M 411, Berkeley, 1973.3. E.H. MAMDANI, "Application of fuzzy logic to approximate reasoninq", IEEE Trans. Comput. 26, 1182-

1191,1977.4. H.-J. ZIMMERMANN, "Fuzzy set theory and its applications", Kluwer Academic Publishers, 2nd

ed.,1991.5. M. MIZUMOTO, "Fuzzy controls under various fuzzy reasoning methods", Information Sciences 45,

pg.129-151,1988.6. CHUEN CHIEN LEE, "Fuzzy logic in control systems: Fuzzy logic controller, pt I & II", Trans. on SMC,

Vo1.20, No.2, pg.404-432. 1990.7. CHIN-TENG L1N,et aI., "Neural-network-based fuzzy logic control and decision system", IEEE Trans.

on Comp., Vol. 40, No. 12, pp.1320-1336, 1991.8. OKADA, et ai, "Initializing multilayer neural networks with fuzzy logic", IJCCN'92, Vol. 1, pp.239-244,

1992.9. B. KOSKO, "Neural networks and fuzzy systems, a dynamical systems approach to machine

intelligence", Prentice-Hall,1992.10. T. YAMAKAYA, "A fuzzy inference engine in nonlinear analog mode and its application to a fuzzy

logic control", Trans. on N.N., Vol.4, No.3, pg.498-522, 1993.11. A. VASILAKOS et ai, "A new approach to the design of reinforcement scheme for Learning Automata:

Stochastic Estimator Learning Algorithms", to appear in Neurocomputing. Also, preliminary resultshave been presented at IEEE Int.Conf. on SMC, Univ.of Virginia, USA, 1991, pg.1387-1392.

12. D.RUMELHART et ai, "Parallel distributed processing: explorations in the microstructure of cognition",vol. 1, MIT Press, Cambridge, Massachusetts, 1986. .

13. A.BARTO et ai, "Pattern recognizing stochastic LA". IEEE Trans. SMC. Vo1.15, pp.360-375, 1985.

1092