Improved MCMAC with momentum, neighborhood, and averaged trapezoidal output

10
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000 491 Improved MCMAC with Momentum, Neighborhood, and Averaged Trapezoidal Output Kai Keng Ang and Chai Quek Abstract—An improved modified cerebellar articulation controller (MCMAC) [14] neural control algorithm with better learning and recall processes using momentum, neighborhood learning, and averaged trapezoidal output, is proposed in this paper. The learning and recall processes of MCMAC are investigated using the characteristic surface of MCMAC and the control action exerted in controlling a continuously variable transmission (CVT). Extensive experimental results demonstrate a significant improvement with reduced training time and an extended range of trained MCMAC cells. The improvement in recall process using the averaged trapezoidal output (MCMAC-ATO) are contrasted against the original MCMAC using the square of the Pearson product moment correlation coefficient. Experimental results show that the new recall process has significantly reduced the fluctuations in the control action of the MCMAC and addressed partially the problem associated with the resolution of the MCMAC memory array. Index Terms—Continuous variable transmission (CVT) control, im- proved modified cerebellar articulation controller (MCMAC), improved recall, memory resolution, momentum, neighborhood learning, training paths. I. INTRODUCTION Artificial neural networks (ANN’s) are simplified models of the cen- tral nervous system that consist of interconnected neural computing el- ements. The intelligence within ANN lies in its ability to learn and gen- eralize. These two factors inspired the development of the cerebellar model articulation controller (CMAC) in [1]. It is a lattice associative memory network (AMN) that offered an efficient implementation as well as a model for the functionality of the cerebellum. The basic struc- ture of CMAC is very similar to the Perceptron [15], and it is funda- mentally a look-up table where the basis functions generalize locally [2]. It has been used for modeling and controlling high-dimensional, nonlinear plants such as robotic manipulators [3], [18], [12]. Many dif- ferent schemes have also been proposed to improve the basic algorithm [7], [14], [2], [10]. Control and optimization problems are some of the more difficult applications for ANN’s. The mapping functions that must be learned are generally very complex in nature and problem constraints that must be satisfied are often conflicting, for example in controlling the contin- uously variable transmission (CVT) for optimum engine speed and ve- hicle acceleration. Look-up tables are often used in Delphi Automotive Systems for transmission control, which required extensive calibrations for optimal performance. CMAC were proposed for closed-loop con- trol of complex dynamic systems [3], [18], and learning convergence of CMAC were established in [11], [20]. Hence, the CMAC offered an adaptive look-up table design in the area of transmission control since its associated learning algorithm is simple, temporally stable due to the lack of generalization and convergence to a global minimum is guar- anteed even in the presence of random noise. In this work, the Modified Cerebellar Articulation Controller (MCMAC) architecture [14] is proposed as the neural controller for Manuscript received July 10, 1998; revised January 16, 2000. This paper was recommended by Associate Editor S. Lakshmivarahan. K. K. Ang is with the Delphi Automotive Systems, Singapore Private Limited, Singapore 569621. C. Quek is with the Intelligence System Laboratory, Nanyang Technological University, Singapore 639798. Publisher Item Identifier S 1083-4419(00)04126-1. Fig. 1. CMAC block diagram. Fig. 2. Architecture of CMAC memory. controlling the CVT. The results on the control performance of the MCMAC during training are presented. An on-line learning rule using both momentum and neighborhood is proposed to train the MCMAC. The effects of momentum and neighborhood are illustrated by the characteristic surfaces of the MCMAC during training. Finally, the effects of modifying the recall process in the MCMAC using the averaged trapezoidal output are discussed. II. CEREBELLAR MODEL ARTICULATION CONTROL Fig. 1 shows the CMAC block diagram. The CMAC memory con- sists of a two-dimensional (2-D) array that stores the value of as the content of an element in the array with coordinates [9]. The coordinates are derived by quantizing the reference input and plant output The quantization process is described in (1), where represents a discrete step and represents the sampling period. During the initial operation, the plant derives almost all its con- trol input from the classical controller, while the CMAC memory is initialized to zero. During each subsequent control step, the classical control actuation signal is used to build the CMAC character- istic surface (1) where maximum value of ; minimum value of ; resolution of CMAC memory. The CMAC memory can also be visualized as a neural network con- sisting of a cluster of 2-D self-organizing neural network (SOFM). However, instead of a random initialization of the neural net weights, they are fixed such that they formed a 2-D grid, as shown in Fig. 2. The winning neuron in the CMAC memory at time step k is identified as the neuron with weights and given 1083–4419/00$10.00 © 2000 IEEE

Transcript of Improved MCMAC with momentum, neighborhood, and averaged trapezoidal output

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000 491

Improved MCMAC with Momentum, Neighborhood, andAveraged Trapezoidal Output

Kai Keng Ang and Chai Quek

Abstract—An improved modified cerebellar articulation controller(MCMAC) [14] neural control algorithm with better learning andrecall processes using momentum, neighborhood learning, and averagedtrapezoidal output, is proposed in this paper. The learning and recallprocesses of MCMAC are investigated using the characteristic surfaceof MCMAC and the control action exerted in controlling a continuouslyvariable transmission (CVT). Extensive experimental results demonstratea significant improvement with reduced training time and an extendedrange of trained MCMAC cells. The improvement in recall process usingthe averaged trapezoidal output(MCMAC-ATO) are contrasted againstthe original MCMAC using the square of the Pearson product momentcorrelation coefficient. Experimental results show that the new recallprocess has significantly reduced the fluctuations in the control action ofthe MCMAC and addressed partially the problem associated with theresolution of the MCMAC memory array.

Index Terms—Continuous variable transmission (CVT) control, im-proved modified cerebellar articulation controller (MCMAC), improvedrecall, memory resolution, momentum, neighborhood learning, trainingpaths.

I. INTRODUCTION

Artificial neural networks (ANN’s) are simplified models of the cen-tral nervous system that consist of interconnected neural computing el-ements. The intelligence within ANN lies in its ability to learn and gen-eralize. These two factors inspired the development of the cerebellarmodel articulation controller (CMAC) in [1]. It is a lattice associativememory network (AMN) that offered an efficient implementation aswell as a model for the functionality of the cerebellum. The basic struc-ture of CMAC is very similar to the Perceptron [15], and it is funda-mentally a look-up table where the basis functions generalize locally[2]. It has been used for modeling and controlling high-dimensional,nonlinear plants such as robotic manipulators [3], [18], [12]. Many dif-ferent schemes have also been proposed to improve the basic algorithm[7], [14], [2], [10].

Control and optimization problems are some of the more difficultapplications for ANN’s. The mapping functions that must be learnedare generally very complex in nature and problem constraints that mustbe satisfied are often conflicting, for example in controlling the contin-uously variable transmission (CVT) for optimum engine speed and ve-hicle acceleration. Look-up tables are often used in Delphi AutomotiveSystems for transmission control, which required extensive calibrationsfor optimal performance. CMAC were proposed for closed-loop con-trol of complex dynamic systems [3], [18], and learning convergenceof CMAC were established in [11], [20]. Hence, the CMAC offered anadaptive look-up table design in the area of transmission control sinceits associated learning algorithm is simple, temporally stable due to thelack of generalization and convergence to a global minimum is guar-anteed even in the presence of random noise.

In this work, the Modified Cerebellar Articulation Controller(MCMAC) architecture [14] is proposed as the neural controller for

Manuscript received July 10, 1998; revised January 16, 2000. This paper wasrecommended by Associate Editor S. Lakshmivarahan.

K. K. Ang is with the Delphi Automotive Systems, Singapore Private Limited,Singapore 569621.

C. Quek is with the Intelligence System Laboratory, Nanyang TechnologicalUniversity, Singapore 639798.

Publisher Item Identifier S 1083-4419(00)04126-1.

Fig. 1. CMAC block diagram.

Fig. 2. Architecture of CMAC memory.

controlling the CVT. The results on the control performance of theMCMAC during training are presented. An on-line learning rule usingboth momentum and neighborhood is proposed to train the MCMAC.The effects of momentum and neighborhood are illustrated by thecharacteristic surfaces of the MCMAC during training. Finally, theeffects of modifying the recall process in the MCMAC using theaveraged trapezoidal output are discussed.

II. CEREBELLAR MODEL ARTICULATION CONTROL

Fig. 1 shows the CMAC block diagram. The CMAC memory con-sists of a two-dimensional (2-D) array that stores the value ofxn(kT )as the content of an element in the array with coordinatesi; j [9]. Thecoordinatesi; j are derived by quantizing the reference inputyref(kT )and plant outputyp(kT �T ): The quantization process is described in(1), wherek represents a discrete step andT represents the samplingperiod. During the initial operation, the plant derives almost all its con-trol input from the classical controller, while the CMAC memory isinitialized to zero. During each subsequent control step, the classicalcontrol actuation signalxc(kT ) is used to build the CMAC character-istic surface

Q(y(kT )) =n(y(kT )� ymin)

ymax � ymin

(1)

whereymax maximum value ofy;ymin minimum value ofy;n resolution of CMAC memory.The CMAC memory can also be visualized as a neural network con-

sisting of a cluster of 2-D self-organizing neural network (SOFM).However, instead of a random initialization of the neural net weights,they are fixed such that they formed a 2-D grid, as shown in Fig. 2.

The winning neuron in the CMAC memory at time step k is identifiedas the neuron with weightsQ(yref(kT )) andQ(yp(kT � T )) given

1083–4419/00$10.00 © 2000 IEEE

492 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000

Fig. 3. MCMAC block diagram.

Fig. 4. Control results during training using� = 0.01.

the inputsyref(kT ) andyp(kT � T ): The weights are effectively thecoordinatesi; j of the location of the neuron in the SOFM. The outputof the winning neuron can be directly obtained from the weightwi;j ofthe output neuron.

CMAC learning is a competitive learning process that is similarto the Kohonen and the SOFM learning rules. However, since theweights of the cluster of neurons that represent indices to the CMACmemory are fixed, learning only occurs in the output neuron. TheCMAC learning rule is based on the Grossberg competitive learningrule and is applied only to the output layer. No competitive Kohonenlearning rule is applied to the input layer. The CMAC learning rulecan therefore be represented by (2)[21].

i =Q(yref(kT )); j = Q(yp(kT � T )); i; j 2 N

w(k+1)i;j =w

(k)i;j + �(x(kT )� w

(k)i;j ) (2)

where� learning constant;x(kT ) plant input at discrete stepk;yref (kT ) reference input at discrete stepk;yp(kT � T ) plant output at discrete stepk � 1;w(k)i;j contents of CMAC cell with coordinatesi; j at dis-

crete stepk;Q(�) the quantization function defined in (1).

A. Modified CMAC

It is difficult to train the CMAC memory because the plant char-acteristic surface has to be learned while a classical model controllercontrols the plant. For a particular control setting, the plant outputtypically follows a certain trajectory. Hence only the weights of the

Fig. 5. MCMAC contour surface after training using� = 0.01.

output neurons (array cells) visited by the path of this trajectory areupdated. Hence, the training of the CMAC has to be carefully planned.This poses a problem in many control applications that requires on-linelearning in which the control rules are not readily available. The Modi-fied CMAC (MCMAC) architecture had been proposed in [14] to over-come this limitation by using the plant closed loop errorec(kT ) andplant outputyp(kT ) as the models in the training. This allows on-linetraining as well as ease in the planning of the training trajectory. Thetraining path can now be more directly controlled using the referenceplant inputyref (kT ):

Fig. 3 shows the MCMAC block diagram. This architecture is similarto the CMAC, except that the quantized closed loop errorec(kT ) =yref (kT )�yp(kT ) and plant outputyp(kT ) are the indices to the 2-DMCMAC memory array instead of the originalyref (kT ) andyp(kT �T ) in CMAC. The MCMAC learning rule is given in (3)

i =Q(ec(kT)); j = Q(yp(kT )); i; j 2 N

w(k+1)i;j =w

(k)i;j + �ec(kT ) (3)

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000 493

Fig. 6. Control results during training using� = 0:25, � = 0:01 .

where� learning constant;

yref (kT ) reference input at discrete stepk;

yp(kT ) plant output at discrete stepk;

ec(kT ) closed loop erroryref (kT )� yp(kT ) at discrete stepk;

w(k)i;j contents of MCMAC cell with coordinatesi; j at dis-

crete stepk;

Q(�) the quantization function defined in (1).

Both CMAC and MCMAC attempt to model the plant’s characteris-tics on the basis of the input indices. The difference between the two isthat the former learns using the plant inputx(kT ) and outputyp(kT );while the latter learns using the closed loop errorec(kT ) and plantoutputyp(kT ): Hence, MCMAC does not require a classical controllerto be operational during the learning phase. It uses the closed loop errorand the plant output as the model. This has an added advantage in thedetermination of the training trajectory through the control of the ref-erence inputyref (kT ):

III. I MPROVED MCMAC LEARNING RULE

Although the modification to CMAC has removed the need for aclassical controller, the training of MCMAC still requires careful plan-ning such that all the cells in the MCMAC memory are visited. Thecontents of the MCMAC memory represent the plant characteristic tobe controlled by the neurocontroller. The MCMAC memory can be vi-sualized using a three-dimensional (3-D) characteristic surface. Theaxes of this MCMAC contour surface consists of the cell indicesi; j

[namely,Q(ec(kT)) andQ(yp(kT ))] and the content value of eachcell. This is subsequently used to analyze the training rate of both theMCMAC learning rule and the improved learning rule.

A. Results Using Original MCMAC Learning Rule

An experiment was conducted to analyze the training process of a20� 20 MCMAC memory as a neurocontroller for a CVT gear ratiocontrol using the MCMAC learning rule given in (3). The engine ismodeled using a mapping of torque against TPS and engine speed for a3000cc engine provided by Delphi Automotive System. The cells wereinitialized to zero prior to training together with a learning constant of�

= 0.01. During training, a unit pulse cycle of 100% TPS and a referenceengine speed of 4800 rpm were applied until the vehicle speed reached100 km/h. Brakes were then applied to slow the vehicle to 20 km/h andanother pulse was applied till the vehicle reached a speed of 80 km/h.

Fig. 7. The MCMAC contour surfaces after training using� = 0:25 and� = 0:25, � = 0:01.

Fig. 4 shows the results in the experiment, where eng speed repre-sents the sampled plant outputyp(kT ); ref rpmrepresents the sampledreference inputyref (kT ); gearrepresents the sampled actuator outputx(kT ); ctrl gear represents the sampled controller outputxn(kT ): Attime t1; a unit pulse of 100% TPS was applied. This pulse was main-tained until thevehicle speedreached 100 km/h at timet2: This endsthe first training cycle. The second pulse was applied from timet3 tot4:

Heavy fluctuations in the neurocontroller were observed during thetraining. A long cycle from timet1 to t2 was taken to accelerate thesimulated vehicle to 100 km/h. This occurred due to the slow increasein the engine speed. Although the results showed that the plant couldbe controlled by the fresh MCMAC during training, a large rise timewas observed together with a large closed loop error at the end of thetraining cycle at timet2:

Fig. 5 shows a 3-D visualization of the characteristic surface of theMCMAC after the two training cycles have been applied; whereyp isthe plant output,ec is the closed loop error, andxn is the output ofa specific neuron. There are only a few peaks on the contour surface.This showed that only a few cells in MCMAC have been visited duringthe training cycles since only these cells have been trained.

B. Momentum Term

Although increasing the learning constant� in the MCMAClearning rule may improve the training process, results have confirmedthat large learning constant produces oscillation [19], [5]. Therefore,a small learning constant is generally preferred even though thisproduces slow convergence during training. An alternative approachto improve the rate of training in neural networks can be accomplishedby the addition/inclusion of an inertia or momentum term in thegradient descent expression. This is achieved by adding a fractionof the previous weight change to the current weight change. An

494 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000

Fig. 8. Control results during training using� = 0:75, � = 0:01.

update rule introduced by Rumelhartet al. [16] that includes such amomentum term is shown in (4)

�w(k+1)m = ��

@E

@w(k)m

+ ��w(k)m (4)

where�w

(k)m change in weight of neuron m at discrete stepk;

wm weight of neuron m at discrete stepk;� learning constant;� momentum constant;E error.

The addition of the momentum term smoothens the gradient descentby preventing extreme changes in the gradient due to local anomalies.It produces an averaging effect on the trajectory of the gradient as itmoves downhill [13]. Numerous experiments conducted by researchers[6], [19], [17] have showed that the inclusion of the momentum factorhas improved the speed of convergence as well as maintaining stability.The MCMAC learning rule using a momentum term is given in (5) and(6)

w(k+1)i;j =w

(k)i;j + �(1� �)ec(kT ) + ��w

(k)i;j (5)

�w(k+1)i;j =w

(k+1)i;j � w

(k)i;j (6)

wherew(k)i;j contents of MCMAC cell with coordinatesi; j at dis-

crete stepk;�w

(k)i;j change in weight of MCMAC cell with coordinates

i; j at discrete stepk;ec(kT ) closed loop erroryref (kT )� yp(kT ) at discrete step

k;� learning constant;� momentum constant.

C. Results Using Momentum Term

The same set of experiment in Section III-A was repeated using theimproved MCMAC learning rule with a momentum term. Fig. 6 showsthe experimental results using a learning constant of� = 0:25 and amomentum constant� = 0:01.

At time t1; a unit pulse of 100% TPS was applied. This pulse wasmaintained until the vehicle speed reached 100 km/h at timet2: Thisends the first training cycle. The second pulse was applied from timet3to t4: Comparing the results in Figs. 10 and 8, the two training cycleshad been reduced from 22 s (without the momentum term) to 17 s (with

Fig. 9. The MCMAC contour surfaces after training using� = 0:75 and� = 0:25; � = 0:01:

the momentum term). There was a significant reduction in fluctuationsin the controller output of thectrl gear, although certain localized rippleexists.

Fig. 7 shows the 3-D contour surfaces of the MCMAC after trainingwith and without the momentum term. They were plotted withyp as theplant output,ec as the closed loop error, andxn as the output of eachneuron. More peaks were observed on the contour surface when mo-mentum was included in the learning process. The use of momentumalso allows more cells in the MCMAC to be trained with the sametraining cycles. This significantly improves the speed of learning.

Since significant improvements was achieved using a momentumconstant of� = 0:25, another experiment was conducted to observethe training process with a larger momentum constant. Fig. 8 shows thesimulation results in which the MCMAC memory was trained with amomentum constant of� = 0:75 and learning constant of� = 0:01.

At time t1; a unit pulse of 100% TPS was applied. This pulse wasmaintained until the vehicle speed reached 100 km/h at timet2: Thisends the first training cycle. The second pulse was applied from timet3 to t4: Comparing the results in Figs. 12 and 10, the two trainingcycles had been further reduced to 13 s against 17 s when� was setat 0.25. However, the plant output reached a value ofengine speed=4800 rpm att = 3:2 s. This value continued to overshoot to 6000 rpmat timetovershoot; which was the maximum engine speed. This valuehad significantly overshot the reference input ofref rpm= 4800. Inaddition, very huge fluctuations occurred in the controller output.

Fig. 9 shows the 3-D contour surfaces of the MCMAC after trainingusing large and small momentum terms of�1 = 0:75 and�2 = 0:25,respectively. The peaks of some cells trained with a large momentumhad saturated. The training of the MCMAC under large momentum hasresulted in cells having either minimum output values for untrainedcells or having the maximum output value for trained cells. Under a

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000 495

Fig. 10. Control results during training for� = 0:25, � = 0:01, N = 1.

large momentum term, MCMAC training produces control that degen-erate into simple ON-OFF schedules.

Although a large momentum term achieved short rise time in thetraining of the MCMAC, it produced large amount of overshoot. In ad-dition, the trained MCMAC degenerates into straightforward ON-OFFcontrol that is similar to the winding up of the PID controller. There-fore, a general guideline in the choice of a value for the momentumterm� is that it should not produce high region of saturation in thecontroller effort during the MCMAC learning process.

D. Neighborhood Learning

Instead of using a sharp neighborhood boundary, the Gaussianneighborhood function [13] can be used in the SOFM during thelearning process to distribute the information to neurons surroundingthe winning neuron. The difference between the SOFM and theMCMAC learning rule is that the former uses the Kohonen learningrule [8], while the later uses Grossberg learning rule [4]. Likewise,the use of the Gaussian function to distribute the learnt information ofthe winning neuron to neighboring neurons can also be incorporatedinto the MCMAC learning rule. The neighborhood-based MCMAClearning rule with momentum is described in

w(k+1)m:n =w

(k)m;n + hm;n(�(1� �)ec(kT ) + ��w

(k)m;n)

for jm� ij � N; jn� jj � N ; i; j 2 N

(7)

hm;n = exp�jrm;n � ri;j j

2

2�2(8)

�w(k+1)m;n =w

(k+1)m;n � w

(k)m;n (9)

whereN neighborhood constant;jrm;n�ri;jj distance between cell with coordinatesm;n and cell-

with coordinatesi; j;� radius of the Gaussian distribution;hm;n neighborhood function for neuron with coordinates

m;n;wm;n contents of MCMAC cell with coordinatesm;n at

discrete stepk.

E. Results Using Neighborhood and Momentum Term

The same set of experiments conducted in Section III-C was repeatedusing the neighborhood-based MCMAC learning rule with momentum.The MCMAC was trained with a learning constant of� = 0:01, mo-mentum constant� = 0:25, neighborhood constantN = 1 and aneighborhood function with� = 2:0. The results are shown in Fig. 10.

Fig. 11. MCMAC contour surfaces after training usingN = 1 andN = 0;� = 0:25, � = 0:01.

Fig. 12. MCMAC contour surfaces after training usingN = 3 andN = 1;� = 0:25, � = 0:01.

At time t1; a unit pulse of 100% TPS was applied. This pulse wasmaintained until the vehicle speed reached 100 km/h at timet2: Thisends the first training cycle. The second pulse was applied from timet3 to t4: The results were similar to those produced with only the mo-mentum term. The plant output reached anengine speedof 3800 rpmat time t2: This was again close to the reference input ofref rpm=4800. However, localized ripples in the controller output in Fig. 6 wereabsent with the introduction of the neighborhood-based training.

Fig. 11 shows the 3-D contour surfaces of the MCMAC after trainingusing a neighborhood constant ofN1 = 1 andN2 = 0; (the latter doesnot utilize neighborhood information). Significantly more cells werevisited during the two training cycles when neighborhood learning wasapplied. The addition of neighborhood training allows more cells to betrained, hence further improving the speed of learning.

Employing neighborhood training in MCMAC allows greaternumber of cells to be trained during the learning process. This reducedthe localized ripples in the control effort as shown in Fig. 6. Indirect

496 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000

Fig. 13. MCMAC contour surface after training using� = 0:25, � = 0:01,N = 3.

training of neurons through their activated neighbors reduces thedifficulty and deficiency in the planning of the MCMAC training path.Hence the neighborhood trained MCMAC are able to provide controlin regions adjacent to the training trajectory.

The neighborhoodN was further increased to 3 to investigate theeffect of neighborhood size. Fig. 12 shows the 3-D contour surfacesof the MCMAC after training using a neighborhood ofN1 = 3 andN2 = 1:

Fig. 12 shows the MCMAC with an increased in the number of cellstrained. Since the Gaussian neighborhood functionhm;n decreases thedistribution of learnt information from the active cell to distance cells,a high neighborhood constant produces a smoother surface With theuse of the same Gaussian neighborhood function, a high neighborhoodvalue requires more computation as compared to smaller neighborhoodconstants. In addition, it looses the advantage of the specialized trainingfor individual cells, making it difficult to identify untrained regionsduring the design of the training sequence. Therefore, neighborhoodvalue should be chosen such that the neighborhood under training is nottoo large in relation to the total surface area. This allows easy identifi-cation of specific untrained region directly from the MCMAC contoursurface.

F. Typical Training Parameters

Further experiments were carried out to provide some guidelines forsetting the training parameters in MCMAC. These experiments wereconducted using a 50� 50 MCMAC memory as a neurocontrollerfor the CVT gear ratio control using the neighborhood-based MCMAClearning rule with momentum. Figs. 13–16 show the characteristic sur-faces under the same training sequence using different training param-eters and iterations.

Fig. 13 shows the 3-D contour surface of the MCMAC after it hadbeen trained for 1 iteration with the training parameters:� = 0:25,� =

0:01, N = 3; whereyp is the plant output,ec is the closed loop error,andxn is the output of each cell. Figs. 14–16 show the 3-D contoursurfaces of the MCMAC after it has been similarly trained for 1, 2,and 3 iterations, respectively, using the following training parameters:� = 0:25, � = 0.025,N = 5.

Both the learning constant ( and the momentum constant ( shouldbe chosen to allow more cells in the MCMAC to be trained with thesame training cycles. However, the choice of the momentum constantshould not produce high region of saturation in the controller effort inthe MCMAC learning process. The neighborhood constant should bechosen such that the neighborhood under training is not too large in

Fig. 14. MCMAC contour surface after training using� = 0:25,� = 0:025,N = 5:

Fig. 15. MCMAC contour surface after two training iterations using� =

0:25, � = 0:025,N = 5.

Fig. 16. MCMAC contour surface after three training iterations using� =

0:25, � = 0:025,ff N = 5.

relation to the total surface area. This allows easier identification ofspecific untrained region directly from the MCMAC contour surface.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000 497

Fig. 17. Overall results using MCMAC for CVT gear ratio control.

Fig. 18. Results using MCMAC for CVT gear ratio control.

IV. PROPOSEDUSE OFAVERAGED TRAPEZOIDAL OUTPUT

The proposed improvements to the MCMAC training rule only ad-dresses the training rate of MCMAC. An alternative to improving thecontrol performance of MCMAC is to directly increase the number ofcells in the MCMAC memory. Overlays of CMAC memory was sug-gested [2] to smoothen the CMAC output, which indirectly increasesthe CMAC memory. An improved method for cell recall using theav-eraged trapezoidal output(MCMAC-ATO) is proposed to improve thecontrol performance of the MCMAC without further increase in thenumber of cells in MCMAC. A quantitative measure of the fluctuationsin control action of a trained MCMAC-based on the Pearson productmoment correlation using the both proposed ATO recall and the stan-dard recall techniques is presented for analysis.

A. Performance of MCMAC

Fig. 17 shows the overall results of the vehicle performance obtainedusing the MCMAC in controlling the CVT gear ratio. The 3-D contoursurface of the trained MCMAC is shown in Fig. 16. Fig. 18 shows theinterval fromt3 to t4 in greater details.

At time t3; a step input ofref rpm= 4800 was applied by increasingthe throttle position to 100%. The engine speed changed with changes

in the gear ratio to allow the vehicle to accelerate. As the engine speedapproached the reference 4800 rpm, the MCMAC reduced the CVTgear ratio such that the engine rpm was maintained at 4800 rpm withminimum overshoot. At timet = 14 s, theref rpmwas reduced to 2000rpm, with the throttle position decreased to 30%. The MCMAC thencontrolled the CVT gear ratio such that the engine rpm was maintainedat 2000 rpm. However, numerous controller oscillations can be ob-served while the MCMAC was controlling the CVT gear ratio to main-tain the desired engine rpm.

These controller fluctuations are inevitable using the standard re-call rule because the active cell changes frequently on the slope ofthe MCMAC characteristic surface. With a configuration of 50� 50MCMAC in Fig. 16, there are approximately six to eight cells in theMCMAC that forms the “characteristic slope of the hill” that controlsthe CVT gear ratio during stable engine speed. Unless this number ofcells on the slope is increased, the controller fluctuations will occur asit switches from one active cell to another.

Increasing the size of the MCMAC memory is a possible solutionto reduce this fluctuation. However, based on the 3-D contour surfaceof the MCMAC shown in Fig. 16, the output of most MCMAC cellsare either stuck at the maximum or minimum control value. Thus in-creasing the size of the MCMAC memory may not be cost effective.

498 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000

Fig. 19. Averaged trapezoidal output from four MCMAC elements.

Fig. 20. Overall results using MCMAC-ATO for CVT gear ratio control.

B. Proposed MCMAC with ATO

The output control uses only one active MCMAC cell from theMCMAC memory to control the plant at any instant. This singleactive cell, which corresponds to a winning neuron, is selected bydiscretizing theec(kT ) and yp(kT ) as the indices to the locationof the cell in the MCMAC memory. However, whenec(kT ) andyp(kT ) are changed slightly, this discretization process maps theindices to a new MCMAC cell and subsequently introduces achange in the output. The magnitude of this change, which is inthe order of the difference between the output of the previouslyactive MCMAC cell and the currently active MCMAC cell, issignificant when the MCMAC memory size is small. Therefore,the sudden change in the output produces control fluctuations inthe output from the MCMAC.

Fig. 19 shows a section of four MCMAC cells from the character-istics surfaces of a trained MCMAC; wherede and dy denotes thechanges inec andyp, respectively, for the corresponding changes inthe indices of the selected MCMAC element. The quantization error is

1

2de and 1

2dy for ec andyp, respectively. Assumingyp is constant, a

sudden change in the output of magnitudejwi+1;j+1 �wi;j+1j occurswhenec increases due to quantization error.

Therefore, to ensure a smooth change in the output from theMCMAC a new method of recall by computing the output using theaveraged trapezoidal output (ATO), termed MCMAC-ATO is proposedin this paper. In this method, the outputs from all the four neighboringMCMAC elements that formed the bound of the discretization ofypandec (wi;j; wi+1;j ; wi;j+1 andwi+1;j+1 as shown in Fig. 19) areused to determine the final MCMAC output. The final outputw canbe obtained using (10)–(12)

wa =@yp

dyp(wi;j � wi;j+1) + wi;j+1 (10)

wb =@yp

dyp(wi+1;j � wi+1;j+1) + wi+1;j+1 (11)

w =@ec

dec(wb � wa) + wa (12)

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000 499

Fig. 21. Results using MCMAC-ATO for CVT gear ratio control.

wheree(i)c value ofec at ec coordinatei;y(j)p value ofyp at y coordinatej;@ec ec(kT ) � e

(i)c ;

@yp yp(kT ) � y(j)p ;

dec e(i+1)c � e

(i)c ;

dyp y(j+1)p � y

(j)p ;

wi;:j output weight of MCMAC cell with coordinatesi; j.

C. Performance of MCMAC-ATO

Fig. 20 shows the overall results of the vehicle performance obtainedusing the MCMAC-ATO in controlling the CVT gear ratio. The 3-Dcontour surface of the trained MCMAC is shown in Fig. 16. Fig. 21shows the interval from timet3 to t4 in greater details.

The controller oscillations are greatly reduced while the MCMAC iscontrolling the CVT gear ratio to maintain the engine rpm. The squareof the Pearson product moment correlation between the controller andthe actuator outputs is given in (13). This is used to compare the amountof control fluctuations produced by MCMAC using the standard recalland the MCMAC-ATO recall techniques

r =n XY � X Y

n X2� X

2

n Y 2� Y

2

(13)

wheren total number of samples;X control out (controller output);Y gear ratio (actuator output).

For the experiment fromt = 0 s tot = 17.96 s, the following pearsoncoefficients are obtained:

rMCMAC =0:2394

rMCMAC-ATO =0:6082

.The rMCMAC-ATO shows greater correlation between the con-

troller output from the MCMAC-ATO and the actuator output. TheMCMAC-ATO has demonstrated its ability to reduce the controlfluctuations as a result of quantization. It has improved the MCMAC

recall process without incurring additional overhead that wouldnormally be required with increased MCMAC resolution.

V. CONCLUSIONS

This paper proposed improvements to the learning and recallprocesses of the modified cerebellar articulation controller (MCMAC)neural controller. Detailed discussion on improving the MCMAClearning process with momentum term and neighborhood trainingwere given. The characteristic surface of an MCMAC trained usingthe original learning rule for simulated control of CVT gear ratiowas contrasted against one trained using the improved learning rule.Results showed that the improved learning rule had significantlydecreased the time required for training and increased the amount oftrained MCMAC cells. An improvement to the recall process usingaveraged trapezoidal output (MCMAC-ATO) was also proposed.The control performance of the original MCMAC recall process wascompared against the proposed MCMAC-ATO using square of thePearson product moment correlation between the controller and theactuator outputs. Results showed that the proposed MCMAC-ATOhas significantly reduced the fluctuations in the control action of theMCMAC and has indirectly/partially addressed the issue of fidelity inMCMAC memory resolution.

REFERENCES

[1] J. S. Albus, “A new approach to manipulator control: The CerebellarModel Articulation Controller (CMAC),”Trans. ASME. J. Dyn. Syst.Meas. Contr., vol. 63, no. 3, pp. 220–227, 1975.

[2] M. Brown and C. Harris,Neurofuzzy Adaptive Modeling and Con-trol. Englewood Cliffs, NJ: Prentice-Hall, 1994.

[3] S. Commuri, S. Jagannathan, and F. L. Lewis, “CMAC neural networkcontrol of robot manipulators,”J. Robot. Syst., vol. 14, no. 6, pp.465–482, 1997.

[4] S. Grossberg, “Nonlinear neural networks: Principles, mechanisms, andarchitectures,”Neural Networks, vol. 1, no. 1, pp. 17–61, 1988.

[5] Y. Hirose, K. Yamashita, and S. Hijiya, “Backpropagation algorithmwhich varies the number of hidden units,”Neural Networks, vol. 4, no.1, pp. 61–66, 1991.

[6] R. A. Jacobs, “Increased rates of convergence through learning rateadaptation,”Neural Networks, vol. 1, no. 4, pp. 295–307, 1988.

[7] J. S. Ker, Y. H. Kuo, R. C. Wen, and B. D. Liu, “Hardware implementa-tion of CMAC neural network with reduced storage requirements,”IEEETrans. Neural Networks, vol. 8, no. 6, pp. 1545–1556, ????.

[8] T. K. Kohonen and Self-Organization and Associative Memory, , 3rded. New York: Springer-Verlag, 1989.

500 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 30, NO. 3, JUNE 2000

[9] L. G. Kraft and D. S. Campagna, “A comparison between CMAC neuralnetwork control and two traditional adaptive control systems,”IEEEContr. Syst. Mag., pp. 36–43, April 1992.

[10] S. H. Lane, D. A. Handelman, and J. J. Gelfand, “Theory and develop-ment of higher order CMAC neural networks,”IEEE Contr. Syst. Mag.,pp. 23–30, April 1992.

[11] C. S. Lin and C. T. Chiang, “Learning convergence of CMAC tech-nique,” IEEE Trans. Neural Networks, vol. 8, no. 6, pp. 1281–1292,1997.

[12] W. T. Miller, G. H. Glanz, and L. G. Kraft, “CMAC: An associativeneural network alternative to backpropagation,”Proc. IEEE, vol. 78, no.1, pp. 1561–1567, 1990.

[13] W. D. Patterson,Artificial Neural Networks: Theory and Applica-tions. Englewood Cliffs, NJ: Prentice-Hall, 1996.

[14] H. C. Quek and P. W. Ng, “Realization of neural network controllers inintegrated process supervision,”Int. J. Artif. Intell. Eng., vol. 10, no. 2,pp. 135–142, 1996.

[15] F. Rosenblatt,Principles of Neurodynamics: Perceptrons and the Theoryof Brain Mechanisms. Washington, DC: Spartan, 1961.

[16] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning in-ternal representations by error propagation,” inParallel DistributedProcessing: Explorations in the Microstructures of Cognition, D. E.Rumelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press,1986, vol. 1.

[17] A. Sato, “An analytical study of the momentum term in a backpropaga-tion algorithm,” inProc. ICANN, Espoo, Finland, 1991, pp. 617–622.

[18] S. M. Song and Y. Lin, “Learning hybrid position/force control of aquadruped waling machine using a CMAC neural network,”J. Robot.Syst., vol. 14, no. 6, pp. 483–499, 1997.

[19] T. Tollenaere, “SuperSAB: Fast adaptive backpropagation with goodscaling properties,”Neural Networks, vol. 3, pp. 561–573, 1990.

[20] Y. Wong and A. Sideris, “Learning convergence in the cerebellar modelarticulation controller,” IEEE Trans. Neural Networks, vol. 3, pp.115–121, 1992.

[21] J. M. Zurada,Introduction to Artificial Neural Systems, Singapore: InfoAccess Distribution Pte Ltd..