Organization of receptive fields in networks with Hebbian learning: the connection between synaptic...

17

Transcript of Organization of receptive fields in networks with Hebbian learning: the connection between synaptic...

Organization of Receptive Fields in Networks withHebbian Learning: The Connection BetweenSynaptic and Phenomenological Models�Harel Shouval and Leon N CooperDepartment of Physics, The Department of Neuroscience andThe Institute for Brain and Neural SystemsBox 1843, Brown UniversityProvidence, R. I., [email protected] 28, 1996AbstractIn this paper we address the question of how interactions a�ect the formation and orga-nization of receptive �elds in a network composed of interacting neurons with Hebbian typelearning. We show how to partially decouple single cell e�ects from network e�ects, and howsome phenomenological models can be seen as approximations to these learning networks. Weshow that the interaction a�ects the structure of receptive �elds. We also demonstrate how theorganization of di�erent receptive �elds across the cortex is in uenced by the interaction term,and that the type of singularities depends on the symmetries of the receptive �elds.1 Introduction: Synaptic Plasticity and the structure of the Vi-sual CortexRecently Erwin et. al. (94) performed a comparison between predictions of many models, whichattempt to explain organization of receptive �elds in the visual cortex, and cortical maps obtainedby optical imaging (Blasdel, 1992; Bonhoe�er and Gringald, 1991). In this paper we will considertwo classes of such models; detailed synaptic models (Miller, 1992; Miller, 1994; Linsker, 1986a),and iterative phenomenological models (Swindale, 1982; Cowan and Friedman, 1991). Erwin et.al. (94) have examined these classes of models and claim that the phenomenological models �t theexperimental observations much better than the synaptic models.Detailed synaptic models are fairly complex systems, with a large number of parameters. Theexperience of the last couple of decades in the physics of complex systems has shown that phe-nomenological models are extremely important in explaining properties of complex systems. Forcomplex systems of many interacting particles it is typically hopeless to try and solve the exact�To appear in Biological Cybetnetics vol. 73 1

microscopic theory of these interacting systems; however in many cases coarse grained phenomeno-logical models, that do not take into account most of the microscopic detail, do manage to predictthe properties of such systems. It seems that if this is the case for the relatively simple complexsystems of the physical sciences, then it might also be the case for the much more complex systemsof neurobiology.In this paper we show how the detailed synaptic models can be approximated by much simplermodels that are related to some of the phenomenological models. These simpler models can beused in order to gain a better understanding of the properties of the complex synaptic models andpossibly to understand where reported the di�erences in their properties (Erwin et al., 1994) arisefrom.We analyze a cortical network of learning neurons; each of the neurons in this network formsplastic bonds with the sensory projections, and �xed bonds with other cortical neurons. Thelearning rule assumed herein is a normalized Hebbian rule(Oja, 1982; Miller and MacKay, 1994).The type of network architecture assumed here is similar to that used in many other synapticmodels of cortical plasticity (von der Malsburg, 1973; Nass and Cooper, 1975; Linsker, 1986b;Miller et al., 1989; Miller, 1994, etc.). Rather then presenting a new model we have attempted,in this paper, to analyze the behavior of these existing models, and map them onto much simplermodels that we might be able to understand better. When this analysis is be performed, it becomesclear that in order to understand the behavior of such networks, it is �rst necessary to understandhow single neurons using the same learning rule behave.The purpose of this paper is to start giving answers to two related questions about such networks.(a) How do the lateral interactions a�ect the receptive �elds of single cortical neurons; inparticular, can cortical interactions produce oriented receptive �elds, where the single cell modelproduces radially symmetric receptive �elds.(b) How do the cortical interactions a�ect the organization of di�erent receptive �elds acrossthe cortex.A similar analysis has been used by Rubinstein (94) in order to answer the second question.His analysis di�ers form our in that it concentrates on a periodic homogeneous network, and doesnot examine the e�ect of the lateral interactions on receptive �eld formation.2 A self organizing network modelWe assume a network of interconnected neurons, which are connected with modi�able synapses tothe input layer. The learning rule examined is a normalized Hebbian rule which is sensitive onlyto the second order statistics.The output c(r), of each neuron, in this model is de�ned asc(r) =Xx;� I(r� x)A(�� �(x))d(�)m(x; �):where � is the center of the symmetric arbor function A, and is a function of x. I is the corticalinteraction function, m are the weights, d the input vectors, � denotes points in input space, r andx in output space. 2

We will now present an energy function H , that is minimized by a Hebbian form of learning.Thus the �xed points of a network with Hebbian learning would correspond to the minima s of theenergy function.The energy function H , we will try to minimize is nowH = �Xr c2(r)� (m) (1)= � Xr;x;x0 I(r� x)I(r� x0)X�;�0 A(� � �(x))A(�0 � �(x0))m(x; �)m(x0; �0)d(�)d(�0)� (m)where is an abstract constraint on the allowed values of m.We now perform the standard procedure, of turning this into a correlational model, by takingthe ensemble average over the input environment, and also taking the sum over r. ThusH = �Xx;x0 ~I(x� x0) (2)X�;�0 A(� � �(x))A(�0 � �(x0))m(x; �)m(x0; �0)Q(�� �0)� (m):Where ~I(x� x0) =Pr I(r� x)I(r� x0) is the new e�ective interaction term, andQ(�� �0) =< d(�)d(�0) > is the correlation function.From now on, in order to simplify matters, we assume a step arbor function of the form,A(x) = ( 1 for jxj � �max0 for jxj > �max (3)Using this function is justi�ed since it can produce oriented receptive �elds in both networkmodels (Miller, 1994) and single cell models (Liu, 1994).The gradient descent dynamics implied by this Hamiltonian are nowdm(x; �)dt = 8>>><>>>: Xx0 Xj�0��(x0)j<�max ~I(x� x0)m(x0; �0)Q(�� �0) for jx� �(x)j < �max0 for jx� �(x)j > �max (4)This equation is the same as Miller's (1994) synaptic plasticity equations for the step arborfunction. It is important to note that only sequential or random dynamics are guaranteed tominimize H 1The method we use now, is to represent the weights m in terms of the complete set, that ism(�;x) = Pln aln(x)mln(�;x). We have chosen this complete set to be a solution of the noninteracting eigen-value problem.Xj�0��jmln(�0;x)Q(j�� �0j) = �lnmln(�� �;x); (5)1A simple example which demonstrates that parallel dynamics may not minimize H when parallel dynamics areused, is the oscillation between staggered anti-ferromagnetic solutions in a nearest neighbor ferromagnet.3

In a radially symmetric environment the eigenfunction will take the formmln(�;x) = mln(�; �ln(x)) = eil�ln(x)Uln(�)where �ln(x) is an arbitrary phase which is undetermined, due to the radially symmetric nature ofthe correlation function.Using this expansion, and taking sum over � the Hamiltonian takes the form,H = �Xx;x0 Xlnl0n0 ~I(x� x0)�lnaln(x)al0n0(x0)O �l; n; l0; n0; �ln(x); �ln(x0); �(x); �(x0)�� (a) (6)Where O the overlap between two receptive �elds, isO �l; n; l0; n0; �ln(x); �ln(x0); �(x); �(x0)�= X�0 mln ��0 � �(x); �ln(x)�ml0n0 ��0 � �(x0); �l0n0(x0)� : (7)It is important to notice that in general, O depends on the distance between the two arbor functions,as well as on the shapes and phases of the two eigen-states.3 Analysis of the Common Input modelWe will now make the simpli�cation that � = � = 0 that is that all neurons have an arbor functionwith the same center point, this simpli�cation makes analysis simpler, and is a good approximationwhen the number of output neurons is much greater than the dimensionality of the input.It is important to notice though that the conclusions which we deduce from this common inputmodel, may not be applicable to more complex models. In particular it is important to notice thatin other models, O may change signs as a function of jx� x0j and thus turn an excitatory model,i.e: one in which I � 0 to one in which there is e�ectively an inhibitory portion to the interaction.For the above choice of arbor function the overlap O takes a very simple form,O �l; n; l0; n0; �ln(x); �ln(x0)� = �ll0�nn0 expil[�ln(x)��ln(x0)] :Thus for the common input model the Hamiltonian isH = � Xxx0lm ~I(x� x0)�lnaln(x)aln(x0)eil[�ln(x)��ln(x0)] � (a): (8)When observing equation 8 it is easy to see that the eigenvalues of the correlation function,play a major role in the resulting Hamiltonian. They could serve as an importance parameterby which approximations can be made. Hence the approximation proposed here is to neglect allterms except those multiplying the largest eigenvalues, these approximations will be termed hereinsimpli�ed models.In order to make things concrete, we assume that �01 = �0, and �11 = �1 are the largesteigenvalues and considerably larger then the other eigen-values. This is not an arti�cial choice,4

since in many circumstances this indeed seems to be the case (MacKay and Miller, 1990; Liu andShouval, 1994; Shouval and Liu, 1996), although in other regimes other eigen-states will dominate.With this approximation the Hamiltonian takes the form:H = �Xrr0 ~I(r� r0) ��0a(r)a( r0) + �1b(r)b(r0)cos[�(r)� �(r0)]� (a; b)Where a = a01 and b = a11. Di�erent types of constraints can be considered. Given the constraintthat the weights are square normalized, i.e that P�m(�;x)2 = 1,2 which for the approximatemodel takes the form a2(r) + b2(r) = 1, the model is equivalent to the non isotropic Heisenbergspin model embedded in two dimensions. This can be made more clearer this is rewritten in twoequivalent forms. If we set a(r) = Sz(r) = cos(�(r)); b(r) = sin(�(r)) , Sx(r) = b(r)cos(�(r))and Sy(r) = b(r)sin(�(r)): then the Hamiltonian takes the form,H = �Xrr0 ~I(r� r0) ��0Sz(r)Sz(r0) + �1[Sx(r)Sx(r0) + Sy(r)Sy(r0)] (9)Subject to the constraint that S2x + S2y + S2z = 1.This can also be put in the closed form,H = �Xr;r0 I(r� r0)f�0 cos(�(r)) cos(�((r0))+ �1 sin(�(r)) sin(�(r0))[cos(�(r)) cos(�(r0)) + sin(�(r)) sin(�(r0))]g: (10)Which is exactly the non-isotropic Heisenberg model of Statistical Mechanics.It can be seen that each neuron is associated with only two dynamical variables � and �, thusthis approximation has resulted in a major simpli�cation in the complexity of the model.If �1 >> �0 and we can discard of the terms multiplying the �0 3 this becomes an XY modeland is equivalent to the phenomenological model proposed by Cowan and Friedmann (Cowan andFriedman, 1991). In this case the Hamiltonian takes the simple formH = �Xrr0 I(r� r0) cos ��(r)� �(r0)� ; (11)and the dynamics are _�(r) = �Xr0 I(r� r0) sin ��(r)� �(r0)� : (12)Another, well known, phenomenological models is the Swindale model (82) for the developmentof orientation selectivity; we will show that it is similar to the Cowan Friedmann model. Swindalede�nes the complex variable Z(r) = Sx(r) + iSy(r) or Z(r) = b(r) exp(i�(r)) 4.2This is the constraint to which the Oja (82) rule converges3This is the same type of approximation that lead from equation 8 to equation 10. It's validity in this case canbe veri�ed by examining �gure 3 and 5.4This angle is di�erent than the one Swindale chooses for the orientation angle, He de�nes the orientation angle�s as �s = 12�: 5

For which he de�nes the dynamics_Z(r) =Xr0 I(r� r0)Z(r0)(1� jZ(r)j): (13)Thus for the variables �(r) = tg�1(Sx=Sy) and b(r) = jZ(r)j, the dynamics are_�(r) = �(1� b(r))Xr0 I(r� r0) sin ��(r)� �(r0)� (14)and _b(r) = � (1� b(r))b(r)Xr0 I(r� r0) cos ��(r)� �(r0)� (15)Comparing equations 12 and 14 we see that for b(r) < 1 the dynamics of the angle � in theSwindale model are parallel to those in the Cowan model. However, when b(r) = 1 the Swindaledynamics reach a �xed point and when b(r) > 1 they are anti-parallel to the Cowan and Friedmanndynamics. The dynamics for b(r) when b(r) < 1 are such that if h(r) = Pr0 I(r� r0) cos(�(r) ��(r0)) > 0 then b(r) ! 1, this would be the common case since the dynamics for �(r) try tomaximize h(r) however in some cases, such as near orientation singularities this may not be possibleand in these cases b(r)! 0.Thus it can be expected that if the Swindale simulations are started at small values of b(r) andthe learning rate is slow enough, then the �xed points of the orientation maps would be similar tothe Cowan and Friedmann �xed points.In synaptic space the constraint imposed in the Swindale model is equivalent to stopping theupdate of each neuron once it's synaptic weights becomes normalized. This seems a non localconstraint, although it could possibly be made local by a procedure similar to the one Oja has usedto attain square normalization. The �nal state of a Swindale type model can depend on the initialconditions, and on the learning rate �, thus the Swindale model has extra (spurious) �xed points,which the Cowan and Friedmann model does not possess.Another possible choice for the normalization, is to set a+b=1 , then the approximate Hamil-tonian takes the formH = �Xrr0 I(r� r0) ��0a(r)a( r0) + �1[1� a(r)][1� a(r0)]cos[�(r)� �(r0)]It is important to note that this Hamiltonian is not bounded, as may be the case when the van derMalsburg (73) type of normalization is used. This choice of normalization is therefore not valid forour purpose.In the following sections we will use the simpli�ed model, as de�ned in equations 9 to 10 toanswer the questions posed in the introduction.4 The Excitatory case I � 0In the Excitatory case, we know that the global minimum of the common input energy function(eq:8) as well as in the approximation (eq:10), occurs when all receptive �elds are equivalent to theprincipal component, and have the same orientation.6

Since we know what the global minimum is, it is interesting to see how simulations of thesimpli�ed model manage to reach this minimum. Doing this is a test of our procedures, but it alsohas interesting implications for simulations of detailed synaptic models of the Miller, Linsker type.In Figure: 1 we show the time evolution of the excitatory simpli�ed model (equation 10), when�1=�0 = 1:3, i.e: the oriented solution has a higher eigenvalue. In these simulations we used aGaussian interaction function, with a standard deviation � = 2:5. The dynamics used were randomgradient descent dynamics. We have de�ned one iteration in this network as the number of randomupdates, required for updating each neuron on average once.We can see that the global minimum is indeed reached, however it takes several hundred it-erations to reach the global minimum for the parameters we have used, which we have tried tooptimize. The neurons attain their maximal selectivity quickly, most of them after 30 iterations,however it takes a very long time in comparison for the neurons to become parallel.Figure 1: Time evolution of orientation in an excitatory network. The images show the orientationsin a 30 by 30 network of neurons, which were run for 5,15,30,100,400 and 600 iterations, displayedfrom left to right, top to bottom. The direction of the line codes orientation, and it's intensitycodes for the degree of orientation selectivity. A great majority of the cells have already attained ahigh degree of selectivity, at 100 iterations, but it takes several hundred more iterations for them tobecome completely parallel.Simulations of the detailed synaptic models which are much more complex, were typically runwith a smaller number of iteration (Miller, 1994). In the Miller model the learning rule usedwas unstable, therefore saturation limits were set on the synapses. When synapses reached theselimits they were frozen, these type of bounds we term sticky bounds. When a high fraction of thesynapses become saturated, the simulations are terminated. Since in our simulations neurons reach7

their maximal degree of orientation selectivity long before they become parallel, it implies that ifthe sticky bounds are not used and if simulations are run for more iterations, then the resultingorientation maps may be quite di�erent. Thus sticky boundaries are more than a computationalconvenience.Another di�erence in the way these simulations, and the detailed synaptic simulations havebeen performed is that we used random sequential dynamics, whereas parallel dynamics have beenused in the simulations of the complete synaptic models (Miller, 1994; Linsker, 1986b). Thisimplies that all synapses in the network are updated synchronously, i.e: at once. Parallel dynamicsare computationally convenient, but they may alter the �nal state the network will reach and inparticular they can prevent the model from converging to an energy minima. If parallel dynamicsare assumed, then rotating wave solutions may occur as was shown by Paullet and Ermentrout (94). They analyzed equations similar to equations 10 - 14, which occur in several other model physicalsystems, and found that they have a stable rotating wave solution.These types of continuous spin models, even with only excitatory connections, can attain vortexstates as well, if a �nite amount of noise is introduced into the dynamics (Kosterlitz and Thouless,73). These vortex states however are uctuations in the statistical mechanics sense, therefore theywill move in time like the rotating phase solutions. In this sense they are quite di�erent from thevortex states we will show can be attained with models which include inhibition.5 Interactions with InhibitionIn the case where there is some inhibition in the network, the situation is more complex, and alsomore interesting. In this case we do not know what the global minimum is; we will therefore usesimulations of the simpli�ed models in order to examine this question numerically what.Using the simpli�ed models has the great advantage of being reducing complexity and thereforerequiring much less computation; further de�nitions such as orientation selectivity and orientationangle have a transparent de�nition. On the other hand there is the risk of not being a validapproximation. This might happen, for example if when running the full model, the �nal receptive�elds would have strong components of the less prominent eigenfunctions which are not representedin the approximation.In order to examine the amount of single cell symmetry breaking, we de�ne an order parameter,rms(a), which is de�ned as rms(a) = p< a2 > where the average is taken over all network sites.This order parameter shows how much the single cell receptive �elds in the network deviate fromnon-interacting single cells. Since in the all excitatory case the values of rms(a) should be likethose in the non interacting case, we can asses the e�ect of the interaction by comparing the valuesof rms(a) for interactions with inhibition, to those which have only excitation. Strong e�ects ofthe interaction may also imply that the approximation may break down.In our simulations, we used an interaction function of the form,~I(r) = exp[�(r2=2�2)]� IN � exp[�(r2=2(2�)2)];where we have �xed � = 2:5. Hence for simplicity we have varied only the value of IN . Crosssections of the di�erent interaction terms used are shown in �gure: 2.The cases we have examined are Excitatory IN = 0:0, Balanced IN = 0:25, where theintegral of the interaction function is zero, Net Inhibitory IN = 0:5 in which there is more8

inhibition then excitation, and Net Excitatory IN = 0:125 where there is more excitation theninhibition. We have run all of our simulations for at least 1500 iterations and tested the stabilityof the �nal state, in order to make sure that we have indeed reached a stable �xed point.-0.2

0

0.2

0.4

0.6

0.8

1

-10 -5 0 5 10

I(r)

r

The Lateral Interaction

ExcitatoryBalanced

Net InhibitoryNet ExcitatoryFigure 2: This image presents the di�erent types of interaction functions of the form I(r) =exp(�r2=�2)�IN �exp(�r2=(2�)2). Excitatory IN = 0:0, i.e there is only excitation, BalancedIN = :25 i.e the inhibition and excitation are balanced, Net Inhibitory IN = :5 more excitationthen inhibition, and Net Excitatory IN = :125 more excitation then inhibition.

00.20.40.60.810 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2rms(a) �1=�0

The e�ect of the lateral interaction on Symmetry breakingExcitatoryBalanced 33 3 3 3 33333 3 3 3 3Net Inhibitory ++ + + + +++++ + + + +Net Excitatory 22 2 2 2 22222 2 2 2 2Figure 3: This image presents rms(a), the rms value of a, as a function of �1=�0 for the di�er-ent types of interactions as de�ned above. Inhibitory interactions can create orientation selectivecells in regions in which single cells are radially symmetric. The di�erent types of interaction wehave used have the form I(r) = exp(�r2=�2) � IN � exp(�r2=(2�)2). and di�er in the value ofIN , Excitatory IN = 0:0, i.e there is only excitation, Balanced IN = :25 i.e the inhibitionand excitation are balanced, Net Inhibitory IN = :5 more excitation then inhibition, and NetExcitatory IN = :125 more excitation then inhibition.As can be seen in �gure 3 the interaction can indeed break the symmetry, although only mildly9

since under the most conditions investigated most receptive �elds remain dominated by the �rstprincipal component. In �gure 4, we chose to display the network, in an extreme case of symmetrybreaking; even here most receptive �elds have a strong, radially symmetric component.Figure 4: Simulations results for �1=�0 = 0:95 and a balanced interaction. The receptive �elds(left) and the orientation directions (right). The single cell solution is radially symmetric, but heresome symmetry breaking does occur although many of the cells are still dominated by the radiallysymmetric component (shown by the short length of many of bars representing orientation). Inorder to represent the single receptive �elds we have chosen an arbitrary normalized functions, oneradially symmetric and one with an angular dependence of / cos(�)Figure 5: Simulations results for �1=�0 = 0:3 Balanced interaction. The single cell solution isradially symmetric, and so are the network solutions. The e�ect of the inhibitory term in the lateralinteraction has been to form staggered ordering.The e�ect of the interaction on the global ordering is more pronounced. In �gure 5, we can seethat when the radially symmetric principal component is dominant, the e�ect of the interaction10

would be a staggered type of �nal state.When the oriented solution dominates, the emerging structure has bands with similar orienta-tions, and some orientation singularities as can be seen in �gure: 6.Figure 6: When �1=�0 = 1:3 the �nal state shows a banded structure, and structures sometimesdescribed in the biological literature as pinwheels, The map displayed here shows singularities of theS = 1 type.These singularities however, are of the S = 1 type which means that, around a singularity, theorientation angle changes by 360deg. It has been claimed (Erwin et al., 1994) that in the cortex andin many models such as in the Swindale (92) model and in the Miller (94) model, the singularitiesare of the S = 12 type. i.e the orientation angle changes by 180 deg. Examples of such singularitiesare illustrated in �gure 7.This stands in contradiction to the results in the simpli�ed model, which therefore raises thequestion of the source of these di�erences.In the Swindale model the orientation angles are de�ned to be half of the angle of the complexvariable z, i.e �s = 12�, and thus when � has a S = 1 singularity �s has a S = 12 singularity.The di�erences between the simpli�ed model and the Swindale model actually arise for a similarreason. We claim that when the dominant single cell solution has the angular component cos(m�),or in general when a rotation of (360=m) deg leaves the receptive �eld as it was, then the resultingsingularities will be of the type S = 1m . The heuristic reasoning for this claim is the following;suppose that we divide the interaction function into two terms, an inhibitory term and an excitatoryterm, I = IE + II such that IE > 0 and II < 0. It is usually assumed that the excitatory termdominates the short range interactions. The excitatory term attempts to enforce continuity, andlow curvature, i.e: a small as possible change between neighboring orientations. Thus a closed patharound a singularity should exhibit the minimal curvature which retains continuity. For receptive�elds with an angular dependence cos(m�), this implies a rotation around the curve by 360=mdegrees, i.e a singularity of type S = 1m . More explicitly we predict that when the single cellsolution has a cos(2�) angular dependence, then the singularities exhibited will be of the type11

S=1

S=0.5Figure 7: A schematic drawing of di�erent types of singularities , S = 1 are drawn above, S = 12below .S = 12 . In �gure 8 we have displayed an orientation map produced by the simpli�ed model, whenwe make this assumption. The singularities are indeed of the S = 1=2 type.The receptive �elds obtained in the cos(2�) model look far from plausible biologically. Ina previous paper (Shouval and Liu, 1996) it has been shown that this can be corrected by theasymmetric portion of the correlation function, which we did not take into account here. Whenwe simulate the degenerate case (�2 = �0) as shown in �gure 9 we obtain more plausible receptive�elds even in the absence of the non symmetric portion of the correlation function. However, all thereceptive �elds obtained under the assumptions we have made in this paper have broader tuningcurves than typically found in the cortex.We believe that the type of singularities exhibited in synaptic types of models, depend criticallyon the dominant eigen function of the single cell in the same parameter regime. Our belief is basedon the simpli�cations obtained by the analysis performed in this paper; this is but one example ofthe power and usefulness of such analysis. Therefore claims about synaptic models (Erwin et al.,1994) should be restricted to the parameters examined.6 Related ModelsWhen we do not make the common input assumption, the overlap function O takes on a di�erentform.The other extreme from the common input case , is the case where inputs to neighboring cellshave no spatial overlap. In this case the overlap function takes the simple formO [1; 1; 1; 1;x; (x0); �(x); �(x0)] = �(x� x0)�ll0�mm0 , and the Hamiltonian takes the formH =Xx Xln ~I(0)�lna2ln � (a):12

Figure 8: When the dominant eigenfunction has a cos(2�) angular form, and respectively �2=�0 =1:3 the �nal state shows S = 12 type singularities.Thus the Hamiltonian becomes a a non interacting Hamiltonian5 and no e�ective interaction existsbetween neighboring neurons. Therefore the �nal state will be such that each neuron will settle intoa single cell receptive �eld, and the orientations will not be coupled. For this model no continuityof the orientation �eld should be expected. It should be noted that the outputs of the neuronsC(r) are correlated due to the interactions; just the angles of the feed forward component of thereceptive �elds become uncorrelated in this case.A more complex case is the one which is examined in the Linsker and Miller models, in which�(x) = x; in this case the overlap function is complex, and depends on the exact details of thereceptive �elds. However we can obtain some understanding by looking qualitatively at the formof the overlap function. When �(x) = const an overlap exists only between same type receptive�elds. When the inputs are shifted, interactions exists between di�erent types of receptive �eldsas well, as for example, between cos(�) and cos(2�) receptive �elds. The interaction between sametype receptive �elds is also a function of the distance between their centers in the input plane; inparticular it can change signs as a function of jx� x0j. Some such abstract examples are shown in�gure 10.When this happens, a model which is purely excitatory can behave like a model with inhibition.To understand this lets assume for simplicity that �11 > �ij for every i and j, and that the crossterm overlaps are very small. In such a case the Hamiltonian will take the form,H �Xxx0 ~I(x� x0)O ��(x); �(x0); �(x); �(x0)� : (16)where we have omitted the l = l0 = 1 and n = n0 = 1 indexes for simplicity. In a purely excitatorycase, ~I > 0, the aim will be to minimize the overlap functions O, as a function of the angles5An interaction term exists in this case; that H is e�ectively non interacting becomes evident just when trans-forming to this basis 13

Figure 9: In the degenerate case, when the eigen value of the radially symmetric, center surround,solution �0 is equal to that of the cos(2�) solution, the resulting receptive �eld structure seemsbiologically plausible, and the �nal state shows S = 12 type singularities.�. When observing �gure 10, it is obvious that for Y = �(x) � �(x0) small the two receptive�elds would preferentially have the same phase angle �, where as at larger distances it would beminimized when �(x) = �(x0) + �. This observation enables us to understand that the commoninput approximation is sensible, If the width of the interaction term ~I is smaller that the �rst zerocrossing of the overlap term O.7 DiscussionWe have investigated models of interacting Hebbian neurons and have shown that learning in such anetwork is equivalent to minimizing an energy function. We have further shown that under certainconditions this network can be approximated by continuous spin models, and is similar to someof the proposed phenomenological models (Cowan and Friedman, 1991; Swindale, 1982). In thecommon input case we have shown that for excitatory interactions there is no symmetry breaking,and that all receptive �elds have the same orientation. We have also used this case to make thepoint that sticky boundary assumptions can drastically change the organization of receptive �eldsin the model, so that reported results are heavily dependent on when simulations are terminated.The addition of noise (Kosterlitz and Thouless, 73), or parallel dynamics (Paullet and Ermentrout,1994), can create vortex states that move in time.Inhibition a�ects both the structure of the receptive �elds and the organization of di�erentreceptive �elds across the cortex. When inhibition is su�ciently strong, networks will have someorientation selective receptive �elds, even though the single cell receptive �elds are radially sym-metric.The global organization produced by models with inhibition, shows iso-orientation patches, and14

1

O

Y

Y Y

Y

O

1

10

1

Y

Y

1

o

Y

YFigure 10: A schematic drawing of some aspects of the overlap functions for shifted inputs. Wehave de�ned Y = �(x)��(x0). Above cases of overlaps between di�erent types of RF's are displayed,below between same kinds of RF'spinwheel type of singularities. The types of singularities depend critically on the on the symmetryof the receptive �elds.The major qualitative di�erence between the common input model, and shifted input models, isthat even in the absence of inhibition, shifted input models can behave like models with inhibition.8 AcknowledgmentsThe authors thank Yong Liu, Igor Bukharev, Bob Pelcovits and all the members of Institute forBrain and Neural Systems for many helpful discussions. Research was supported by grants fromthe Charles A. Dana Foundation the O�ce of Naval Research and the National Science Foundation.ReferencesBlasdel, G. G. (1992). Orientation selectivity,preference, and continuity in monkey striate cortex.The Journal of Neuroscience, 12(8):3139{3161.Bonhoe�er, T. and Gringald, A. (1991). Iso-orientation domains in cat visual cortex are arrangedin pinwheel-like patterns. Nature, 353:429{431.Cowan, J. D. and Friedman, A. E. (1991). Simple spin models for the development of oculardominance columns and iso-orientation patches. In J. D. Cowan, G. T. and Alspector, J.,editors, Advances in Neural Information Processing Systems 3.15

Erwin, E., Obermayer, K., and Schulten, K. (1994). A simple network model of unsupervisedlearning. Technical report, University of Illinois.Kosterlitz, M. J. and Thouless, D. J. (73). Ordering, metastability and phase transitions in two-dimensional systerms. J. Phys., C6:1181.Linsker, R. (1986a). From basic network principles to neural architecture. Proc. Natl. Acad. Sci.USA, 83:8779{83.Linsker, R. (1986b). From basic network principles to neural architecture. Proc. Natl. Acad. Sci.USA, 83:7508{12, 8390{4, 8779{83.Liu, Y. (1994). Synaptic Plasticity: From Single Cell to Cortical Network. PhD thesis, BrownUniversity.Liu, Y. and Shouval, H. (1994). Localized principal components of natural images - an analyticsolution. Network., 5.2:317{325.MacKay, D. J. and Miller, K. D. (1990). Analysis of linsker's simulations of hebbian rules to linearnetworks. Network, 1:257{297.Miller, K. D. (1992). Development of orientation columns via competition between on- anf o�-centerinputs. NeuroReprot, 3:73{76.Miller, K. D. (1994). A model for the development of simple cell receptive �elds and the orderedarrangement of orientation columns through activity-dependent competition between on- ando�-center inputs. J. of Neurosci., 14:409{441.Miller, K. D., Keller, J. B., and Striker, M. P. (1989). Ocular dominance column development:Analysis and simulation. Science, 245:605{615.Miller, K. D. and MacKay, D. J. C. (1994). The role of constraints in Hebbian learning. NeuralComputation, 6:98{124.Nass, M. N. and Cooper, L. N. (1975). A theory for the development of feature detecting cells invisual cortex. Biol. Cyb., 19:1{18.Oja, E. (1982). A simpli�ed neuron model as a principal component analyzer. Journal of Mathe-matical Biology, 15:267{273.Paullet, J. E. and Ermentrout, B. (1994). Stable rotating waves in a two-dimensional discrete activemedia. SIAM J. Appl. Math., 54(6):1720{1744.Rubinstien, J. (1994). Pattern formation in associative neural networks with weak lateral interac-tion. Biological Cybernetics, 70.Shouval, H. and Liu, Y. (1996). Principal component neurons in a realistic visual environment.Network. In Press.Swindale, N. (1982). A model for the formation of orientation columns. Proc. R. Soc. London,B208:243. 16

von der Malsburg, C. (1973). Self-organization of orientation sensitive cells in striate cortex. Ky-bernetik, 14:85{100.

17