Communicating neurons: A connectionist spiking neuron implementation of stochastic diffusion search

16
Communicating neurons: a connectionist spiking neuron implementation of stochastic diffusion search S.J. Nasuto School of Systems Engineering, The University of Reading, Whiteknights, Reading, UK. e-mail: [email protected] J.M. Bishop * Department of Computing, Goldsmiths College, New Cross, London, UK. e-mail: [email protected] K. De Meyer Division of Engineering, Kings College, London Keywords: Spiking Neural Networks, Selective Attention; Stochastic Diffusion Search, Swarm Intelligence, Global Search Abstract An information-processing paradigm in the brain is proposed, instantiated in an artificial neural net- work using biologically motivated temporal encod- ing. The network will locate within the external world stimulus, the target memory, defined by a specific pattern of micro-features. The proposed network is robust and efficient. Akin in operation to the Swarm Intelligence paradigm, Stochastic Diffusion Search, it will find the best-fit to the memory with linear time com- plexity. Information multiplexing enables neu- rons to process knowledge as ‘tokens’ rather than ‘types’. The network illustrates possible emergence of cognitive processing from low level interactions such as memory retrieval based on partial match- ing. 1 Introduction One of the important roles of metaphor in science is to facilitate understanding of complex phenom- ena. Metaphors should describe phenomena in an intuitively understandable way that captures their essential features. We argue that a description of single neurons as computational devices does not capture the information processing complexity of * corresponding author real neurons and argue that describing them in terms of communication could provide a better al- ternative metaphor. These claims are supported by recent discoveries showing complex neuronal be- haviour and by fundamental limitations of estab- lished connectionist cognitive models. We suggest that real neurons operate on richer information than provided by a single real number and there- fore their operation cannot be adequately described in standard Euclidean setting. Recent findings in neurobiology suggest that, instead of modelling the neuron as a logical or numerical function, it could be described as a communication device. The prevailing view in neuroscience is that neu- rons are simple computational devices, summing up their inputs and calculating a non-linear out- put function. Information is encoded in the mean firing rate of neurons which exhibit narrow speciali- sation - they are devoted to processing a particular type of input information. Further, richly inter- connected networks of such neurons learn via ad- justing inter-connection weights. In the literature there exist numerous examples of learning rules and architectures, more or less inspired by vary- ing degrees of biological plausibility. Almost from the very beginning of connectionism, researchers were fascinated by computational capabilities of such devices [42, 56]. The revival of connection- ism in the mid-eighties featured increased interest 1

Transcript of Communicating neurons: A connectionist spiking neuron implementation of stochastic diffusion search

Communicating neurons: a connectionist spiking neuron

implementation of stochastic diffusion search

S.J. NasutoSchool of Systems Engineering, The University of Reading, Whiteknights, Reading, UK.

e-mail: [email protected]

J.M. Bishop ∗

Department of Computing, Goldsmiths College, New Cross, London, UK.

e-mail: [email protected]

K. De MeyerDivision of Engineering, Kings College, London

Keywords: Spiking Neural Networks, Selective Attention; Stochastic Diffusion Search, Swarm Intelligence, Global Search

Abstract

An information-processing paradigm in the brainis proposed, instantiated in an artificial neural net-work using biologically motivated temporal encod-ing. The network will locate within the externalworld stimulus, the target memory, defined by aspecific pattern of micro-features.

The proposed network is robust and efficient.Akin in operation to the Swarm Intelligenceparadigm, Stochastic Diffusion Search, it will findthe best-fit to the memory with linear time com-plexity. Information multiplexing enables neu-rons to process knowledge as ‘tokens’ rather than‘types’. The network illustrates possible emergenceof cognitive processing from low level interactionssuch as memory retrieval based on partial match-ing.

1 Introduction

One of the important roles of metaphor in scienceis to facilitate understanding of complex phenom-ena. Metaphors should describe phenomena in anintuitively understandable way that captures theiressential features. We argue that a description ofsingle neurons as computational devices does notcapture the information processing complexity of

∗corresponding author

real neurons and argue that describing them interms of communication could provide a better al-ternative metaphor. These claims are supportedby recent discoveries showing complex neuronal be-haviour and by fundamental limitations of estab-lished connectionist cognitive models. We suggestthat real neurons operate on richer informationthan provided by a single real number and there-fore their operation cannot be adequately describedin standard Euclidean setting. Recent findings inneurobiology suggest that, instead of modelling theneuron as a logical or numerical function, it couldbe described as a communication device.

The prevailing view in neuroscience is that neu-rons are simple computational devices, summingup their inputs and calculating a non-linear out-put function. Information is encoded in the meanfiring rate of neurons which exhibit narrow speciali-sation - they are devoted to processing a particulartype of input information. Further, richly inter-connected networks of such neurons learn via ad-justing inter-connection weights. In the literaturethere exist numerous examples of learning rulesand architectures, more or less inspired by vary-ing degrees of biological plausibility. Almost fromthe very beginning of connectionism, researcherswere fascinated by computational capabilities ofsuch devices [42, 56]. The revival of connection-ism in the mid-eighties featured increased interest

1

in analysing the properties of such networks [55],as well as in applying them to numerous practi-cal problems [26]. At the same time the samedevices were proposed as models of cognition ca-pable of explaining both higher level mental pro-cesses [58] and low level information processing inthe brain [22]. However, these promises were basedon the assumption that the computational modelcaptures all the important characteristics of realbiological neurons with respect to information pro-cessing. We will indicate in this article that veryrecent advances in neuroscience appear to invali-date this assumption. Neurons are much more com-plex than was originally thought and thus networksof oversimplified model neurons are orders of mag-nitude below the complexity of real neuronal sys-tems. From this it follows that current neural net-work ‘technological solutions’ capture only superfi-cial properties of biological networks and further,that such networks may be incapable of provid-ing a satisfactory explanation of our mental abil-ities. We propose to complement the descriptionof a single neuron as a computational device by analternative, more ‘natural’ metaphor :- we hypoth-esise that a neuron can be better and more natu-rally described in terms of communication ratherthan purely computation. We hope that shiftingthe paradigm will result in escaping from the lo-cal minimum caused by treating neurons and theirnetworks merely as computational devices. Thisshould allow us to build better models of the brain’sfunctionality and to build devices that reflect moreaccurately its characteristics. We will present asimple connectionist model, NEural STochastic dif-fusion search netwORk (NESTOR), fitting well inthis new paradigm and will show that its propertiesmake it interesting from both the technological andbrain modelling perspectives. Selman et al. posedsome challenge problems for Artificial Intelligence[61]. In particular Rodney Brooks suggested revis-ing the conventional McCulloch Pitts neuron modeland instead investigating the potential implications(with respect to our understanding of biologicallearning) of new neuron models based on recentbiological data. Further, Selman claimed that thesupremacy of standard heuristic, domain specificsearch methods of Artificial Intelligence need to berevised and suggested that recent investigation offast general purpose search procedures has openeda promising alternative avenue. Furthermore, in

the same paper Horvitz posed the development ofricher models of attention as an important problem,as all cognitive tasks “... require costly resources”and ”controlling the allocation of computational re-sources can be a critical issue in maximising thevalue of a situated system’s behaviour.” We claimthat the new network presented herein addresses allthree challenges posed in the above review paper[61], as it is isomorphic in operation to StochasticDiffusion Search (SDS), a fast, generic probabilis-tic search procedure which automatically allocatesinformation processing resources to search tasks.

2 Computational metaphor

The emergence of connectionism is based on thebelief that neurons can be treated as simple com-putational devices [42]. Further, the assumptionthat information is encoded as mean firing rate ofneurons was a base assumption of all the sciencesrelated to brain modelling. The initial booleanMcCulloch-Pitts model neuron was quickly ex-tended to allow for analogue computations. Themost commonly used framework for connection-ist information representation and processing is asubspace of a Euclidean space. Learning in thisframework is equivalent to extracting an appropri-ate mapping from the sets of existing data. Mostlearning algorithms perform computations whichadjust neuron interconnection weights according tosome rule, adjustment in a given time step being afunction of a training example. Weight updates aresuccessively aggregated until the network reachesan equilibrium in which no adjustments are made(or alternatively stopping before the equilibrium, ifdesigned to avoid overfitting). In any case knowl-edge about the whole training set is stored in finalweights. This means that the network does not pos-sess any explicit internal representation of the (po-tentially complex) relationships between trainingexamples other than that which implicitly exists asa distribution of weight values. We do not considerrepresentations of arity zero predicates, (e.g. thosepresent in NETtalk [59]), as sufficient for represen-tation of complex relationships. These limitationsresult in poor internal knowledge representationmaking it difficult to interpret and analyse the net-work in terms of causal relationships. In particularit is difficult to imagine how such a system could

2

develop symbolic representation and logical infer-ence (cf. the symbolic/connectionist divide). Suchdeficiencies in the representation of complex knowl-edge by neural networks have long been recognised[19, 7, 54]. The way in which data are processedby a single model neuron is partially responsiblefor these difficulties. The algebraic operations thatit performs on input vectors are perfectly admissi-ble in Euclidean space but do not necessarily makesense in terms of the data represented by these vec-tors. Weighted sums of quantities, averages etc.,may be undefined for objects and relations of thereal world, which are nevertheless represented andlearned by structures and mechanisms relying heav-ily on such operations. This is connected with amore fundamental problem missed by the connec-tionist community - the world (and relationshipsbetween objects in it) is fundamentally non-linear.

Classical neural networks are capable of discov-ering non-linear, continuous mappings between ob-jects or events but nevertheless they are restrictedby operating on representations embedded in lin-ear, continuous structures (Euclidean space is bydefinition a finite-dimensional linear vector spaceequipped with standard metric). Of course it ispossible in principle that knowledge from some do-main can be represented in terms of Euclideanspace. Nevertheless it seems that only in extremelysimple or artificial problems the appropriate spacewill be of small dimensionality. In real life prob-lems spaces of very high dimensionality are morelikely to be expected. Moreover, even if embeddedin a Euclidean space, the actual set representing aparticular domain need not be a linear subspace, orbe a connected subset of it. Yet these are amongthe topological properties required for the correctoperation of classical neural nets. There are nogeneral methods of coping with such situations inconnectionism.

Methods that appear to be of some use in suchcases seem to be freezing some weights (or restrict-ing their range) or using a ‘mixtures of experts’ or‘gated networks’ [31]. However, there is no prin-cipled way describing how to perform the former.Mixture of experts models appear to be a bettersolution, as single experts could in principle ex-plore different regions of a high dimensional spacethus their proper co-operation could result in sat-isfactory behaviour. However, such architecturesneed to be individually tailored to particular prob-

lems. Undoubtedly there is some degree of modu-larity in the brain, however, it is not clear that thebrain’s operation is based solely on a rigid modular-ity principle. In fact we will argue in the next sec-tion that biological evidence seems to suggest thatthis view is at least incomplete and needs revision.We feel that many of the difficulties outlined abovefollow from the underlying interpretation of neu-ron functioning in computational terms, which re-sults in entirely numerical manipulations of knowl-edge by neural networks. This seems a too restric-tive scheme. Even in computational neuroscience,existing models of neurons describe them as geo-metric points even though neglecting the geometricproperties of neurons (treating dendrites and axonsas merely passive transmission cables) makes suchmodels very abstract and may strip them of someinformation processing properties. In most techni-cal applications of neural networks the abstractionis even higher - axonic and dendritic arborisationsare completely neglected - hence they cannot inprinciple model the complex information process-ing taking place in these arbors [62]. We thinkthat brain functioning is best described in termsof non-linear dynamics but this means that pro-cessing of information is equivalent to some formof temporal evolution of activity. The latter how-ever may depend crucially on geometric propertiesof neurons as these properties obviously influenceneuron activities and thus whole networks. Friston[21] stressed this point on a systemic level when hepointed out to the importance of appropriate con-nections between and within regions - but this is ex-actly the geometric (or topological) property whichaffects the dynamics of the whole system. Qualita-tively the same reasoning is valid for single neurons.Undoubtedly, model neurons which do not take intoaccount geometrical effects perform some process-ing, but it is not clear what this processing has todo with the dynamics of real neurons. It followsthat networks of such neurons perform their oper-ations in some abstract time not related to the realtime of biological networks (We are not even sureif time is an appropriate notion in this context, incase of feedforward nets ‘algorithmic steps’ wouldbe probably more appropriate).

This concerns not only classical feedforward netswhich are closest to classical algorithmic processingbut also many other networks with more interest-ing dynamical behaviour (e.g. Hopfield or other

3

attractor networks). Of course one can resort tocompartmental models but then it is apparent thatthe description of single neurons becomes so com-plex that we have to use numerical methods to de-termine their behaviour. If we want to perform anyform of analytical investigation then we are boundto simpler models.

Relationships between real life objects or eventsare often far more complex for Euclidean spacesand smooth mappings between them to be the mostappropriate representations. In reality it is usuallythe case that objects are comparable only to someobjects in the world, but not to all. In other wordsone cannot equip them with a ‘natural’ ordering re-lation. Representing objects in a Euclidean spaceimposes a serious restriction, because vectors canbe compared to each other by means of metrics;data can be in this case ordered and compared inspite of any real life constraints. Moreover, vari-ables are often intrinsically discrete or qualitative innature and in this case again Euclidean space doesnot seem to be a particularly good choice. Net-works implement parametrised mappings and theyoperate in a way implicitly based on the Euclideanspace representation assumption - they extract in-formation contained in distances and use it for up-dates of weight vectors. In other words, distancescontained in data are translated into distances ofconsecutive weight vectors. This would be fine ifthe external world could be described in terms ofEuclidean space however it would be a problem ifwe need to choose a new definition of distance eachtime a new piece of information arrives.

Potentially new information can give a new con-text to previously learnt information, with the re-sult that concepts which previously seemed to benot related now become close. Perhaps this meansthat our world model should be dynamic - chang-ing each time we change the definition of a dis-tance? However, weight space remains constant- with Euclidean distance and fixed dimensional-ity. Thus the overall performance of classical net-works relies heavily on their underlying model ofthe external world. In other words, it is not thenetworks that are ‘smart’, it is the choice of theworld model that matters. Networks need to ob-tain ‘appropriate’ data in order to ‘learn’, but thisaccounts to choosing a static model of the worldand in such a situation networks indeed can per-form well. Our feeling is that, to a limited ex-

tent, a similar situation appears in very low levelsensory processing in the brain, where only thestatistical consistency of the external world mat-ters. However, as soon as the top down informationstarts to interact with the bottom up processingthe semantic meaning of objects becomes signifi-cant and this can often violate the assumption ofstatic world representations. It follows that classi-cal neural networks are well equipped only for tasksin which they process numerical data whose rela-tionships can be well reflected by Euclidean dis-tance. In other words classical connectionism canbe reasonably well applied to the same category ofproblems which could be dealt with by various re-gression methods from statistics. Moreover, as infact classical neural nets offer the same explana-tory power as regression, they can be therefore re-garded as its non-linear counterparts. It is howeverdoubtful whether non-linear regression constitutesa satisfactory (or the most general) model of fun-damental information processing in natural neuralsystems. Another problem follows from the rigidityof neurons’ actions in current connectionist models.The homogeneity of neurons and their responses isthe rule rather than the exception. All neuronsperform the same action regardless of individualconditions or context. In reality, as we argue inthe next section, neurons may condition their re-sponse on the particular context, set by their im-mediate surroundings, past behaviour and currentinput etc. Thus, although in principle identical,they may behave as different individuals becausetheir behaviour can be a function of both morphol-ogy and context. Hence, in a sense, the way con-ventional neural networks operate resembles sym-bolic systems - both have built in rigid behaviourand operate in an a priori determined way. Tak-ing different ‘histories’ into account would allowfor the context sensitive behaviour of neurons - ineffect for existence of heterogeneous neuron pop-ulations. Standard nets are surprisingly close toclassical symbolic systems although they operate indifferent domains: the latter operating on discrete,and the former on continuous spaces. The differ-ence between the two paradigms in fact lies in thenature of representations they act upon, and not somuch in the mode of operation. Symbolic systemsmanipulate whole symbols at once, whereas neuralnets usually employ sub-symbolic representationsin their calculations. However, both execute pro-

4

grams, which in case of neural networks simply pre-scribe how to update the interconnection weights inthe network. Furthermore, in practice neural net-works have very well defined input and output neu-rons, which together with their training set, can beconsidered as a closed system relaxing to its steadystate. In modular networks each of the ‘expert’ netsoperates in a similar fashion, with well defined in-puts and outputs and designed and restricted inter-communication between modules. Although manyresearchers have postulated a modular structure forthe brain [20], with distinct functional areas beingblack boxes, others [44, 18] have realised that thebrain operates rather like an open system which,due to the ever changing conditions, exhibits exten-sive connectivity between areas and no fixed inputand output. The above taxonomy resembles a sim-ilar distinction between algorithmic and interactivesystems in computer science, the latter possessingmany interesting properties [70].

3 Biological evidence

Advances in neuroscience provide us with evidencethat neurons are much more complex than previ-ously thought [34]. In particular, it has been hy-pothesised that neurons can select input dependingon its spatial location on the dendritic tree or onits temporal structure [34, 6, 23]. Some neurobiol-ogists suggest that synapses can remember the his-tory of their activation or, alternatively, that wholeneurons discriminate spatial and/or temporal pat-terns of activity [23]. Various authors have pos-tulated spike encoding of information in the brain[65, 60, 35]. The speed of information processingin some cortical areas, the small number of spikesemitted by many neurons in response to cognitivetasks [53, 57, 66], together with very variable be-haviour of neurons in vivo [63], suggest that neu-rons would not be able to reliably estimate meanfiring rate in the time available. Some results sug-gest that firing events of single neurons are repro-ducible with very high reliability and interspike in-tervals encode much more information than firingrates [8]. Others found that neurons in isolationcan produce, under artificial stimulation, very reg-ular firing with high reproducibility rate suggestingthat the apparent irregularity of firing in vivo mayfollow from interneuronal interactions or may be

stimulus dependent [39]. The use of interspike in-terval coding enables richer and more structured in-formation to be transmitted and processed by neu-rons. The same mean firing rate corresponds toa combinatorial number of interspike interval ar-rangements in a spike train. What would previ-ously be interpreted as a single number can carrymuch more information in temporal coding. More-over, temporal coding enables the system to encodeunambiguously more information than is possiblewith a simple mean firing rate. Different parts of aspike train can encode qualitatively different infor-mation. All these possibilities have been excludedin the classical view of neural information process-ing. Even though a McCulloch-Pitts neuron is suf-ficient for production of spike trains, spike trainsby themselves do not solve the binding problem(i.e. do not explain the mechanism responsible forintegration of object features which are processedin spatially and temporally distributed manner).However, nothing would be gained, except possiblyprocessing speed, if the mean firing rate encodingwould be merely replaced by temporal encoding asthe underlying framework of knowledge represen-tation and processing still mixes qualitatively dif-ferent information by simple algebraic operations.The irregular pattern of neuron activity in vivo [63]is inconsistent with temporal integration of excita-tory post synaptic potentials (EPSPs) assumed inclassical model neurons. It also introduces hugeamounts of noise, thus making any task to be per-formed by neurons, were they unable to differen-tially select their input, extremely difficult. On theother hand, perhaps there is a reason for this ir-regular neuronal behaviour. If neurons are coin-cidence detectors rather than temporal integrators[34, 65] then the randomness of neuron firing is anasset rather than liability. One of the most difficultand as yet unresolved problems of computationalneuroscience is that of binding distinct features ofthe same object into a coherent percept. How-ever, in [49], Nelson postulates that it is the tra-ditional view ‘transmission first, processing later’,that introduces the binding problem. On this viewprocessing cannot be separated from transmissionand, when entangled with transmission performedby neural assemblies spanning multiple neuronal ar-eas, it makes the binding problem non-existent [50].

5

4 Communication metaphor

The brain’s computational capabilities have to beunderstood in a metaphorical sense only. All mat-ter, from the simplest particles to the most com-plex living organisms undergoes physical processeswhich, in most sciences, are not given any specialinterpretation. However, when it comes to nervoussystems the situation changes abruptly. In neuro-science, and what follows in connectionism, it isassumed that neurons and their systems possessspecial computational capabilities, which are notattributed to other, even the most complex, biologi-cal substances (e.g. DNA). This is a very anthropo-morphic viewpoint because, by definition, compu-tation is an intentional notion and it assumes exis-tence of some ‘demon’ able to interpret it. Thus weclaim that the very assumption of computationalcapabilities of real neurons leads to homunculartheories of mind. In our opinion to say that neuronsperform computations is equivalent to saying thate.g., a spring extended by a moderate force com-putes, according to Hook’s law, how much it shoulddeform. We need to stress that our stance doesnot imply that one should abandon using computa-tional tools for modelling and analysing the brain.However, one should be aware of their limitations.On the other hand, although also metaphorical,treating neurons as communicating with each othercaptures their complex (and to us fundamental),capability of modifying behaviour depending on thecontext. Our claim is that communication as bio-logical information processing could describe morecompactly complex neuronal operations and pro-vide us with intuitive understanding of the meaningof these operations (albeit we do not impose thatthis meaning would be accessible to single neurons).Although interpreting neurons as simple numericalor logical functions greatly simplifies their descrip-tion, it introduces problems at the higher levels ofneural organisation. Moreover, as we argued earlierrecent neurobiological evidence supports the claimthat the idea of neurons being simple computa-tional devices has to be reconsidered. We arguethat communication better describes neuron func-tionality than computation. In contrast to com-putation, communication is not a merely anthro-pomorphic projection on reality. Even relativelysimple organisms, e.g. bacteria, communicate witheach other or with the environment. This ability

is essential for their survival and it seems indis-pensable for more complex interactions and socialbehaviour of higher species. The role of communi-cation in human development and in social interac-tions cannot be overestimated [14]. It seems there-fore that communication is a common process usedby living systems on all levels of their organisation.In our opinion the most fundamental qualitativeproperties of neurons postulated recently are theircapability to select different parts of converging sig-nals and the capability of choosing which signals toconsider in the first place. Thus neurons can be saidto communicate to each other simple events and toselect information which they process or transmitfurther. The selection procedure could be basedon some criteria dependent on the previous signals’properties such as where from and at what momentthe information arrived. This would account forneurons’ spatio-temporal filtering capacity. Also itwould explain the amount of noise observed in thebrain and apparent contrast between reliability ofneural firing in vitro and their random behaviourin vivo. What is meaningful information for oneneuron can be just noise for another. Moreover,such noise would not deter functionality of neuronsthat are capable of responding to selected infor-mation. One could object to our proposal usingthe parsimony principle - why introduce an extralevel of complexity if it has been shown that net-works of simple neurons can perform many of thetasks attributed to biological networks? However,we argue that such a position addresses a purelyabstract problem, which may have nothing to dowith brain modelling. What is possible to com-pute with artificial neurons is, in principle, a math-ematical problem; how the same functionality isachieved in the brain is another matter. The in-formation processing capacity of dendritic trees isa scientific fact not merely a conjecture. Insteadof computational parsimony we propose an ‘eco-nomical’ one: the brain facilitates the survival ofits owner and for that purpose uses all available re-sources to processes information. Polychronisation,emergence of time locked but not synchronised pat-terns of activity or synfire chains, groups of mutu-ally synchronised neurons which are out of synchwith each other are other suggestions proposed inthe literature that propose richer temporal process-ing capacity of neurons and their networks [30, 1].In the next section we will outline a connectionist

6

architecture, in which neurons’ operation is basedon communication more than on computation (thefact that communication is implemented using aform of computation should not detract us fromrecognition of the importance of the communica-tion as a level at which the operation of neuronscan be meaningfully described and interpreted ver-sus computation as implementing such processes onan inherently Turing like architecture).

5 NESTOR: a connectionistSpiking Neuron StochasticDiffusion Network

The connectionist network architecture ofNESTOR consists of three one-dimensionallayers: the input retina consisting of a layer ofreceptor neurons; a layer of matching neurons anda layer of memory neurons. In operation the func-tion of NESTOR is to label and locate the best fitof a target pattern (hereafter called a memory anddefined by the activity of memory neurons), from aparticular external stimulus on the retina (definedby the pattern of activity on receptor neurons). Allneurons output spike trains in which informationis encoded in Inter-Spike Intervals (ISI’s). Suchtemporal coding allows for a much richer repertoireof information encoding than a simple mean firingrate. In particular, different parts of the spiketrain may encode different types of informationeffectively leading to multiplexing. Moreover,only parts of the spike trains may encode directlyinformation about the stimulus (or memory); theother parts providing a modulatory input internalto the network and setting the context in whichthe sensory (or memory) information is processed.It is such a hypothetical encoding scheme that isutilised in NESTOR.

Matching neurons are fully interconnected to allother matching neurons. They are also fully con-nected to both receptor neurons and memory neu-rons. A schematic diagram of network connectiv-ity is given in Figure 1. The restriction to one-dimensional layers is for clarity of exposition onlyand extension to two-dimensional layers is straight-forward.

Matching neurons perform a set of operationson their input essentially akin to (nonlinear) filter-

Figure 1: The architecture of NESTOR

ing. In contrast to the operation of classical neuronmodels, matching neurons select the spike trainsto process on the basis of information containedin the first part of the spike trains and on theirown state. Subsequently, their output and statewill be dependent on the information contained inthe second part of the accepted spike trains. How-ever, the output will always consist of (parts of)the accepted spike trains; no further processing ofthe inputs is needed. In this sense their operationsamounts to filtering which information will be prop-agated throughout the network rather than calcu-lating some numeric transformation of their over-all input. Nonlinearity of filtering stems from itscontextual nature; which parts of matching neu-ron’s input are being propagated depends on itsown state and on the properties of other input com-ponents.

More specifically, matching neurons both main-tain and output hypotheses - ‘where’ values - defin-ing possible locations of the target pattern on theretina. Also for clarity of exposition we use thelabels ‘active’ and ‘inactive’ to identify the inter-nal ‘state’ of matching neurons, however these la-bels are neither used or known by other matchingneurons. An active matching neuron will outputits current hypothesis as a spike train; an inactivematching neuron will adopt a new hypothesis fromthe first spike train it receives and output this. Thisspike train can either be from another matchingneuron or from memory and retina neurons.

Matching neurons evaluate their hypothesis -‘where’ value - by comparing randomly selectedmicro-feature(s) - ‘what’ information - from thememory and the corresponding ‘what’ informationfrom the retina. If the micro-features are the same

7

(the corresponding ‘what’ values are the same), thisprovides evidence that the hypothesis is good andso the matching neuron becomes active.

An active matching neuron initiates a spiketrain and retains its current hypothesis (its ‘where’value). Thus a successful matching neuron willnot switch to evaluate other locations in the searchspace (retina), but will continue to randomly evalu-ate its current hypothesis - ‘where’ value - by test-ing for the presence of other (randomly selected)micro-features from the target.

On the contrary, a neuron failing to discover thesame micro-feature will remain inactive and willadopt a new hypothesis encoded in a spike trainarriving from other matching neurons. As thesespike trains are accepted on the basis of their ar-rival time, an inactive matching neuron can eitherexploit a potentially correct position signalled byan active matching neuron or explore a completelynew position encoded in the spike train generatedby another inactive neuron.

However, if the first spike train arrives frommemory or retina, a matching neuron will simplyoutput a spike train encoding the position definedby the input trains without further processing orchanges to its [notional] state.

It is apparent that, over consecutive cycles ofoperation, a matching neuron evaluating the spiketrains encoding the retinal region with the highestoverlap of microfeatures with the target memory ismore likely to retain its memory at the next cycleof activity, than it is when processing spike trainsencoding other locations. This is because, by defini-tion, the chance of matching a micro-feature fromthe retina and the memory is highest here. Ef-fectively matching neurons operate by filtering theincoming spike trains on the basis of their past ac-tivity.

The location on the retina of the best fit to thememory located by the network is encoded in thedominant spike train output of matching neurons,i.e. it is the position corresponding to the mode ofthe distribution of matching neuron ‘where’ signals.

As the process of memory location by the net-work is statistical in nature, the assembly of neu-rons encoding the best fit solution can fluctuatedynamically both in number and identity. Eventhough single neurons can change their ‘where’ in-formation with relatively high probability and theiractivity can be considered random when considered

in isolation, they nevertheless collectively producestable behaviour in a quasi-deterministic manner.This form of dynamic information decoding hasto be contrasted with the conventional neural net-works operation where the readout from the net-work is possible due to the deterministic and fixedfunctionality of individual neurons.

As the time jitter of matching neurons efferentactivity is small compared to the length of the in-formation encoding, an assembly of neurons findingthe best-fit may produce time locked or near oscil-latory behaviour. Hence in this model oscillatorybehaviour may be a result of, rather than a cause of,the binding of features belonging to the same object.

5.1 Retina and memory

The function of each receptor neuron is to commu-nicate its position on the retina and to encode themicro-feature that stimulated it. The first ISI sentby a receptor neuron encodes information aboutits position on the retina (the ‘where’ information,∆twret) and the second ISI encodes the micro-featurethat stimulated this receptor (the ‘what’ informa-tion, ∆tmret). For modelling purposes we assume thefollowing relationship between a receptor’s retinalposition and its encoding via the ISI as:

∆twj+1 ∝ ∆twj + ∆tw (1)

where j, j+1 are two adjacent retinal positionsand ∆tw is the minimal increment of the ISI (seeFigure 2). The above encoding is akin to the prin-ciple of preservation of receptor’s topographic or-ganisation although its particular form is arbitraryand assumed here to simplify this exposition - anyrelation enabling determination of the length of thefirst ISIs of neighbouring receptor neurons would beequally suitable.

In operation memory neurons are functionallyidentical to receptor neurons. Thus at the onsetof activity, their first ISI, ∆twmem encodes relative‘where’ information and the second ISI, ∆tmmem en-codes memory ‘what’ information.

Hence the only difference between receptor cellsand memory neurons is that receptors respondto and encode external stimuli whereas memoryneurons encode and propagate internal representa-tions of the external world. However, in operationmatching neurons need to distinguish the ‘source’

8

Figure 2: Encoding of information in the spiketrains of receptor cells. The relative position of re-ceptor cells in the spike train is encoded in the firstInter-Spike Intervals. The difference t, between thefirst Inter-Spike Interval’s lengths of adjacent re-ceptor cells is constant for any receptor cell j. Thesecond ISI’s encode micro-features that activatedcorresponding receptor cells.

of afferent information (matching, memory or re-ceptor neuron). This requires either ‘labelled’ in-put lines or some form of source-encoding withinthe spike train. However, both methods provideinformation at the expense of an increase in net-work complexity, therefore to aid the clarity of thisshort exposition, the process underlying this par-ticular mechanism will be left open, with source in-formation explicitly available to matching neuronsas required.

5.2 Matching layer neurons

A matching neuron only stores the ‘where’ infor-mation (defined by the ISI encoding the value of∆twneur), of an accepted spike train. The delay be-tween a matching neuron firing and the arrival ofthe resulting spike train to efferent neurons followsfrom the transmission delays introduced by efferentaxons.

We assume that the sum of the maximal lengthof the spike train,Tl and maximal time delay Td issmaller than the length of the time interval betweenconsecutive updates of matching neurons, T:

Tl + Td ≤ T (2)

Further, it is assumed that the probability dis-tribution of the spike train arrival times to a givenmatching neuron is the same for all afferent axons.

The method by which a matching neuron pro-cesses incoming information depends on the infor-mation itself and on the past activity imprinted inneuron’s memory (∆twneur) and state (either activeor inactive). If a matching neuron is in an activestate, it will select for processing the first spiketrains from the retina and memory that fulfil thecondition:

∆twret + ∆twmem = ∆twneur (3)

This condition ensures that each matching neu-ron selects a micro-feature from memory and com-pares it with appropriate information from theretina, (see Figure 3). Subsequently, the matchingneuron will compare the second ISIs from the corre-sponding spike trains, ∆tmmem and ∆tmret , definingthe ‘what’ information from the memory and theretina.

If the comparison is successful (i.e. the micro-feature propagated by the receptor neuron is iden-tical to that propagated by the memory neuron)the matching neuron fires a spike train correspond-ing to the position defined by ∆twneur. In the nextcycle of activity it will explore other microfeaturesat that position as it will remain in an active stateand its internal memory will not change. Other-wise, the matching neuron will become inactive. Inthe inactive state the matching neuron will acceptinformation from either the first arriving spike trainfrom another matching neuron or spike trains mem-ory and the retina (albeit with no constraints onISI’s). In the latter case the matching neuron willfire a spike train corresponding to the position de-fined jointly by first inter spike intervals arrivingfrom the memory and the retina. In the formercase it will modify its memory, ∆twneur, to store thefirst ISI from another neuron and will change itsstate to active. This mechanism allows a match-ing neuron to check the position pointed to by an-other successful matching neuron or alternativelya completely random position in the search spacederived from the spike train generated by anotherinactive neuron. From the fact that arrival times

9

Figure 3: (a) Relative positions of the micro-feature, C, selected from the target memory,[A1BB3C33G], and on the retina, [ADC33 ..A1BB3C33G .. XTGPS]; (b) their encoding in thelength of Inter-Spike Intervals by receptor cell andmemory neuron respectively, together with the re-sulting matching neuron target position encoding.

of spike trains are independent and identically dis-tributed random variables it follows that the choiceof the spike trains based on their arrival times to agiven neuron is unbiased. It means that a matchingneuron can choose information arriving from othermatching neurons with equal probability.

Operation of NESTOR can be better understoodif one realises that it is closely related to a dis-tributed swarm intelligence system called Stochas-tic Diffusion Search described in the next section.

6 Stochastic Diffusion Search

SDS is an efficient probabilistic swarm intelligenceglobal search and optimisation technique that hasbeen applied to diverse problems such as site selec-tion for wireless networks [68], mobile robot self-localisation [10], object recognition [12] and textsearch [11]. Additionally, a hybrid SDS and n-tupleRAM [2] technique has been used to track facialfeatures in video sequences [12, 24].

Previous analysis of SDS has investigated itsglobal convergence [45], linear time complexity [46]and resource allocation [48] under a variety ofsearch conditions.

SDS is based on distributed computation, inwhich the operations of simple computationalunits, or agents are inherently probabilistic. Agentscollectively construct the solution by performingindependent searches followed by diffusion of in-formation through the population. Positive feed-back promotes better solutions by allocating tothem more agents for their exploration. Limited re-sources induce strong competition from which thelargest population of agents corresponding to thebest-fit solution rapidly emerges.

SDS uses a population of agents. In many searchproblems the solution can be thought of as com-posed of many subparts and SDS explicitly utilisessuch decomposition to increase the search efficiencyof individual agents. In SDS each agent poses ahypothesis about the possible solution and eval-uates it partially. Successful agents repeatedlytest their hypothesis while recruiting unsuccessfulagents by direct communication. This creates apositive feedback mechanism ensuring rapid con-vergence of agents onto promising solutions in thespace of all solutions. Regions of the solution spacelabelled by the presence of agent clusters can be

10

interpreted as good candidate solutions. A globalsolution is thus constructed from the interactionof many simple, locally operating agents formingthe largest cluster. Such a cluster is dynamic innature, yet stable, analogous to, “a forest whosecontours do not change but whose individual treesdo”, [3, 13]. The search mechanism can be illus-trated with the following analogy.

6.1 The restaurant game

A group of delegates attends a long conference inan unfamiliar town. Each night they have to findsomewhere to dine. There is a large choice ofrestaurants, each of which offers a large variety ofmeals. The problem the group faces is to find thebest restaurant, that is the restaurant where themaximum number of delegates would enjoy dining(this simplistic model constructed to illustrate theSDS assumes that all diners have the same prefer-ences). Even a parallel exhaustive search throughthe restaurant and meal combinations would taketoo long to accomplish. To solve the problemdelegates decide to employ a Stochastic DiffusionSearch.

Each delegate acts as an agent maintaining a hy-pothesis identifying the best restaurant in town.Each night each delegate partially evaluates his hy-pothesis by dining there and randomly selecting oneof the meals from the menu. The next morning atbreakfast every delegate who did not enjoy his mealthe previous night, asks one randomly selected col-league to share his dinner impressions. If the expe-rience was good, the unsatisfied diner also adoptsthis restaurant as his choice. Otherwise he simplyselects another restaurant at random from thoselisted in ‘Yellow Pages’.

Using this strategy it is found that very rapidlysignificant number of delegates congregate aroundthe ‘best’ restaurant in town.

Abstracting from the above algorithmic process:By iterating through test and diffusion phases

agents stochastically explore the whole solutionspace. However, since tests succeed more oftenon good candidate solutions than in regions withirrelevant information, an individual agent willspend more time examining good regions, at thesame time recruiting other agents, which in turn,via a positive feedback cycle, recruit even moreagents. Good candidate solutions are thus iden-

Initialisation phase

Whereby all agents (delegates) generate

an initial hypothesis (select a restaurant).

loop

Test phase

Each agent evaluates evidence for its

hypothesis (meal degustation).

Agents are classified as active (happy

diners) or inactive (disgruntled diners).

Diffusion phase

Inactive agents adopt a new hypothesis

by either communication with another

agent or, if the selected agent is also

inactive, there is no information flow

between the agents; instead the

selecting agent must adopt a new

hypothesis (restaurant) at random.

endloop

tified by concentrations of a substantial populationof agents.

Central to the power of SDS is its ability to es-cape local minima. This is achieved by the prob-abilistic outcome of the partial hypothesis evalua-tion in combination with reallocation of resources(agents) via stochastic recruitment mechanisms.Partial hypothesis evaluation allows an agent toquickly form its opinion on the quality of the in-vestigated solution without exhaustive testing (e.g.it can find the best restaurant in town without hav-ing to try all the meals available in each).

7 Comparison of SDS andNESTOR

Under specific assumptions about the probabilitydistribution of delays and the use of multiplexing ofinformation encoded in the spike trains, NESTORmimics operation of the generic SDS. The corre-spondence between these architectures can be es-tablished by equating matching neurons of the lat-ter with agents. However, the important differ-ence between the two is that in SDS the activitystate of a given agent is accessible to other agents,whereas in NESTOR information about the inter-nal state of a matching neuron is only used lo-

11

cally. Matching neurons do not have direct ac-cess to the internal states of other neurons. More-over, the two algorithms differ in the way they en-code information. All the information necessaryfor operation of matching neurons is (in princi-ple) encoded explicitly, whereas in SDS agents canonly act on the information distributed by otheragents; their activity constituting an implicit in-formation. Moreover, the information encoding inSDS is in an abstract, dimensionless form, typi-cal for vast majority of artificial neural networksand even for more faithful neural models utilisingrate coding. On the contrary, all the informationprocessed by matching neurons in NESTOR has aphysical dimension of time interval between con-secutive spikes. This characteristic introduces timedelays into NESTOR and thus bears on the dy-namics of this network. Another difference betweenthe systems follows from different modes of opera-tion. SDS operates in synchroneous mode and allagents go through their cycles of activity in par-allel. In contrast, matching neurons in NESTORoperate asynchroneously, which prevents a one toone correspondence between the largest cluster ofactive agents in the in SDS and the strongest invari-ant spike train pattern encoding the solution foundby NESTOR. Both above mentioned features affectthe time count in both systems - several hundredsof time steps in NESTOR correspond to a singleiteration of SDS.

Figure 4 shows an example of operation of bothSDS, panel (a), and NESTOR, panel (b), illustrat-ing these differences. Thus, traces denote the num-ber of agents in the largest active cluster in case ofSDS, and the number of active neurons firing thespike train encoding the solution in NESTOR. Thedifferences discussed above influence the quantita-tive differences in the dynamics of the two systems(e.g. much larger variance in case of NESTOR).Nevertheless, in spite of these differences, the qual-itative behaviour of SDS and NESTOR is similar(in both the time course and quasi-steady state be-haviour).

8 Discussion and conclusions

The neural network outlined in this paper per-forms SDS and solves the best-fit matching prob-lem. This functionality emerges from its ability to

Figure 4: (a) Time evolution of the largest clus-ter of active agents pointing to the same positionin the search space in SDS; (b) time evolutionof the number of active matching neurons emit-ting the spike train encoding the solution found byNESTOR. The difference of the information en-coding mode between the two algorithms affects dif-ferent time scales as well as quantitative measuresof response such as variance (see text for details).However, qualitative behaviour of both systems issimilar in both time course and quasi-steady state.

12

self-organise in response to incoming stimuli. Thenetwork effectively uses a tagged dynamic assem-bly encoding for the target, and finding this withinthe search space results in the onset of time lockedactivity of spike trains within the neural assembly.

Micro-features defining the target are taggedboth by their retinal encoding and their relative po-sition on the retina. A label-tag encodes ‘what’ themicro-feature is and a location-tag defines ‘where’it is. Thus the NESTOR network processes knowl-edge as ‘tokens’ and not as ‘types’. This con-trasts information processing in most associativenetworks, where knowledge is represented as sim-ple types (defined by vectors in Euclidean space). Ithas been suggested by Van de Velde [69] that ‘type’representation schemes are a fundamental cause ofmany of the problems encountered when modellingsymbolic processes by associative networks.

NESTOR constitutes a very simplified modelqualitatively demonstrating how information maybe processed in a neural architecture utilising tem-poral encoding and multiplexing. As such, someimportant issues such as encoding of memories needfuture elaboration. Instead, this paper argues thepossibility that knowledge representation in thenetworks using temporal encoding of informationmay be fundamentally different from that used inclassical nets. In our model the activity of singleneurons does not suggest a semantical interpreta-tion, as neurons respond to all features. However,an assembly of such neurons, locked to a particularlocation in the search space, does acquire a semanti-cal interpretation, as it supports a tagged-tokenisedinternal representation of the object it attends to.The neurons do not constitute the internal repre-sentation in themselves, as the assembly is dynami-cally fluctuating, but their pattern of activity does.Thus, although an assembly supporting a partic-ular representation in different time instants candiffer considerably in the number and identity ofits constituent neurons, the representation is con-tinuous and stable over time.

Further, the allocation of neurons to the best fitof the target is analogous to an attention mech-anism switching computational resources betweentarget objects. This suggests a possible solutionto the classical parallel/serial divide problem of at-tention theory described by Treisman [64]. In ourmodel both types of attention coexist. Single neu-rons process information from the search space in

parallel and serial attention emerges when an as-sembly of neurons that have been focused on onearea of the retina ‘locks’ to another. Moreover,the network will retain one of the most funda-mental properties of SDS - automatic allocation ofresources, whose characteristics have been exten-sively investigated in [48].

———————————————————–

References

[1] Abeles, M. Synfire chains. In M. A. Arbib (Ed.),The handbook of Brain theory and neural net-works. (pp. 11431146). Cambridge, MA: MITPress, 2002.

[2] I. Aleksander and T.J. Stonham, Guide to pat-tern recognition using random access memories,Computers & Digitial Techniques 2:1 (1979), 29-40.

[3] W.B. Arthur , Inductive reasoning and boundedrationality, (The El Farol Problem), Amer.Econ.Rev. Papers and Proceedings 84 (1994),406.

[4] G. Ascoli, Progress and perspectives in compu-tational neuroanatomy, The Anatomical Record(New Anat.), 257, pp. 195-207, 1999.

[5] T. Back, Evolutionary Algorithms in Theoryand Practice, Oxford University Press, 1996.

[6] [H. Barlow, Intraneuronal information pro-cessing, directional selectivity and memory forspatio-temporal sequences. Network: Computa-tion in Neural Systems 7 (1996) 251-259.

[7] [J. Barnden & J. Pollack (eds.), High-LevelConnectionist Models, Ablex: Norwood, NJ,(1990).

[8] [M.J. Berry, et al., The structure and precisionof retinal spike trains. Proc. Natl. Acad. Sci. USA94 (1997) 5411-5416.

[9] J.M. Bishop, Anarchic Techniques for PatternClassification, Chapter 5. PhD Thesis, Univer-sity of Reading, 1989.

[10] P.D. Beattie & J.M. Bishop, Self-Localisationin the ‘SENARIO’ Autonomous Wheelchair,

13

Journal of Intelligent and Robotic Systems, 221998, 255-267.

[11] J.M. Bishop, Stochastic Searching Networks,Proc. 1st IEE Int. Conf. ANNs, London, UK,(1989), 329-331.

[12] J.M. Bishop & P. Torr, The Stochastic SearchNetwork, in Linggard, R., Myers, D.J., Nightin-gale, C. (eds.) Neural Networks for Images,Speech and Natural Language, Chapman & Hall,New York, 1992, pp. 370-387.

[13] J.M. Bishop, S.J. Nasuto, & K. De Meyer,Knowledge Representation in Connectionist Sys-tems. ICANN 2002, Madrid, Spain, (2002).

[14] [R. Brown, Social psychology. Free Press, NewYork (1965)

[15] P. Cariani, As if time really mattered: Tempo-ral strategies for neural coding of sensory infor-mation, in Origins: Brain and Self-Organization,K. Pribram (ed.), Lawrence Erlbaum Assoc,1994.

[16] D. Corne, M. Dorigo, & F. Glover, New Ideasin Optimisation, McGraw-Hill, 1999.

[17] E. Bonabeau, M. Dorigo, & G. Theraulaz,Swarm Intelligence: from Natural to ArtificialSystems, Oxford University Press, 1999.

[18] [M. Farah, Neuropsychological inference withan interactive brain: A critique of the local-ity assumption. Behavioural and Brain Sciences(1993)

[19] [J. Fodor & Z.W. Pylyshyn, Z.W.: Connec-tionism and Cognitive Architecture: A CriticalAnalysis. In: Boden, M.A. (ed.): The Philoso-phy of Artificial Intelligence, Oxford UniversityPress (1990)

[20] [J.A. Fodor, The Modularity of Mind. MITPress (1983).

[21] [K.J. Friston, Transients, Metastability, andNeuronal Dynamics. Neuroimage 5 (1997) 164-171.

[22] [K. Fukushima, Neocognitron: A hierarchicalneural network capable of visual pattern recog-nition. Neural Networks 1 (1988), 119-130.

[23] [R. Granger et al.: Non-Hebbian properties oflong-term potentiation enable high-capacity en-coding of temporal sequences. Proc. Natl. Acad.Sci. USA Oct (1991) 10104-10108.

[24] H.J. Grech-Cini, & G.T. McKee, Locating theMouth Region in Images of Human Faces. InP.S.Schenker (Ed.), Proceedings of SPIE - TheInternational Society for Optical Engineering,Sensor Fusion VI 2059, Massachusetts, 1993.

[25] [D. Goldberg, Genetic Algorithms in search,optimization and machine learning. AddisonWesley, Reading MA, 1989.

[26] [S. Haykin, Neural Networks: A Comprehen-sive Foundation. Macmillan, New York (1994).

[27] R. Hirsh, Modulatory integration: A conceptcapable of explaining cognitive learning and pur-posive behavior in physiological terms, Psychobi-ology, 18 (1), pp: 3-15, (1990).

[28] J.H. Holland, Adaptation in natural and arti-ficial systems. The University of Michigan PressAnn Arbor MIT, 1975.

[29] M. Iosifescu, Finite Markov processes andtheir applications. Wiley, Chichester, 1980.

[30] E. M. Izhikevich, Polychronization: Compu-tation With Spikes, Neural Computation (2006)18:245-282

[31] [M.I. Jordan & R.A. Jacobs, Hierarchical mix-tures of experts and the EM algorithm. MITComp. Cog. Sci. Tech. Report 9301 (1993).

[32] [E.N. Kamas, L.M. Reder, & M.S. Ayers, Par-tial matching in the Moses illusion: Responsebias not sensitivity, Memory & Cognition, 24:6,pp. 687-699, 1996.

[33] J. Kennedy & R.C. Eberhart, Swarm Intelli-gence, Morgan Kauffman, San Francisco, 2001.

[34] [Koch, C.: Computation and the single neu-ron. Nature 385 (1997) 207-210

[35] [Koenig, P., et al.: Integrator or coincidencedetector? The role of the cortical neuron revis-ited. Trends Neurosci. 19(4) (1996) 130-137.

14

[36] K. De Meyer, Explorations in Stochastic Diffu-sion Search: soft- and hardware implementationsof biologically inspired Spiking Neuron Stochas-tic Diffusion Networks, University of Reading,UK KDM/JMB/2000-1, 2000.

[37] K. De Meyer, J.M. Bishop, & S.J. Nasuto,Small World Network behaviour of StochasticDiffusion Search. ICANN 2002, Madrid, Spain,2002.

[38] K. De Meyer, J.M. Bishop, & S.J. Nasuto, At-tention through Self-Synchronisation in the Spik-ing Neuron Stochastic Diffusion Network. Consc.and Cogn. 9:2 (2000), 81-81.

[39] [Z.F. Mainen & T.J. Sejnowski, Reliability ofspike timing in neocortical neurons. Science 168(1995) 1503-1506.

[40] M. Moglich, U. Maschwitz & B. Holldobler,Tandem Calling: A New Kind of Signal in AntCommunication. Science, 186:4168 (1974), pp.1046-1047.

[41] T. Morey, K. De Meyer, S.J. Nasuto & J.M.Bishop, Implementation of the Spiking NeuronStochastic Diffusion Network on Parallel Hard-ware (abs). Consciousness and Cognition, 9:2 97-97, (2000), Academic Press.

[42] W.S. McCulloch & W.Pitts, .: A logical cal-culus immanent in nervous activity. Bulletin ofMathematical Biophysics 5, (1943), 115-133.

[43] C. F. Moukarzel, Spreading and ShortestPaths in Systems with Sparse Long- RangeConnections. Physical Review E 60:6, (1999),62636266.

[44] [D. Mumford, Neural Architectures forPattern-theoretic Problems. In: Koch, Ch.,Davies, J.L. (eds.): Large Scale NeuronalTheories of the Brain. The MIT Press, London,England (1994).

[45] S.J. Nasuto & J.M. Bishop, Convergence Anal-ysis of Stochastic Diffusion Search. Journal ofParallel Algorithms and Applications 14:2, pp:89-107, 1999.

[46] S.J. Nasuto, J.M. Bishop & S. Lauria, TimeComplexity of Stochastic Diffusion Search. Neu-ral Computation ’98, Vienna, Austria, 1998.

[47] S.J. Nasuto, K. Dautenhahn & J.M. Bishop,Communication as an Emergent Methaphor forNeuronal Operation. Lecture Notes in ArtificialIntelligence 1562 (1999), 365-380.

[48] S.J. Nasuto, Analysis of Resource Allocationof Stochastic Diffusion Search, PhD Thesis, Uni-versity of Reading, UK, 1999.

[49] [J.I. Nelson, Visual Scene Perception: Neuro-physiology. In: Arbib, M.A. (ed.): The Hand-book of Brain Theory and Neural Networks. MITPress: Cambridge MA (1995).

[50] [J.I. Nelson, Binding in the Visual System. In:Arbib, M.A. (Ed.): The Handbook of Brain The-ory and Neural Networks, MIT Press, CambridgeMA (1995).

[51] A. Neumaier, Complete search in continuousglobal optimization and constraint satisfaction.In Isereles, A., (ed.) Acta Numerica 2004, Cam-bridge University Press.

[52] M.E.J. Newman & D.J. Watts, Scaling andPercolation in the Small-World Network Model.Physical Review E 60:6 (1999), 7332-7342.

[53] [D.I. Perret et al., Visual neurons responsiveto faces in the monkey temporal cortex. Experi-mental Brain Research 47 (1982) 329-342.

[54] [S. Pinker & A. Prince, On Language and Con-nectionism: Analysis of a Parallel DistributedProcessing Model of Language Acquisition. In:Pinker, S., Mahler, J. (eds.): Connections andSymbols, MIT Press, Cambridge MA, (1988).

[55] T. Poggio & F. Girosi, Networks for approx-imation and learning. Proceedings of the IEEE78 (1990), 1481-1497.

[56] F. Rosenblatt, Principles of Neurodynamics.Spartan Books, Washington DC (1962).

[57] [E.T. Rolls & M.J. Tovee, Processing speed inthe cerebral cortex and the neurophysiology ofvisual backward masking. Proc. Roy. Soc. B 257(1994) 9-15.

[58] [D.E. Rumelhart & J.L. McClelland, J.L.,(eds.), Parallel Distributed Processing. Explo-rations in the Microstructure of Cognition, MITPress, Cambridge MA (1986).

15

[59] [T.J. Sejnowski & C.R. Rosenberg, C.R., Par-allel networks that learn to pronounce Englishtext. Complex Systems 1 (1987), 145-168.

[60] [T.J. Sejnowski, Time for a new neural code ?,Nature 376 (1995) 21-22.

[61] [B. Selman et al., Challenge Problems forArtificial Intelligence. Proceedings of AAAI-96,National Conference on Aritifical Intelligence,AAAI Press, 1996.

[62] [G.M. Shepherd The Synaptic Organisationof the Brain. Oxford University Press, LondonToronto, 2nd Ed, (2004).

[63] [W.R. Softky & C. Koch, The highly irregularfiring of cortical cells is inconsistent with tempo-ral integration of random EPSP. J. of Neurosci.13 (1993) 334-350.

[64] A. Treisman, Features and Objects: The four-teenth Bartlett memorial lecture, The QuarterlyJournal of Experimental Psychology, 40A:2, pp.201-237, 1988.

[65] [A.M. Thomson, More Than Just FrequencyDetectors ?. Science 275 Jan (1997) 179-180.

[66] [S.J. Thorpe & M. Imbert, Biological con-straints on connectionist modelling. In: Pfeifer,R., et al. (eds.): Connectionism in Perspective.Elsevier (1989).

[67] D.J. Watts & S.H. Strogatz, Collective Dy-namics of Small-World Networks. Nature 393(1998), 440-442.

[68] R.M. Whitaker and S. Hurley, An agent basedapproach to site selection for wireless networks.Proc ACM Symposium on Applied Computing(Madrid), (2002), 574-577.

[69] F. Van der Velde, On the use of computationin modelling behaviour, Network: Computationin Neural Systems, 8, pp. 1-32, 1997.

[70] [P. Wegner, Why Interaction is More Powerfulthen Algorithms. CACM May (1997).

[71] R. Yuste, Dendritic shock absorbers, Nature,387, pp: 851-853, 1997.

16