Flexibly Programmable OPTOElectronic Analogic CNN Computer (POAC) Implementation Applying An...

25
FLEXIBLY PROGRAMMABLE OPTO-ELECTRONIC ANALOGIC CNN COMPUTER (POAC) IMPLEMENTATION APPLYING AN EFFICIENT, UNCONVENTIONAL OPTICAL CORRELATOR ARCHITECTURE SZABOLCS TŐKÉS, LÁSZLÓ ORZÓ, AHMED AYOUB AND TAMÁS ROSKA Analogical and Neural Computing Laboratory, Computer and Automation Research Institute, Hungarian Academy of Sciences, PO box H-1111 1111 Kende u.11, Budapest, Hungary, Email: {Tokes, Orzo, Roska}@sztaki.hu http://www.labanalogic.sztaki.hu/index.html Received (received date) Revised (revised date) Accepted (accepted date) To overcome shortcomings of the currently available optical computer architectures a programmable opto-electronic CNN array computer (POAC) is introduced. It combines the optical correlators’ supreme resolution and complexity with the versatility of the single chip adaptive visual CNN–UM sensor and processor arrays. The system is grounded on an unconventional optical correlator architecture, which is a modified joint Fourier transform correlator. This architecture uses the angular coding of the template pixels and applies the template not in the recording phase, but in the reconstruction phase of correlation. Feasibility of this concept has been successfully demonstrated by a breadboard model. It computes correlation in two consecutive steps: First a hologram of the input image is recorded in a dynamic holographic material (we used chemically modified Bacteriorhodopsin film samples). Next, angularly coded read-out beams reconstruct the appropriately weighted and shifted replicas of the stored input image. Superposition of these copies produces the correlogram. A programmable adaptive sensor array, a special Visual CNN-UM chip detects the resulting correlogram. So, both the sensing and the required post processing steps are accomplished in a parallel way. The introduced opto-electronic architecture provides exceptional synergy and good compromise of the applicable technologies and the system performance. 1. Introduction Target recognition and tracking tasks, including image processing and understanding algorithms require enormous computation power. Real-time applications in particular have proven to be impossible with the use of currently or in the near future available technologies. However, the so far neglected analog technologies provide a solution for these tasks. A great number of algorithms use integral operations (convolution, correlation) 1 . Obviously, these local operations can be handled with high efficiency within the cellular nonlinear/network (CNN) paradigm 2,3 . The CNN concept can be implemented effectively by the analog VLSI technology 4,5,6,7,8 . Uniquely designed, special-purpose CNN microcircuits grant extremely high performance 9,10,11,12 . Furthermore, stored programmability of the CNN circuits is also tractable 13 and this has led to the concept of the CNN universal machine (CNN-UM) 14 , which extends the applicability of this paradigm 10 radically. Several VLSI CNN-UM implementations exist 11,15,16,17,18 and confirm the extreme capabilities of this powerful principle 19 . Even digital CNN-UMs 20 are constructed. Due to the built in local analog memories and the also implemented feedback templates, these chips can be applied successfully as general-purpose image processing 14 and complex

Transcript of Flexibly Programmable OPTOElectronic Analogic CNN Computer (POAC) Implementation Applying An...

FLEXIBLY PROGRAMMABLE OPTO-ELECTRONIC ANALOGIC CNN COMPUTER (POAC) IMPLEMENTATION APPLYING AN EFFICIENT,

UNCONVENTIONAL OPTICAL CORRELATOR ARCHITECTURE

SZABOLCS TŐKÉS, LÁSZLÓ ORZÓ, AHMED AYOUB AND TAMÁS ROSKA

Analogical and Neural Computing Laboratory, Computer and Automation Research Institute, Hungarian Academy of Sciences, PO box H-1111

1111 Kende u.11, Budapest, Hungary, Email: {Tokes, Orzo, Roska}@sztaki.hu

http://www.labanalogic.sztaki.hu/index.html

Received (received date) Revised (revised date)

Accepted (accepted date) To overcome shortcomings of the currently available optical computer architectures a programmable opto-electronic CNN array computer (POAC) is introduced. It combines the optical correlators’ supreme resolution and complexity with the versatility of the single chip adaptive visual CNN–UM sensor and processor arrays. The system is grounded on an unconventional optical correlator architecture, which is a modified joint Fourier transform correlator. This architecture uses the angular coding of the template pixels and applies the template not in the recording phase, but in the reconstruction phase of correlation. Feasibility of this concept has been successfully demonstrated by a breadboard model. It computes correlation in two consecutive steps: First a hologram of the input image is recorded in a dynamic holographic material (we used chemically modified Bacteriorhodopsin film samples). Next, angularly coded read-out beams reconstruct the appropriately weighted and shifted replicas of the stored input image. Superposition of these copies produces the correlogram. A programmable adaptive sensor array, a special Visual CNN-UM chip detects the resulting correlogram. So, both the sensing and the required post processing steps are accomplished in a parallel way. The introduced opto-electronic architecture provides exceptional synergy and good compromise of the applicable technologies and the system performance.

1. Introduction

Target recognition and tracking tasks, including image processing and understanding algorithms require enormous computation power. Real-time applications in particular have proven to be impossible with the use of currently or in the near future available technologies. However, the so far neglected analog technologies provide a solution for these tasks. A great number of algorithms use integral operations (convolution, correlation)1. Obviously, these local operations can be handled with high efficiency within the cellular nonlinear/network (CNN) paradigm2,3. The CNN concept can be implemented effectively by the analog VLSI technology4,5,6,7,8. Uniquely designed, special-purpose CNN microcircuits grant extremely high performance9,10,11,12. Furthermore, stored programmability of the CNN circuits is also tractable13 and this has led to the concept of the CNN universal machine (CNN-UM)14, which extends the applicability of this paradigm10 radically. Several VLSI CNN-UM implementations exist11,15,16,17,18 and confirm the extreme capabilities of this powerful principle19. Even digital CNN-UMs20 are constructed. Due to the built in local analog memories and the also implemented feedback templates, these chips can be applied successfully as general-purpose image processing14 and complex

pattern generating units21. Using second order CNN implementations22 even wave based algorithms become efficiently solvable23,24. Notwithstanding, in VLSI CNN implementations the applicable template size and resolution are restricted. Due to the chip surface limitations, a compromise has to be made between the size of the CNN cells and their complexity. This means that so far only nearest neighbor templates are implemented and 128x128 pixel resolution has been achieved25. It has to be noted, that even with these restrictions, the VLSI implemented CNN performs much better and possesses computational power that is several orders of magnitude higher than its contemporary digital processing counterparts19. This is the consequence of its massively parallel architecture, where image flows can be handled. Local optical sensors provide parallel input to visual CNN chip15,26. As these sensors can be controlled according to the state of the appropriate CNN cell, an adaptive sensor array is realizable27,28. Being a locally connected analog sensor array, CNN appears to be an ideal modeling tool of visual systems29 and might provide an optimal framework for bionic eye implementations30. To overcome the above-mentioned shortcomings of the VLSI CNN implementations, namely the restricted size of both the input image and that of the templates, other types of implementations are also considered. Optical implementations offer much larger input image resolution and template sizes. However, so far only a few attempts had been made to implement a CNN system optically31,32,33,34,35,36,37,38,39,40,41 and some of them were based on mainly numerical simulations35. To maintain the high degree of parallelism and to avoid the serial transfer bottleneck, all-optical implementations are favored. However, the feedback operations and the appropriate nonlinear functions are hard to implement optically42,43,39. Furthermore, the so far used VanderLugt type correlators are not flexibly programmable and are extremely sensitive to the misalignments of the optical elements44. A hybrid opto-electronic implementation, based on a joint Fourier transform correlator (JTC)45,46,47, however, combines the outstanding parallelism and speed of the optical correlator with the stored programmability, built in nonlinearities and local analog storage capabilities of a VLSI CNN-UM device. In this new conceptual system an optical correlator provides an extremely high computational power, while a visual CNN-UM chip detects, measures and adaptively thresholds the optical system’s output, as well as performs the necessary post-processing tasks in a parallel way. However, there is a contradiction between the high resolution of the input image (~1000x1000) and the moderate resolution (128x128) of the adaptive detector. An appropriate detector cell should have high processing complexity (as it is the case of the visual CNN-UM chip), high resolution (as in CCD or CMOS cameras) and high speed. Since no currently available detector array possesses all of these features simultaneously, we have to develop an all-optical method to overcome this problem. The solution is a coherent post-processor applied to a t2-JTC, described in a paper to be published (A novel coherent/incoherent correlator for correlogram post-processing). In the present paper we introduce a new, unconventional optical correlator (a modified JTC)a architecture48,49,50 to implement the above-introduced programmable opto-electronic analogic CNN computer (POAC). First the objectives of this task are defined. Next we compare the optical correlator architectures utilized so far with our correlator, which is the basis of the optical CNN implementation. We show that the recently

a This is the so called t2-JTC, where t2 means that the template is applied in the holographic reconstruction step of the correlation, while for the traditional JTC we introduce the t1-JTC notation. Here, t1 labels a template used in the recording step.

developed optical correlator architecture unifies the advantages of the earlier correlators but avoids their drawbacks. Special advantages of this architecture are also discussed. Simulation and experimental results, validating the invented setup capabilities, are presented. The measurements on the proof-of-principle model of the POAC system are also shown. A few examples of applications illustrate that there are computationally intensive tasks that can be executed much more efficiently with a POAC than any other means. Further development ideas of the system are proposed. Computational power of this hybrid opto-electronic CNN implementation is estimated.

2. POAC: the Programmable Opto-electronic Analogic CNN Computer

Optical correlators provide extreme computational power due to their inherently massive parallelism (up to 107 parallel channels). Electronically or optically addressable spatial light modulatorsb (ESLM or OASLM) can be used to display input images and templates51,52. To detect the output different CCD or CMOS sensors can be used. Unfortunately, these devices have only serial down and upload capabilities, which seriously restricts the entire system’s performance. In spite of the fact that the processing is fulfilled at the speed of light in a vastly parallel optical way, the serial nature of input and output forms a data transfer bottleneck. To overcome this problem we have introduced the concept of the programmable opto-electronic analogic CNN computer (POAC), which is based on the successive application of a massively parallel optical processor (optical correlator) and the use of a programmable CNN sensor and processor array – a CNN-UM chip. So the number of the necessary serial data transfer is considerably reduced. Applying an OASLM as an incoherent-to-coherent converter to display input images, the system’s parallelism can be further enhanced. Furthermore, the CNN paradigm provides a general programming frame for the optical correlator itself. Since the basic operation of CNN is convolution or correlation, the optical correlator itself can be considered as a special type, feed-forward-only CNN machine. The difference between correlation and convolution operations (on real valued functions) can be handled simply by flipping the templates around x- and y-axis. So the whole system can be discussed and handled within the framework of the CNN paradigm.

3. Optical Correlator based CNN-UM Implementations

Several optical CNN implementation architectures emerged in the last decade34,37,32,38,53. All of them are based on some kind of optical correlators. Although these optical implementations have high resolution, a very high degree of parallelism54, they were not flexibly programmable. Most of these implementations apply the so-called VanderLugt scheme (see later). Their main limitation is the limited programmability since it requires a series of prefabricated complicated holographic filters. As nonlinearity is hard to be implemented optically, presently, feed-forward-only CNN operations are realized in all of the optical implementations. There are some proposals for realization of optical nonlinear feedback operations but these remained at the level of digital simulation53,55. However, in opto-electronic implementations, the embedded electronic sensor arrays can introduce the necessary nonlinearity. Feedback operations are frequently used in diverse CNN algorithms. However, if the applied templates do not

b We used a liquid crystal micro display and a two-dimensional acousto-optical deflector as input device for the image to be processed and for the template, respectively. However, other types of SLMs - as MEM mirrors, VCSEL microlaser arrays - can also be used.

push the cells into the nonlinear regime, like in different spatial filtering tasks, it is easy to replace them by big feed-forward templates. Otherwise they do not really require large template sizes like in special pattern generating, wave-based algorithms, and this way the built in VLSI CNN-UM can implement them efficiently within the POAC. To compare the different optical architectures a short summary of them is given.

4. Optical Correlator Architectures

We can differentiate between fundamentally two families of correlators: incoherent and coherent correlators. However, the most advanced coherent correlators can be further classified, and as we have pointed out, even the JTC correlators can work in fully coherent and in semi-coherent mode56,57.

4.1. Incoherent correlators

Unlike the coherent correlators’, incoherent correlators' work (with a few exceptions58, 59) is based mainly on shadow casting and therefore diffraction is not utilized but induces degradation of the system precision. This property severely confines the attainable space-bandwidth product and so the incoherent architectures' overall performance. It has to be note that the incoherent correlators eliminate the otherwise inescapable coherent, speckle noise.

4.2. VanderLugt correlator

Due to its simple structure, the majority of the commercially available coherent optical correlators60,61 are based on the so-called VanderLugt (VLC) 4f scheme44. In this architecture a previously calculated Fourier holographic filter extracts the correlogram of a reference object (template) from the input image into the back focal plane of a Fourier transforming lens (see Fig. 1.).

Fourier plane filter Lens Lens Input image Correlogram

f f f f

Fig. 1. VanderLugt type of optical correlator scheme is a so-called 4f system. The Fourier calculus can describe the operation of VLC:

))()((),( *1 tFxFFtxCorr ∗= − ( 1 )

Where x, t denote the input image, template; F, F-1 the Fourier and inverse Fourier transformations; ∗ and * multiplication and complex conjugation respectively.

The main speed-limiting factor of the VanderLugt correlator is the restricted pace of the filter-hologram display. This includes the cumbersome calculation of the hologramc. It can take even more time than the direct calculation of the cross correlation. Thus the real time generation of the holographic matched filter cannot be achieved. Consequently, the application of a VLC system seems to be useful only for those tasks where a few, previously defined reference objects or templates are to be searched for. Furthermore, the VLC is extremely sensitive for its elements' precise alignment.

4.3. Joint Fourier transform correlator (t1-JTC)

Another coherent optical correlator architecture is known, which is based on the recording of the input image and template (reference object) joint power spectrum, which is a Fourier hologram. A square law detector – photosensitive film, a sensory device (CCD, CMOS), OASLM or some kind of dynamic holographic medium can record this joint spectrum. There are two consecutive steps in the operation of the JTC62,46,47. In the recording step both the input image and the key-template have to be displayed simultaneously and illuminated with mutually coherent beams. In the read-out (hologram reconstruction) step this hologram is illuminated with a coherent plane wave (see Fig. 2). The reconstructed wavefront is inverse-Fourier transformed by the second Fourier optic lens. The off-axis correlogram and convolution are formed in addition to the on-axis auto-correlation terms, also formed in the back focal plane of this lens.

Lens LensCoherent plane wave

f f f f

Read out

Square law

detector Correlogram

Template

Object

Input

Reference object

Fig. 2. Scheme of the JTC optical correlator architecture. Hologram of the input image and the template joint power spectra is fixed dynamically in a square law detector (e.g. OALSM, bacteriorhodopsin). A parallel coherent laser beam reads out the hologram and an inverse-Fourier transforming lens produces the correlogram.

No pre-calculated holographic matched filter is needed in this architecture. Reprogramming is performed by recording a new hologram with the same or a new input image, but with a new t1 template. So unfortunately, for all correlation operation a hologram-recording step is necessary, which - due to the inevitable photon-electron conversion - confines the speed of the system. However, the speed of the display and of the sensor usually set a more severe limit to the system-speed than the hologram

c This can be a computer-generated high space-bandwidth Fourier-transform hologram of the template and it that has to be stored and displayed on an ESLM. In this case the majority of the resolution is lost to represent the carrier frequency. Or it can be prefabricated what leads to inflexibility. If it is calculated optically, than it has to be pre-recorded. This latter version can be the fastest and most flexible solution, however, it requires a fast and sensitive dynamic holographic material.

recording. The speed of the common liquid crystal optically addressable spatial light modulators (LCD OASLM)63,64, which is frequently used as a dynamic holographic medium, are too low for high-speed applications. Furthermore, this type of OASLM devices have too low space bandwidth product to store high-resolution joint spectra (in particular, the high spatial frequency carrier of the off-axis holograms exploits a lot of the SBWP). Although there are other type of dynamic holographic devices and materials, which have the required resolution, but they demand much more laser power for the hologram recording44,65,66. Finely tuned balance has to be kept between the image and template illuminations to ensure the appropriate utilization of the holographic material‘s limited dynamic range in a JTC setup. This balance can be more easily ensured in a dual axis JTC system than in a single axis one67. Furthermore, a dual axis JTC arrangement optimizes the input display (SLM) surface utilization, but sets further requirements against the resolution of the dynamic holographic medium. The next figure shows a dual axis JTC arrangement (Fig. 3).

Two complementary aspects of JTC: the joint correlation can be discussed in two different aspects: Either in the spatial domain or in the spatial-frequency domain. In the spatial (signal or image) domain view: the correlogram is produced by shifting and weighting the holographically reconstructed input image according to the position and intensity of the template pixels, respectively. To complete the correlation process, summing of these superposed images is required (detailed analysis of the coherent and incoherent JTC will be given in one of our next paper). In the spatial-frequency domain, the joint correlation can be handled by the Fourier calculus, as it is usual in the linear-systems. Fine distinctions have to be made among the different optical implementations’ mathematical models. It is important to mention that there are a lot of articles published recently on hybrid JTCs. In these experiments, the joint hologram is usually picked up by a CCD camera, digitally processed and then displayed on an ESLM to form the correlogram. These correlators are useful for studying the nature of JTC, but having a severe serial transfer bottleneck they loose their parallelism, so they can have little practical value. Our goal is to develop an all-optical or nearly-all-optical correlator to fully exploit the advantages of optical array processing.

Collimated laser light

M 1

M 2

Joint Power Spectrum

T

O

L1

L2

Fig. 3. Dual axis architecture for holographic recording

Comparing the VanderLugt and JTC type of systems, the advantages and drawbacks ban be summarized68:

Advantages of JTC realization • Use of a spatial-domain filtering (no previous calculations and preprocessing is

needed to synthesize the Fourier–domain filters which are necessary for VLC). • The two-axis JTC has a higher space-bandwidth product (SBWP). • It can have a higher modulation depth. • It is suitable for real time applications. • It is much more robust against vibrations and misalignment compared with the VLC

and its robustness is comparable with that of the incoherent correlators.

Drawbacks of JTC: • It suffers from moderate (lower) detection efficiency when applied to multiple target

recognition or targets embedded in intense background noise. • High lateral coherence is required, but it does not need large coherence length. • The JTC method is less efficient from energetic point of view as the first order

diffractive beam, which is providing the desired correlation, carries only about 1/8th of the incident energy.

5. t2-JTC: Unconventional Optical Correlator Architecture (with angularly coded t2 templates)

To overcome the limitations of the above-mentioned optical correlators we have developed an architecture, which combines their main advantages while avoids their hindrances. Since a flexibly programmable optical CNN implementation is desired, dual axis JTC based architecture had been chosen. Applying a new idea we were able to radically increase the capabilities of this system. It has been recognized that the recorded joint power spectrum can be regarded as coherent superposition of input image holograms according to each template pixel reference. Furthermore, the readout beam can be regarded as the Fourier transform of a single pixel. From this point of view we can revert the operation of the whole system. If we use only a single pixel ‘reference object’ – resulting in a parallel reference beam – then simply a hologram of the input image is recorded in the holographic medium. The readout beams, however, is modulated according to the reference template pixels, to get correlation. Appropriate modulation can be achieved easily by a simple optical system: That is, the templates’ proper angular coding is ensured by the application of a relatively low-resolution display and a suitably positioned Fourier transform lens. The Fourier transform of any template pixel is an appropriately deflected parallel beam weighted by the according pixel’s gray-scale value. These beams reconstruct shifted and weighted copies of the input image and their superposition results in a correlogram. This superposition can be either coherent or incoherent depending on the mode of operation. If the t2 template pixels are coherent but mutually incoherent ones, then the holographically reconstructed images are incoherently summed at the photodetector pixels. In this incoherent mode of correlation, most of the coherent noises are eliminated. It is especially beneficial and advantageous, if different kinds of phase errors are present in the system (it is frequently the case in custom liquid crystal displays). If we display the t2 template pixels in a mutually coherent way then the complex amplitudes of the reconstructed images are summed by interference. So, the

optical field explicitly contains the correlogram, what is superior comparing to the incoherent correlogram. By spatial filtering we can further enhance the discrimination capability of the coherent correlogram.

Lens LensCoherentplane wave

f f f f

Read out

Squarelaw

detectorCorrelogram

Template t1

(Dirac delta)

objectInput,

Template t2

Fig. 4. Scheme of the unconventional optical correlator architecture ((t1+t2)-JTC). Here, the hologram of the input image applying a t1 reference object is dynamically fixed within the JTC scheme. Additional programming (by template t2) is carried out in the read-out step. In this second step there is no need for photoelectric conversion, so extremely high correlation speed can be achieved.

5.1. (t1+t2)-JTC architecture

Within this scheme even combined, double template operations can be implemented, where first the joint power spectrum of the input image and a primary template t1 is recorded, then it is read out using a modulated beam according to the Fourier transform of a secondary template t2. Here, even more complicated operations; convolution and/or correlation can be implemented (e.g. correlation of t2 on the correlogram of t1 on the input image s(x,y)).

6. Special Advantages

Although the t1-JTC architecture provides flexible programmability, it requires recording of the hologram for each correlation. Since usually at least one of the recording media parameters: speed, resolution and sensitivity are severely confined, the hologram-recording step limits the whole system performance (all of these requirements should be satisfied simultaneously). With the t2-JTC architecture this limitation had been alleviated. A great number of correlation computation steps can follow each hologram-recording step and this way a high resolution, but relatively less sensitive holographic media can be used. (1000 correlation/sec can be executed in our present implementation, where a two-dimensional acousto-optical deflector displays the t2 templates in a mutually incoherent way. Presently the detector speed limits the systems performance.) Reconstruction steps do not require photon-electron conversions therefore they can be completed at the speed of light. Due to the quasi Fourier arrangement - the holographic medium is placed only in the neighborhood of the Fourier plane - the correlator can better utilize the restricted dynamic range. The whole system is more robust and less sensitive to the misalignments of its parts than any other type of correlators.

The use of "reconstructing template t2 " introduces the following possibilities: (i.) Only a single (not superimposed) hologram has to be recorded in the back focal

plane of the first Fourier optic lens using a point source reference. (ii.) The t2 templates can be applied in the reconstruction step, which is much more

simple to implement. This also helped us to maintain the ideal energy balance between the object beam and reference beam.

(iii.) The use of t2 template enables us to speed up the correlation “calculation” process. (iv.) Different t2 templates can be applied at high frequency on the same input scene

without the need of repeated holographic recording. With this method we can realize scale, distortion and rotation invariant pattern recognition.

(v.) Either fully coherent or mutually incoherent reconstructions are possible, involving further advantages. This makes it possible to fulfill coherent post-processing to improve the discrimination capabilities.

(vi.) Applying both t1 and t2 templates simultaneously on an object-scene or applying different t2 templates on the same JPS (recorded with the same object and t1 template pair) double correlation (correlation of t1 on the object and correlations and convolution of t2 on the first correlogram) can be implemented. Furthermore, this way multi-scale correlation can be implemented with novel pattern and class recognition capabilities.

(vii.) The use of preprocessed (e.g. differentiated, contoured) t2 templates enhances the discrimination capability.

7. The Proof-of-principle Model

To test the new optical correlator architecture a breadboard model of it has been built. Although, not all the final elements of a POAC system are built-in yet, its feasibility has been successfully demonstrated. The schematic structure of the optical correlator can be seen in the next figure (Fig. 5). It defines correlation in two consecutive steps: First a hologram of the input image is recorded. Next, according to each template pixel a shifted and weighted reconstruction of the input image is formed. Superposition of these shifted and weighted copies results in a correlogram. Several correlation determination steps can be fulfilled after each hologram recording. The space-bandwidth product of the optically addressable spatial light modulator (LCD OASLM) applied as a square law detector (holographic recording device) in the earlier experimental setup was too low. Therefore, alternative high-resolution dynamic holographic materials were considered to increase the resolution of the system. First a special photochromic polymer film was examined69,70,71, but due to this material’s wavelength and power requirements, an alternative dynamic holographic medium the Bacteriorhodopsin72 had been chosen. It turned out to be an excellent choice (details of this material can be found later). A 2D acousto-optical deflector was used to display the read-out t2 template. By the use of this device it was easy to set the required angular deflections and the speed of the device was superior to any contemporary micro display devices for relatively small resolution templates. A telescopic system simultaneously transformed the spot size and the angle of the pixel-beams impinging onto the hologram.

He-Ne lasers were applied both for hologram recording and reconstruction as well (10 mW and 6 mW accordingly). Their coherence length (~10 cm) and power were suitable for the preliminary measurements.

Laser #2 (read)

2D AO deflector

Beam expander

Beam expander

CNN-UM (detector)

Input Image (LCD)

BR (dynamic holographic recording material)

Mirror

Mirror

Semi -transparent Mirror

Imaging lens

Fourier transforming lens Output

Input Program

Beam splitter

Laser #1 (write)

Fig. 5. The POAC architecture consists of a new programmable optical correlator and a VLSI CNN-UM chip with special optical input and on-chip processing capabilities. Correlation is determined in two consecutive steps: first a hologram of the input image (600x800) is recorded (indicated by bright gray color in the figure) in a dynamic holographic medium (bacteriorhodopsin film). During the consecutive step(s), suitably modulated laser beams (dark gray color), according to angularly coded t2 template pixels, reconstruct the appropriately shifted versions of the input image and produce correlation by incoherent summation. A high-speed two-dimensional acousto-optical deflector accomplishes this angular coding. This way a large number of CNN template operations (1000 frames/second with 32x32 templates) can be executed on the recorded high-resolution input images.

Reflection type (Lippmann-Denisyuk) holograms73 were recorded in this setup, but transparent Bragg type holography has been also tested. In the preliminary measurements we applied a CCD camera with a frame grabber. Although this sensor was an adequate measuring device, it limited the whole setup speed to 30 frame/seconds. In the further development of the POAC this sensor will be replaced by a visual CNN sensor and processor array. Hence, not only the sensor speed limitations will be alleviated, but also an adequate post-processing unit will be included. The photo of the breadboard model can be seen in the next figure.

Fig. 6.Photograph of our semi-incoherent optical CNN setup.

8. Applied Materials and Devices

8.1. Bacteriorhodopsin

Bacteriorhodopsin (BR) is a fast, high-resolution dynamic holographic material74,75,76. BR is a stable transmembrane protein of an archea, Halobacterium Salinarium (formerly Halobacterium halobium). This microorganism lives in salt marshes and prefers extremely high sodium chloride concentrations. BR, itself, is a photon-driven proton pump. BR molecules consist of seven α-helical sub-domains. These parts of the protein form a hole, which is occupied by a co-factor, the retinal molecule. Structure of the BR molecule has been intensively studied and is relatively well known. BR molecules aggregate into uni-axially oriented, hexagonal, trimer structure. This hexagonal, crystal-like structure assembles into large clustered patches in the membrane. After absorption of a photon BR goes through a sequence of chemical and physical changes. It is including the retinal molecule’s all-trans – 13-cis transformation, shifts of charges within the protein, movement of protons, that is protonation, deprotonation, and several conformational changes as well. The above characteristic changes are reflected - and manifested primarily - in the changes of the BR’s absorption spectrum and in the related refractive index changes. Different states have different lifetimes and characteristic absorption spectra. After photon absorption and the proton transfer steps, sooner or later BR returns to its initial conformation. The whole sequence of transitions is called the photocycle of the BR77,78,79,80,81. Quantum yield of the photocycle is relatively high (0.67). Above-mentioned characteristic changes in the absorption spectra can be used for analog optical storage. Maxima of the absorption spectra of the main intermediate states in the BR photocycle and the characteristic life-times of the different states in wild type, not modified BR are depicted in the next figure (Fig. 7).

bR568

J625

K600

L550

M412

N520

O640

500fs

10ps

2µs

40µs 7ms

4ms

50µs Intense Blue light

Photon absorption

Fig. 7. Transition times of the BR photocycle states. Subscripts denote the maxima of the absorption spectra. Intense blue illumination drives the BR from the M state to the bR state promptly, avoiding the slow thermal relaxation. This way the whole photocycle can be fulfilled within 100 µsec. The big arrowheads indicate the main direction of the transitions, while the small ones denote, that some fraction of the BR is in the photocycle’s preceding state.

By knowing the BR’s structure and its detailed molecular dynamics, even its molecular engineered modification is possible. Different biologically, chemically and physically modified BR forms are available. For practical application several different types of utilization of BR have been suggested82,83. BR has extremely high spatial resolution: higher than 5000 lines/mm84. This resolution even exceeds the commonly applied optical system’s capabilities. This high spatial resolution assures the feasibility of BR based dual axis JTC implementation. Sensitivity of the material is relatively low (0.1-80 mJ/cm2), but it is even better than any other holographic-materials with comparable resolution (e.g. photo-polymers). Speed of BR writing can be remarkably high, but of course, it depends on which states or transitions of the photocycle are utilized. Usually the BR’s initial state (bR) and another relatively stable intermediate (M) states are used. The bR to M transformation takes about 40 µsec. This speed seems to be satisfactory for our transient holographic recording purposes, but using other states and different modified forms of BR even much higher speed can be achieved. One of the main problems with the utilization of BR as a temporary holographic material is the relatively long thermal recovery time. M to bR transition can take several hundreds of milliseconds and even seconds depending on the different physical and chemical conditions of modified BR samples. To overcome this obstacle: intensive blue light drives back the BR’s M state to bR state within 50 µsec85. Non-coherent, intense blue illumination is relatively easy to produce. BR can be written and read more than one million times without considerable degradation. It is stable for months or years and persists in harsh environmental conditions: it is resistant against heat and irradiation, protected against oxygen and stable in a really wide range of pH (from pH 1 to 10). To exploit the high-density information storage capability of holographic memories not only high-resolution but a large dynamic range is also needed. A fast two-level system, like BR can serve as an excellent dynamic holographic material 86,87. A total cycle of a dynamic holographic process in an optical processor consists of three steps: writing, reading and erasing. As Seitz and Hampp88 proved it, the bacteriorhodopsin can be regarded – with a good approximation – as an ideal two-state system. We use the bR⇒M transition for writing. Recording a hologram with a 568nm or 633nm laser takes about 50 µs depending on the applied intensity. This is followed by the reading (hologram reconstruction) step using a different wavelength. Using a properly chosen wavelength (e.g. 670nm) non-volatile read-out is possible. The absorption at this wavelength is negligible but the refractive index modulation, resulting in high diffraction efficiency, is still satisfactory. An intensive flash of blue light (λ<420nm) performs erasure by driving back the M states to the bR state. This way the total cycle takes about 100 µs, resulting in a frame rate of 10 kHz. However, in our system, due to the limitations of the available laser power and display speed BR is written only at video rate. To ensure the required massive parallelism high angular bandwidth is needed. Consequently, high spatial resolution holographic material is needed with low angular selectivity. This later requirement means that the maximal thickness of the holo-material must be restricted. Up to this thickness diffraction efficiency (η50) must not fall below half of its maximum, even for the largest angular Bragg mismatch due to the outermost skew rays from the object and template.

Fig. 8. Several different chemically modified (TEA) BR samples have been developed, fabricated and tested. The absorption spectra and optical density of the samples changes with the concentration of the chemical dopant and with the films’ thickness.

To test the applicability of BR as a holographic recording material in the POAC system, several measurements have been made. In the experiments several different chemically modified (TEA)89, dry BR films with 20-40 µm thickness were used. Next figure depicts the absorption spectra of the used samples depending on their thickness and optical density (controlled also by the chemical environment). It should be noted that the overall speed and so the time-dependent sensitivity of the photocycle changes in different modified BR forms (wild type, genetically modified, chemically modified TEA) and in different physical conditions (humidity, temperature etc.): Dynamic behavior – the process leading to the steady-state condition – was only qualitatively studied, while the resolution of the BR samples after reaching the steady-state condition has been measuredd by self-diffraction. Self-diffraction measurements provides information only about the diffraction efficiency of the –1 order. Resolution – which was measured by the diffraction efficiency as a function of the angle of the interfering beams – showed a steep decrease with growing angle. Using shutters we measured the steady state diffraction efficiency of both the –1 and the +1.We have found that the intensities of the positive and negative first order diffraction were not symmetric functions of the angle. It was foreseeable, because with increasing angle we moved from the Raman-Nath region to the Bragg domain. Our measurements were in good agreement with Hampp’s previous results, showing that the +1 order – what we can utilize in the correlator – couples out more and more light from the –1 order as the effective thickness of the hologram grows with increasing angle. However, these steady state measurements have to be extended into the short time dynamic regions to get data for our fast processing needs. We found that the diffraction efficiency of the positive first order term was at least 0.3% and practically did not change with the angle up to 90°.

dThe coherence length for the recording of the interference pattern was long enough (a 633 nm HE-Ne laser was used and the path difference carefully minimized) and for the steady-state measurements proper vibration isolation was needed what can be omitted in fast dynamic measurements.

We used semi-thick bacteriorhodopsin samples for holographic recording, therefore the angular selectivity of such samples had to be measured. Angular selectivity limits the size of the applicable templates. A maximal angular difference of 90° between the reference and the read-out beams where measured and the holographic reconstruction proved to be satisfactory. In our experiments we investigated the effects of the classical JTC architecture’s illumination imbalance on the recording material. Highly unequal illumination causes the bleaching of the BR film at some places meanwhile other areas are still underexposed. This inhomogeneous response is an obstacle of appropriate holographic recording. To avoid these hindrances a dual axis fractional joint Fourier correlator architecture was implemented.

8.2. Visual CNN-UM chip as correlation detector and postprocessor

A new generation of CNN-UM chip has been designed and fabricated (ACE16K) 0. It has a cell-wise sensor array for parallel optical input and it suits the requirements of POAC. The hardware and software environment of this chip is being developed. The speed and sensitivity of this chip is appropriate for high-speed applications. Although the resolution of this sensor array is moderate, optical high pass filtering of the correlation results makes it satisfactory as detector and post-processor.

8.3. Acousto-optic deflector for implementing templates

For the angular coding of the template pixels we used a 2D acousto-optic deflector (AOD). It is a pair of TeO2 (tellurium dioxide) deflector with 50-90 MHz (70-90 MHz) acoustic bandwidth. This provides 44 mrad (22 mrad) deflection bandwidth. Within this deflection we used only 32x32 pixel templates, but we can implement bigger templates up to 60x60. Physical speed of the deflector is limited by the speed of acoustic waves within this material. Switching time for the applied one-millimeter diameter beams is 1 µsec in this material. Diffraction efficiency in the applied 2D AO deflector reached 40%. Viewing angles of the input and template pixel pitches have to agree in order to achieve correlation. The reference and reconstructing beams’ diameters had to be adjusted to cover the whole hologram surface. In a Fourier setup the input pixel pitch and the focal length of the Fourier lens determine this viewing angle. There is an inverse relationship between beam-diameter and angular magnification, so the speed and angular bandwidth of the AOD is limited due to the required beam expansion. A high-speed AOD radio-frequency driver is required to reach deflectors physical speed limitation. Special attention was paid to forming the template pixel beam’s wave front to make it similar to that of the reference beam.

8.4. Liquid crystal display (LCD)

In our current setup reflective LCD (Microdisplay: MD800G6 SVGA) was used as an input device (ESLM). Its frame rate is 60Hz, but as it is coding colors in a sequential manner even 180 Hz frame-rate is attainable. The resolution of this device is 600x800, but provides relatively high contrast (200/1) and reflectivity (overall throughput is claimed to reach ~40%). High-resolution LCDs (1024x1268 or even 3500x4500) are also commercially available. These are used frequently in common overhead projectors. Although, their price is still relatively high, they can be used as an input element in a POAC system, if a special

application requires high resolution. A suitable OASLM can be used an alternative input device, which can pick-up live images. This solution combines the functions of a high-resolution camera and a display that is it works as an incoherent to coherent converter. Although OASLM devices are expensive, their resolution can reach 4500x4500.

8.5. Laser

In the current setup continuous wave He-Ne lasers were used. Their power was only 5-10 mW. A laser with relatively high coherence length is necessary for hologram recording, but considerably shorter coherence is sufficient at the read-out. Diode lasers (638, 658nm) had also been tested in the optical correlator. Their poor wave-front quality has to be improved by pinhole filtering. To ensure higher speed, both for hologram recording and read-out steps, high power lasers are required. Actively Q-switched, diode pumped YAG lasers can provide an ideal solution as they have considerably higher power (50-150mW) than the gas lasers used so far. In addition, their absorption in BR is notably (5-8 times) higher. Although these types of lasers are relatively expensive, they seem to be indispensable for high-speed POAC applications. Due to the wavelength difference between the writing and reading lasers, holographic magnification occurs, which has to be balanced out by changing the template pixel pitch.

8.6. Optical constrains

Input images were recorded in two kinds of hologram setups: first by using a Fourier transform lens and second without lens. The former one can also be used for nonlinear processing (high-pass spatial filtering) utilizing saturation effects of BR in the low spatial frequency domain. The lens-less version is simpler, but the input image size is more confined. Viewing angles of the input SLM’s pixel and that of the template SLM’s has to agree in order to get correlation. By using a 2D AOD only a virtual template was realized, but it was much easier to arrange the system than with a real SLM. The size of the reference and reconstructing beams had to be adjusted to cover the whole hologram surface, which is determined by the size of the input image in the lens-less case. In the case of Fourier setup, however, the input pixel pitch p and the focal length f of the Fourier transforming lens determines the size D of the area covered by the first order diffraction pattern90 (D = 2 pf /⋅λ ). As it can be seen in the next figure, there is inverse relationship between the spot size and the angular magnification. So the speed of the AOD and its angular bandwidth is limited due to the required beam expansion.

φ1,2

e1,2 = f1+f2

wAODwAOD’

βtele=f2/f1 WAOD’/WAOD=βtele

φ1,2’

φ1,2’/φ1,2=βtele-1

HOLOGRAM

Fig. 9. t2 template generation by a 2D AOD. A telescopic (afocal) system enlarges the laser spot to match the

hologram aperture while decreases the deflection angle by βtele.

9. Results

9.1. Measurement results

Using the breadboard model correlation of 32x32 binary templates on 500x500 and on 600x800 input images were measured. In the next figure some experimentally measured correlation are presented. Adequate thresholding provides near perfect solutions (see Fig. 10).

A

B

C D

E F

Fig. 10. Computed (C) and experimentally measured correlation (D) of the input image (500x500) (A) and the template (32x32) (B) can be compared. Cross correlation terms show a high degree of similarity. When applying an appropriate threshold (E), the correlation peaks become separable. The system performance can be estimated from the signal-to-noise-ratio.

9.2. POAC Applications

The POAC has an extremely wide field of applications. It is superior to any other analog type of computer in cases where large images, data matrices have to be processed at high speed. The most promising application areas seems to be the image understanding and processing tasks where the use of an optical computer is reasonable and fast. Optical computer implementations are superior to their digital signal processing counterparts especially in those tasks, where great numbers of correlations have to be completed with large templates on high-resolution images91. VanderLugt type, optical filtering based correlators are fast, due to their inherent massive parallelism. However, because of the time consuming process of the matched filter design, they are not applicable when the task requires fast (re-) programming. As there is no need of any matched filter calculation in the advanced t2-JTC architecture, data gained from the previous image frames become applicable. So, a series of feedforward operations can build up complex algorithms. Since some pre- and post-processing is usually required, the parallel CNN-UM VLSI chip, which is an integral part of the POAC system, can fulfill these tasks efficiently. Optical realization of these pre- and post-processing steps is also possible with a programmable reconfiguration of the POAC architecture. Several applications have been suggested, like stereo image processing, motion estimation for image compression or diverse correlation based identification algorithms, where not only fast programmable optical correlation is necessary, but local parallel pre- and post-processing (CNN) is also required. A special correlation based image recognition and adaptive tracking algorithm is detailed in the next section.

9.3. Adaptive target recognition and tracking

Correlation based recognition and tracking of targets is frequently required, because alternative methods (based on special target features like heat emission, color etc.) are relatively easy to deceive. However, distance, coverage and view of the targets alter in a hectic way during the tracking procedure. Therefore, to accomplish the target-tracking task enormous library of templates is required and all of them have to be correlated on each image frame. This feature makes the earlier implementations too slow and considerably constrains their practicality. Using the POAC framework an adaptive target-tracking algorithm is introduced. In this algorithm it is assumed that the target image changes continuously between the consecutive image frames. An appropriately windowed section of the previous frame’s target image is used in the algorithm as a template (reference object) for the next frame. The output of this algorithm can be used to control the tracking servo system. This algorithm can be implemented digitally. However, the computing of correlation, even if it applies fast Fourier transformation-based algorithms, requires considerable computer power. Since the optical computing of correlation is a highly parallel, two step fast procedure, it is much faster and simpler than its digital counterpart. Our adaptive algorithm uses the POAC system to determine the correlation of a template on the input image. The template is a high-pass filtered and windowed portion of the target that has been found in the previous frame. Global or local maximums of this correlation define the possible target positions in this frame. From the location of the

correlation peak a new templatee is extracted, that the algorithm will use to correlate on the following frame. This way the target is adaptively tracked. The following figure demonstrates the operation of this algorithm with numeric simulation.

100

125

126

150

Fig. 11. Simulation of the adaptive target detection and tracking algorithm. Upper frames depict the input image frames and the applied templates (in white boxes). Lower frames represent the appropriate correlograms (white arrows sign the found correlation peaks). The adaptive algorithm updates templates according to the determined correlation peak position (dotted line arrows and box sign the extracted template). Although, the templates change considerably during the tracking procedure (e.g. 100th, 125th, and 150th frames), there is only small modification between the consecutive frames (e.g. 125th, 126th frames).

e The extraction of the new template is extremely fast because its size is relatively small (32x32). Cutting and downloading of the required part of the input image can be achieved within 50 µsec. All the required further processing of the templates (windowing) can be accomplished well within the time-window of the template operations.

If several templates are applied and those overlap on the same target then their relative positions provide data for algorithms, which aims to define the target's self motion (rotation, time to collision etc.). The outcomes of the adaptive target-detection and tracking algorithm can provide information for a more refined tracking algorithm, which considers the motion history of the target and approximate its future position (e.g. Kalman or other non-linear filtering). These algorithms can be also efficiently implemented within the CNN-UM algorithmic frame. A simplified version of the adaptive target detection algorithm has been demonstrated on the experimental POAC system. In spite of the current setup's limitations, it has been demonstrated that a POAC system can perform the algorithm’s basic operations. The input is a test image sequence of a flying and turning airplane. For the sake of demonstration in this printed document, the time sequence of the airplane images is displayed simultaneously on the input SLM of POAC. The correlation peaks in the next figure shows that the translating and rotating aircraft was successfully tracked along its path.

A

B

C

D

E

Fig. 12. Our measured and filtered results prove that adaptive target detection and tracking tasks can be accomplished robustly within the POAC framework. (A) depicts the time sequence of input image (500x500), (B) the applied templates (32x32) and (C) the measured and processed (filtered and thresholded) correlograms (C, D, E). Short arrows indicate the detected sequence of correlation peaks.

The current POAC implementation is capable of finding and tracking up to sixteen targets (with 32x32 size templates) in every frame (800x600) of a 60-frames/sec video sequence. Using more enhanced building elements (and/or by the application of smaller templates) even this performance can be improved considerably.

10. Scopes of further developments

The POAC system’s performance is limited primarily by the speed of the template displaying devices and that of the applied sensors. To overcome these constraints several alternative displaying techniques were considered. Although acousto-optic deflectors provide very high speed for small templates, due to the basically serial deflection scheme it becomes insufficient as the template resolution increases. Vertical cavity surface emitting laser (VCSEL) arrays can even faster and display the templates in a parallel way. Unfortunately, these devices are not commercially available yet with the required wavelength range and resolution. There are volume holographic experimental setups, where such VCSEL arrays provide the paging information75 0. Small modifications seem to be necessary only to apply this VCSEL device for correlation purposes. Although the semi incoherent implementation of the optical correlator reduces the coherent noise, because of the limited resolution of the high-speed sensor and processing units, alternative coherent correlator architecture is preferred. In the case of coherent displaying techniques optical high pass filtering reduces the demands against the sensor resolution. Micro Electro-Mechanical displays (MEMS) and high-speed micro LCD devices can provide the required coherent, parallel template displays, but their speed is not high enough only in the case of large templates. High-speed OASLM devices are also applicable as input and template displaying devices. Although there were announcements about high speed, high resolution multiple quantum well OASLM devices, these are not commercially available yet. Compromise has to be found between the speed and resolution of the available display devices. High resolution LCDs - up to 3500x4500 (and OASLMs with 4500x4500) pixel resolution - are commercially available presently. Although their prices are quite high, the POAC system overall performance can be improved considerably by applying them. Utilizing the above-mentioned elements even 10-80 TeraOps (Flops) (frame size in pixels/elementary reconstruction time) is attainable with an enhanced POAC system.

11. Conclusion

It has been shown that the POAC is a fruitful and promising concept and the introduced optical correlator based architecture is original and favorable to other technologies from several points of view. Due to its massive parallelism POAC provides a new algorithmic frame and ensures extremely high-speed operations for image processing, target detection and tracking tasks. Appropriate key elements and devices are required to reach a final product stage. The appropriate architecture and working principles have been worked out. However, further improvements of the key-elements and building blocks are necessary to achieve all the inherent possibilities of the POAC. These include the application of the followings. High power, actively Q-switched frequency doubled YAG lasers; Improved LCD or OASLM devices for input and template displays; A high speed CMOS detector matrix and a CNN-UM sensor and processor array for detecting and adaptive

thresholding of the output; Improve the dynamic holographic material (e.g. bacteriorhodopsin) parameters. The final size of the system can be decreased to 30 by 40 by 20 cm. This way a POAC will become a commercially auspicious device. The performance of its kernel processor, the new, t2-JTC type unconventional optical correlator; is superior to any known optical correlators. The computational Performance of POAC can be summarizing as follows. With the best, currently commercially available devices it is possible to reach the following computational performance. Hologram writings: Mi×FRi=107 pixel × 103 frame/sec = 1010

Byte/sec. Hologram reconstruction (elementary correlation): FRr=106 frames/sec. We have to make a trade-off between the writing frequency FRi and reading frequency FRt so that their product is constant: FRi × FRr = 106. Hence, 107×106 = 1013 flops can be performed. In these calculations we have not considered the computational complexity as it depends on the input image and template size, but for moderate template sizes does not modify the results considerably. The optical part of the present POAC architecture implements feedforward-only operations, but it is possible to feed the nonlinearly processed results back to the optical input to process it with a new B template.

Acknowledgement

This research was supported by the Computer and Automation Research Institute of the Hungarian Academy of Sciences (SZTAKI) and ONR Grant No.: N00014-0010429. In engineering works the help of Levente Török is highly appreciated. The BR samples were prepared in the Biophysical Research Institute, at Szeged.

References

1. L.O. Chua, and L. Yang, “Cellular Neural Networks: Theory” IEEE Trans. on Circuits and Systems, 35, (1988) pp. 1257-1272.

2. L.O. Chua, and T. Roska, “The CNN Paradigm” IEEE Trans. on Circuits and Systems, 40, (1993) pp.147-156.

3. L. O. Chua, CNN: A Paradigm for Complexity, World Scientific, Singapure, 1998. 4. L.O. Chua, and L. Yang, “Cellular Neural Networks: Applications” IEEE Trans. on Circuits

and Systems, 35, (1988) pp. 1273-1290. 5. L.O. Chua and B.E. Shi, "Multiple layer cellular neural networks: A tutorial" Algorithms and

Parallel VLSI Architectures, vol. A: Tutorials, (book: Algorithms and Parallel VLSI Architectures), E.F.Deprettere and A. van der Veen (eds), Elsevier, New York 1991, pp. 137-168.

6. S. Espejo, R. Carmona, R. Domingúez-Castro and A. Rodrígúez-Vázquez, “CNN Universal Chip in CMOS Technology” International Journal of Circuit Theory and Applications, Special Issue on CNN II: Part I, 24, (1996) pp. 93-111.

7. Á. Rodríguez-Vázquez, E. Roca, M. Delgado-Restituto, S. Espejo and R. Domínguez-Castro, “MOST-based Design and Scaling of Synaptic Interconnections in VLSI Analog Array Processing Chips” Journal of VLSI Signal Processing Systems, 23(2/3), (1999) pp. 239-266.

8. A. Rodriguez-Vázquez, S. Espejo, R. Dominguez-Castro, J.L. Huertas and E. Sánchez-Sinencio, “Current-Mode Techniques for Implementation of Continuous- and Discrete-Time Cellular Neural Networks” IEEE Trans. on Circuits and Systems, 40(3), 1993) pp. 132-146.

9. R. Carmona, S. Espejo, R. Dominguez-Castro, A. Rodríguez-Vázquez, T. Roska, T. Kozek, and L.O. Chua, “A 0.5 µm CMOS CNN Analog Random Access Memory Chip for Massive Image Processing” in Proc. of 5th International Workshop on Cellular Neural Networks and Their Applications CNNA’98 London, 1998, pp. 271-281.

10. J.M. Cruz, L.O. Chua, and T. Roska, “A Fast, Complex and Efficient Test Implementation of

the CNN Universal Machine” in Proc. of the third IEEE Int. Workshop on Cellular Neural Networks and their Application (CNNA-94), Rome, 1994, pp. 61-66.

11. J.M. Cruz and L.O. Chua, “A 16x16 Cellular Neural Network Universal Chip: the First Complete Single-chip Dynamic Computer Array with Distributed Memory and with Gray-scale Input-output” Analog Integrated Circuits and Signal Processing, 15(3), 1998) pp. 227-238.

12. R. Dominguez-Castro S., Espejo, A. Rodriguez-Vazquez, R. Carmona, P. Földesy, Á. Zarándy, P. Szolgay, T. Szirányi, and T. Roska, “A 0.8 µm CMOS 2-D Programmable Mixed-Signal Focal-Plane Array Processor with On-Chip Binary Imaging and Instructions Storage” IEEE J. Solid State Circuits, July 1997, pp. 103-1026.

13. L.O. Chua and T. Roska, "The Cellular Neural Network (CNN) and the CNN Universal Machine: Concept, Architecture and Operation Modes" Towards the Visual Microprocessor - VLSI Design and Use of Cellular Network Universal Machines, (book: Towards the Visual Microprocessor), Chichester, T.Roska and A.Rodriguez-Vázquez, J. Wiley, 2000, pp. 1-28.

14. T. Roska, and L.O. Chua, “The CNN Universal Machine: An Analogic Array Computer” IEEE Transactions on Circuits and Systems-II, 40, (1993) pp. 163-173.

15. R. Domínguez-Castro, S. Espejo, Á. Rodríguez-Vázquez and R. Carmona, "A 20x22 CNN-UM Chip with On-chip Optical Sensors" Towards the Visual Microprocessor - VLSI Design and Use of Cellular Network Universal Machines, ed. Chichester, T.Roska and A.Rodriguez-Vázquez, J.Wiley, 2000, pp. 213-237.

16. A. Paasio, A. Kananen K. Halonen and V. Porra, “A 48 by 48 CNN Chip Operating with B/W Images” in Proc. 5th Int. Conf. on Electronics, Circuits and System (ICECS’98), Lisbon, 1998, pp.191-194.

17. A. Paasio, A. Kananen, K. Halonen and V. Porra, “A QCIF Resolution Binary I/O CNN-UM Chip” Journal of VLSI Signal Processing Systems, 23(2/3), (1999) pp. 281-290.

18. B.E. Shi, "A One-Dimensional CMOS Focal Plane Array for Gabor-Type Image Filtering", IEEE Trans. on Circuits and Systems I: Special Issue on Bio-Inspired Processors and Cellular Neural Networks for Vision, (CAS-I Special Issue), 46(2), (1999) pp. 323-326.

19. T. Roska, A. Rodríguez-Vázquez, "Review of CMOS Implementations of the CNN Universal Machine - Type Visual Microprocessors" Proceedings of IEEE International Symposium on Circuits and Systems, (ISCAS'2000), Geneva, 2000, 2, pp. 120-123.

20. Á. Zarándy, P. Keresztes, T. Roska, and P. Szolgay, “CASTLE: an Emulated Digital CNN Architecture; Design Issues, New Results” in Proc. of 5th Int. Conference on Electronics, Circuits and Systems (ICECS’98), Lisbon, 1998, pp.199-202.

21. P. Thiran, K.R. Crounse, L.O. Chua, and M. Hasler, "Pattern Formation Properties of Autonomous Cellular Neural Networks", IEEE Trans. on Circuits and Systems I: Fundamental Theory and Applications, (CAS-I), 42(10), (1995) pp. 757-774.

22. Cs. Rekeczky, T. Serrano-Gotarredona, T. Roska, A. Rodríguez-Vázquez, "A Stored Program 2nd Order/3-Layer Complex Cell CNN-UM", Proceedings of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA'2000), Catania, 2000, pp. 213-218.

23. I. Szatmári, Cs. Rekeczky and T. Roska, “A Nonlinear Wave Metric and its CNN Implementation for Object Classification” Journal of VLSI Signal Processing Systems, 23(2/3), (1999) pp. 437-448.

24. Cs. Rekeczky, and L.O. Chua, “Computing with Front Propagation: Active Contour and Skeleton Models in Continuous-time CNN” Journal of VLSI Signal Processing Systems, 23(2/3), (1999) pp. 373-402.

25. G. Liñán, R. Domínguez-Castro, S. Espejo, A. Rodríguez-Vázquez. "ACE16k: and Advanced Focal-Plane Analog Programmable Array Processor" European Conference on Circuit Theory and Design (ECCTD'2001), Helsinki University of Technology, 1, (2001) pp. 345-348.

26. S. Espejo, R. Domínguez–Castro, G. Liñán, A. Rodríguez-Vázquez, “A 64×64 CNN Universal Chip with Analog and Digital I/O” in Proc. of 5th Int. Conference on Electronics, Circuits and Systems (ICECS’98), Lisbon, September 1998, pp. 203-206.

27. T. Roska, “Computer-Sensors: Spatio-temporal Computers for Analog Array Signals,

Dynamically Integrated with Sensors” Journal of VLSI Signal Processing Systems, 23(2/3), (1999) pp. 221-238.

28. M. Brendel, T. Roska, "Adaptive Image Sensing and enhancement using Adaptive Cellular Neural Network Universal Machine", Proceedings of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA'2000), Catania, 2000, pp. 93-98.

29. Á. Zarándy, L. Orzó, E. Grawes, and F. Werblin, "CNN Based Models for Color Vision and Visual Illusions" IEEE Trans. on Circuits and Systems I: Special Issue on Bio-Inspired Processors and Cellular Neural Networks for Vision, 46(2), (1999) pp. 229-238.

30. F. Werblin, T. Roska, and L.O. Chua “The Analogic CNN Universal Machine as a Bionic Eye” International Journal of Circuit Theory and Applications, 23, (1995) pp. 541-569.

31. S.I. Andersson, “Optical CNN for Ultra-fast Processing”, CECIL Technical Report, Göteborg, 1999.

32. R. Buczynski, H. Thienpont, S. Jankowski, T. Szoplik and I. Veretennicoff, "Programmable CNN based on Optical Tyristors for Early Image Processing", Proceedings of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA'98), London, 1998, pp. 259-26.

33. S. Jankowski, R. Buczynski, A. Wielgus, W. Pleskacz, T. Szoplik, I. Veretennicoff and H. Thienpont, “Digital CNN with Optical and Electronic Processing” in Proc. of 14 European Conference on Circuit Theory and Design, (ECCTD'99), Stresa, Italy, 2, (1999) pp. 1183-1186.

34. N. Früehauf, E. Lueder, and G. Bader, “Fourier Optical Realization of Cellular Neural Networks” IEEE Trans. on Circuits and Systems II: Analog and Digital Signal Processing, (CAS-II), 40(3), (1993) pp. 156-162.

35. N. Frühauf and E. Lüder, "Realization of CNNs by Optical Paralell Processing with Spatial Light Valves" Proceedings of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA'90), Budapest, 1990, pp. 281-290.

36. N. Frühauf, E. Lüder, M. Gaiada, and G. Bader, "An optical implementation of space invariant Cellular Neural Networks” Proceedings of 10th European Conference on Circuit Theory and Design, (ECCTD'91), Copenhagen, 1991, pp. 42-51.

37. E. Lüder, N. Frühauf, “Optical Signal Processing for CNNs” in Proc. of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA'92), Munich, 1992, pp. 45-54.

38. K. Slot, T. Roska, and L.O. Chua, “Optically Realized Feedforward Only Cellular Neural Networks” International Journal of Electronics and Communications, 46(3), (1992) pp. 158-167.

39. K. Slot, "Optically Realized Feedback Cellular Neural Networks" Proceedings of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA'92), Munich, 1992, pp.175-180.

40. E. Gómez-Ramírez, V. Ramos-Viterbo, M. Alencastre-Miranda, A. Flores-Méndez, A. Sánchez-de-Tagle, "Implementation of a Cellular Neural Network Over an Optical Computer", Proceedings of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA'96), Sevilla, 1996, pp. 445-450.

41. I.A. Stig "Recent Progress on Logic and Algorithms for Optical Neural Networks" Proceedings of IEEE Int. Workshop on Cellular Neural Networks and Their Applications, (CNNA'98), London, 1998, pp. 50-51.

42. A. VanderLugt, Optical Signal Processing, John Wiley & Sons, Inc., 1992. 43. F.T.S. Yu and S. Jutamulia, Optical Signal Processing and Neural Networks, John Wiley and

Sons, Inc. 1992. 44. A. VanderLugt, “Signal detection by complex spatial filtering.” IEEE Trans. Inf. Theory, 10,

(1964) pp. 139-145. 45. D.A. Jared, K.M. Johnson, and G. Moddel “Joint transform correlator using optically addressed

smectic liquid-crystal spatial light modulators” In OSA Technical Digest Series (Optical Society of America, Washington DC, 1989), paper THI5.

46. B. Javidi, J.L. Horner, “Single SLM joint transform optical correlator” Applied Optics, 28(5),

(1989) pp. 1027-1032. 47. B. Javidi, J.L. Horner, Real-time optical information processing, Academic Press, Inc. London.

1994. 48. Sz. Tőkés, L.R. Orzó, Cs. Rekeczky, T. Roska, and Á. Zarándy, "An optical CNN

implementation with Stored programmability." Proceedings of IEEE International Symposium on Circuits and Systems, (ISCAS'2000), Geneva, 2000, 2, pp. 136-139.

49. Sz. Tőkés, L.R. Orzó, Cs. Rekeczky, Á. Zarándy and T. Roska, "Dennis Gabor as initiator of Optical Computing: Importance and prospects of optical computing and an optical implementation of the CNN-UM computer" Proceedings of the Dennis Gabor Memorial Conference and Symposium on Holography and High Resolution Measurement Techniques, (Holography'2000), Budapest, 2000.

50. Sz. Tőkés, L.R. Orzó, L. Török, A. Ayoub and T. Roska, "An advanced joint Fourier transform correlator (JTC)", Proceedings of DO'01 Topical Meeting on DIFFRACTIVE OPTICS, Budapest, Hungary, 2001.

51. D. Armitage, J.I. Thackara and W.D. Eades, “Photoaddresed liquid-crystal spatial light modulators.” Applied Optics, 28, (1988) pp. 4763-4771.

52. C. Cornwell and R. Albert, “Liquid-crystal devices promise high performance. WDM SOLUTIONS” Supplement to Laser Focus World February 2000, pp. 35-39.

53. N.K. Gupta, H. Ogawa, “Optoelectornic architecture for cellular neural networks” Optics Communications, 138, (1997) pp. 11-15.

54. G. Sincerbox, Selected Papers in Holographic Storage, Bellingham W.A. SPIE Press. 1994. 55. S. Zhang and M.A. Karim, “Cellular neural network implementation using a phase-only joint

transform correlator”, Optics Communications, 162(1-3), (1999) pp. 31-36. 56. M. Karim, A. Mohammad and A. S. Awwal, Optical Computing: An Introduction, John Wiley

and Sons, Inc. 1992. 57. A. D. McAulay, Optical computer architectures, John Wiley. 1991. 58. P. Andres, V. Climent, J. Lancis, A. Lohmann, G. Minguez-Vega and E.:Tajahuerce, “All-

incoherent dispersion-compensated optical correlator”, Opt. Lett. (accepted) 59. P. Andres, J. Lancis and W.D. Furlan, “White-light Fourier transformer with low chromatic

aberration “, AO, 31, (1992) pp. 4682-4687. 60. http://www.bnonlinear.com 61. http://www.ino.qc.ca/en/syst_et_compo/oc.asp 62. M.S. Alam, “Selected Papers on Optical Pattern Recognition Using Joint Transform

Correlation”, SPIE Milestone Series vol. MS 157, SPIE-The International Society for Optical Engineering, Bellingham, Washington USA, 1999.

63. F.T S. Yu, S. Jutamulia, Optical Pattern Recognition, Cambridge University Press. 1998. 64. J.C. Kirsch and D.A. Gregory, “Video rate optical correlation using a magneto-optic SLM.”

Opt. Eng., 29, 1990) pp. 1122-1128. 65. N.N. Vsevolodov, T.V. Dyukova, “Retinal-protein complexes as optoelectronic components”

Trends in Biotechnology, 12, (1994) pp. 81-88. 66. Stroke, G.W.: An Introduction to Coherent Optics and Holography, Academic Press. 1966. 67. T.C. Lee, J. Rebholz and P. Tamura, “Dual-axis joint-Fourier transform correlator” Optic

Letters, 4, (1979) pp. 121-123. 68. F.T.S. Yu, Q.W. Song, Y.S. Cheng and D.A. Gregory, “Comparison of detection efficiencies

for VanderLugt and joint transform correlators” Applied Optics, 29, (1990) pp. 225-232. 69. R.H. Berg, S. Hvilsted and P.S. Ramanujam, Nature, 383, (1996) pp. 505. 70. N.C.R. Holme, P. S. Ramanujamand, S. Hvilsted, Applied Optics, 35, (1996) pp. 4622. 71. S. Hvilsted, F. Andruzzi, C. Kulinna, H.W. Siesler, and P.S. Ramanujam, Macromolecules, 28,

(1995), pp. 2172. 72. Sz. Tőkés, L.R. Orzó, G. Váró, A. Dér, P. Ormos and T. Roska, "Programmable Analogic

Cellular Optical Computer using Bacteriorhodopsine as Analog Rewritable Image Memory",

NATO Book: "Bioelectronic Applications of Photochromic Pigments" (Eds.: L. Keszthelyi, J. Stuart and A. Dér) IOS Press, Amsterdam, Netherlands, 2000, pp 54-73.

73. J. Ashley, M.P. Bernal, G.W. Burr, H. Coufal, H. Guenther, J.A. Hoffnagle C.M. Jefferson, B. Marcus, R.M. Macfarlane, R.M. Shelby and G.T.: Sincerbox, “Holographic data storage,'' IBM Journal of Research and Development, 44(3), (2000) pp. 341-368.

74. N. Vsevolodov, Biomolecular Electronics: An introduction via Photosensitive Proteins, Birkhauser Boston, 1998.

75. J. Mumbru, G. Panotopoulos, D. Psaltis, X. An, F. Mok, S. Ay, S. Barna, E. Fossum. “Optically Programmable Gate Array” Proc. SPIE of Optics in Computing 2000, 4089, (2000) pp. 763-771.

76. D. Osterhelt, C. Brauchle, N. Hampp, “Bacteriorhodopsin: a biological material for information processing” Quarterly Review of Biophysics, 24 (4), (1991) pp. 425-478.

77. A. Dér, L. Oroszi, Á. Kulcsár, L. Zimányi, R. Tóth-Boconádi, L. Keszthelyi, W. Stoeckenius and P. Ormos, “Interpretation of the spatial charge displacements in bacteriorhodopsin in terms of structural changes during the photocycle” in Proc. Natl. Acad. Sci. USA, 96, (1999) pp. 2776-2781.

78. J.K. Lányi, Gy. Váró, “The Photocycles of bacteriorhodopsin” Israel Journal of Chemistry, 35, (1995) pp. 365-385.

79. K. Ludmann, Cs. Gergely and Gy. Váró, “Kinetic and Thermodynamic Study of the Bacteriorhodopsin Photocycle over a Wide pH Range” Biophys. J., 75, (1998) pp. 3110-3119.

80. K. Ludmann, Cs. Gergely, A. Dér, and Gy. Váró “Electric Signals during the Bacteriorhodopsin Photocycle, Determined over a Wide pH Range” Biophys. J., 75, (1998) pp. 3120-3126.

81. G.I. Groma, J. Hebling, C. Ludwig, J. Kuhl, “Charge Displacement in Bacteriorhodopsin During the Forward and Reverse bR-K Phototransition” Biophysical Journal, 69, (1995) pp. 2060-2065.

82. R.R. Birge, M.B. Gillespie, E.W. Izaguirre, A. Kuznetzow, A.F. Lawrence, D. Singh, Q. Wang Song, E. Schmidt, J.A. Stuart, S. Seetharaman, K.J. Wise, “Biomolecular Electronics: Protein-Based Associative Processors and Volumetric Memories” J. Phys. Chem. B, 103, (1999) pp. 10746-766.

83. N. Hampp, A. Popp, C. Brauchle and D. Oesterhelt, “Diffraction efficiency of Bacteriorhodopsin films for holography containing bacteriorhodopsin wild type BRwt and its variants BrD85E and BRD96N” J. Phys. Chem., 96, (1992) pp. 4679-4685.

84. J.D. Downie and D.T. Smithey, “Measurement of holographic properties of bacteriorhodopsin films” Applied Optics, 35, (1996) pp. 5780-5789.

85. K. Ludmann, C. Ganea, Gy. Váró, “Back photoreaction from intermediate M of bacteriorhodopsin photocycle” J. Photochem. Photobiol. B, 49, (1999) pp. 23-28.

86. A. Zeisel, N. Hampp, “Spectral Relation of Light-induced Refractive Index and Absorption Changes in Bacteriorhodopsin Films Containing Wildtype BRWT and the Variant BRD96N” J. Phys. Chem., 96, (1992) pp. 7788-7792.

87. Yu-He. Zhang, Q.W. Song, C. Tseronis, “Real-time holographic imaging with a bacteriorhodopsin film” Optics Letters, 20(23), (1995)

88. A. Seitz, N. Hampp, “Kinetic Optimisation of Bacteriorhodopsin Films for Holographic Interferometry” J. Phys. Chem. B, 104, (2000) pp. 7183-7192.

89. Z. Batori-Tartsi, K. Ludmann, Gy. Váró, “The effect of chemical additives on bacteriorhodopsin photocycle” J. Photochem. Photobiol. B, 49, (1999) pp. 192-197.

90. Sz. Tőkés, L.R. Orzó, T. Roska, "Design Aspects of an Optical Correlator Based CNN Implementation", Proc. of IEEE International Symposium on Circuits and Systems, Helsinki. Finland, 2001, 1, pp 353-356.

91. Sz. Tőkés, L.R. Orzó, T. Roska, "Opto-electronic CNN implementation provides a new and powerful tool for image processing applications", KEPAF-2, Domaszék, Hungary, 2002, pp. 30-43.