Robot navigation inspired by principles of insect vision

14
ELSEVIER Robotics and Autonomous Systems 26 (1999) 203-216 Robotics and Autonomous Systems Robot navigation inspired by principles of insect vision M.V. Srinivasan a,., J.S. Chahl a, K. Weber b, S. Venkatesh b, M.G. Nagle ~, S.W. Zhang a a Centre for Visual Science, Research School of Biological Sciences, Australian National University, PO Box 475, Canberra, ACT 2601, Australia b Department of Computer Science, School of Computing, Curtin University, GPO Box U 1987, Perth, WA 6001, Australia Received 26 June 1998; accepted 30 August 1998 Abstract Recent studies of insect visual behaviour and navigation reveal a number of elegant strategies that can be profitably applied to the design of autonomous robots. The peering behaviour of grasshoppers, for example, has inspired the design of new rangefinding systems. The centring response of bees flying through a tunnel has led to simple methods for navigating through corridors. Experimental investigation of the bee's "odometer" has led to the implementation of schemes for visually driven odometry. These and other visually mediated insect behaviours are described along with a number of applications to robot navigation. © 1999 Elsevier Science B.V. All rights reserved. Keywords: Insect vision; Peering behaviour; Centring response; Range from motion; Egomotion; Obstacle avoidance; Odometry: Navigation 1. Introduction Anyone who has tried to swat a fly, or watched it ex- ecute a flawless landing on the rim of a teacup would have noticed that insects possess a visual system that is fast, precise and reliable. This is so despite the fact that the insect brain weighs less than a tenth of a mil- ligram and possesses four orders of magnitude fewer neurons than does the human brain. Do simple nervous systems such as these use computational "short cuts" to achieve their goals? Observation of the visual be- haviour of insects suggests that this may indeed be the case. Unlike vertebrates, insects have immobile eyes with fixed-focus optics. Thus, they cannot infer the distance * Corresponding author. Fax: 61-2-6249-3808; e-mail: m.srinivasan @anu.edu.au of an object from the extent to which the directions of gaze must converge to view the object, or by mon- itoring the refractive power that is required to bring the image of the object into focus on the retina. Fur- thermore, compared with human eyes, the eyes of in- sects are positioned much closer together, and possess inferior spatial acuity. Therefore the precision with which insects could estimate the range of an object through binocular stereopsis would be much poorer and restricted to relatively small distances, even if they possessed the requisite neural apparatus [1 ]. Not surprisingly, therefore, insects have evolved alterna- tive strategies for dealing with the problem of three- dimensional vision. Many of these strategies rely on using image motion as a significant cue. Some of them are outlined below, together with applications to robot navigation [2]. References to more complete accounts are also provided. 0921-8890/99/$ - see front matter © 1999 Elsevier Science B.V. All rights reserved PII: S0921-8890(98)00069-4

Transcript of Robot navigation inspired by principles of insect vision

ELSEVIER Robotics and Autonomous Systems 26 (1999) 203-216

Robotics and

Autonomous Systems

Robot navigation inspired by principles of insect vision M.V. Srinivasan a,., J.S. Chahl a, K. Weber b, S. Venkatesh b, M.G. Nagle ~, S.W. Zhang a

a Centre for Visual Science, Research School of Biological Sciences, Australian National University, PO Box 475, Canberra, ACT 2601, Australia

b Department of Computer Science, School of Computing, Curtin University, GPO Box U 1987, Perth, WA 6001, Australia

Received 26 June 1998; accepted 30 August 1998

Abstract

Recent studies of insect visual behaviour and navigation reveal a number of elegant strategies that can be profitably applied to the design of autonomous robots. The peering behaviour of grasshoppers, for example, has inspired the design of new rangefinding systems. The centring response of bees flying through a tunnel has led to simple methods for navigating through corridors. Experimental investigation of the bee's "odometer" has led to the implementation of schemes for visually driven odometry. These and other visually mediated insect behaviours are described along with a number of applications to robot navigation. © 1999 Elsevier Science B.V. All rights reserved.

Keywords: Insect vision; Peering behaviour; Centring response; Range from motion; Egomotion; Obstacle avoidance; Odometry: Navigation

1. Introduction

Anyone who has tried to swat a fly, or watched it ex- ecute a flawless landing on the rim of a teacup would have noticed that insects possess a visual system that is fast, precise and reliable. This is so despite the fact that the insect brain weighs less than a tenth of a mil- ligram and possesses four orders of magnitude fewer neurons than does the human brain. Do simple nervous systems such as these use computational "short cuts" to achieve their goals? Observation of the visual be- haviour of insects suggests that this may indeed be the case.

Unlike vertebrates, insects have immobile eyes with fixed-focus optics. Thus, they cannot infer the distance

* Corresponding author. Fax: 61-2-6249-3808; e-mail: m.srinivasan @anu.edu.au

of an object from the extent to which the directions of gaze must converge to view the object, or by mon- itoring the refractive power that is required to bring the image of the object into focus on the retina. Fur- thermore, compared with human eyes, the eyes of in- sects are positioned much closer together, and possess inferior spatial acuity. Therefore the precision with which insects could estimate the range of an object through binocular stereopsis would be much poorer and restricted to relatively small distances, even if they possessed the requisite neural apparatus [1 ]. Not surprisingly, therefore, insects have evolved alterna- tive strategies for dealing with the problem of three- dimensional vision. Many of these strategies rely on using image motion as a significant cue. Some of them are outlined below, together with applications to robot navigation [2]. References to more complete accounts are also provided.

0921-8890/99/$ - see front matter © 1999 Elsevier Science B.V. All rights reserved PII: S0921-8890(98)00069-4

204 M.V. Srinivasan et aL /Robotics and Autonomous Systems 26 (1999) 203-216

//')~

A_

,1

Stationary target

(a)

. . . . . Actual Target location

.. . . . . Pcrceived Target location

. . . . . . i li . . . . . .

// i b O - O

) , ]

Target moves anti-phase (Target appears closer)

(b)

Target moves in-phase (Target appears morc distant)

(c) _ _ ]

Fig. 1. Experiments investigating how locusts measure target range by peering•

2. F r o m peering locusts to rangefinding a lgor i thms

A hundred years ago Exner [3], pondering the eyestalk movements of crabs, speculated that inver- tebrates might use image motion to estimate object range. However, the first clear evidence to support this conjecture did not arrive until the late 1950s, when Wallace [4] made the astute observation that a locust sways its head from side to side before jumping on to a nearby object (Fig. l(a)). Wallace hypothesized that this "peering" motion, typically 5-10 m m in am- plitude, was a means of measuring object range. To test this hypothesis, he presented a locust with two objects subtending the same visual angle. One object was relatively small in size and was placed close to the locust, whilst the other was larger and situated further away. He found that the locust, after peering, jumped almost invariably to the nearer object. In a further series of elegant experiments, recently con- firmed more quantitatively by Sobel [5], a target was oscillated from side to side, in synchrony with the insect's peering movements. When the target was os- cillated out of phase with the movement of the head, thereby increasing the speed and amplitude of the object 's image on the retina, the locust consistently underestimated the range of the target (Fig. l(b)); when the target was oscillated in phase with the head, it consistently overestimated the range (Fig. l(c)). This showed that the reduced image motion of the target caused the insect to overestimate the range of the target, whilst increased motion had the opposite

effect. These findings demonstrated clearly that the peering locust was estimating the range of the target in terms of the speed of the image on the retina.

This technique of range estimation has inspired the design of a peering camera that moves perpendicular to its optical axis and computes the ranges of objects within its field of view by measuring local image ve- locities [6]. One version of such a device generates a two-dimensional map of the range measured along any direction within its field of view. A generalised- gradient algorithm [7,8] is used to measure image ve- locity. A more advanced version of this device obtains the same information by simultaneously capturing two paraxial views of the scene by means of a mirror ar- rangement [9]. This method of virtual peering dis- penses with the need to move the camera (which can lead to erroneous results if the motion is imprecise). A hand-held version of this device, using a front-end consisting of an array of photodiodes and analog pro- cessing hardware, has also been built [10]. This device has a narrow field of view and delivers range instanta- neously in any direction in which it is pointed. These devices can potentially be used in automobiles to sig- nal impending collisions, or as mobility aids for the visually impaired [9].

The images captured by a peering camera can be used not only to measure the range to a surface, but also to determine the surface orientation. If the surface is oriented perpendicular to the camera 's optical axis, as shown in Fig. 2(a), the resulting image velocity will be uniform over the entire field of view, and directed leftward if the camera is moving to the right. If on the other hand the surface is sloped relative to the frontal plane, as shown in Fig. 2(b), there will be an additional flow field (pattern of image velocities) superimposed upon the translational field. For a rightward-moving camera this additional flow field will be a compres- sion if the slope ot is positive, and an expansion if the slope is negative. If the surface is tilted relative to the frontal plane, as shown in Fig. 2(c), there will be a shearing flow field superimposed upon the transla- tional flow field. For a rightward-moving camera the shear will be counterclockwise if the tilt fl is positive, and clockwise if the tilt is negative. Thus, if the ex- tent of camera motion is known and the mean transla- tion of the image and the amount of compression and shear can be measured, the range and orientation of the surface can be computed.

M.V. Srinivasan et al./Robotics and Autonomous Systems 26 (1999) 203-216 205

~ ' a : q s

(a) (b) (c)

Y ' a ~ i S

- - - - - O - - - - - - 4 - - - - - . O - . . . . . . - 4 - - - - - 4 0----- 0"-- • - - 4 .--.-4 - - - - - . 4 - - - - - O - - - - 0 ~ ---'--4

-----O - - - - ~ ------0 - - - - -4 - - - - - -O 0----- e - - • - - 4 - - - -4 - - -4 - - -0 - - 4 - - 4 - - - 0

- - - - - 0 - - - ' - - ' O - - - - . . 4 - - - - 4 - - - - 4 0----- 0- '- • - - 4 -----4 - 4 - 4 --0 - 4 - 4

- - - " O - - - - O - - - - - 4 - - - - - - O - - - - 4 0 - - - - 0"-- • - -4 - - " ' 4 • • • • •

- - - - - - O - - - - 4 - - - . . 4 - . - - - - 4 - - - - - 4 0----- I"-- O - - 4 -----4 O- O- O- 0"- e--

-- . .41 -.----~ - - - - 4 - - - - 4 -.......-4 0 - - . - 0.-- • - 4 .......-4 0---- 0.-.* 0 - - - ~ 0-..-

Fig. 2. Illustration of image deformations caused by a peering camera motion when a surface is (a) fronto-parallel, (b) sloped and (c) tilted. In (b) and (c) the translational component has been removed for clarity.

To measure complex image motions of this kind, an image-interpolation algorithm has been developed in our laboratory [ 11,12]. This algorithm does away with the need to first measure local image velocities at a number of different points in the image. Instead it measures the global deformation of the image in a sin- gle step, as follows. An image f0 is grabbed. The cam- era is then moved by a known amount, under epipolar constraint, and a second image f is captured. We then generate, in software, from the original image )q), a set of reference images that contain known amounts of the deformations that are to be measured, i.e. trans- lation, shear, and compression. In this particular case, we generate reference images f] and f2 correspond- ing to simple translation relative to fo (leftward and rightward), f3 and f4 corresponding to compression and expansion, and f5 and f6 corresponding to clock- wise and counterclockwise shear. We assume that, as the camera moves from its initial position to its fi- nal position, the image deforms linearly in translation, compression and shear. (Violations of this assumption are avoided by appropriate low pass filtering [12].) The image f grabbed at the final position can then be approximated by J, a linear combination of fo and the various reference images as follows:

A

= f b + 0 . 5 Ax ( f 2 - f l ) Axref

Ac As +0.5 (f4 -- f3) + 0.5 (f6 -- f5),

Acref ASref (1)

A A A

where Ax, Ac and As are estimates of the actual image translation, compression and shear, and Axref,

Acre f and Asref are the magnitudes of the correspond- ing reference deformations.

From this we can compute the components of image motionA betweenA f 0 a n d f by determining the values of Ax, Ac and As that minimize the mean-square error between f and f . Analytic minimization yields a solution that is obtained by inverting a 3 x 3 matrix [ 12]. This solution is obtained in a single, non-iterative step. The mean range, slope and tilt of the surface can then be readil~commputedAfrom the image-deformation parameters, Ax, Ac and As, through simple geometry. Figs. 3(a) and (b) show results for a surface at a mean range of 40cm that was sloped at - 6 0 ° and tilted at 0 °. The depth profile (Fig. 3(a)), measured along the horizontal axis, shows a steady decrease from left to right, as expected, with a value of approximately 40 cm along the optical axis. The computed slope is close to - 6 0 + and the computed tilt is close to 0 c, as expected. Figs. 3(c) and (d) show results for a surface at a mean range of 40 cm that is tilted at - 6 0 ° and sloped at 0 °. The depth profile (Fig. 3(c)), measured along the vertical axis, shows a steady decrease from the bottom to the top, as expected, with an on-axis value of approximately 40 cm. The computed tilt is close to - 6 0 ° and the computed slope is close to 0 k, as expected. In essence, this technique permits the range and pose of a surface to be determined without using stereo algorithms, without feature matching, and without even computing local image velocities.

206 M. V. Srinivasan et al. /Robotics and Autonomous Systems 26 (1999) 203-216

r

5o!

40

.~ 3O 5.

20

10

(a) (b) ~c) 2O

~-20

~=.-40

L.J -611

-80

5O

4o r

#2ot

10

OI

Id)

"or i

-20[

6,, i

-8O I -5 0 5 -5 0 5 -5 0 5 -5 0 5

Horiz Dir(deg) Horiz. Dir(deg) Vert Dir(deg) Vert Dir(deg}

Fig. 3. Performance of the image-interpolation algorithm in computing range, slope and tilt of a surface at a mean distance of 40 cm from the camera. In (a) and (b) the surface is sloped at - 6 0 =' and has zero tilt. In (c) and (d) the surface is tilted at - 6 0 and has zero slope. Slope measurements are indicated by stars ( , ) and tilt measurements by circles (o).

i~ii!; iiiiiiii iiiiiii~ i,ii~iii ii~iii~ ~i, iii!iiii :~iiiiii iiii!ii ~ ii!i~i i~ iii!i i ¸ iiiiii ii~i ~ ~'~ ..... ii~ii ® ii!iiii

C. d .

a. b. e. f.

II q P

Fig. 4. Illustration of an experiment which demonstrates that flying bees infer range from apparent image speed. The shaded areas represent the means and standard deviations of the positions of the flight trajectories, analysed from video recordings of several hundred flights.

3. The centring response in bees, and its application to robot navigation

When a bee flies through a hole in a window, it tends to fly through its centre. How does the insect, lacking stereo vision, gauge and balance the distances to the sides of the opening? One possibility is that it simply balances the speeds of image motion on the two eyes. To investigate this hypothesis, bees were trained to enter an apparatus which offered a reward of sugar solution at the end of a tunnel [13]. Each side wall carried a pattern consisting of a vertical black-and- white grating (Fig. 4). The grating on one wall could be moved horizontally at any desired speed, either towards the reward or away from it. After the bees had received several rewards with the gratings stationary, they were filmed from above, as they flew along the tunnel.

When both gratings were stationary, the bees tended to fly along the midline of the tunnel, i.e. equidistant from the two walls (Fig. 4(a)). But when one of the gratings was moved at a constant speed in the direction of the bees' flight, thereby reducing the speed of reti- nal image motion on that eye relative to the other eye, the bees' trajectories shifted towards the wall with the moving grating (Fig. 4(b)). When the grating moved in a direction opposite to that of the bees' flight, thereby increasing the speed of retinal image motion on that eye relative to the other, the bees' trajectories shifted away from the wall with the moving grating (Fig. 4(c)). These findings demonstrate that when the walls were stationary, the bees maintained equidistance by bal- ancing the apparent angular speeds of the two walls, or, equivalently, the speeds of the retinal images in the two eyes. A lower image speed on one eye was evi- dently taken to mean that the grating on that side was

M.V Srinivasan et al./Robotics and Autonomous Systems 26 (1999) 203-216 207

further away, and caused the bee to fly along a trajec- tory closer to it. A higher image speed, on the other hand, had the opposite effect. Variation of the periods of the gratings on the two walls did not change the results (Figs. 4(d)-(f)), indicating that the bees were measuring the speeds of the images of the walls inde- pendently of their spatial structure.

This biological finding offers a simple strategy for autonomous visual navigation of robots along corri- dors. By balancing the speeds of the images of the two side walls, one can ensure that the robot progresses along the middle of the corridor without bumping into the walls. Computationally, this method is far more amenable to real-time implementation than methods that use stereo vision to calculate the distances to the walls. Four different laboratories have now built robots that negotiate corridors successfully using the bee- derived principle of balancing lateral image motion [14-17].

The design and performance of one of these robots [14] is shown in Fig. 5. The robot is approximately the size of a small skateboard, with a single video camera mounted facing upwards (Fig. 5(a)). This cam- era captures views of the side walls (one of each wall) through a mirror assembly positioned above the lens. Video information from the camera is transmit- ted to a desktop computer, where the image veloci- ties of the two walls, induced by the motion of the robot, are measured using a simplified version of the image-interpolation algorithm described above. The computer then issues appropriate steering commands to the robot to ensure that it stays close to the mid- line of the tunnel. The tunnel-following performance of the robot is illustrated by the examples shown in Figs. 5(b)-(e). In all cases the robot reliably follows the axis of the corridor. The presence of an obstacle next to one of the walls causes the robot to go through the middle of the gap remaining between the obstacle and the other wall. Additional control algorithms have been developed for controlling the speed of the robot. Speed control is achieved by holding constant the sum of the image speed from the two walls, thereby ensur- ing that the robot automatically slows down to a safe speed when the corridor narrows. This strategy, again, is based on observation of bees flying though tunnels of varying widths [18]. The mirror assembly also pro- vides two frontal, paraxial views of the corridor that are used to determine the distance to frontal obstacles

iJ /ll / ~

b c d ~:

Fig. 5. Upper panel: frontal view of corridor-following robot. Lower panels: performance in variously shaped corridors, of width approximately 80-100 cm.

by using a range from virtual peering algorithm sim- ilar to that described above.

4. Navigation in an environment cluttered with obstacles

How do insects navigate in a cluttered environment, and can they distinguish between objects at various distances? This question was investigated by present- ing bees with a collection of objects at different dis- tances, and examining whether they could learn to choose an object at a specific distance [19,20]. Bees flying over an artifical "meadow" were trained to dis- tinguish between artificial "flowers" by associating a food reward with a flower at a particular height. The sizes and shapes of all the flowers were varied ran- domly and frequently to ensure that the bees were trained to associate only the height of the flower (or, more accurately, its distance from the eye) and not its position, or angular subtense, with the reward. Using this approach, it was possible to train bees to choose the highest flower, the lowest flower, or even one at an

208 M.V. Srinivasan et al./Robotics and Autonomous Systems 26 (1999) 203-216

" . . . . . . . . . . . . . . . . . 0 . . . . . : "

a b

Fig. 6. Autonomous robot (a) that navigates in a cluttered environment (b).

intermediate height. A series of additional experiments [20] demonstrated that the bees were gauging the dis- tances of the objects in terms of the apparent speeds of the objects' images on the eye. Thus, insects can indeed distinguish between objects at different dis- tances, and they do so by using cues based on image motion.

These findings have inspired at least two laborato- ries to build insect-like robots that navigate through cluttered environments. The distinguishing feature of these robots is that they use image motion, rather than stereo vision, to compute object range. When the robot moves along a straight line, the image of an object at a given bearing moves faster when the object is nearer than when it is further away [21,22]. Thus, object range can be gauged in terms of the speed of image motion. One prototype, carrying a compound-eye-like sensor with a 360 ° azimuthal field of view, estimates the range to obstacles in various directions by moving in short, straight-line segments [23]. Another proto- type, developed in our laboratory in collaboration with Fujitsu, uses a single camera and computes the three- dimensional layout of the world by moving in a zig-zag fashion, rather like a blowfly flying along a corridor [24]. An illustration of this robot, which is approxi-

mately 25 cm high, is shown in Fig. 6(a). To infer the three-dimensional layout of the environment the robot moves in a straight line for a specified distance, during which 16 frames are captured and stored in a frame buffer. The image motion in these frames is then com- puted using a generalised-gradient algorithm, details of which are given in [7,8]. With knowledge of the position and geometry of the camera, and the speed of the robot, the measured image velocities can be re- lated to object ranges. Despite the camera's limited field of view, the robot obtains a near-panoramic rep- resentation of the surrounding environment by turning its camera from one side to the other and moving in a zig-zag manner.

The performance of the robot in moving towards a goal in a cluttered environment is illustrated in Fig. 6(b). The robot is indicated by the grey circle (with four radiating lines emanating from it) located at the centre of the grid. These radiating lines define the two sectors viewed by the camera on succes- sive moves. The goal, depicted by the cogged circle near the robot, is specified either in terms of x. y

co-ordinates or by the position of a light source. The rectangles and straight black lines represent obsta- cles and wall surfaces, respectively. The open circles

M. V Srinivasan et al./Robotics and Autonomous @stems 26 (1999) 203-216 209

are the locations of obstacles determined in previous moves. Their size is proportional to the estimated distance of the obstacle, and their darkness indicates the degree of confidence of the estimate. The filled circles indicate the computed safe paths in the di- rection of each obstacle. The darker the filled circle, the more strongly the robot is attracted to that path. The actual direction in which the robot moves at any instant is then determined by the direction of the goal and the directions of the permissible movements. The path of the robot up to this point is indicated by the connected black circles. It is evident that the robot has navigated successfully to the goal, whilst avoiding the intervening obstacles.

The range-computing strategy used by these insect- like robots is simpler than that used in traditional stereo vision. This is because algorithms for measure- ment of local optic flow [25] can readily be imple- mented in real or quasi-real time (e.g. [26]), whereas classical stereo algorithms that solve the "correspon- dence problem" require more computational effort.

5. Visual odometry: From bees to robots

Although it is well established that foraging hon- eybees can gauge the distances to food sources, the mechanisms by which they do so have remained enigmatic. To elucidate these mechanisms, we re- cently investigated whether bees can estimate and learn short distances flown under controlled labora- tory conditions [18,27]. Bees were trained to enter a tunnel and collect a reward of sugar solution at a feeder placed in the tunnel at a fixed distance from the entrance. The walls and floor of the tunnel were lined with black-and-white stripes, usually perpendicular to the tunnel's axis (Fig. 7(a)). During training, the position and orientation of the tunnel were changed frequently to prevent the bees from using any ex- ternal landmarks to gauge their position relative to the tunnel entrance. The bees were then tested by recording their searching behaviour in a fresh tunnel which carried no reward, and was devoid of any scent cues. The training and test tunnels were covered by a transparent sheet of perspex, and subdivided into 16 sections for the purposes of analysis. In the tests, the bee's behaviour whilst searching for the reward was recorded by noting the locations of their first, second,

third and fourth U-turns. From this data it was possi- ble to estimate the mean searching location, and the extent to which the search was distributed about this mean (Fig. 7(b)).

Bees trained in this way showed a clear ability to search for the reward at the correct distance, indicated by the thick curve in Fig. 7(b). How were the bees gauging the distance flown? A number of hypotheses were examined, as described below.

Were the bees counting landmarks en route to the goal? To examine this possibility, bees were trained in a tunnel lined with stripes of a particular spatial period and tested in a tunnel lined with stripes of a different period. The test bees searched at the correct distance from the tunnel entrance, regardless of stripe period (Fig. 7(c)). Therefore, distance is not gauged by counting the number of stripes or other features passed whilst flying through the tunnel.

Were the bees measuring distance flown in terms of the time required to reach the goal? To examine this possibility, bees were trained as above and tested in a tunnel that presented a headwind or a tailwind, generated by a fan at the far end of the tunnel. In a headwind, bees flew slower and took longer to reach the estimated location of the reward. The opposite was true in a tailwind [18]. Therefore, distance is not es- timated in terms of time of flight, or other correlated parameters such as number of wingbeats. In a head- wind, bees overshot the location of the reward; in a tailwind, they undershot it. Therefore, distance flown is not measured in terms of energy consumption.

Were the bees measuring distance flown by inte- grating the motion of the image of the surrounding panorama as they flew to the goal? To examine this possibility, bees were trained in a tunnel of a certain width (14 cm) and then tested in a tunnel that was nar- rower (width 7 cm) or wider (22 cm). In the narrower tunnel, the bees searched at a shorter distance from the entrance; in the wider tunnel, they searched far- ther into the tunnel (Fig. 7(d)). These results suggest that distance flown is gauged by integrating the speed of the images of the walls and floor on the eyes whilst flying through the tunnel.

To further examine the image motion hypothesis, bees were trained and tested in conditions where im- age motion was eliminated or reduced. In tunnels with axially oriented stripes on the walls and floor, the bees showed no ability to gauge distance travelled. In

210 M.V. Srinivasan et al./Robotics and Autonomous Systems 26 (1999) 203-216

Cross stripes on w i l l s and #c~f

A I I I I I lr lpl i l l on w l i l l incl floor

150

"~ ,O0- e~ I.

• ~ 5 0 -

el ao

0 -

R e w ~ d

17"~ ,-. c._;_--c>-c-- .>-,,/~ - c- -c~ -,L~ c--C*-Q,,,

,,' \ ', ~i" A x i a l / \~,

"" " "' ' '\ C r o s s

5 10 15

P o s i t i o n in tunne l

20

e~

#, ~3

150

I O0

50

0

R e w a r d

13 Per iod 4 c m ,&

Per iod 8 c m tt':; Per iod 2 c m

-- - - , , - -

5 10 15

I= c

20

b 15(I-

100"

50"

R e w a r d 7 cm tunnel •

c ,3 14 c m tunne l

f ~ ..? ~, ::/}., ~ ~ 22cmtunnel

A J .,~ E}. ~ ~ .(2 ~, ~-B

5 I 0 ] 5 2[)

Position in tunnel r] ~ ' ~ Position in tunnel r~

25

c d

Fig. 7. (a) Experimental setup for investigating visual odometry in bees. (b) Performance of bees in gauging distance travelled in tunnels lined with cross or axial stripes, as shown by distributions of searching locations. (c) Comparison of search distributions when bees are trained to a given distance with stripes of a particular spatial period (4 cm) and then tested in tunnels which present stripes of double or half the training period. (d) Comparison of search distributions when bees are trained to a given distance in a tunnel of a particular width (14cm) and then tested in narrower or wider tunnels.

these tests, the bees searched uniformly over the entire length of the tunnel, showing no tendency to stop or turn at the former location of the reward (thin curve, Fig. 7(b)). These results indicate that image motion is critical to odometry in bees, and confirms the hy- pothesis that distance flown is measured by integrat- ing image motion. Unlike an energy-based odometer, a visually driven odometer would not be affected by wind or by the load of nectar that the bee carries.

Inspired by the bee's visual odometer, we have de- signed two different robots that use image motion to determine how far they have travelled.

One such machine is an augmented version of the centring robot, described above and pictured in Fig. 5.

This robot navigates in the centre of a corridor by balancing the image velocities of the side walls as de- scribed above. In addition, it computes the distance travelled along the corridor by integrating, over time, the image motion (optic flow) experienced in the two sides [15]. This computation will not yield the travel distance in an absolute sense, because the image mo- tion depends upon the environment through which the robot travels, and upon the lateral distances of objects in it. But it will yield a number that is indicative of distance travelled, provided the robot always traverses the same environment. We have investigated two ways of using image velocity to estimate travel distance. One way is to compute distance travelled in terms of

M.V. Srinivasan et al./Robotics and Autonomous Systems 26 (19991 203-216 211

- ' ] ~ "~ F

b c d e

Fig. 8. Performance of a robot using

F J

#

j

F

P Z

f g h i j k t

visual odometry to travel a fixed distance along a corridor.

the time integral of the sum of the (angular) image velocities on the two sides. That is,

= f(o L + O,R) dr. (2) Distancel

This measure of distance travelled tends to be some- what sensitive to the robot's position along the width of the corridor, especially if the machine travels close to one of the walls. This is because the integrand, be- ing proportional to the sum of the reciprocals of the distances to the two walls, tends to be fairly constant as long as the robot is close to the midline of the corri- dor, but becomes very large when the robot approaches one of the walls.

Another approach is to compute distance travelled in terms of the time integral of the sum of the recipro- cals of the image velocities on the two sides. That is,

f 4d t Distance~ = . (3) - [1 /a)L 4- 1/COR]

The Distance2 measure has the advantage that it tends to be independent of the robot's position along the width of the corridor. This is because the denominator of the integrand is proportional to the sum of the dis- tances to the two walls, which is equal to the width of the corridor and is therefore independent of the robot's lateral position. This measure is therefore likely to be more robust to sideways meanderings of the robot as it travels down the corridor.

The performance of the robot in measunng dis- tance travelled, using Distance2, is shown in Fig. 8. In each run, the machine was meant to stop after it had traversed a fixed distance. Figs. 8(a)-(d) show four examples in which the robot started its run in the cen-

tre of the corridor, and performed its standard cen- tring behaviour and measured distance travelled whilst maintaining an average travel speed of approximately 12cm/s. Figs. 8(e)-(h) show four examples in which the robot started off-centre. It can be seen that, even though these runs are more meandering (because of slight instabilities in the centring behaviour), the robot traverses roughly the same length of corridor as in Figs. 8(a)-(d). Finally, Figs. 8(i)-(1) show four exam- ples in which the robot again started off-centre, but travelled at a lower speed of 7.5 cm/s. The length of corridor traversed is again approximately the same as before, despite the meanderings and the lower speed.

Performance statistics for the robot's odometry are given in Table 1. It is clear that both distance measures yield useful estimates of distance travelled. However, Distance2 provides estimates that are slightly more consistent.

Of course, both distance measures will fail if the robot is placed in a different environment. For exam- ple, if the width of the corridor were doubled, the robot would travel twice the distance before stopping.

To overcome this shortcoming (which, by the way, is also a feature of honeybee odometry), we have designed another robot. This robot does not compute image motion per se. Rather, it uses a version of the image-interpolation algorithm described above, but in the reverse sense. The algorithm makes use of the fact that if a sensor captures images of the environ- ment from two different reference positions, a known distance apart, the image corresponding to an un- known intermediate position can be approximated by a weighted linear combination of the images obtained from the two reference positions. The intermediate

212 M.V. Srinivasan et al . /Robotics and Autonomous Systems 26 (1999) 203-216

Table 1 Performance statistics of a robot using Distancel (and Distance2) measures for odometry

Run Min. Max. Mean Standard SD as % deviation of mean

Figs. 8(a)-(d) 22.624 23.831 23.187 0.551 2.375 (21.404) (21.806) (21.588) (0.155) (0.717)

Figs. 8(e)-(h) 22.661 25.469 24.093 1.008 4.185 (20.962) (21.230) (21.137) (0.103) (0.489)

Figs. 8(i)-(1) 21.885 24.406 23.157 1.004 4.334 (19.154) (20.632) (20.046) (0.580) (2.893)

2oo ~ ~] o,~gtn • coea~u~md path

O end pob.ts ~ true path

~1oc

sc

-XOO SO 0 50 100

b c

Fig. 9. (a) Photograph of egomotion computing robot in the laboratory in which all experiments were undertaken, showing the grid on the floor and some of the objects in the environment. (b) An image captured by one of the video cameras via the conical reflector which is used to obtain a horizontal, panoramic strip view of the environment. The black-and-white annulus represents the part of the image that was decoded to form the narrow strip. Individual visual units along the strip are indicated by the alternating black-and-white segments. (c) Comparisons between the path determined by the integration of motion calculated by the interpolation algorithm (dotted line), and the actual path followed by the robot (solid line). Three different trajectories are shown.

M. V. Srinivasan et aL /Robotics and

position of the sensor can then be determined from the coefficients of the interpolation. This technique is ac- curate provided that the reference images are captured at positions that are close enough to permit accurate linear interpolation of the intermediate image [28].

This concept is applied to compute egomotion as follows. Two panoramic views of the world are cap- tured simultaneously, a fixed, known distance apart on the axis along which the robot translates (the fore-aft axis). Each view corresponds to a ring-like, horizontal slice of the environment, and is acquired by a sepa- rate video camera mounted under a cone-shaped mir- ror (see Fig. 9(a)). The separation of the two cameras is equal to the largest displacement that is expected. Let fo(O) and fl (0) denote the images captured by the rear and front cameras, respectively. Now assume that the robot translates forward by an unknown distance ot expressed as a fraction of the separation between the two cameras. We capture a third image f(O) at this location, using the rear camera. We wish to determine the translation of the robot by measuring how far the rear camera has moved. The rear camera is now lo- cated somewhere between the original locations of the front and rear cameras. We assume that the new image captured by the rear camera at this intermediate loca- tion can be approximated by f(O), a weighted linear combination of the reference images fo(O) and fl (0) as follows:

f (o ) ~- (1 - a) fo(O) + otft (0), (4)

where a (0 ~< a ~< 1) specifies the fractional displace- ment of the rear camera relative to the distance be- tween the two cameras. This equation is valid provided the local spatial gradient of the intensity profile is ap- proximately constant between points fo(O) and fl (0). In other words, it assumes that the intensity profile is smooth enough that the image displacement at any lo- cation is not greater than half the period of the highest local spatial frequency. (In practice, this can be en- sured by low-pass filtering the images.) From (4) we see that when c~ = 0, f(O) = fo(O) and when ot = 1, f(O) = f l (0). Thus, the approximation f(O) satisfies the boundary conditions.

To determine the value of a that gives the best ap- proximation to f ( 0 ) , we minimise the mean square error between f(O) and f (o) . That is, we minimise

Autonomous Systems 26 (1999) 203-216 213

2~r

f [ ( 1 -- ot)fo(O) c~fl(0) -- (5) f ( 0 ) ] 2 + dO

0

with respect to ~. This minimisation yields

f(~n [f(O) -- fo(O)][fl (0) - J])(O)] dO a = (6)

f02Jr If0(0) - f l (0)] 2 dO

Since 3~)(0), f ( 0 ) and f l (0) are known, c¢ can be readily computed. Thus, the translatory displacement of the robot can be measured optically by capturing three images, and processing them as specified above.

An analogous approach can be used to measure ro- tations of the robot about the optical axis of the rear camera. In this case, the reference images can be ob- tained by software-rotating the image captured by the rear camera, because rotation about the optical axis causes all points in the ring image to be displaced by the same amount, irrespective of the distances of the objects in the environment.

We have embodied this technique in a mobile robot thereby enabling it to measure its egomotion in an unknown environment. A view of the robot and its environment is shown in Fig. 9(a). The robot is ap- proximately 40 cm long and moves on three wheels, consisting of two drive wheels at the rear and a casting wheel at the front. The drive wheels are independently controllable, allowing the robot to either (i) translate along the long axis, (ii) turn about a point on the rear axle midway between the two rear wheels or (iii) move along a curve, i.e. translate and rotate simultaneously. Fig. 9(a) also shows the two camera/cone assemblies. The image captured by each camera is 344 x 288 pix- els of 128 grey levels (Fig. 9(b)).The egomotion algo- rithm uses a ring-shaped strip of this image, of radius 140 pixels and thickness 5 pixels, centred on the axis of the cone, that is divided into an array of 250 units. These are shown as a ring of black and white circles superimposed on the grey level image in Fig. 9(b). This ring corresponds to a horizontal slice of the envi- ronment at a height of 15 cm above the floor, covering an elevafional angle of 1 °.

The performance of the egomotion algorithm in computing the motion of the robot is illustrated in Fig. 9(c). The robot moved along three different trajec- tories, each encompassing 30-40 individual steps. The figure compares the path determined by integrating the

214 M.V. Srinivasan et al . /Robotics and Autonomous Systems 26 (1999) 203-216

motions computed for the individual steps by the ego- motion algorithm, with the actual path followed by the robot, as established from the grid marked out on the floor. Clearly, the algorithm performs well.

It is important to note that the egomotion al- gorithm requires absolutely no knowledge of the three-dimensional structure of the environment. For example, there is no need to assume that the world is locally planar. Nor is there need for any object recognition, feature matching, or even image-veloci ty computation. The algorithm yields accurate results, provided the environment carries sufficient tex- ture to provide the relevant visual information. The image-interpolation algorithm, as applied here, uses the world as its" o w n m o d e l to infer the motion of the robot. Recently, we have extended this approach to the measurement of egomotion in six degrees of freedom [29].

6. Conclusions

Analysis of vision in simple natural systems, such as those found in insects, can often point to novel ways of tackling tenacious problems in autonomous nav- igation. This is probably because insects, with their "stripped down" nervous systems, have been forced to evolve ingenious strategies to cope with visual chal- lenges within their environment. This article has out- lined four ways in which insects use motion cues to perceive their environment in three dimensions, and navigate in it. Each of these strategies has inspired a novel solution to a problem in machine vision or robotics. Observation of peering locusts has led to the development of novel algorithms and devices for rangefinding; observation of bees landing on artificial flowers has led to the design of robots that negotiate cluttered environments; and observation of bees fly- ing through tunnels has spawned the construction of robots that navigate along corridors and use vision to compute their egomotion. In constructing these robots, our aim is not to copy insect vision in the literal sense. For instance, our second egomotion computing robot, with its forward and rear cameras, could hardly be de- scribed as an imitation insect. At present we do not know enough about the actual processing mechanisms that underlie insect vision to produce a carbon (or sil- icon) copy, anyway. Our aim, instead, is to carry out

behavioural experiments to reveal the cues that insects use to navigate, and to design machines that use such cues to advantage.

Acknowledgements

This research was partly supported by the Australian Defence Science and Technology Organisation and an International Human Frontiers Science Program Grant R6-84/97.

References

[11 M.V. Srinivasan, How insects infer range from visual motion, in: F.A. Miles, J. Wallman (Eds.), Visual Motion and its Role in the Stabilization of Gaze, Elsevier. Amsterdam, 1993, pp. 139 156.

[2] M.V. Srinivasan, S. Venkatesh, From Living Eyes to Seeing Machines, Oxford University Press, Oxford, 1997.

[3] S. Exner, The Physiology of the Compound Eyes of Insects and Crustaceans, Translated by R.C Hardie, Springer, Berlin, 1891, pp. 130-131.

[4] G.K. Wallace, Visual scanning in the desert locust schistocerca gregaria, Journal of Experimental Biology 36 (1959) 512-525.

[5] E.C. Sobel, The locust's use of motion parallax to measure distance, Journal of Computational Physiology 167 (1990) 579-588.

[6] M.G. Nagle, M.V. Srinivasan, R Sobey, Robust depth extraction for mobile robots, Proceedings of the SPIE (Boston) 2056 (1993) 207-218.

[7] M.V. Srinivasan, Generalized gradient schemes for the measurement of two-dimensional image motion, Biological Cybernetics 63 (1990) 421-431.

[8l R Sobey, M.V. Srinivasan, Measurement of optical flow using a generalized gradient scheme, Journal of the Optical Society of America 8 (1991) 1488-1498.

[9] M.G. Nagle, M.V. Srinivasan, EJ. Sobey, High resolution CCDs and their use in mobility devices, in: Proceedings, International Mobility Conference, Melbourne, 1994. pp. 330-334.

[10] G. Stange, M. Srinivasan. J. Dalczynski, Rangefinder based on intensity gradient measurement, Applied Optics 30 (1991) 1695 1700.

[11] M.V. Srinivasan, An image-interpolation technique for the computation of optic flow and egomotion, Biological Cybernetics 71 (1994) 401-416.

[12] M.G. Nagle, M.V. Srinivasan, Structure from motion: Determining the range and orientation of a surface by image interpolation, Journal of the Optical Society of America 13 (1995) 25-34.

[13] M.V. Srinivasan, M. Lehrer, W. Kirchner, S.W. Zhang, Range perception through apparent image speed in

M.V. Srinivasan et al./ Robotics and Autonomous Systems 26 (1999) 203-216 215

freely-r? mg honeybees, Vision and Neuroscience 6 (1991) 519-535.

[14] D. Coombs, K. Roberts, Bee-bot: Using peripheral optical flow to avoid obstacles, Proceedings of the SPIE (Boston) 1825 (1992) 714-721.

[15] K. Weber, S. Venkatesh, M.V. Srinivasan, Insect inspired behaviours for the autonomous control of mobile robots, in: M.V. Srinivasan, S. Venkatesh (Eds.), From Living Eyes to Seeing Machines, Oxford University Press, Oxford, 1997, pp. 226--248.

[16] G. Sandini, E Gandolfo, E. Grosso, M. Tistarelli, Vision during action, in: Y. Aloimonos (Ed.), Active Perception, Lawrence Erlbaum, Hillsdale, NJ, 1993, pp. 151-190.

[17] A.P. Duchon, W.H. Warren, Robot navigation from a Gibsonian viewpoint, in: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, San Antonio, TX, 1994, pp. 2272-2277.

[18] M.V. Srinivasan, S.W. Zhang, M. Lehrer, T.S. Collett, Honeybee navigation en route to the goal: Visual flight control and odometry, Journal of Experimental Biology 199 (1996) 237-244.

[19] M. Lehrer, M.V. Srinivasan, S.W. Zhang, G.A. Horridge, Motion cues provide the bee's visual world with a third dimension, Nature (London) 332 (1988) 356-357.

[20] M.V. Srinivasan, M. Lehrer, S.W. Zhang, G.A. Horridge, How honeybees measure their distance from objects of unknown size, Journal of Computational Physiology A 165 (1989) 605-613.

[21] K. Nakayama, J.M. Loomis, Optical velocity patterns, velocity-sensitive neurons, and space perception: A hypothesis, Perception 3 (1974) 63-80.

[22] G.A. Horridge, The evolution of visual processing and the construction of seeing systems, Proceedings of Royal Society of London B 230 (1987) 279-292.

[23] N. Franceschini, J.M. Pichon, C. Blanes, From insect vision to robot vision, Philosophical Translations Royal Society of London B 337 (1992) 283-294.

[24] P. Sobey, Active navigation with a monocular robot, Biological Cybernetics 71 (1994) 433-440.

[25] J.L. Barton, D.J. Fleet, S.S. Beauchemin, Systems and experiment: Performance of optical flow techniques, International Journal of Computer Vision 12 (1994) 43- 77.

]26] P. Sobey, S. Sasaki, M. Nagle, T. Toriu, M.V. Srinivasan, A hardware system for computing image velocity in real time, in: B.G. Batchelor, S.S. Solomon, EM. Waltz (Eds.), Machine Vision Applications, Architectures and Systems Integration, SPIE 1823 (1992) 334-341.

[27] M.V. Srinivasan, S.W. Zhang, N. Bidwell, Visually mediated odometry in honeybees, Journal of Experimental Biology 200 (1997) 2513-2522.

[28] J.S. Chahl, M.V. Srinivasan, Visual computation of egomotion using an image interpolation technique, Biological Cybernetics 74 (1996) 405-411.

[29] M. Nagle, M.V. Srinivasan, D. Wilson, Image interpolation technique for measurement of egomotion in six degrees of freedom, Journal of Optical Society of America A 14 (1997) 3233-3241.

Mandyam Srinivasan holds an un- dergraduate degree in Electrical En- gineering from Bangalore University, a Master's degree in Electronics from the Indian Institute of Science, a Ph.D. in Engineering and Applied Science from Yale University, and a D.Sc. in Neuroethology from the Australian Na- tional University. He was a Research Fellow in Neurobiology and Applied Mathematics at the Australian National University, an Assistant Professor of

Zoology at the University of Zurich, and is presently Profes- sor of Visual Sciences at the Australian National University's Research School of Biological Sciences and a Fellow of the Australian Academy of Science. Dr. Srinivasan's research fo- cuses on the principles of visual processing in simple natural systems, and on the application of these principles to machine vision and robotics.

Javaan Chahl received his Bachelor of Computer Engineering from the University of Newcastle, Australia, in 1991, his Graduate Diploma in Neuro- science from the Australian National University in 1992, and completed his Ph.D. in 1996 at the Australian National University. He is currently a Post Doctoral Fellow in the Re- search School of Biological Sciences at the Australian National University. His research interests include mobile robotics, machine vision, and arthro- pod behavior.

Keven Weber holds a Bachelor of Sci- ence degree, with first class honours, in Computer Science, and a Ph.D. in Computer Science, from Curtin Uni- versity of Technology, Perth, Western Australia. His Ph.D. research was in the area of insect-inspired approaches to robot navigation. Further research interests include: biologically inspired autonomous robotics, behaviour-based AI, collective intelligence and group behaviour.

216 M.V. Srinivasan et aL /Robotics and Autonomous Systems 26 (1999) 203-216

Svetha Venkatesh is a Professor at the School of Computing of Curtin Uni- versity of Technology, Perth, Western Australia. She is currently the Chair of the Computer Science Department. Her research is in the areas of active vi- sion, biological based vision systems, image understanding and applications of computer vision to image and video database indexing and retrieval. She is the author of about 100 research papers in these areas and is a senior member of the IEEE.

Martin G. Nagle B.Sc. (UCD), Ph.D. (ANU). Research areas include artifi- cial vision, motion based scene analy- sis, colour vision in primates, and the development of vision based devices to aid mobility in the visually impaired.

Shao-Wu Zhang studied applied mathematics at the Chinese University of Science and Technology and grad- uated in 1964. He was promoted to Assistant Professor in 1979, then As- sociate Professor in 1986 at the Insti- tute of Biophysics, Chinese Academy of Sciences. As an Alexander yon Humboldt Fellow he visited the In- stitute of Zoology at the Technical University of Darmstadt (Technische Hochschule) and the Max-Planck-

Institute of Biological Cybernetics at Ttibingen in Germany from 1980 to 1983. Since 1991, he has been at the Centre for Visual Science at the Australian National University. Dr. Zhang's research interests cover visual information processing in biological systems and engineering systems, biophysical aspects and modeling of visual systems, insect vision and behavioural analysis. Recently he is focusing on learning and memory in simple nervous systems.