Post on 14-May-2023
MULTISENSORDATA FUSION FOR
AUTONOMOUS VEHICLE NAVIGATION IN RISKY ENVIRONMENTS
G.L. Foresti, Senior Member, IEEE, and Carlo S. Regazzoni#, Senior Member, IEEE
Department of Mathematics and Computer Science (DIMI)
University of Udine, Via delle Scienze 208, 33100 Udine, Italy
#Department of Biophysical and Electronic Engineering (DIBE)
University of Genoa, Via all'Opera Pia 11A, I-16145 Genova, Italy
Abstract This paper describes a multisensor data-fusion system for driving an autonomous
earthwork vehicle operating in a sanitary landfill1. The system acquires data from a set of
ultrasonic sensors, a laser range finder and several CCD cameras, and produces as output alarms
that indicate potential dangerous situations, e.g., the presence of fixed or mobile obstacles in the
vehicle working area. The proposed system adds to the vehicle important functionalities such as to
avoid terrain holes or down slopes, or to discriminate between heaps of waste to be compacted and
other man-made obstacles. Data fusion allows one to increase the system’s reliability and to
compensate for the inaccuracies and limited operating ranges of individual sensors. Experimental
results show the system’s functioning both under normal operational conditions and in the
presence of dangerous situations. Moreover, the performances of the system in bad environmental
situations (e.g., rain, low lighting) have been evaluated.
Index Terms Autonomous vehicle driving, multisensor data fusion, feature extraction, outdoor
environments, ultrasonic sensors, laser range finder.
I. INTRODUCTION
RISK detection is a basic task for several computer-based systems such as autonomous vehicles
[1-4] and surveillance systems [5,6]. The possibility of a robot to interact with and operate in
unstructured environments without the complete control of the human operator is strictly
dependent on its capability to detect risky situations (e.g., fixed or mobile obstacles, holes or
terrain slopes, etc.). A commonly accepted observation is that such complex tasks cannot be
solved by using a single sensor, and that a synergetic use of multiple sensors is mandatory. The
decreasing costs and the increasing performances of the sensors available today have made the
aforesaid tasks feasible. Moreover, real-time processing and fusion of multisensor information
are becoming possible thanks to the considerable progress made in the areas of distributed
architectures and processing systems.
The major advantage that a robot or an autonomous system can obtain by integrating
information from multiple sensors is that such information is more accurate in terms of features
that may be impossible to acquire by using just individual sensors. The use of multiple sensors
1 This research was partially founded by the CEE-ESPRIT project 6068 ATHENA (Advanced Teleoperation for
eartHwork Equipment Navigation).
2
can also increase the system’s reliability in the case of the failure of an individual sensor.
Complementary information from a group of disparate sensors can be exploited to compensate
for the inaccuracies and limited operating ranges of individual sensors. As a result overall
information turns out to be is accurate over a very wide range.
Several systems recently developed in the context of autonomous-vehicle driving are focused
on real-time risk detection and/or multisensor-data fusion. A visual-based system for detection
and tracking of multiple vehicles in road traffic scenes was developed by Malik et. al [3]. This
system addresses the problem of occlusions in tracking multiple 3D objects in a known
environment by employing a contour-tracker algorithm based on intensity and motion
boundaries. The positions and motions of contours, represented by closed cubic splines, are
estimated by means of linear Kalman Filters. A multisensor visual-based system, named BVV3,
for autonomous vehicle driving and risk detection was proposed by Graefe [4]. The system
architecture includes multiple parallel processors, a systembus for inter-processor
communications, and a separate video-bus for digital image data transmission. An obstacle
detection module continuously scans an image to search for a feature that may represent the first
sign of a possible obstacle (e.g., another vehicle). Large objects (trucks) can be detected at very
long distances, up to 700 m, and smaller ones (cars) at about 350 m. When an object candidate
has been detected, 2D generic models describing the visual characteristics of objects belonging to
predefined classes (trucks, cars, etc.) are employed to classify road obstacles. The main limitation
on these systems is that they are based only on visual sensors and, consequently, they depend on
environmental conditions (e.g., fast change in lighting, uneven lighting, bad visibility, etc.). Rao
et al. [5] presented a complex multisensor surveillance system using CCD cameras, ultrasonic
sensors and optical barriers to track (in real time) people and robots moving around a factory
room. This system allows a complete decentralization of the Kalman filter equations (employed
for object tracking), thus eliminating computational bottlenecks; moreover, it can degrade slowly
in the case of sensor and processor failures. However, it exhibits some limitations: (a) multiple-
object tracking is not completely reliable due to possible communication deadlocks or levelocks,
and (b) occlusions and shadows limit the tracking capabilities of the system. Corrall [6] proposed
a multisensor system for airport stand surveillance, ground-traffic control and road-traffic
accident detection. In particular, it involves real-time analysis and description of complex scenes
(containing also object occlusions) belonging to image sequences, but is restricted to cases where
scene structures, objects and their expected behaviors are known.
In recent years, a number of researchers have begun to develop methods for integrating and
fusing information from multiple sensors. In order to manage the large amount of data generated
3
by such sensors, various integration and fusion approaches have been proposed in the literature
(e.g., neural networks [7], Bayesian networks [8], Kalman filter architectures [8], production-rule
systems [9], knowledge-based systems [10], and other [11,12]). An accurate overview of such
methods can be found in [13,14].
An important additional advantage of data-fusion architectures can be obtained by considering
systems based on a distributed organization of computational resources: multiple processors
gather and process information provided by multiple information sources to analyze observed
situations. This approach results in the development of Distributed Sensor Networks (DSNs)
[15], which are systems made up of connected, co-operating information sources organized into
highly structured, message-exchanging networks. In a DSN, sensors may be either physical (i.e.,
they generate information by translating observed physical quantities into electrical signals) or
virtual (i.e., they derive new information from existing data). The main goals of a DSN are to
process data acquired by different sensors, to integrate them (thus reducing the degree of
uncertainty naturally associated with acquired information) and to produce an interpretation of
the environment observed. The main advantages of a DSN are the following:
1. Each local node of the network has one degree of intelligence and knows only one part of the
whole problem to be solved. Therefore, the information representation on each node is
structurally simpler than the corresponding information representation in a centralized system.
2. Imposing local constraints at the lower levels decreases the number of possible solutions
from the lower to the higher levels of the distributed system. At the same time, the network
performance may improve if an appropriate fusion mechanism is used.
3. The amount of information is reduced as only significant information is propagated
through the network.
In this paper, we describe how a DSN can be applied to integrate data coming from multiple
sensors for the purpose of automating the operations currently performed by human drivers of
earthwork vehicles working at waste-disposal sites (sanitary landfills). The main objectives are to
operate such vehicles without need for drivers and to provide the vehicles with autonomous risk-
detection capabilities. Three different kinds of sensors have been employed: ultrasonic sensors
(USs), a laser range finder (LRF) and multiple CCD cameras. USs and the LRF are placed on-
board an earthwork vehicle and aim to detect risky situations in its neighborhood area (a circular
region of radius of about 12 meters). Cameras are placed on high poles around a sanitary landfill,
and are used to detect and track intruders in order to predict possible collisions with the
autonomous vehicle. Data coming from these sensors are integrated to obtain a complete view of
4
the whole environment. Each detected risky situation is also spatially located on a 2D top-view
map.
II. SYSTEM DESCRIPTION
A. Vehicle automation and control
An autonomous vehicle mainly consists of a bulldozer equipped with a front blade for the
spreading and leveling of waste, and with steel wheels for the breaking and compacting of waste
(Fig. 1a). An on-board computer interfaces the actuators with some sensors in order to close the
vehicle’s control loop. The basic sensor data are represented by the position of the blade, the
steering angle between the two bodies of the vehicle, the speed of the vehicle and the engine
r.p.m. These data can be obtained by two linear potentiometers installed on the two jacks used to
move the blade and to steer the vehicle, by an encoder placed on the power shaft and by an
encoder interfaced with the alternator.
Figure 1b shows the general view of the vehicle control system; it can be roughly divided into
two parts connected through a high-speed radio link: the on-board part (which is in charge of the
real-time control of the vehicle) and the off-board part (which helps the human operator in its
supervision).
The off-board part of the system is equipped with visual cameras placed on high poles along
the perimeter of the working area, and is mainly in charge of detecting, locating and tracking any
intruder moving in the scene. It is also composed of a man-machine interface that provides the
operator with adequate devices for the input of commands and for data visualization, and of a
communication server for managing the data link with the on-board system.
The on-board part is more complex, being the interface between the commands coming from
the man-machine interface and the vehicle’s sensors and actuators. It equipped with ultrasonic
sensors and a laser range finder to detect risky situations in the neighborhood area of the vehicle
(e.g., fixed or mobile obstacles, terrain slopes or holes, etc.). It is composed of several modules:
the vehicle motion control which decides on the movements of the vehicle, the position
determination system, which provides the vehicle position and speed by means of an advanced
laser-based sensor, and the path planner, which computes the optimum trajectory of the vehicle.
An analog remote radio console is also present in the system architecture as an independent
auxiliary control system able to take the control of the vehicle, at any moment, in case of fault of
the radio link or of some module of the digital control system. For safety reasons, a stop button
on the analog console is always provided that allows the system to override any other command.
B. Multisensor data fusion system
5
The aim of this paper is to detail the multisensor data fusion system that aims to ensure the
driving capabilities of the vehicle moving in a complex and risky environment.
A distributed sensor network organized into a hierarchical tree structure has been developed
(Figure 2). The tree consists of a root node and as many leaf nodes (sensor level) as the number
of real sensors used by the system (e.g., several CCD cameras, a laser range finder and sixteen
ultrasonic sensors). The second level (feature extraction) is composed of as many nodes as the
number of features extracted by the sensors. The third level of the tree (low-level data fusion) is
composed of a set of virtual sensors that integrate the informational content of one or more
features to accomplish different tasks. As we would like the fusion process to exhibit an
“intelligent-like” behavior, we assume that it is able to detect the existence of an obstacle before
tracking it. To this end, the following tasks are performed at the third level: (a) detection of static
obstacles, (b) detection of terrain holes or down slopes, (c) recognition of specific fixed objects
(e.g., trash heaps), (d) detection and location of mobile objects (e.g., other vehicles, pedestrians,
etc.). Each virtual sensor points out information about the estimated number of detected objects
and their 3D positions computed by an appropriate reference system. In particular, a Cartesian
(Xs,Ys,Zs) reference system is used by both the visual and ultrasonic sensors, whereas a polar
(reference system is used by the laser sensor. The origins of these systems are placed in
the related sensors. Different voting-based strategies are applied to fuse data coming from
multiple virtual sensors of the same type, e.g., data coming from different cameras are merged
into a general reference system (X,Y,Z) related to the working site. The root node (high-level data
fusion) uses the symmetry and perturbation model to integrate the evidences coming from all
virtual sensors. The objective is to obtain a fused information about the number of objects and
their 3D poses in the (X,Y,Z) general reference system. Moreover, a Kalman-based approach is
adopted to track any object moving in the working area of the vehicle and to estimate its motion
parameters. To this end, information from the virtual sensors performing the intruder-location
and intruder-detection tasks is required. Finally, the fused information is used by the vehicle
motion control to decide on the best action to take, e.g., stopping the vehicle, moving the vehicle
in a certain direction, etc.
III. APPLIED SENSORS AND FEATURE EXTRACTION
The proposed system is made up of different kinds of sensors placed both on-board and off-board
the vehicle. Several ultrasonic sensors placed at appropriate boundary positions round the vehicle
(i.e., 5 wide beams and 5 narrow beams) (Figs. 3a, 3b and 3c) and a laser range finder are on-
board the vehicle(Fig. 3d). The range of the wide-beam sensors is 12m, their angular spatial
resolution is about 20, their resolution is 1m and their accuracy is about 1mm in the existing
6
configuration. The narrow- beam sensors have the same characteristics, except for their range,
which is 6 meters. They are placed on fixtures with 2 degrees of freedom, thus they can move
along the x and y-axes. By changing their angles to the y and x axes, it is also possible to cover
the area around the vehicle and to prevent it from going down a steep slope. Each ultrasonic
sensor gives the distance di to an obstacle, in its field of view, where 1 m di 12 m.
The laser range finder sensor is placed on a fixture in the vehicle’s front part. The laser beam
points upwards, is reflected by a mirror at an angle between 0 and 45 to the vertical axis, and
rotates in parallel with the vertical axis. Thus, the laser beam scans a spherical segment of 16m in
diameter, limited by the horizontal plane and a cone having its acme in the center of the spherical
segment at an angle of 90. A complete data-acquisition cycle requires 10 complete revolutions
and takes about 2 sec. The data returned by the laser scanner are expressed in polar coordinates
and have the format ( whereis the range measured at the horizontal and vertical angles,
and , respectively (Figure 3e). The laser resolution is about 32 mm and its accuracy is about
100 mm in , 2 on the horizontal plane, and 0.5 on the vertical plane. Each complete revolution
returns about 1600 data triplets.
Several color cameras have been installed on high poles at the entrance of the sanitary landfill
and at each corner of the working area in such a way that their fields of view cover the whole
site. First, the absolute difference D(x,y) between the current image I(x,y) and the background
image BCK(x,y) is computed every k time instant (e.g., 500 msec) as:
]..1,..1[),(),(),(),( NNyxyxBCKyxIyxD (1)
where NN is the image size. Then, a hysteresis function with two thresholds, THRin and
THRout, is applied to each point of the difference image D(x,y) to establish wheter the related
point of the current image belongs to the background or to a moving object. A binary image
B(x,y) is obtained, where changing pixels are set to 1 and background pixels are set to 0. A
morphological filter is used to eliminate isolated or small groups of pixel s[23]; as a result, the
output image is only composed of significant groups of connected pixels, called blobs, bi. As the
scene background is time-varying, a background-updating procedure is applied to estimate
significant changes in the scene [24].
Figure 4a shows a real image (chosen as a running example) representing an earthwork
vehicle moving into an outdoor test site. Figures 4b, 4c and 4d show respectively the background
image and the blobs obtained for different values of the thresholds THRin and THRout.
For each detected blob bi, the following features are computed on the image plane: (a) the center
of mass, ri=(xi,yi), and (b) the dimensions, l and h, of the minimum bounding rectangle (MBR).
7
Figure 4e shows the result of the focus-of-attention procedure applied to the image of the
running example. Noise effects and an inadequate setting of the THR thresholds can produce
false MBRs on the image plane. In the current case, the trees present in the scene are moved by
the wind; therefore, a false MBR is detected in the upper part of the image in Fig. 4e.
IV. LOW-LEVEL DATA FUSION
The main objective of low-level data fusion is to integrate the informative contents of the
features extracted from one or multiple sensors of the same type in order to obtain a rough
estimate of the number of objects present in the observed scene and their 3D positions. A specific
data-fusion procedure is applied to each kind of sensor. In particular, a weighted function based
on confidence measures is used to integrate the information coming from the different cameras
whose fields of view are partially overlapped. A voting mechanism is also adopted to infer the
presence of obstacles from the 3D data generated by the laser sensor. A simple grouping rule is
used to integrate the information coming from the ultrasonic sensors.
A. Visual-data integration
Multiple cameras, installed on high poles at fixed points (e.g., corners) of the sanitary landfill,
continuously acquire images from the monitored environment. The goal of the virtual sensors
associated with the CCD cameras is to detect and locate any intruder object moving inside the
working area of the autonomous vehicle. The detected blobs in each image are considered as
virtual data to locate the object in the 3D environment. In particular, the position rb=(xb,yb) of
the center of mass of the blob, and the dimensions, l and h, of the related MBRs are used.
The transformation of an image plane point, r0=(x0,y0) into a 3D reference system point,
Rw=(Xw,Yw,Zw), is a well-known ill-posed problem. However, if the intrinsic parameters (camera
position, optical axis orientation and focal length f) and the calibration matrix M [16,17] of the
camera are known, the problem can be regularized [31]. A linear system ri=M.Rw, where the 2D
coordinates ri=(xi,yi) on the image plane and the 3D point Rw=(Xw,Yw,Zw) on the camera
reference system (Figs. 5a and 5b) are known, should be solved. By considering the matrix
M= m m m1 2 3, ,T
, where mi represents the i-th row, a system characterized by two equations
and three unknowns is obtained:
x
y
i
i
m
m
m
m
w
w
w
w
1
3
2
3
R
R
R
R
(2)
8
By assuming the ground plane hypothesis [18], the system is reduced to 2 equations and 2
unknowns, i.e., Xw and Zw :
x M X M Z M M X M Z M
y M X M Z M M X M Z M
i w w w w
i w w w w
8 10 11 0 2 3
8 10 11 4 6 7
0
0
(3)
The ground plane hypothesis is assumed in order to regularize the location problem: all
objects (e.g., cars, lorries, etc.) moving towards the sanitary landfill are considered on the ground
plane (i.e., Y=0).
For each camera, the projection of the blob position onto the 3D-camera reference system is
performed. A suitable and simple confidence measure ci, depending on the quality of the acquired
image, on the accuracy c of the calibration matrix (computed off-line during the camera
calibration procedure) and on the estimated distance z of the object from the sensor, is
associated with each estimated object position:
zLMTNfc ci
ˆ
1),( 321 (4)
where the coefficient i are normalization constants such that i
i 1 .The image quality
function is expressed as [31]:
2
221
1),( LMw
TSwLMTNf (5)
where w1 and w2 are weights used to normalize the different parameters and to enforce the most
significant contributions. The parameter TN represents the tenengrad measure associated with the
focus of the camera lens, and it can be computed as:
T N S x y S x y T h
y
N
x
N
( , ) ( , )2
11
for
(6)
where 22),(),(),( yxIiyxIiyxS yx , i
x and iy are the Sobel operator’s [30] for the
computation of the gradient in the x and y directions, respectively, and * is the convolution
operator. The parameter LM represents the average brightness, which is defined as:
LM l Oi ii
0
255
(7)
where Oi is the number of occurrences of the gray-level li in the image.
The estimated object position is considered valid only if the related confidence value is above
a fixed threshold Thc (e.g., Thc =0.7).
9
If a given blob bi is detected by K cameras, a weighted function is used to obtain a more
accurate object position. Let )ˆ,ˆ,ˆ(ˆjjjj zyxx be 3D blob position estimated by the j-th camera.
The final object position is estimated according to the following relation:
j
K
j
jbKi
xx ˆ1
ˆ1
(8)
where
j
j
jc
c is a weight parameter taking into account the confidence values of the
estimated positions computed by each single sensor.
Figure 4f shows the final MBRs on the image of the running example survived after the object
location phase. The ground plane hypothesis together with a threshold on the minimum MBR
size, allows the system to eliminate false MBR.
B. Laser-data integration
The virtual sensor associated with the laser range finder is aimed at processing only
information about fixed obstacles. Its main objective is to discriminate between trash heaps to be
compacted and other fixed obstacles (e.g., stopped vehicles, sanitary-landfill fixtures, etc.). The
shapes of trash heaps are generally irregular and their surfaces show a high degree of roughness,
whereas man-made objects are generally characterized by regular shapes and smooth surfaces.
Elementary shapes that indicate the presence of man-made objects are line segments [19] and
plane regions [20]. A measure of entropy of a given region Di in the working area is based on a
rough analysis of the region surface and is applied to infer the presence of trash heaps in the
scene. This entropy measure is calculated on the laser data as:
max
log)(N
i
iii hhDEntropy (9)
where ll
nh i
i
is the distance ni of the i-th point of the observed region from the laser divided
by the total number of points in the region (a ll square mask), and Nmax is the number of laser
data available in the considered region. Areas characterized by entropy values above a fixed
threshold (computed experimentally after several tests) represent possible trash heaps.
Moreover, vertical or quasi-vertical planes are searched for in the image in order to find
regular objects. A voting-based mechanism [21] is applied that is similar to the Hough transform
for detecting line segments in edge images. A plane can be described in terms of polar
coordinates by means of two angles (,) and by its distance (z) from the origin of the reference
system centered in the center of mass of the vehicle [22]. More precisely, is the slant angle,
i.e., the angle between the normal to the plane and the laser direction, is the tilt angle, i.e., the
10
angle formed by the projection of the normal to the plane onto the plane normal to the laser
direction, and z is the distance of the plane from the origin of the reference system. The above
three parameters univocally locate a plane into the 3D reference system of the laser sensor.
Geometric transformations can be easily performed to transform laser data (expressed in polar
coordinates (into a 3D Cartesian system (x,y,z) with the origin in the vehicle’s center of
mass:
cos
cos
z
sinsiny
sinx
(10)
A segmentation of laser data can be done by accumulating evidences in the parameter space
(,z). 3D points in the Cartesian reference system (X,Y,Z) vote for a set of buckets in the
parameter space according to a voting equation [22], and at the end of the process, the buckets
receiving a large number of votes can be considered as candidate planes in the observed scene.
The process is implemented on DSP boards in order obtain in real time a rough 3D
reconstruction of the environment. The detection of vertical or quasi vertical planes is associated
with the possible presence of man-made obstacles.
In particular, information about static man-made obstacles to be avoided and about trash heaps to
be compacted is sent to the path-planner module to decide on the vehicle navigation strategies.
C. Ultrasonic-data integration
Ultrasonic sensors are evenly distributed along the ring and they are divided into several
groups, sgj, j=1,..,M, where each group is composed of three neighboring sensors [25]. With this
sensor arrangement, the distance, dgj, measured by the j-th sensor group from the vehicle’s center
of mass to the obstacle is expressed as:
Mjforjjjidddg ioffsetj ,..,13,13,23min (11)
where doffset is the known distance between the vehicle’s center of mass and the sensors. The
estimated distances, dgj, are the features extracted from the ultrasonic sensors.
D. Low-level risk detection
The virtual sensors that make up the third level of the data-fusion system allow the low-level
risk-detection module for the vehicle. Figure 6 shows the general flowchart of the low-level risk-
detection module. The primitive object detector virtual sensor processes data coming from the
ultrasonic sensors. These data are acquired using two different orientations along the y and x axes
(see Figs. 3b and 3c). The first orientation (about 70 with respect to the ground plane) makes it
possible to detect the presence of obstacles in the neighborhood of the vehicle, and the second
11
orientation (about 30 with respect to the ground plane) prevents the vehicle from going down a
step down slope.
If the primitive object detector finds a risky situation, (e.g., the presence of a terrain slope), the
vehicle is stopped immediately. A stop message is sent to the vehicle actuators and to the man
machine interface to alert the operator. The positions of the detected slopes or holes are also sent
to the path-planner module to compute a new vehicle trajectory.
If an obstacle is detected, the obstacle motion determination function uses the vehicle velocity
and direction to discriminate between static (sanitary-field boundaries, trash heaps, etc.) and
mobile objects (other industrial vehicles, trucks, pedestrians, etc.). Moreover, the obstacle
position determination function computes the positions of fixed objects and also estimates the
position of mobile obstacles every T1 sec (e.g., T1=500 msec). This information is sent to the
high-level data-fusion module to improve the object-tracking procedure.
The environment perception virtual sensor processes data coming from the laser range finder
in order to detect the presence of planes or irregular surfaces. If vertical or quasi-vertical planes
are detected, the presence of a man-made obstacle is inferred. The distance of the plane from the
laser sensor (z coordinate) is used as an estimate of the obstacle position. If non-flat regions
characterized by large entropy values are detected, they are considered as trash heaps.
The main objective of the intruder detection and location virtual sensor is the control of the
vehicle safety in the whole working area. This goal is achieved through a hierarchy of safety
levels, such that a level can be achieved only if the lower levels have been completely tested (Fig.
7). At the first level, intruder-detection procedures are performed on the images acquired by the
camera(s) placed at the sanitary-landfill gates. If significant blobs are detected, an alarm signal is
sent to both the operator and the vehicle to inform that intruders are coming into the working
area. The second level deals with the location of an intruder on a 2D top-view map of the
working area. Images are acquired by multiple cameras placed at particular points of the sanitary
landfill (e.g., corners) in such a way that their fields of view cover the whole area to be
monitored. First, the best camera to see and track the intruder is selected according to the known
geometry of the working area, the positions of the cameras, and the estimated intruder position at
the next time instant (computed by the tracking module). However, if partial or total occlusions
are present in the field of view of the selected camera (e.g., the dimensions of the MBR of the
blob have significantly changed, as compared with the previous frame, or the blob has
completely disappeared), the intruder location is computed separately by two contiguous
cameras. Then, significant blobs are searched for in the difference image, and the related MBRs
are computed. The 3D position and direction of the vehicle, computed by the position-
12
determination module (based on the global position system (GPS)), are used to find on the image
plane the blob associated with the vehicle. A 3D-into-2D transformation, based on (a) a-priori
known information about the shape and dimensions of the vehicle, (b) perspective law equations,
and (c) the ground plane hypothesis, is performed. Finally, the position(s) of the intruder(s) is
sent to the operator display and to the high-level data-fusion module.
Each virtual sensor provides information about the estimated number of detected objects and
about their related 3D positions computed in the sensor-based reference system.
V. HIGH-LEVEL DATA FUSION
The main objective of high-level data fusion is to integrate all information coming from the
virtual sensors at the lower level in order to obtain an accurate estimate of the number of objects
present in the observed scene as well as their 3D positions.
The symmetry and perturbation model [26], a probabilistic representation model, and an
integration mechanism, based on the extended Kalman filter (EKF) [27], for uncertain geometric
information are used for multisensor data integration. It is worth noting that both real and virtual
sensors render uncertain geometric information in terms of partiality and inaccuracies. Partiality
refers to the degree of freedom associated with different object locations, and to the way such
locations determine the locations of other objects related to them. Inaccuracy refers to the level of
precision in estimating object locations.
A different set of parameters (expressed in different reference systems) are used to represent
object locations by the different virtual sensors. Moreover, different geometric features used to
represent object locations have different degrees of freedom, e.g., the location of a blob on the
image plane is determined by two degrees of freedom, whereas the location of an object by laser
data is determined by three degrees of freedom. A reference system E is associated with any type
of sensor. For example, the location xWE of an object in the image reference system with respect
to the general reference system W=(X,Y,Z) is given by a transformation tWE composed by two
Cartesian coordinates and an angle [27] (Fig. 8a):
T
W E yx ),,( x where ),(),( zRotyxTranstW E (12)
The degrees of freedom that determine the location of an object are related to its symmetries of
continuous motion [28]. These symmetries are defined by a set SE of transformations that
preserve the object. For example, the symmetries of a straight line on the image plane are the set
of continuous translations (Tx) along the line. (Fig. 8b). It is possible to represent the set of
symmetries using a row selection matrix BE, called binding matrix. According with this
definition, two objects, whose location is represented by the points A and B respectively, have
coincident locations if :
13
0 ABA xB (13)
where BA denotes the binding matrix of both objects. For example, let A and B be two object
locations into two different reference systems whose relative location is given by:
T
AB yx ),,( x
010
001BA BB
According to equation (1) we have: 0),( T
ABA yxxB . Two points coincide if the relative
positions in their associate reference systems are zero.
Sensors give imprecise information, thus one only obtains an estimate of the location of a
given object. Normally, sensors give measurements with uniform distribution of errors. In the SP
model, the estimate of the location of a given object E is denoted by W Ex , and the error
associated to this estimate is expressed using a differential location vector dE, relative to the
reference associated to the object (see Fig. 8a), so that the true location of E is given by
EW EW E dxx ˆ (14)
where indicates the composition of two location vectors. A vector pE formed by non-null
elements of dE is called perturbation vector. These two vectors are related by the binding matrix
BE:
E
T
EE pBd (15a)
EEE dBp (15b)
To this end, the information associated to the estimated location of the object is represented by a
quadruple EEEW EW E BCpxL ,,ˆ,ˆ , where E
T
EW EW E pBxx ˆ , }{ˆEE E pp and }{ EE Cov pC .
In this paper, we use the SP model along with the specialized version of the EKF for the SP
model, the SP filter, to estimate the location of obstacles from a set of partial and imprecise
observations in the vehicle-working environment.
Let x be the state vector whose value is to be estimated, and let there be n independent and partial
observations yk of x, where },..,1{ nk , affected by white Gaussian noise:
kkk uyy ˆ (16)
where ),0( kk SNu
Let each observation yk be related to x by an implicit nonlinear function of the form
)ˆ()ˆ(),( kkkkkkk yyGxxHhyxf (17)
where )ˆ,ˆ( kkk yxfh , )ˆ,ˆ( kyxk
kx
fH
and )ˆ,ˆ( kyx
kk
y
fG
The estimate nx of the state vector and its covariance Pn after integrating the measurements are:
14
nnn MPx ˆ (18)
where
n
k
k
T
kkk
T
kn
1
11 )()( HGSGHP (19a)
n
k
k
T
kkk
T
kn
1
1)( hGSGHM (19b)
In our approach, the state vector x represents the object location in the general reference system
W, and the vector yk contains the observations (object location) estimated by each virtual sensor
in its reference system.
A. Object tracking
An additional objective of the high-level data-fusion module is to estimate the motion parameters
of each mobile obstacle in order to prevent possible collisions with the autonomous vehicle and
decide the best trajectory to be followed. A Kalman-based tracking module, based on visual
observations, has been developed and applied independently to each camera [29]. It receives in
input the displacement rb=(x
b,y
b) over two consecutive frames of center of mass of each
detected blob, and variation (l,h) of the dimensions of the minimum bounding rectangle. The
output is represented by the estimated depth Zb of the detected blob in the general reference
system W, and by the estimated width P and length L of the detected object.
Let us consider the 3D motion of the center of mass R b X Y Zb b b
T
, , of an object moving
with translation and rotation motion parameters, t U V WT
, , and A B CT
, , , related to the
camera reference system. The following relation holds:
R Rb b
t (20)
According to the considered applications, the following constraints can be assumed: (a) local
planar motion of the vehicle V=A=C=0 (ground plane hypothesis) and (b) null rotational vector
=0 (we consider only the motion of the object centroid). As consequence, we obtain:
X U Y Z Wb b b 0 (21)
The third equation Z Wb represents the dynamic model for the quantity of interest Zb. For
what concerns the other QIs, if one considers the object as a rigid body, it is possible write:
0and0 PL .
An Extended Kalman Filter (EKF) is applied to estimate the quantities of interest, Zb, P and L.
The equations of the Kalman filter for the prediction phase are:
q q bk k k
1 1 (22a)
15
q q
k k
1
Q (22b)
where qkk
k k
T
Zb
L P
, , and b k K
T
W , ,0 0 . In order to determine the updating equations,
it is necessary to linearize the measure equation around the predicted value:
m q LI q nk k k k k k kf (23)
where the function f describes the relation among the quantities of interest Zb, P and L and the
system observations x, y, l and h, k and k are Gaussian random vectors with zero means
and covariance matrix Qk and Rk, respectively, and the matrix LIk is a 43 matrix containing the
coefficients of the variables linearized at the k-th step [29].
Finally, we can write the equations related to the updating phase:
K LI LI LI Rkk
k
T
kk
k
T
kq q
1
(24a)
q q
kk k
k
1
I K LI (24b)
q q K d d q K d LI q nk k k k k k k k k k k
(24c)
where Tk dhdlyx ,,,d and Kk is the filter gain. The filter must be initialized by a vector q0
and a variance matrix 0 that represents the accuracy of the initial estimation.
B. High-level risk detection
Data coming from the virtual sensors are integrated to improve the driving capabilities of the
autonomous vehicle and to obtain a complete control of the whole working area in terms of
numbers of fixed and mobile obstacles, their estimated 3D positions and their estimated motion
parameters. The high-level data-fusion module is in charge of the vehicle supervision, and
produces risk messages that are sent to the vehicle motion control, the path planner and the
operator. Two different situations may occur.
First situation One of the virtual sensors for intruder detection and location detects an intruder
vehicle moving in the sanitary landfill. Each intruder displacement is followed by the tracking
module, and estimates of its new position, speed and direction are computed (Fig. 9). On the
basis of the known vehicle position, speed and trajectory, a collision point computation function
is applied to verify if a collision between the intruder and the vehicle may occur. In the
affirmative, an alarm message containing information about the estimated collision point
( , , )X Y Z (i.e., Z=0) and about the estimated time before impact, te , is sent to the man machine
interface to alert the operator and to the vehicle motion control, which evaluates the degree of
16
risk and decides on the best action to face the situation: (a) stopping the compactor vehicle, (b)
slowing down its speed, or (c) ignoring the alarm message. The degree of risk is computed on
the basis of the estimated time before impact. If action (b) is chosen, attention is also focused on
the output of the virtual sensor in charge of detecting the object position. If an obstacle is
detected in front of the vehicle, the vehicle is stopped immediately, and a new alarm message is
sent to the man machine interface.
Second situation Any intruder moving in the sanitary landfill is detected by the intruder
detection and location virtual sensors; the virtual sensor in charge of detecting the object position
finds a moving obstacle (e.g., a driver getting out of the lorry and moving in the sanitary landfill,
an animal, etc.). The vehicle is stopped immediately, and an alarm message containing the 3D
position of the detected obstacle is sent to the intruder detection and location virtual sensors and
to the man machine interface. Such a message is repeated every 500 msec and provides the
updated object position. The thresholds THin and THout of the change-detection algorithm are
relaxed in order to improve the detection capabilities of the system until the moving intruder is
located and then tracked.
The high-level data-fusion module is able to provide updated information about a risky
situation at every time instant (about 1 sec), thus making it possible to avoid useless
computations such as tracking of blobs not related to any interesting object and recognition of
objects to which semantic meanings have already been assigned. This data-reduction process,
combined with the one performed by the change detection module, sharply reduces the
computational load.
VI. EXPERIMENTAL RESULTS
Two different scenarios, representing different working situations of the autonomous vehicle,
have been considered. The first scenario shows the autonomous vehicle and an intruder (i.e., a
lorry) moving in the sanitary landfill and generating possible collision situations, and the second
concerns some tests in bad environmental conditions.
An experimental test site has been arranged in order to simulate the morphological
characteristics of a sanitary landfill for urban solid waste. Such a simulation has been realized by
reproducing the real geometry and by using piled materials. The test site has an area of 3500
square meters and a rectangular shape (50x70 meters); it is characterized by a ground with an
average slope of about 2%. An embankment has been realized to enclose the test site and to
uphold a Control Room [19]. Plinths, complete of anchoring bolts for positioning the video
cameras, have been installed around the site.
A. First scenario: collision situations
17
Figure 10a shows an image sequence acquired by the gate camera at a frequency of 2 frames per
second. The intruder detection virtual sensor detects an intruder and produces a risk message for the
operator and for the vehicle control module. The cameras are switched on and the intruder location
and tracking modules are activated in order to predict possible collision situations. Figure 10b
shows the scene acquired from an internal camera placed in the best position (i.e., the side opposite
to the gate). It is possible to see the autonomous vehicle stopped in the left corner of the test site,
and the intruder moving in a direction perpendicular to the gate side. The MBR of each detected
blob is displayed in the original images. The presence of shadows and noise can erroneously
improve the dimensions of some MBRs. However, the constraints on the maximum rotation and
translation speeds imposed by the EKF model and its prediction capabilities allows the system to
maintain a satisfactory object location accuracy (in the range [-2,+2] meters) and to avoid wrong
estimates of the intruder trajectory. The object’s positions computed by the location virtual sensor
and the vehicle trajectory estimated by the tracking module are displayed on a 2D top-view map of
the site (Fig. 11a). Figure 11b shows the same image sequence as in Fig. 10a after some time. Now
the autonomous vehicle is moving and collision situations may occur. Data coming from ultrasonic
sensors are continuously processed to detect the presence of a terrain slope or holes. In particular,
after few frames, both vehicles are moving toward the same area: the risk of collision is very high
(frames 23 and 24). The collision point computation function estimates the collision point, i.e.,
(10,45,0) and the time to impact, i.e., about 20 sec, on the basis of (a) the known vehicle position
(i.e., (45,45,0)) and speed (i.e., 2 m/sec) and (b) the estimated intruder position (i.e., (30,15,0)). The
collision point is estimated on the basis of the measure of the vehicle speed, which is more accurate
than the estimated intruder speed (computed by the tracking module and equal to about 5 m/sec in
this case). If the estimated time-to-impact is shorter than a given threshold (e.g., 8 sec under the
hypothesis of constant acceleration), the autonomous vehicle is stopped immediately. Some time
instants after (frames 26 to 28), the autonomous vehicle and the intruder are moving in opposite
directions (e.g., the intruder is moving toward the gate), but in a neighborhood area. The risk of
collision is very low, but, due to perspective, an occlusion situation occurs. The vehicle is occluded
by the intruder vehicle: the related binary shapes are superimposed and fused into one MBR (frame
28) on the image plane. The information coming from the visual sensors is not sufficient to locate
the intruder to a satisfactory accuracy. In order to eliminate the ambiguity as to the intruder position,
the high-level data-fusion module uses also the data coming from the primitive object detector
virtual sensor. Figures 12a and 12b show the detected obstacle (frames 27 and 28) in the
reconstructed images coming from the ultrasonic sensors. In frame 28, an example of the utility of
used model is given: the fused data about the position of the obstacle (the real position is
18
(25.2,13.4,0)) can be estimated by the model to a higher accuracy (24.7,12.5,0) than the one
obtained by the single laser sensor. As a big obstacle has been detected in the neighborhood of the
vehicle, this is immediately stopped and a risk-alarm message is sent to the operator. The vehicle
remains stopped as long as until the obstacle in its neighborhood area (frames 29 to 32).
Figure 13a shows another complex situation (frames 29 to 32): the intruder vehicle is coming
out of the camera field of view. The intruder detection and location virtual sensors are unable to
work. The tracking module continues to predict the intruder position and speed according to the last
average five values (see Fig. 11a). After frame 32, the intruder is going out of the vehicle
neighborhood area, so the vehicle restarts. In frame 36, another risky situation occurs: the primitive
obstacle detector finds a hole in the front-right trajectory of the vehicle (Fig. 13b). The autonomous
vehicle is immediately stopped. Information about the position of the hole is sent to both the
operator and the path planner who decides on a new trajectory and restarts the vehicle.
Figure 13c shows another risky situation. The lorry is stationary in the middle area of the test
site. When the autonomous vehicle arrives near the lorry, the primitive obstacle detector virtual
sensor finds the obstacle and stops the vehicle. As the detected obstacle is stationary, the output of
the environment perception virtual sensor is analyzed to assign the obstacle to two different classes:
static man-made object or trash heap to be compacted. A quasi-vertical plane with parameters
(*,
*,z
*)=(14,33,7) has been detected. The classification result (i.e., the text message “man-made
object, no compacting operation”) is sent to the operator.
B. Second scenario: bad environmental conditions
The proposed system must work 24 hours per day. The ultrasonic sensor and the laser range
finder work well under bad environmental conditions (low visibility due to rain or fog, and
uneven lighting), and the visual sensor shows serious problems. Thanks to the cooperation
among the different sensors and to the related data-fusion strategies, the probability of false
object detection can be reduced.
B1. Low visibility
In order to simulate the situation of low visibility (due to heavy rain or fog), the original
images have been corrupted with salt and pepper noise. Up to 30% of the total number of pixels
per image have been modified in terms of gray level values (Figs. 14a and 14b). This kind of
noise, like a real rainy-weather situation, causes a remarkable degradation of the feature-
extraction process, as the size of the minimum bounding rectangle and the center of mass of a
blob change rapidly during the frame sequence. Consequently, the intruder trajectory estimated
by the EKF could not be very accurate. Figures 14c and 14d show the results of the change-
19
detection algorithm applied to the image in Figures 14a and 14b. Figures 14e and 14f show the
estimated trajectories of the detected intruder under different noisy conditions. These results
prove that the proposed approach is able to reduce the contributions of the vehicle shadow and
the corrupting noise, without greatly modifying the characteristics of the intruder trajectory. It is
important to note that, even if the system tracking performances are degraded, the high-level
data-fusion module is always able to detect and locate the intruder with acceptable errors.
B2. Uneven lighting conditions
We now examine the improvements obtained by considering the scene under uneven lighting
conditions: the gray levels of the original images contain insufficient quantities of information
(e.g., the gray-level mean is too low). A pre-processing phase is needed to simplify the
processing task of the visual-based virtual sensors.
Figure 15a represents a test image acquired towards sunset. In this case, we apply the window-
level contrast stretching function [30] (Fig. 15b). The result shown in the image of Fig. 15c,
exhibits a better visual contrast, even though it does not exhibit a noticeable visual clipping.
Figure 16a gives the example of an image acquired after sunset, towards night. In this case, we
apply two different nonlinear point-transformation functions to improve visual contrast: the
logarithmic function [30] (Fig. 16b) and the gamma function [30] (Fig. 16c). Results are
presented in Figs. 16d and 16e, respectively. It is worth noting that a simple contrast
manipulation function can improve the quality of the acquired images. For example, in the
image enhanced by the gamma function, the output is quite similar to an image acquired in
daylight under normal conditions. On the contrary, it is useful to observe that, in some cases, the
logarithmic transformation can make the output image too light. So we use this function in the
case of an image with many dark levels, and the logarithm works quite well.
However, the visual-based sensors should not to be able to run at night; therefore, it is
necessary to install an artificial lighting system within the landfill.
C. System performances
System robustness has been evaluated mainly in terms of detection and location accuracy for a
limited set of objects. In normal conditions and on a large number of real tests (about 104
images), the proposed multisensor data fusion system has provided a percentage of correct
intruder detection of 95%, a mean location accuracy of about 2 meters and a percentage of
correct risk detection in 92% of normal situations and 60% of complex situations showing object
occlusions. In particular, the primitive object detector based on ultrasonic data has provided a
successful object detection rate of about 87% and a successful terrain slope or hole detection rate
of about 76% on a large set of about 103 ultrasonic data. The environment perception virtual
20
sensor has provided a successful rate of detection of man-made static objects of about 85% and a
successful rate of trash heap detection of about 69% on a set of 5104 range data.
D. System limitations
The main limitations on the proposed system are defined in the following points.
(a) In the current system, the recognition capabilities are restricted to distinguishing between
static man-made objects (e.g., stopped trucks, sanitary-landfill fixtures, etc.) and trash heaps
characterized of irregular shapes. Such a recognition process, performed by the environment
perception virtual sensor, is based only on the detection of planar surfaces in 3D-range data [33].
In some situations, it could be more effective to use also visual data (b/w or color images) and
the related features (e.g., edges, regions, straight segments, etc.) to improve the recognition task.
To this end, computer vision methods are currently being studied, where the object recognition
process is performed on both range and visual data.
(b) A further limitation related to object recognition is that the vehicle is not equipped with a
CCD camera that could be used to better discriminate between regular objects and trash heaps.
This solution is under study, but two main problems should first be solved: (1) the high
vibrations generated by the engine of the vehicle produce very noisy images; (2) acquired images
should be transmitted (in real time and in a robust way) to the remote control center. New
representations of the information content should be added to the system.
(c) Very bad environmental conditions (e.g., rain, fog, etc.), low lighting and the presence of
shadows mislead the visual-based virtual sensors, which may erroneously generate false alarms.
(d) The assumption of low-speeds intruder vehicles moving in the sanitary landfill (maximum
speed: 20 Km/h) has been made.
VII. CONCLUSIONS
A multisensor data-fusion risk-detection system designed for an autonomous vehicle operating
in an outdoor environment (i.e., a sanitary landfill) has been presented. The core of the system is
a distributed sensory network organized into a hierarchical tree structure composed of as many
leaf nodes as the number of real sensors employed, e.g., several cameras, a laser range finder, and
sixteen ultrasonic sensors. Nodes at the second level are devoted to extracting features from
sensor data. The third level of the tree (low-level data fusion) is composed of a set of virtual
sensors, which integrate the informative content of one or more features to perform different
tasks: (a) detection of static obstacles, (b) detection of holes or down slopes, (c) recognition of
specific fixed objects (e.g., trash heaps), (d) detection and location of mobile objects (e.g., other
21
vehicles, pedestrians, etc.). The root node (high-level data fusion) integrates the evidence coming
from each virtual sensor in order to obtain fused information (in the general reference system)
about the number of objects, their 3D poses, and estimates of their motion parameters. Finally,
fused information is used to decide on the best action to take, e.g., stopping the vehicle, moving
the vehicle in a given direction, etc.
Results refer to real scenarios representing different working situations for the autonomous
vehicle, and demonstrate the goodness of the chosen solutions. The use of data-fusion strategies,
applied to complementary or redundant multisensor data, considerably helps in detecting risky
situations. Thanks to the capabilities and the robustness of the multisensor data-fusion system,
the autonomous vehicle can operate almost continuously, independently of changes weather
conditions or differences in lighting over time.
Future work will be aimed at overcoming the main system limitations and at testing the
proposed system in different application domains (e.g., airports, nuclear environments, etc.).
REFERENCES
[1] E.D. Dickmanns, B. Mysliwetz, T. Christians, "An integrated spatial-temporal approach to
automatic visual guidance of autonomous vehicles," IEEE Trans. on Systems, Man and
Cybernetics, vol. 20, no. 6, pp. 1273-1284, 1990.
[2] N. Kehtarnavaz, N.C. Griswold, and J.S. Lee, "Visual control of an autonomous vehicle
(BART) - The vehicle following problem," IEEE Transactions on Vehicular Technology, vol. 40,
no. 3, pp. 654-661, 1991.
[3] D.Koller, J.Weber, and J.Malik, "Robust multiple car tracking with occlusion reasoning,"
Proc. of IV European Conf. on Computer Vision, Stockolm, Sweden, pp. 189-196, 1994.
[4] V. Graefe, "Vision for intelligent road vehicles," IEEE Workshop on Intelligent Vehicles,
Tokyo, Japan, pp. 1-6, 1993.
[5] B.S. Rao, H.F. Durrant-Whyte and J.A. Shen, "A fully decentralized multisensor system for
tracking and surveillance," Int. Journal of Robotic Research, vol.12, no.1, pp.20-43, 1993.
[6] D. Corrall, "VIEW: Computer Vision for surveillance applications", IEE Colloquium on
Active and Passive Techniques for 3D Vision, IEE, London, 8/1-3, 1991.
[7] E. Jouseau and B. Dorizzi, “Neural networks and fuzzy data fusion. Application to an on-line
and real time vehicle detection system”, Pattern Recognition Letters, Vol. 20, no. 1, 1999, pp.
97-107.
[8] R. C. Luo, M. Lin, and R. S. Scherp, "Dynamic multi-sensor data fusion system for intelligent
robots", IEEE Journal of Robotic Automation, Vol. 3, No. 4, pp. 386-396, 1988.
[9] A. R. Hanson, E. M. Riseman, and T. D. Williams, "Sensor and information fusion from
knowledge-based constraints", in Proc. SPIE Conference on Sensor Fusion, Vol. 931, C. W.
Weaver, Ed., Orlando, FL, pp. 186-196, April 1988.
[10] L. Pau, "Knowledge representation approaches in sensor fusion", Automatica, Vol. 25, No.
2, pp. 207-214, 1989.
[11] R. C. Luo, "Multisensor integration and fusion in intelligent systems", IEEE Trans. on
System, Man, and Cybernetics, Vol. 19, No. 5, pp. 901-928, September/October 1989.
[12] J.L. Crowley and Y. Demazeau, Principles and techniques for sensor data fusion, Signal
Processing, Vol. 32, no. 1-2, 1993, 5-27.
[13] J.K. Aggarwal and N. Nandhakumar, Special issue on Sensor Fusion, Journal of Applied
Intelligence, Vol. 5, No. 3, pp. 207-290, July 1995.
22
[14] L.A. Klein, Sensor and Data Fusion Concepts and Applications, SPIE, Bellingham, WA,
1993.
[15] S. S. Iyengar, R. L. Kashyap and R. N. Madan, "Distributed Sensor Networks - Introduction
to the Special Section", IEEE Transactions on System, Man, and Cybernetics, Vol. 21, No. 5, pp.
1027-1031, September/October 1991.
[16] R. Hartley, “Calibration of cameras using the essential matrix”, in Proceedings of DARPA
Workshop on Image Understanding, San Diego, CA, January 26-29, 1992 (Morgan Kaufmann,
San Mateo, CA), pp. 911-915.
[17] R.I. Hartley, “Kruppas equations derived from the fundamental matrix”, IEEE Transactions
on Pattern Analysis and Machine Intelligence, Vol. 19, No. 2, 1997, pp. 133-135.
[18] T.N. Tan, G.D. Sullivan and K.D. Baker, Structure from motion using the ground plane
constraint, in Proc. of 2th European Conf. on Computer Vision, S.Margherita Ligure, Italy, May
1992, pp. 254-262.
[19] P. Grossmann, “From 3D line segments to object and spaces”, Proc. IEEE Conf. on
Computer Vision and Pattern Recognition, San Diego, CA, pp. 216-221, 1989.
[20] Z. Zhangh and O. Faugeras, “Finding clusters and planes from 3D line segments with
application to 3D motion determination, Proc. of. 2th European Conf. on Computer Vision, Santa
Margherita Ligure, Italy, May 1992, pp. 227-236.
[21] H. Illingworth, and J. Kittler, "The adaptive Hough Transform", IEEE Trans. on Pattern
Analysis Machine Intell., Vol. 9, No. 5, pp.690-698, 1987.
[22] G.L. Foresti, V. Murino, and C.S. Regazzoni, "Vehicle Recognition and Tracking from
Road Image Sequences", IEEE Transaction on Vehicular Technology, Vol. 48, No. 1, 1999, pp.
301-318.
[23] P.T. Jackway and M. Deriche, “Scale-space properties of the multiscale morphological
dilation erosion”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no.
1, 1996, pp. 38-51.
[24] G.L. Foresti, "Object Detection and Tracking in Time-Varying and Badly Illuminated
Outdoor Environments", Optical Engineering, Vol. 37, No. 9, 1998, pp. 2550-2564.
[25] N.H.C. Yung and C. Ye, “An intelligent mobile vehicle navigator based on Fuzzy logic and
reinforcement learning”, IEEE Transactions on System, Man, and Cybernetics – Part B:
Cybernetics , Vol. 29, No. 2, 1999, pp. 314-321.
[26] J.D. Tardos, “Representing partial and uncertain sensorial information using the theory of
symmetries”, in Proc. of IEEE Intern. Conference on Robotics and Automation, Nice, France,
May 1992, pp. 1799-1804.
[26] C. Brown, H. Durrant-White, J. Leonard, and B. Rao, "Centralized and decentralized
Kalman filter techniques for tracking, navigation, and control", American Control Conference,
Boston, MA, pp. 262-268, 1985.
[27] J.A. Castellanos, J.D. Tardós, J. Neira, "Constraint-Based mobile robot localization",
International Workshops on Advanced Robotics, Intelligent Machines, Salford, Manchester
(U.K.), April 1996.
[28] Arras K.O., Tomatis N., Improving Robustness and Precision in Mobile Robot Localization
by Using Laser Range Finding and Monocular Vision, in Proceedings of the Third European
Workshop on Advanced Mobile Robots (Eurobot 99), Zurich, Switzerland, Sept. 6-9, 1999.
[29] G.L. Foresti, and C.S. Regazzoni, "Localization and Tracking of Multiple Unknown Objects
in Real Environments", Electronics Letters, Vol. 31, No. 5, 1995, pp.355-356.
[30] J.C. Russ, The Image Processing Handbook, CRC Press, Boston, 1995.
[31] V. Murino, G.L. Foresti and C.S. Regazzoni, "Adaptive Camera Regulation for
Investigation of Real Scenes", IEEE Transaction on Industrial Electronics, Vol. 43, No. 5, 1996,
pp. 588-600.