A snake model for object tracking in natural sequences

20
Signal Processing: Image Communication 19 (2004) 219–238 A snake model for object tracking in natural sequences G. Tsechpenakis, K. Rapantzikos, N. Tsapatsoulis*, S. Kollias School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou Str., Zografou, Athens 15780, Greece Received 18 October 2002; received in revised form 1 June 2003; accepted 31 July 2003 Abstract Tracking moving objects in video sequences is a task that emerges in various fields of study: video analysis, computer vision, biomedical systems, etc. In the last decade, special attention has been drawn to problems concerning tracking in real-world environments, where moving objects do not obey any afore-known constraints about their nature and motion or the scenes they are moving in. Apart from the existence of noise and environmental changes, many problems are also concerned, due to background texture, complicated object motion, and deformable and/or articulated objects, changing their shape while moving along time. Another phenomenon in natural sequences is the appearance of occlusions between different objects, whose handling requires motion information and, in some cases, additional constraints. In this work, we revisit one of the most known active contours, the Snakes, and we propose a motion-based utilization of it, aiming at successful handling of the previously mentioned problems. The use of the object motion history and first order statistical measurements of it, provide us with information for the extraction of uncertainty regions, a kind of shape prior knowledge w.r.t. the allowed object deformations. This constraining also makes the proposed method efficient, handling the trade-off between accuracy and computation complexity. The energy minimization is approximated by a force-based approach inside the extracted uncertainty regions, and the weights of the total snake energy function are automatically estimated as respective weights in the resulting evolution force. Finally, in order to handle background complexity and partial occlusion cases, we introduce two rules, according to which the moving object region is correctly separated from the background, whereas the occluded boundaries are estimated according to the object’s expected shape. To verify the performance of the proposed method, some experimental results are included, concerning different cases of object tracking, indoors and outdoors, with rigid and deformable objects, noisy and textured backgrounds, as well as appearance of occlusions. r 2003 Elsevier B.V. All rights reserved. Keywords: Active contours; Snakes; Tracking; Cluttered sequences; Occlusion 1. Introduction This work addresses the problem of object contour modeling and tracking in natural video sequences. Specifically, given a video stream in which a known object of interest is in motion, the goal is to track the object’s silhouette across time ARTICLE IN PRESS *Corresponding author. Tel.: +30-210-772-3037; fax: +30- 210-772-2492. E-mail addresses: [email protected] (G. Tsechpenakis), [email protected] (K. Rapantzikos), [email protected] (N. Tsapatsoulis), [email protected] (S. Kollias). 0923-5965/$ - see front matter r 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.image.2003.07.002

Transcript of A snake model for object tracking in natural sequences

ARTICLE IN PRESS

Signal Processing: Image Communication 19 (2004) 219–238

*Correspondin

210-772-2492.

E-mail addre

[email protected]

(N. Tsapatsoulis

0923-5965/$ - see

doi:10.1016/j.ima

A snake model for object tracking in natural sequences

G. Tsechpenakis, K. Rapantzikos, N. Tsapatsoulis*, S. Kollias

School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou Str., Zografou,

Athens 15780, Greece

Received 18 October 2002; received in revised form 1 June 2003; accepted 31 July 2003

Abstract

Tracking moving objects in video sequences is a task that emerges in various fields of study: video analysis,

computer vision, biomedical systems, etc. In the last decade, special attention has been drawn to problems

concerning tracking in real-world environments, where moving objects do not obey any afore-known constraints

about their nature and motion or the scenes they are moving in. Apart from the existence of noise and environmental

changes, many problems are also concerned, due to background texture, complicated object motion, and deformable

and/or articulated objects, changing their shape while moving along time. Another phenomenon in natural sequences is

the appearance of occlusions between different objects, whose handling requires motion information and, in some

cases, additional constraints. In this work, we revisit one of the most known active contours, the Snakes, and we

propose a motion-based utilization of it, aiming at successful handling of the previously mentioned problems.

The use of the object motion history and first order statistical measurements of it, provide us with information

for the extraction of uncertainty regions, a kind of shape prior knowledge w.r.t. the allowed object deformations.

This constraining also makes the proposed method efficient, handling the trade-off between accuracy and

computation complexity. The energy minimization is approximated by a force-based approach inside the

extracted uncertainty regions, and the weights of the total snake energy function are automatically estimated as

respective weights in the resulting evolution force. Finally, in order to handle background complexity and partial

occlusion cases, we introduce two rules, according to which the moving object region is correctly separated from the

background, whereas the occluded boundaries are estimated according to the object’s expected shape. To verify the

performance of the proposed method, some experimental results are included, concerning different cases of object

tracking, indoors and outdoors, with rigid and deformable objects, noisy and textured backgrounds, as well as

appearance of occlusions.

r 2003 Elsevier B.V. All rights reserved.

Keywords: Active contours; Snakes; Tracking; Cluttered sequences; Occlusion

g author. Tel.: +30-210-772-3037; fax: +30-

sses: [email protected] (G. Tsechpenakis),

.gr (K. Rapantzikos), [email protected]

), [email protected] (S. Kollias).

front matter r 2003 Elsevier B.V. All rights reserve

ge.2003.07.002

1. Introduction

This work addresses the problem of objectcontour modeling and tracking in natural videosequences. Specifically, given a video stream inwhich a known object of interest is in motion, thegoal is to track the object’s silhouette across time

d.

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238220

varying images. Applications abound in computervision, video processing, including video analysisand understanding, object-based coding, content-based retrieval, remote surveillance, object recog-nition, as well as various biomedical problems,such as human organ motion analysis and track-ing. According to the application examined, objecttracking includes a number of problems thatemerge, making its successful performance a verychallenging task, especially regarding its generalityand its independence from initial constraints.Moreover, during the last decade, object trackinghas become one of the most important tasks, dueto the information that new technologies require.A representative example of these technologies, arethe MPEG-4 and MPEG-7 standards for videocoding, according to which object localizationalong time is needed, in order to separate the frontfrom the background, and handle them separately.Contrary to the MPEG-7 standard and manyother applications, such as surveillance for securitysystems, MPEG-4 and a wide variety of applica-tions require the exact contours of the movingobjects, for shape description and recognitionpurposes. In this framework, and depending onthe examined application, many approaches havebeen proposed in the literature, focusing either inthe highest accuracy, or the lowest computationalcomplexity. A great category of these approachesis the deformable templates [12] called activecontours, that have come up and been drawnspecial attention, due to their performance invarious problems, such as image and motionsegmentation [3,4,14,20], object tracking [5,9,21,22,27] and, similarly to these approaches,audiovisual speech recognition.Active contours were first introduced by Kass

et al. in 1988, with a model called Snake [13], andsince then many researchers have proposed var-ious modifications of it, in order to apply it to edgedetection, shape modeling and object trackingproblems, under different constraints. However,many problems concerning these models arise, dueto strong existence of noise, mostly in naturalcluttered sequences [26], the requirement of anappropriate shape initialization [28] and parametertuning [19]. Apart from these problems which aremet due to the definition of the snake models, and

thus can be considered as obvious and well known,there are also some cases that are not so farhandled by Snakes: (a) the performance of snakesin sequences with complex backgrounds, i.e.textured backgrounds containing strong edgesclose to the moving object boundaries, and (b)problems met when the desired moving objects getoccluded by obstacles, which may be static ormoving as well. These are mainly the reasons whymany researchers have proposed other activecontour models [11,20,22], utilizing region-basedinformation, such as motion, color and texture,stochastic approaches and appropriate shapeconstraints [23].In this work we focus on the above-mentioned

problems, i.e. existence of noise, backgroundcomplexity and occlusion handling, with a mod-ified snake model which utilizes the object’smotion history. Using the snake, which is a linearand thus computationally inexpensive model, andfirst-order statistics, we try to balance between thecomputational cost of the proposed method, andits accuracy. The latter is needed, mainly due tosome time-consuming morphological operationsapplied in each frame of the examined sequence.The proposed model is relatively robust toparameter tuning, while the required initializationof the snake in each frame is automaticallyextracted, along with a narrow band around it,in which the final solution (object contour) islocated. Thus, the proposed method consists ofthree main steps: (a) the frame pre-processing,using a morphological filter, and the extraction ofa morphological modified image gradient, (b) theextraction of the snake initialization and its‘‘uncertainty regions’’, in which the final solutionis located, and (c) the snake energy minimizationprocedure, which is approximated by a force-based curve evolution approach, inside the ex-tracted uncertainty regions. Finally, a motionestimation scheme is used to help handlingproblems related to object occlusion.Special attention has been given to the accuracy

of the proposed approach, resulting in efficientsolutions to a variety of applications: casesinvolving different classes of objects and objectmovements in natural sequences, where theamount of noise is not known, backgrounds that

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 221

are highly textured, and moving objects that getsuccessively occluded.

2. Snake model

Snakes are edge-based models widely used forobject shape modeling, in single images, and fortracking, in image sequences. So far, they havebeen successfully applied to problems where shapeprototypes of the desired objects are given. Ingeneral, snakes concern model and image dataanalysis, through the definition of a linear energyfunction and a set of regularization parameters.This energy function consists of two parts: (a) thedata-driven component (external energy), whichdepends on the image data according to a chosencriterion and (b) the smoothness-driven one(internal energy), which enforces smoothnessalong the snake. The goal is to minimize the totalsnake energy; this is achieved iteratively, afterconsidering an initial estimate for the object shape(prototype). Once such an appropriate initializa-tion is specified, the snake can converge to thenearby energy minimum, using gradient descenttechniques.According to the above formulation, a snake is

modeled as being able to deform elastically, butany deformation increases its internal energycausing a ‘‘restitution’’ force, which tries to bringit back to its original shape. At the same time, thesnake is immersed in an energy field (created bythe examined image), which causes a force actingon the snake. These two forces balance each otherand the contour actively adjusts its shape andposition until it reaches a local minimum of itstotal energy.Following the definition of the snakes and the

general idea they adopt, many problems arise,especially when examining cluttered natural se-quences. The first concerns the snake energyfunction and the snake’s convergence accordingto the energy minimization procedure: (a) there isa set of regularization parameters, used as weightsin the total function, that need to be set inappropriate values, (b) an appropriate snakeinitialization around the desired object has to bedefined manually, (c) the object boundaries have

to be distinct, that is the respective edges have tobe strong, and the background has to be relativelysmooth, so that the energy minimization proce-dure is not trapped in undesirable local minima(edges) other than the object boundaries. Theserequirements are very difficult to be met incluttered natural sequences, where weak objectedges, the existence of noise, highly texturedbackground or strong and persistent (along thetime) edges close to the desirable object bound-aries, are common and very frequent phenomena.Moreover, in real-world scenes, it often happensthat a moving object of interest gets successivelyoccluded by another static or moving object. Inthese cases, there is lack of necessary edgeinformation, and thus snake models cannot beefficiently applied, without the exploitation of anappropriate shape prior knowledge.Let us consider a snake representing a curve

defined by a position vector%X ðpÞ ¼ ðxðpÞ; yðpÞÞ on

the image plane, and is generally parameterized byp; 0pppP: The total energy function of the snakeEsnake; is then a weighted summation of an internalenergy factor Eint; corresponding to the summa-tion of a bending and a stretching energy term,and an external energy factor Eext; which denoteshow the snake evolves according to the examinedimage features:

Esnake ¼ a � Eint þ b � Eext; Eint ¼Z P

0

eintðpÞ dp;

Eext ¼Z P

0

eextðpÞ dp; ð1Þ

where eintðpÞ and eextðpÞ are the respective internaland external energy terms defined for each point p

of the curve, whereas a and b are normalizationparameters.For the internal energy Eint; a variety of

definitions have been proposed, according to theexamined application; indicatively three ap-proaches are mentioned: the B-snake [6], wherethe curve is approximated by B-spline polyno-mials, the affine-invariant (AI-) snake [10] and thecurvature-based snake [24] models. On the contrary,in most of the approaches, the external energy termat each point p is generally defined as [10]

eextðpÞ ¼ 1� jrGs�IðxðpÞ; yðpÞÞj � j%gðpÞ �

%nðpÞj; ð2Þ

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238222

where jrGs�IðxðpÞ; yðpÞÞj (Gaussian-of-Laplacian)denotes the magnitude of the gradient of the imageI ðIðx; yÞA½0; 1Þ convolved with a gaussian filter,of variance s at point ðxðpÞ; yðpÞÞ; corresponding tothe curve point p: The unit vectors

%gðpÞ and

%nðpÞ

denote the image gradient direction and thenormal vector of the snake at point p; respectively.The definition of the external energy term, given

in (2), is supposed to have the following effect:smooth regions of the examined image I ; where itsgradient has relatively low values, produce ex-ternal energy values close to 1, whereas imageedges produce external energy values close to 0. Inthis sense, the minimization procedure of the totalenergy function, forces the snake towards theedges closer to its initialization. This definitiondenotes that (a) the regions close to objectboundaries must be smooth without significantedges and noise, that could create undesiredminima, (b) if the background is noisy, then anappropriate value for the parameter s of thegaussian filter must be manually chosen. These aremainly the reasons why it is not easy to use thisdefinition for the external energy in naturalcluttered sequences.

2.1. Internal energy definition

In this work, we adopt a snake model thatutilizes the curvature Ksnake [19,24] and the pointdensity distribution DVsnake:

Ksnake ¼’x � .y � .x � ’y

ð ’x2 þ ’y2Þ3=2; DVsnake ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi’x2 þ ’y2

p; ð3Þ

assuming that the snake points are not equallyspaced along the curve. The first and secondderivatives of ðx; yÞ define the velocity and theacceleration, along the curve:

’X ¼ ½ ’x; ’y ¼dx

dp;dy

dp

� �;

.X ¼ ½ .x; .y ¼d2x

dp2;d2y

dp2

� �; ð4Þ

and for each point p of the snake, they arecalculated in terms of the neighboring points p � 1and p þ 1:

’xðpÞ ¼ xðp þ 1Þ � xðp � 1Þ; ð5Þ

’yðpÞ ¼ yðp þ 1Þ � yðp � 1Þ; ð6Þ

.xðpÞ ¼ xðp � 1Þ � 2xðpÞ þ xðp þ 1Þ; ð7Þ

.yðpÞ ¼ yðp � 1Þ � 2xðpÞ þ yðp þ 1Þ: ð8Þ

The curvature and the point density distributionfunctions are not affine-invariant, as can be easilyproved, and can uniquely define a curve in specifictime instances, whereas curvature denotes whethera part of a curve is convex ðKsnake > 0Þ or concaveðKsnakeo0Þ [1]. In our model, the curvaturefunction is used to provide us with a localsmoothness constraint, since values of K close to0 denote smooth parts of the snake, but also asimilarity criterion in the tracking approach asdescribed in Section 3. The point density functionis the snake local elasticity constraint, since itrepresents the distances between neighboringpoints along the snake. From Eqs. (3) and (4) itis DVsnake ¼ j ’Xj: The internal energy of theproposed snake model is then,

eintðpÞ ¼ jKsnakeðpÞj þDVsnakeðpÞ; ð9Þ

where j � j denotes the magnitude sign. Thus, theminimization of the internal energy forces thesnake into a smoother form.

2.2. External energy definition

Regarding the external energy component, weuse an alternative definition, in order to deal withnoise, textured backgrounds and edges close toobject boundaries. The proposed procedure ex-tracts an image-based component, after appro-priately pre-smoothing the examined image, inorder to eliminate randomly distributed noise, andit provides a good alternative way for overcomingthe problems that appear in definition (2); theproposed method uses various morphologicaloperations that lead to a modified image gradient.The use of morphological operations may alsoeliminate object boundary edges, in cases thatsome of these edges are not well distinguishable,and thus force the snake into a more complicatedform than the desired one. This problem isovercome with the use of the internal energyconstraint, which does not allow great localdeformations of the snake. Moreover, some

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 223

background edges may remain even after themorphological procedures are applied, but thisproblem is overcome with the use hoof furtherconstraints, as described in Section 3.2.3.The first step is the image pre-smoothing with a

non-linear morphological filter, a so-called alter-nating sequential filter (ASF). This filter is basedon morphological area opening ð3Þ and closing ð�Þoperations with structure elements of increasingscale [15]. More specifically, if Si; i=1,y,4, is the2-pixel connected line segments oriented at 90ði �1Þ degrees, and nSi is the corresponding ðn þ 1Þ-pixel elements of size n ¼ 1; 2; 3;y; then theopenings an and closings bn that make up thefilter, for an image I ; are,

anðIÞðx; yÞ ¼ maxiA½1;4

½I3nSiðx; yÞ;

bnðIÞðx; yÞ ¼ miniA½1;4

½I�nSiðx; yÞ: ð10Þ

Then the filtered image IASF is obtained by thefollowing cascade:

IASF ¼ bnanyb2a2b1a1: ð11Þ

In our method, we use that filter to preserve theline-type features (edges) of each frame of asequence, without smoothing them, which cannotbe achieved with, e.g., median filtering. Fig. 1illustrates an example of structure elements ofdifferent orientations (0�; 90�; 180� and 270�) andincreasing scale (3 3; 5 5 and 7 7), used forthe construction of the ASF utilized in our

Fig. 1. Structure elements of increasing sca

implementation. Also, Fig. 2 illustrates the per-formance of image smoothing using the proposedASF: (a) is the original image and (b) is the filteredimage; it can be clearly seen that a great amount ofclutter is eliminated. For a better implementationof the ASF, we propose the use of structureelements of 8 orientations, i.e. 0�; 45�; 90�; 135�;180�; 225�; 270� and 315�; in order to preserveedges in a variety of orientations. The result of thisimplementation is illustrated in Fig. 2(c).After filtering each frame of the examined

sequence, we follow a procedure aiming atpreserving the most important regions, and thusthe most important edges of the frame. Thisprocedure is a part of the watershed transforma-tion [16], and involves the extraction of binarymarkers of the most important regions. Thesemarkers are appropriately combined with theimage gradient so that its variations, due to noiseor weak edges, will be eliminated.More specifically, as shown in Fig. 3, we

subtract a constant h from the filtered imageIASF; and we apply a morphological grayscalereconstruction opening between IASF and IASF � h:The result of this procedure is subtracted fromIASF; and what we obtain is an image representingthe local maxima of IASF; then, the most importantmaxima are obtained by thresholding this imagewith h=2: The same procedure is followed toextract the binary markers that represent theimage most important minima; combining the

le ðn ¼ 1; 2; 3Þ for ASF construction.

ARTICLE IN PRESS

Fig. 2. Frame pre-smoothing with the proposed ASF: (a) original frame; (b) filtered frame with structure elements of 4 orientations;

and (c) filtered frame with structure elements of 8 orientations.

Fig. 3. Binary markers extraction for modified gradient calculation: subtraction of a constant h from the initial image, grayscale

reconstruction opening, local maxima of the image, and binary markers extraction using h=2 as a threshold.

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238224

binary marker images into one, a geodesic ero-sion reconstruction of image gradient G iscomputed,

Gm ¼ limk-N

½ðm~BÞ3GðkÞ; ð12Þ

where m is the binary markers image, B is asymmetric structuring element of radius 1, k is thenumber of the successive operations, and ‘‘~’’and ‘‘3’’ denote the flat erosion and supremumoperations, respectively [17].In our implementation, we normalize the image

intensity in the interval ½0; 1: Thus, a reasonablechoice of the constant h is 0:1php0:3; it isexperimentally proved that the value of h is notcrucial for the extraction of the image modifiedgradient. It must be also noted that instead of thegradient rIASF; we use the morphological image

gradient, defined as G ¼ ½s"IASF � ½s~IASF;where " and ~ are the dilation and erosionoperations, and s is a 3 3 structure element.After the modified image gradient Gm is

extracted, the external snake energy is definedas

eextðpÞ ¼ 1� G2mðpÞ � j

%gmðpÞ �

%nðpÞj; ð13Þ

where%gmðpÞ and

%nðpÞ denote the modified image

gradient unit vector and the normal vector of thesnake at the point p; respectively. Fig. 4 illustratesthe differences between (a) the image gradientrGs�I ðs ¼ 1Þ of Eq. (2) and (b) the proposedmodified gradient Gm of Eq. (13), and it can beclearly seen that a great amount of noise and weakedges is eliminated.

ARTICLE IN PRESS

Fig. 4. Differences between (a) the image gradient rGðs¼1Þ�I and (b) the proposed modified morphological image gradient Gm:

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 225

3. Proposed tracking method

In object tracking, the goal is to track object’scontour along time; this actually means that, foreach frame of a video sequence, we aim atseparating the object from the background. Dur-ing the last decade, many approaches for contourtracking have been proposed in the literature [25],which can be roughly categorized in three mainclasses: (a) edge-based methods [11], utilizing theedge information of the examined images, (b)region-based methods [18], that rely on theinformation obtained by region features (color,texture, etc.), and (c) combinations of edge andregion-based methods [22], which exploit andcombine the advantages of the previous twoclasses, formulating either active contours orgeometric alternatives of them, such as the Level-Set theory. Amongst these methods, there areapproaches relying on grouping motion informa-tion along time [18], introducing additional con-straints in terms of high level semantic knowledge[7], or utilizing prior knowledge for the desiredobject’s shape [23].The issue of object tracking in natural and

cluttered sequences, involves various problems,especially when its most general confrontation isneeded. The generality of a method is one of themost important goals, which means that we haveto deal with as few as possible initial constraints,or even make the tracking problem independentfrom initial conditions/constraints. The difficultiesthat usually arise in real-world scenes are: (a) non-rigid (or articulated) moving objects, (b) movingobjects with complicated contours, (c) object

motions that are not simple translations, but alsoinvolve rotations, objects approaching or drawingaway from the shooting camera (or camerazooming), (d) sequences with highly texturedbackground, (e) existence of noise or abrupt/gradual environmental (external) changes, suchas external lighting changes, (f) moving objectsthat get successively occluded by obstacles, thatcan be either static or moving.In this Section, we describe the proposed

tracking method that utilizes the snake modeldefined in the previous Section, and exploits theresults of a motion estimation scheme proposed in[2]. Our method consists of three main steps: (a)estimation of the snake’s initial position in frameI þ 1; given its position in the previous frame I ;using a robust motion estimation technique, (b)definition of a narrow band around the initializa-tion, in which the object contour is supposed to belocated, and (c) estimation of the contour’s finalposition in frame I þ 1: The narrow band aroundthe object is called ‘‘uncertainty region’’, and thisterm is used to account for the uncertainty factorthat is involved in estimating the shape and theposition of a moving object in a scene. In thissense, the problem of finding the correct contour isconstrained in a small image region, which allowsus to reduce the computational cost.

3.1. Snake initialization and uncertainty regions

extraction

Let us consider the contour CðIÞ of the desiredobject in frame I of a video sequence, defined by P

ordered points on the image plane, in terms of

ARTICLE IN PRESS

Fig. 5. Differences between a contour observed motion and its

instant motion in three successive time instances.

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238226

complex numbers, i.e.

CðIÞ ¼ ½CðIÞðpÞ ¼ xðIÞðpÞ þ j � yðIÞðpÞ j pAP; ð14Þ

where%X ðIÞðpÞ ¼ ðxðIÞðpÞ; yðIÞðpÞÞ is the position

vector of each point pAP of the contour. We usethe term ‘‘observed motion’’, dc; to define themotion of the contour, obtained by a motionestimation scheme, that is expected to give us theposition of the contour in the next frame I þ 1:

dðIþ1Þc ¼ ½d ðIþ1Þc ðpÞ j pAP

¼ ½MFðI ;Iþ1ÞðxðpÞ; yðpÞÞ j pAP; ð15Þ

where MFðI ;Iþ1ÞðxðpÞ; yðpÞÞ is the motion vectorbetween I and I þ 1; at point ðxðpÞ; yðpÞÞ; com-puted using the motion estimation scheme pro-posed by Black et al. [2].In our work we utilize the specific motion

estimation technique, considering it as an efficientway to overcome the common problems that arisein estimating the displacements or the optical flowin video streams. The most important problemthat is handled by this technique is the trade-offbetween motion field over-smoothing and noisesensitivity. More specifically, the method proposedin [2] (a) eliminates noisy motion estimates, thatcannot give us the motion information we need,(b) handles motion estimates’ over-smoothing,which leads to smooth but not actual motioninformation, and (c) provides smooth motionfields on edges, which are the most informativefeatures; this can lead to successful separationbetween the moving front and the background, orbetween two objects moving close to each other.The latter advantage is exploited in our work, inorder to estimate the desired object contour intextured backgrounds and detect its possibleocclusions.The observed motion d ðIþ1Þ

c ðpÞ of each contourpoint CðIÞðpÞ is different from its actual motion,which is calculated after the contour new positionis estimated in the next frame I þ 1: Thus, wedefine the ‘‘instant motion’’ of the contourbetween the frames I and I þ 1 as

mðIþ1Þc ¼ ½mðIþ1Þ

c ðpÞ j pAP

¼ ½%X ðIþ1ÞðpÞ �

%X ðIÞðpÞ j pAP; ð16Þ

where%X ð�ÞðpÞ represents the position vector corre-

sponding to the snake point p: This equationrepresents the displacement of snake final solutionbetween the successive frames I and I þ 1; i.e. thedifference between the respective position vectors,after the object contour in the next frame isestimated.The snake initialization C

ðIþ1Þinit in the next frame

I þ 1 is defined in terms of the observed motion,

CðIþ1Þinit ¼ CðIÞ þ dðIþ1Þc : ð17Þ

Fig. 5 illustrates the difference between theobserved motion, according to the obtainedmotion field, of a contour point, and its instantmotion. Snake initializations and the vectors of theobserved motion are shown in dotted form,whereas the final positions of the contour andthe instant motions are illustrated in solid line. Thecircled points represent the point p at its initialposition in each frame, whereas the filled circles(dots) represent its final position, estimatedaccording to the procedure described in thefollowing subsections.In order to form the uncertainty regions around

the snake initialization in each frame of theexamined sequence, which actually represents thearea in which possible contour deformations canoccur, we exploit the contour motion history,obtained by the contour instant motions in the L

previous frames. In this way, we form the ‘‘expected’’contour motion %mðIþ1Þ

c ; obtained as the contour’smean motion in the previous L frames,

%mðIþ1Þc ¼ ½ %mðIþ1Þ

c ðpÞ j pAP

¼1

L

XI�1i¼I�L

mðiþ1Þc ðpÞ j pAP

" #: ð18Þ

ARTICLE IN PRESS

Fig. 6. The proposed tracking approach in steps: (a,b) Two

successive frames of a face sequence and the respective

contours. (c) Amplitude of the deviation between the observed

and the expected contour positions, leading to (d) the

uncertainty regions of the curve.

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 227

The difference between the expected %mðIþ1Þc and

the observed dðIþ1Þc motion, is used to formulatethe uncertainty region, i.e. the region where theacceptable deformations of the snake should belocated. In particular, the expected solution (objectboundaries), for the next frame I þ 1; is located ina narrow band around the snake initializationC

ðIþ1Þinit ; whose width varies for each point p; and is

equal to 2 � sdðIþ1ÞðpÞ at each side of it (in thenormal direction to the curve C

ðIþ1Þinit Þ; where

sdðIþ1ÞðpÞ ¼ j %mðIþ1Þc ðpÞ � d ðIþ1Þ

c ðpÞj: ð19Þ

In the above formulation we adopt the idea thatthe object contour is more likely to move in thesame way, as it has been moving in the previous L

frames of the sequence. In cases where theexpected and observed contour positions coincide,there is clear evidence about the predicted contourposition and the uncertainty region is narrowed.Thus, the final solution is taken as the initial one;however, exact match of observed and expectedcontour position is practically impossible, due tocomputational inaccuracies.High values of sdðIþ1Þð�Þ denote either a problem

in the estimation of the observed contour position,or an abrupt change in the velocity of the trackedobject, compared to its mean velocity in theprevious frames. In both cases, the uncertaintyabout the predicted contour’s initial position ishigh and the corresponding uncertainty regionsare increased.Fig. 6 illustrates the proposed approach in steps,

in the case of face tracking. Figs. 6(a) and (b)present two successive frames of a face sequenceand the respective contours. Fig. 6(c) presents theamplitude of the computed deviation (in pixels)between the observed and the expected contourposition. Based on this deviation, the uncertaintyregions are then formed as shown in Fig. 6(d).The exact location of the object contour (final

solution) in the frame I þ 1 is obtained by solvingthe following equations:

CðIþ1Þ ¼ arg minrAR

½w1ðDðrÞK þ D

ðrÞDVÞ þ w2 � E

ðrÞext; ð20Þ

DðrÞK ¼

XpAP

½KðCðIÞÞðpÞ � KðrÞðpÞ2; ð21Þ

DðrÞDV ¼

XpAP

½DVðCðIÞÞðpÞ �DVðrÞðpÞ2; ð22Þ

EðrÞext ¼

XpAP

½eðrÞextðpÞ; ð23Þ

where KðCðIÞÞðpÞ; KðrÞðpÞ and DVðCðIÞÞðpÞ; DVðrÞðpÞ arethe curvature and the point density values of thecontour CðIÞ and the curve rAR at the point p;respectively. R is the set of all possible curves r

generated from the initialization CðIþ1Þinit inside the

extracted uncertainty region, and thus the mini-mization of Eq. (20) is actually a problem ofpicking out the correct curve. The parameters w1

and w2 represent the weights with which theenergy-based terms of Eq. (20) participate in theminimization procedure (corresponding to a and b

of Eq. (1)).The above equations impose the requirement of

the similarity between the obtained solution andthe contour estimated in the previous frame, interms of the smoothness and the elasticity of thesnake, which constrains the snake’s evolutiontowards the object boundaries (external energycomponent).

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238228

3.2. Force-based approach

The minimization of Eq. (20) is a time-consum-ing procedure, since the number of curves gener-ated from the snake initialization, inside theextracted uncertainty regions, is inhibitory forstraightforward implementation. In this subsectionwe propose a force-based approximation of theminimization procedure inside the extracted un-certainty regions.

3.2.1. Internal forces

If nðIþ1Þ ¼ ½%nðIþ1ÞðpÞjpAP and tðIþ1Þ ¼

½%tðIþ1ÞðpÞjpAP are the sets of the normal and thetangential vectors of the snake initialization C

ðIþ1Þinit

in the frame I þ 1; respectively, we define thefollowing forces:

FdðpÞ ¼ ½DVðCðIÞðpÞÞ �DVðCðIþ1Þinit

ðpÞÞ � tðIþ1ÞðpÞ; ð24Þ

FcðpÞ ¼ ½KðCðIÞðpÞÞ � KðCðIþ1Þinit

ðpÞÞ � nðIþ1ÞðpÞ: ð25Þ

The definition of these forces as being normal andtangential to the snake is motivated by the factthat Fd corresponds to the distances betweenneighboring points along the snake, and thus itrepresents the elasticity force, whereas Fc involvesthe curvature function, and thus represents thelocal deformation of the snake in its normaldirection. Comparing these two forces withEqs. (21) and (22), we actually propose twoforce-based shape similarity constraints, denotingthat the snake should not greatly deform its shapealong time, corresponding to the snake internalenergy components. In this sense, if two neighbor-ing points increase their distance during the snakedeformation, a tangential force Fd is producedtrying to bring them in their previous positions.Also, if the newly estimated position of a pointproduces higher or lower curvature value, i.e. thesnake becomes less smooth locally, then a restitu-tion force Fc; normal to the snake at this site, triesto bring the shape in its previous form.

3.2.2. External force

Let us define gm;pðkÞ; given by (26), be themodified image gradient function of all pixels k ¼xk þ j � yk that belong to the uncertainty region U;

and lie on the line segment that is defined by thenormal direction of the curve C

ðIþ1Þinit at point p;

gm;pðkÞ ¼ ½GmðkÞ j ðCðIþ1Þinit ðpÞ � kÞ �

%nðIþ1ÞðpÞ

¼ 1; kAU: ð26Þ

The use of this function, for every point of thesnake, is a way to determine the most salient edgepixel of the examined image I þ 1; inside theextracted uncertainty region, and thus indicate thepossible object boundary pixels, in the normaldirection of the snake. In this sense, the maximumof this function defines the direction of the snakeexternal force, as follows,

*kp ¼ arg maxk

½gm;pðkÞ; ð27Þ

sgnp ¼

þ if *kp inside the area defined

by CðIþ1Þinit ;

� otherwise;

8><>: ð28Þ

where sgnp denotes the sign/direction of theexternal force to be applied to C

ðIþ1Þinit ðpÞ: The

external snake force for each point CðIþ1Þinit ðpÞ is

then:

FeðpÞ ¼ sgnp � eextðpÞ �%nðIþ1ÞðpÞ: ð29Þ

This component is proportional to Gm and forcesthe snake to the salient edges inside the extracteduncertainty region. Thus when this force isapplied, each point of the snake marches towardsthe object boundary edges, with a step defined bythe value of the respective external energy termeextðpÞ: That is, when p is far from an edge, it iseextðpÞC1 and the snake marches with constant‘‘velocity’’; otherwise, when p approaches an edge,eextðpÞ reduces until it gets its minimum valueðeextðpÞC0Þ; and the snake stops. If we used theimage gradient jrGs�I j; then the function (26)would contain a great number of maxima and thusthe direction of the external force would probablylead to insufficient results. It must be also notedthat the maximum of the function gm;p for eachpoint p of the snake, corresponds to the minimumof the respective external energy term, accordingto Eq. (13).Although Eq. (27) provides the most salient

edge pixel in the normal direction of the snake at

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 229

each point p; there are cases that this pixel is notthe desired one, mainly due to four reasons:

* the ASF and the morphological operations,described in the previous Section, may eliminatenoise but also some of the edges on the objectboundaries, in case they are not strong enough,

* even the proposed modified image gradient maycontain background edges close to the movingobject, especially when these edges are strongand salient along the time (and thus cannot beeliminated with spatiotemporal filters),

* there are cases, not very usual in naturalsequences, that both the moving object andthe background are so smooth and with similarintensities, that the function gm;p in Eq. (26)does not have any maxima, and

* there are cases that the moving object getssuccessively occluded by an obstacle, that maybe a static or a different moving object, andthus there is loss of edge information for theoccluded object region; in that case it is likelythat the maximum of gm;p (if p is an occludedpoint) will lie on the boundary edge of theobstacle (which separates it from the movingobject).

3.2.3. Rule-based approach

In order to avoid the above-mentioned pro-blems, we use the information obtained from themotion field. In the following, we describe amethod that separates the moving object fromthe background, detects possible occlusions andhandles them, by estimating the occluded parts ofthe object contour.Without any loss of generality, let us suppose

that the background is highly textured and static,and the desired moving object gets successivelyhidden behind a static obstacle. Then, for eachpoint p of the snake, the maximum *kp ¼ ð *xp; *ypÞ ofgm;p does not probably lie on the desired edge. Letkl ¼ ðxl ; ylÞ and km ¼ ðxm; ymÞ be the surroundingpoints of *kp; on the line segment along which gm;p

is computed. Then, in order to accept *kp as thedesired maximum (object boundary pixel), thethree points *kp; kl and km must obey the followingtwo rules/constraints:

(a) *kp must divide that line segment in two parts:an immiscibly moving and an immiscibly staticone, that is

MFðI ;Iþ1ÞðklÞCd ðIþ1Þc ðpÞ and

MFðI ;Iþ1ÞðkmÞC0 or

MFðI ;Iþ1ÞðklÞC0 and

MFðI ;Iþ1ÞðkmÞCd ðIþ1Þc ðpÞ ð30Þ

and(b) *kp must be a moving point with velocity close

to dðIþ1Þc ðpÞ; that is

MFðI ;Iþ1Þð *kpÞCdðIþ1Þc ðpÞ: ð31Þ

Taking these constraints into consideration, weovercome cases that: (a) the maximum is found inbackground (Fig. 7(a)): it is not a moving one anddoes not separate two immiscible (according tomotion) parts of the function gm;p; (b) themaximum is found inside the moving object region(Fig. 7(b)): although it is a moving one, it does notdivide the function gm;p in such two parts, (c)occlusion occurs and the maximum is on theoccluding object boundary (Fig. 7(c)): the max-imum is not moving, although it makes the regiongm;p separation and (d) occlusion occurs and themaximum is in the occluding object region (Fig.7(d)): neither the maximum is moving, nor it doessuch a separation.If the above requirements are not met for the

global maximum *kp of the function gm;p; at point p

of the snake, we ignore it and continue searchingfor a local maximum that obeys these constraints.In occlusion cases, no such local maxima can befound and thus the external force is ignored,allowing the occluded parts of the snake to deformlocally according to the internal forces. Fig. 8illustrates detection of occlusion with the use ofthe above defined rules that the local maximum *kp;corresponding to a curve point p; must obey.Finally, it should be noted that in occlusion

cases, the more deformable a moving object is, orthe more an object gets occluded, the less accuratethe estimation of the occluded parts of its contouris. Moreover, when the desired object is alreadypartially occluded, for the hidden parts of it,instead of the respective observed motion ofEq. (15), we use the previously estimated instant

ARTICLE IN PRESS

Fig. 7. Four different cases in which the three points *kp; %kl and %km do not obey both of the two rules for the separation of the moving

object from the background or the occluding obstacle.

Fig. 8. Detection of occlusion using the two rules of Eqs. (30) and (31) for the local maximum *kp of the function gm;p for a point p of

the snake. The plots represent the function 1� gm;p and thus *kp are presented as local minima of the 1� gm;p:

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238230

motion mðIÞc (Eq. (16)) to initialize the snake in the

next frame. Then we compute the deviation asfollows:

sdðIþ1ÞðpÞ ¼ %mðIþ1Þc ðpÞ � mðIÞ

c ðpÞ: ð32Þ

The latter is due to the obvious failure of themotion estimation scheme for the occluded parts,

since no region information (intensity) is availablefor them.

3.2.4. Force-based approximation of the energy

minimization procedure

In the force-based approach, the snake initiali-zation C

ðIþIÞinit marches towards the object’s bound-

aries in the next frame I þ 1; due to the internal

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 231

and external forces applied to it. In this way,the minimization procedure of Eq. (20) isapproximated in an iterative manner similar tothe steepest descent approach [8], as it is summar-ized below.Let CðIþ1Þ;ðxÞ be the estimated contour in the x-

est iteration, in the next frame I þ 1; then thefollowing equations hold:

CðIþ1Þ;ð0Þ ¼ CðIþ1Þinit ; ð33Þ

CðIþ1Þ;ðxÞ ¼ CðIþ1Þ;ðx�1Þ þ DCðxÞ; ð34Þ

DCðxÞ ¼ ½CðIþ1Þ;ðx�1ÞðpÞ þ Fðx�1Þtot ðpÞ j pAP; ð35Þ

Fðx�1Þtot ðpÞ ¼w1 � ½Fðx�1Þ

c ðpÞ þ Fðx�1Þd ðpÞ

þ w2 � Fðx�1Þe ðpÞ; ð36Þ

where Fðx�1Þd ðpÞ; Fðx�1Þ

c ðpÞ and Fðx�1Þe ðpÞ are esti-

mated according to Eqs. (24), (25) and (29),respectively, on the basis of CðIþ1Þ;ðx�1Þ; instead ofC

ðIþ1Þinit : The exponent ðx� 1Þ in the forces’ repre-

sentations, denotes that they are calculated in thex� 1 iteration, according to the form of the curveCðIþ1Þ;ðx�1Þ (for the internal forces), and its positionon the image plane (for the external force).The final solution CðIþ1Þ in the next frame I þ 1

is obtained when one of the following criteria issatisfied:(a) the weighted summation of the correspond-

ing forces’ norms in the x-est iteration has lowervalue than the respected summation in the nextiteration, i.e.

F ðxÞt oa � F ðxþ1Þ

t ; ð37Þ

where

F ðxÞt ¼w1 �

XpAP

FðxÞc ðpÞ

þ

XpAP

FðxÞd ðpÞ

!

þ w2 �XpAP

FðxÞe ðpÞ

; ð38Þ

where a is a positive constant in the range 0oao1:When a is selected to be close to one, then CðIþ1Þ ismore likely to correspond to a local minimum

solution; lower values of a increase the number ofiterations and, therefore, the execution time. Theuse of the uncertainty regions allows values of aget close to 1.(b) the maximum number of iterations is

reached; in this case,

CðIþ1Þ ¼ CðIþ1Þ;ð*xÞ; *x ¼ arg minx

½F ðxÞt : ð39Þ

It must be noted that the use of the proposedsteepest descent approach does not ensure that thefinal contour corresponds to the solution ofEq. (20). However, under the constraints we pose,even if CðIþ1Þ corresponds to a local minimum, it isclose to the desired solution.

3.3. Weight estimation

As mentioned in Section 2, snake modelsgenerally involve a set of regularization para-meters (parameters a and b in Eq. (1)) taking partin the total energy function. In our approach, werepresent these parameters as w1 and w2; they arethe weights with which the internal and theexternal energy components of Eq. (20) contributein the minimization procedure. These weights arefirst calculated in the force-based implementation,in order to determine the importance of each forceterm, and thus determine the snake deformationaccording to (a) the snake elasticity and smooth-ness forces and (b) the force depending on theexamined frame features (edges).In order to make our method more straightfor-

ward, we propose an efficient and automaticestimation of these weights. Contrary to the energyminimization procedure of common snakes, wherethese weights are crucial for the accuracy of theobtained solutions, in our force-based method, w1

and w2 do not have to take specific values, butvalues that indicate the relativity between the‘‘internal’’ and ‘‘external’’ information we utilize.This is mainly due the fact that we haveconstrained the problem (snake deformation) ina narrow band, i.e. the uncertainty region, andthus the final solution cannot be far from theactual object boundaries. Also, as mentioned inthe force-based approach, although it is not easyto theoretically prove that the snake in its final

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238232

position will correspond to the solution of theenergy minimization, it is experimentally shownthat at least it is very close to (or probablycoincides with) it.For the weight w1 corresponding to internal

force terms, it suffices to count the snake zero-crossings, that is the points at which the curvaturefunction intersects the zero level. These pointscan determine how smooth the snake locally is,and thus denote how reliably we can apply theinternal forces. The number of zero-crossings Zc ofa snake CðIÞ ¼ ½CðIÞðpÞjpAP; in a frame I ; withcurvature function KCðIÞ (Eq. (3)) is calculatedaccording to

Zc ¼XpAP

1 if KCðIÞ ðpÞ � KCðIÞ ðp þ 1Þo0;

0 otherwise:

(ð40Þ

Regarding weight w2; corresponding to the ex-ternal force component, we simply compute thesummation mðext;UÞ of the external energy values atall the pixels q inside the extracted uncertaintyregion U:

mðext;UÞ ¼XqAU

eextðqÞ; ð41Þ

mext;U ¼1

P

XkAU

GmðkÞ: ð42Þ

In this sense, smooth uncertainty regions result tohigh values of mðext;UÞ; whereas uncertainty regionsthat are textured or contain strong edges, otherthan the object boundaries, result to significantlylower values of mðext;UÞ: Thus, if the extracteduncertainty region is textured, the external forcecomponent is assigned with lower weight, since itis considered as unreliable criterion for the snakedeformation.Practically, in order to set w1 and w2 values,

according to the above and taking into account thenumber of snake points and the uncertainty regionarea, after extracting Zc and mðext;UÞ; we computethe quantities,

#Zc ¼Zc

P; 0p #Zcp1; ð43Þ

where P is the number of all snake points, and

#mðext;UÞ ¼mðext;UÞ

Q; 0p #mðext;UÞp1; ð44Þ

where Q is the number of all pixels inside theuncertainty region ðqAUÞ; i.e. the uncertaintyregion area.Thus, given the values of #Zc and #mðext;UÞ; the

weights w1 and w2 are estimated by,

w1 ¼100

e5#Zc

; w2 ¼100

e5 #mðext;UÞ: ð45Þ

The factor 5, in the exponential parts of thedenominators, is experimentally found to give thedesired values of w1 and w2; as #Zc; #mðext;UÞA½0; 1:Let us now consider the examples of Fig. 9, wherethree objects of different shape complexity are inmotion in either textured or smooth backgrounds.Figs. 9a, d and g represent the original imagesalong with the moving object contours, Figs. 9b, eand h illustrate the proposed external energycomponents, whereas Figs. 9c, f and i are therespective curvature function plots. Indicatively, inthe second example, the background is verysmooth (Fig. 9(e)) and thus we estimate theexternal force with great reliability, but theaircraft’s contour is complicated, which can beseen in the curvature plot, and thus the internalforces are not reliable.

3.4. Time-window L

In Section 3.1 we described a method for thesnake uncertainty region formation, utilizingthe motion history of the tracked contour. In therespective Eq. (18), it is denoted that thesemeasurements involve the recent history of thetracked contour, i.e. the number L of the previousframes. We usually set L ¼ 5y15; according tothe object motion, that is, the ‘‘smoother’’ theobject motion is (almost constant velocity), themore informative a wide time-window is, and thuswe use higher value for L; otherwise, we shoulduse the very recent motion history, i.e. a low valuefor L:The question is what happens in the first frames

of the examined sequence. In the first frame, weconsider that the contour position has been

ARTICLE IN PRESS

Fig. 9. Curvature and external energy terms. (a,d,g) different cases of contours and background textures, (b,e,h) respective external

energies visualization and (c,f,i) respective curvature distributions.

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 233

manually set, whereas for the second frame weproduce manually wide uncertainty regions. Thenin the next frames, the object contours may beestimated not as accurately as we desire, but astime passes the motion history is getting ‘‘en-riched’’ (updated) and thus the solutions getbetter.

4. Experimental results

In the experimental results we present in thisSection, we focus on natural sequences where veryinteresting cases can be encountered, due to theexistence of noise, object deformations, compli-cated motions, and unpredictable external condi-tions (such as lighting changes, temporal clutter

etc.). Moreover, cases with highly textured back-grounds and moving object partial occlusions arealso handled. The performance of the proposedapproach was tested over a large number of suchsequences, and in this section we choose toillustrate the most representative examples.Fig. 10 illustrates the case of tracking a car,

moving towards the shooting camera. In thissequence, although the moving object is rigid, itssilhouette is getting deformed slowly along time,due to its motion projection on the image plane.The background regions close to the object do notcontain strong edges; however, some edges on theobject boundaries are weak (low values of imagegradient). As can be seen in (Fig. 10(a)), we assumethat in the first frame the object contour is notestimated with the desired accuracy, but as long as

ARTICLE IN PRESS

Fig. 10. A moving car tracking in six (a–c) successive frames of a traffic sequence. The background is relatively smooth close to car’s

boundary, while the car’s contour is not very complicated.

Fig. 11. A sperm whale tale tracking in six (a–c) successive frames of a traffic sequence: an example of a very cluttered sequence with

highly textured background.

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238234

motion history gets updated, the results are gettingbetter.Fig. 11 illustrates the tracking of the tail of a

sperm whale. Object tracking in this case isconsidered to be difficult, due to the highlytextured background and the strong clutter closeto the desired object boundaries. The accuracy ofthe obtained results is enhanced due to the framepre-filtering and the proposed modified image

gradient. The results show that a very attractiveapplication of the proposed method would be thesperm whale identification, based on its tail’scontour local formations, which is a task in marinebiology science.In Figs. 12 and 13 the case of a small moving

object that gets partially occluded is presented.Fig. 12 illustrates some indicative results of themotion estimation scheme [2] we utilize, and its

ARTICLE IN PRESS

Fig. 12. Indicative motion estimation results for the tracking example of Fig. 13.

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 235

ability to preserve smooth estimates on theboundary between two different objects. Weexploit this advantage to track the finger nail inFig. 13, efficiently separating the moving frontfrom the background. Due to the slow motion ofthe finger, accurate nail contours are obtained,even in the frames where it is occluded by theobstacle (toy).In the final example of Fig. 14, the case of a

moving object that gets successively occluded isillustrated. The main problem in this case is theexistence of weak edges on the object boundaries,due to the shadowing effects and the intensitysimilarity between the moving object region andthe background, which leads to inaccurate motionestimates. On the other hand, the desired object isrigid, its motion is mainly translational (parallel to

the camera’s plane) and the obstacle’s ‘‘crucial’’edge is strong enough to indicate that occlusionoccurs.

5. Conclusions and further work

In this paper we have presented a modifiedsnake model assisted by rules, aiming at trackingobjects in natural sequences, where the amount ofnoise is unknown, backgrounds may be highlytextured, and partial object occlusions may occur.The internal energy of the proposed snake model isgiven in terms of two well known geometriccharacteristics, i.e. the point density distribution(elasticity) and the curvature (smoothness) func-tions, whereas the proposed external energy is

ARTICLE IN PRESS

Fig. 13. Tracking a small rigid object in partial occlusion.

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238236

given in terms of a modified morphological imagegradient. The object’s motion history, along with arobust motion estimation scheme, provide thesnake initializations for the next frames of theexamined sequence, as well as uncertainty regionsaround these initializations, indicating the possi-ble/allowable snake deformations. In this way, weconstrain the contour estimation problem in asmall frame region, and we follow a force-basedapproach to approximate the snake’s energy

minimization procedure, avoiding the point corre-spondence problem between two successiveframes. Also, in order to handle object occlusions,as well as backgrounds with strong edges close toobject boundaries, we propose a rule-basedimplementation of the proposed tracking model,utilizing again the motion estimation scheme [2].The indicative experimental results we presentillustrate the proposed method’s success, when theafore mentioned problems occur.

ARTICLE IN PRESS

Fig. 14. Object tracking with insufficient motion field and partial occlusion.

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238 237

Our future work focuses on tracking movingobjects utilizing higher-order statistics, in order todeal with articulated motion of objects. Also, weare currently investigating the estimation of force(energy) weights, in order to establish a reliablerelation between them.

References

[1] Y. Avrithis, Y. Xirouhakis, S. Kollias, Affine-invariant

curve normalization for object shape representation,

classification and retrieval, Mach. Vis. Appl. 13 (2)

(2001) 80–94.

[2] M.J. Black, P. Anandan, The robust estimation of multiple

motions: parametric and piecewise-smooth flow fields,

CVIU 63 (1) (1996) 75–104.

[3] V. Caselles, R. Kimmel, G. Sapiro, C. Sbert, Minimal

surfaces: a geometric three dimensional segmentation

approach, Numer. Math. 77 (4) (1997) 423–425.

[4] L.D. Cohen, On active contour models and ballons,

CVGIP: Image Underst. 53 (2) (1991) 211–218.

[5] M. Daoudi, F. Ghorbel, A. Mokadem, O. Avaro, H.

Sanson, Shape distances for contour tracking and motion

estimation, Pattern Recogn. 32 (1998) 1297–1306.

[6] P. Delagnes, J. Benois, D. Barba, Active contours

approach to object tracking in image sequences with

complex background, Pattern Recogn. 16 (1995)

171–178.

[7] D. Gravila, L. Davis, 3-D Model-based tracking of

humans in action: a multi-view approach, in: Proceed-

ings of the IEEE Conference on Computer Vision and

Pattern Recognition (CVPR’96), San Francisco, CA, 1996,

pp. 73–80.

[8] S. Haykin, Neural networks, Macmillan College Publish-

ing Company, New York, 1994, pp. 124–126 (Section 5.3).

ARTICLE IN PRESS

G. Tsechpenakis et al. / Signal Processing: Image Communication 19 (2004) 219–238238

[9] M. Hoch, P. Litwinowicz, A practical solution for tracking

edges in image sequences with snakes, Visual Comput. 12

(2) (1996) 75–83.

[10] H.S. Ip, S. Dinggang, An affine-invariant active contour

model (AI-Snake) for model-based segmentation, Image

Vision Comput. 16 (2) (1998) 135–146.

[11] M. Isard, A. Blake, Contour tracking by stochastic

propagation of conditional density, in: Proceedings of

the European Conference on Computer Vision (ECCV’96),

1, Cambridge, UK, 1996, pp. 343–356.

[12] A.K. Jain, Y. Zhong, M.-P. Dubuisson-Jolly, Deform-

able template models: a review, Signal Process. 71 (1998)

109–129.

[13] M. Kass, A. Witkin, D. Terzopoulos, Snakes: active

contour models, Internat. J. Comput. Vision 1 (4) (1988)

321–331.

[14] A. Mansouri, T. Chomaud, J. Konrad, A comparative

evaluation of algorithms for fast computation of level set

PDEs with applications to motion segmentation, in:

Proceedings of the International Conference on Image

Processing (ICIP’01), Vol. 3, Thessaloniki, Greece, 2001,

pp. 636–639.

[15] P. Maragos, Noise suppression, in: V.K. Madisetti, D.B.

Williams (Eds.), The Digital Signal Processing Hand-

book, CRC Press, Boca Raton, FL, 1998, pp. 20–21

(Chapter 74).

[16] P. Maragos, Image segmentation, in: V.K. Madisetti, D.B.

Williams (Eds.), The Digital Signal Processing Hand-

book, CRC Press, Boca Raton, FL, 1998, pp. 25–26

(Chapter 74).

[17] P. Maragos, Morphological signal and image process-

ing, in: D. Williams (Eds.), The Digital Signal Process-

ing Handbook, CFC Press, Boca Raton, FL, 1998,

pp. 74.1–74.30.

[18] F. Meyer, P. Bouthemy, Region-based tracking using

affine motion models in long image sequences, CVGIP:

Image Underst. 60 (2) (1994) 119–140.

[19] F. Mohanna, F. Mokhtarian, Improved curvature estimation

for accurate localisation of active contours, in: Proceedings of

the International Conference on Image Processing (ICIP’01),

Vol. 2, Thessaloniki, Greece, 2001, pp. 781–784.

[20] S. Osher, J. Sethian, Fronts Propagating with curvature-

dependent speed: algorithms based on the Hamilton–

Jacobi formulation, J. Comput. Phys. 79 (1988) 12–49.

[21] N. Paragios, R. Deriche, A PDE—based level set approach

for detection and tracking of moving objects, in: Proceed-

ings of the Sixth International Conference on Computer

Vision (ICCV’98), Bombay, India, 1998, pp. 1139–1145.

[22] N. Paragios, R. Deriche, Geodesic active contours and

level sets for the detection and tracking of moving objects,

IEEE Trans. Pattern Anal. Mach. Intell. 22 (3) (2000)

266–280.

[23] N. Paragios, M. Rousson, Shape priors for level set

representations, European Conference in Computer Vi-

sion, Vol. 22(3), Copenhagen, Denmark, 2002.

[24] J.S. Park, J.H. Han, Contour motion estimation from

image sequences using curvature information, Pattern

Recogn. 31 (1) (1998) 31–39.

[25] J.S. Park, J.H. Han, Contour matching: a curvature-based

approach, Image Vision Comput. 16 (1998) 181–189.

[26] N. Peterfreund, Robust tracking of position and velocity

with Kalman filters, IEEE Trans. Pattern Anal. Mach.

Intell. 21 (6) (1999) 564–569.

[27] C. Vieren, F. Cabestaing, J.G. Postaire, Catching moving

objects with snakes for motion tracking, Pattern Recogn.

16 (1995) 679–685.

[28] C. Xu, J.L. Prince, Snakes, shapes, and gradient vector

flow, IEEE Trans. Image Processing 7 (3) (1998) 359–369.