Erratum to "Shape matching of partially occluded curves invariant under projective transformation

Shape matching of partially occludedcurves invariant under projective transformation

Carlos Orrite and J. Elias Herrero*

Department of Electronics and Communication Engineering, University of Zaragoza,Centro Politecnico Superior, Zaragoza 50015, Spain

Received 5 June 2002; accepted 3 September 2003

Abstract

This paper describes a method to identify partially occluded shapes which are randomlyoriented in 3D space. The goal is to match the object contour present in an image with an ob-ject in a database. The approach followed is the alignment method which has been described indetail in the literature. Using this approach the recognition process is divided into two stages:first, the transformation between the viewed object and the model object is determined, andsecond, the model that best matches the viewed object is found. In the first stage, invariantpoints under projective transformation (based on bitangency) are used, which drastically re-duced the selection space for alignment. Next, the curves are compared after the transforma-tion matrix is estimated between the image and the model in order to determine the pose of thecurve that undergoes the perspective projection. The evaluation process is performed using anovel estimation of the Hausdor! distance (HD), called the continuity HD. It evaluates par-tially occluded curves in the image in relation to the complete contour in the database. Theexperimental results showed that the present algorithm can cope with noisy figures, projectivetransformations, and complex occlusions.! 2003 Elsevier Inc. All rights reserved.

Keywords: Computer vision; Shape matching; Object recognition; Hausdor! distance; Alignmentapproach; Bitangency; A"ne transformation

*Corresponding author. Fax: 34-976-76-21-11.E-mail address: [email protected] (J. Elias Herrero).

1077-3142/$ - see front matter ! 2003 Elsevier Inc. All rights reserved.doi:10.1016/j.cviu.2003.09.005

Computer Vision and Image Understanding 93 (2004) 34–64

www.elsevier.com/locate/cviu

1. Introduction

Object recognition is one of the most important aspects of visual perception. Thegoal is to recognize objects in an image (i.e., match an object in an image with anobject in a database). Shape-based object recognition involves di!erent levels ofcomplexity depending on the restrictions of the scene configuration:• flat and rigid objects that can move in a plane,• flat objects that can move and rotate in three-dimensional space,• three-dimensional objects in rigid transformations with visible contours that are

sharp (edges of a cube) or smooth (projected silhouette of a cylinder or a sphere),• articulated objects with movable parts (e.g. scissors or the human body),• real objects that bend, stretch and perform other complicated transformations.

In this paper, we describe a method to identify planar objects which are orientedin 3D space and partially occluded. Special attention is given to cases where the sim-ilarity between two curves is weak due to severe spatial transformation. This may bethe case in perspective or occlusion between several objects where the contour in-cludes di!erent pieces of the objects in the image.

A large number of di!erent methods have been proposed to solve this problem.Work on contour matching can be divided into two main groups: proximity match-ing and object decomposition [1,2].

1.1. Proximity matching methods

Proximity matching methods search for the best match while allowing rotation,translation, and scaling of each curve and minimizing the distance between matchedkey points [3–6].

Ayache and Faugeras [4] present a method to identify and locate objects lying ona flat surface known as HYPER. It analyses real scenes with randomly oriented andpartially occulted flat industrial parts. The model position is defined by a T transfor-mation that takes into account a rotation in the plane, a scaling and a translation.The hypotheses generation is based on local compatibility defined as the angle di!er-ence and length of segments after estimating the T transformation. It is evaluated byupdating the model position with a recursive least square technique (Kalman filter)to update the estimated T transformation.

The main di!erence with our proposal is the paradigm used to generate and eval-uate the hypotheses. Ayache and Faugeras use a linear approach due to the restric-tions of the scene configuration (i.e., rotation in the plane of the image). This is notpossible in our case since the approach is non-linear (i.e., rotation of the object in 3Dspace). We use invariant points for alignment and search for the optimum solution ina reduced set of possibilities.

Zisserman et al. [5] recognize general curved objects using four points based onproperties that are preserved under projection (e.g., incidence properties like tan-gency and points of tangency). Their approach is not useful in the case of occlusionbecause the points obtained from the input image are not equivalent to those ob-tained by the same method in the model view.

C. Orrite, J. Elias Herrero / Computer Vision and Image Understanding 93 (2004) 34–64 35

Huttenlocher and Ullman [6] use an application based on the alignment approachto recognize flat objects such as rigid machine parts that can translate and rotate inspace and change scale. The recognition system identifies a small number of salientand stable points such as strong maxima in curvature, deep concavities and the cen-tres of closed or almost closed blobs. This method is based on three points to deter-mine the transformation carried out by specifying six parameters: three for rotation,two for the translation (under orthographic projection) and one for scaling. In thispaper, we consider four points because the transformations can be projective and notonly orthographic. On the other hand, it is di"cult to identify stable points in pro-jective transformations. For example, in general the maxima of the curvature pointsin [6] are not stable. A circle can be viewed as an ellipse with two maxima of curva-ture or vice versa and the ellipse can be transformed into a circle losing these extremepoints. However, the most important di!erence between the present approach and[6] is the possibility of occlusion by other objects.

Fuzzy algorithms can also be used to align shapes [7]. Fuzzy alignment algorithmsprovide good results for closed and open curve boundaries as well as broken bound-aries or other feature point sets of shapes, even under a"ne transformation. How-ever, none of these studies handles occlusion from the point of view of a uniqueclose contour when a figure is partially occulted by another.

1.2. Contour decomposition into parts

Features can also be used to divide curves into shape elements or primitives [2,8–11]. Recognition by object decomposition into parts is related to the theory of Rec-ognition by Components (RBC), developed by Biederman [12]. RBC was first used inpsychology to understand human images. In perceptual recognition, the input imageis segmented at regions of deep concavity into an arrangement of simple geometriccomponents such as blocks, cylinders, wedges, and cones. The fundamental assump-tion is that a modest set of generalized-cone components (called geons) can bederived by contrasting five readily detectable properties of edges in a two-dimen-sional image (curvature, collinearity, symmetry, parallelism, and co-termination).The theory is quite useful since it assumes that the detection of these properties doesnot vary with viewing position. As a result, the object perception is robust when theimage is projected from other points of view.

Syntactical matching methods can be used for shape-based object recognitionwhere curves represent the contour of the object. A curve can be an ordered listof shape elements with attributes such as length, orientation, bending angle, etc.String comparison algorithms can then transform one string into another. These in-clude algorithms that seek isomorphism between attributed relational graphs andothers that search for the largest set of mutually compatible matches.

To divide the curves into primitives we need to find significant points for segmen-tation. In human perception, the major features of a shape are mostly concentratedat the critical points with high curvature [13]. Many applications and algorithms incomputer vision partition the curve at points with depth discontinuities by detectingcurvature [14–19]. Other methods neglect curvature [5,20], but most try to locate ei-

36 C. Orrite, J. Elias Herrero / Computer Vision and Image Understanding 93 (2004) 34–64

ther points with high curvature (such as corners) or with a high curvature derivative(smooth joints [16]). These approaches are very e"cient for shapes with corners orwhen the transformation is not severe (only translation, rotation in 2D and scaled).However, the curvature maxima are not invariant points under projective transfor-mation, although they usually are in practise. In this paper, we base our approach onproperties that are preserved under projection (such as tangency), in order to obtaininvariant points to segment the curve into primitives to be matched.

We attempt to take advantage of both approaches explained above. The align-ment approach is used to find the transformation matrix and to match the contourswhich have undergone a projective transformation. The invariant points (bitangents)are located and the original contour is split into several curved pieces that could cor-respond to di!erent objects that partially occlude each other.

In theory, the invariant approach solves the problemof shapematching. But in prac-tice, the extracted features in the curves are subject to measurement errors and the cal-culated invariant points do not exactly match the stored invariant points in thedatabase. One solution is to search the surroundings of each invariant point, but thissignificantly increases the computation cost of recognition. Instead, we use the Haus-dor! distance, which performs well even in the presence of spurious features and noise.

The Hausdor! distance (HD) is used to search for an object that has been trans-lated, rotated and scaled [21] or that has undergone an a"ne transformation [22].More recently, a Hausdor! oriented similarity measure has been proposed for robustobject alignment [23]. We introduce a novel HD estimation for shape matching. It ismore reliable when several objects occlude each other, providing a curve with di!er-ent pieces of the contour objects present in the image.

The rest of this paper is organized as follows. Section 2 includes an overview ofthe method. Section 3 describes the alignment approach in more detail. Section 4presents the features extraction to obtain invariant points to be used for alignment.In Section 5 we describe the HD and its modified version. Section 6 includes the re-sults and discussion on the shape matching for isolated curves and the presence ofocclusion. Finally, the conclusions are presented in Section 7.

2. Overview of the method

This paper is focused on object recognition problems in computer vision underpartial occlusion. To this, we used the alignment approach, an e"cient method ex-haustively described in the literature [3–6]. In this approach, the recognition processincludes two stages. In the first, we determine the transformation in space betweenthe viewed object and the model object. In the second, we find the model that bestmatches the viewed object. So, the main problem of object recognition becomesone of correspondence, where given four points in the image, the four correspondingpoints in the model have to be found. This problem involving combinatorial searchhas been solved in the past by means of a Genetic Algorithm (GA) [24], and, in gen-eral, GA have been used for object recognition [9,25], despite its inherent computa-tional cost.


In order to reduce the number of combinations between four points in the imageand the four points in the model, we use invariant points under projective transfor-mation, as bitangents are, so, it drastically reduces the space of selection for align-ment. A bitangent to the curve yields two points of tangency which are preservedunder projection (i.e., constructing the points for a projected curve yields the projec-tion of the original points). The main bitangents are selected, which reduce the timefor hypothesis generation (see Section 4). Both curves are compared after estimatingthe transformation matrix between the image and the model. This is evaluated usingthe HD, a measure defined between two sets of points (the model in the databaseand the image). HD is reliable even when the image contains errors in point positionor outliers and missing points due to occlusion or failure in the contour detectionprocess.

In this paper, HD is used to recognize the projective transformation of a model inan image (i.e., to determine the pose of a planar object that undergoes a perspectiveprojection). However, traditional HD has some limitations, especially when there areseveral objects with di!erent levels of occlusion. For this reason we introduce a novelestimation called the continuity HD.

The steps in the recognition process are shown in Fig. 1.The following sections include a detailed description of the steps in the recogni-

tion approach.

3. Projective transformations

In this section we analyze some aspects of projective geometry for machine vi-sion. As we discussed above, the central problem in object recognition is findingthe best transformation that maps an object model into the image data. If V de-notes the object view to be recognized, M is an object model in the database andT is the set of transformations that can be applied to object model M, then, therecognition process searches for the best transformation T that applied to the Mmodel, maximizes a given function F of fit quality between the model and the ob-ject.

A projective transformation between two planes is represented as a 3! 3 matrixacting on homogeneous coordinates of the plane [5]. This transformation modelsthe composed e!ects of 3D rigid rotation and translation of the world plane (cameraextrinsic parameters), perspective projection to the image plane, and an a"ne trans-formation of the final image (which covers the e!ects of changing camera intrinsicparameters). The general projective transformation T from one projective plane,P, to another, p, is represented as

x1x2x3

2

4

3

5 "t11 t12 t13t21 t22 t23t31 t32 t33

2

4

3

5X1

X2

X3

2

4

3

5: #1$

The coordinates before the transformation are represented by upper-case letters andafter the transformation by lower-case letters.


Fig. 1. Recognition algorithm.


Cartesian coordinates are obtained from the previous expression according to theequations

x " x1x3

" t11X % t12Y % t13t31X % t32Y % t33

; y " x2x3

" t21X % t22Y % t23t31X % t32Y % t33

: #2$

In the previous equations, if parameter t33 does not a!ect either the x or y coordi-nates, it is an arbitrary scale factor that does not a!ect the Cartesian coordinates, sowe can choose t33 " 1. Consequently, the projective transformation matrix T requireseight independent parameters to define a unique mapping. Since each point in theplane provides two Cartesian coordinate equations, four point correspondences arerequired to define the transformation matrix uniquely, provided that no three ofthem are collinear between two projectively transformed planes. The resulting linearsystem of equations is

X1 Y1 1 0 0 0 &x1X1 &x1Y10 0 0 X1 Y1 1 &y1X1 &y1Y1X2 Y2 1 0 0 0 &x2X2 &x2Y20 0 0 X2 Y2 1 &y2X2 &y2Y2X3 Y3 1 0 0 0 &x3X3 &x3Y30 0 0 X3 Y3 1 &y3X3 &y3Y3X4 Y4 1 0 0 0 &x4X4 &x4Y40 0 0 X4 Y4 1 &y4X4 &y4Y4

2

66666666664

3

77777777775

t11t12t13t21t22t23t31t32

2

66666666664

3

77777777775

"

x1y1x2y2x3y3x4y4

2

66666666664

3

77777777775

#3$

Therefore, the main problem that object recognition must address becomes aproblem of correspondence, where four corresponding points between model andimage have to be found.

If we consider that visual recognition is a problem that involves searching in alarge space, a method is needed that resolves combinatorial to obtain a solution ina reasonable time. In this sense, we do not limit the performance of the algorithmto detect invariant points, but set some of their properties to reduce the numberof hypotheses to verify, as described in next section.

4. Characteristics extraction

The aim here is to exploit certain properties preserved by projection to identifyinvariant points used to alignment, as well as, to establish some classification in or-der to reduce complexity in matching. The present algorithm is based on the align-ment method which identifies four point correspondences (no three collinear)between the image to recognize and the model image from the database. So, we ex-tract so-called interest points, both in the object model image and in the scene imageto find the best match between these point set.

It is important to observe that since our matching algorithm is designed to dealwith missing and spurious points, we assume that enough interest points can be ex-tracted in the relevant image.


4.1. Points of tangency

Invariant points are obtained using properties that are preserved under projec-tion, such as tangency. A bitangent to the curve is a line with two points of tangencythat are also preserved under projection [5], since constructing the points from a pro-jected curve yields the projection of the original points. An example of bitangentparts computed for an initial image and the projective transformation view areshown in Fig. 2. The bitangent points are the same in both cases.

Image bitangents are located using the following four-stage algorithm:1. Order the curve following an anti-clockwise direction and proceed to a size nor-

malization in order to use a common bitangency threshold for all images.2. Detect points that lie on approximately straight portions of curve. These cannot

correspond to actual points of bitangency.3. Find bitangents of the curve by computing its convex hull.4. Split the curve by the previous bitangent points and recursively obtain the internal

bitangents of each new curve.To detect bitangent points we have to define a bitangency threshold to consider

whether a point is close to a bitangent line. Sometimes, a bitangent line may touch

Fig. 2. Example for computed bitangents, demonstrating the invariant points under projection.


a curve in several close points, for instance, when they form a straight line. So,straight portions of the curve are found by fitting straight lines to short segmentsof the curve using a split and merging algorithm. The threshold used in this algo-rithm is the same as the previous one used for bitangency. This threshold is definedas

!!!2

px discretization step, once the normalization process has taken place. When a

bitangent touches a straight segment, the bitangent point is the first point in tan-gency nearest to the other bitangent point.

The next step is to compute bitangent points following a similar approach pro-vided by Sonka et al. [27], used to detect the convex hull region. This algorithm ispresented here as an intuitive way of detecting bitangents.1. First, we split the original curve into four segments corresponding to the tangent

points in the bounding box of the original curve (Fig. 3A).2. The first segment (P1P2) is obtained and we assign Pk " P1. The vector v represents

the direction of the previous line segment of the bitangent. Initially, we considerthe line formed by P1P2.

3. Search the curve in anti-clockwise direction and compute the angle orientation /for every boundary point Pn which lies after the point P1 (see Fig. 3B). The angleorientation /n is the angle formed by vector PkPn. The point Pq satisfying the con-dition /q " minn #/n$ is a bitangent point.

4. If Pq is a neighbouring point of Pk then Pk is not a bitangent point so, Pk " Pq andrepeat step (3).

5. Assign v " PkPq and Pk " Pq.6. Repeat steps (2)–(5) until Pq " P2 (Fig. 3C).7. Repeat for the other three segments (P2P3; P3P4; P4P1).

Once the bitangent points are located the contour is split recursively at thesepoints. New bitangents are located for every new open curve generated in thisway. Fig. 4A represents the original curve. In Fig. 4B, bitangents are representedby a straight line and the bitangent points are marked by a circle. To locate high or-der bitangents, we considered a new closed figure formed by the piece of the contourbetween two bitangent points and the associated bitangent straight line. The new bi-tangent points are shown in Fig. 4C for the largest bitangent found in the previouscase and inner bitangents points are represented in Fig. 4D, following the procedurerecursively.

Obviously, in many cases it is impossible to guarantee several orders of interestingbitangents. Many objects are formed by straight lines (such as polygonal figures) so

Fig. 3. Computing bitangent line on a curve.


we need to extend the concept of bitangent to points at the ends of a straight line. Inthese figures the corners are located (projective invariant points) and managed in thesame way as the bitangent points.

4.2. Concavity analysis

The main goal of this approach is to obtain invariant points (bitangents or cor-ners) to compare the image with a model. However, when establishing the correspon-dences between both figures, some hypothesis can be eliminated by taking concavity(convexity) into account. For example, three levels of bitangents are computed, eachrepresented by a di!erent colour circle (Fig. 5). The bitangents in the first level areplaced in a convex part. Each bitangent delimits a contour part including an inflec-tion point, which is surrounded by the next inner bitangency level. The property ofconvexity or concavity alternates with each depth level.

This last property can be used to decrease the number of generated hypotheses.Only the correspondences formed by a couple of points with the same characteristic(convex or concave) will be valid, as seen in Fig. 6. In this figure, four bitangentspoints are shown with their possible correspondences. Each one of these points

Fig. 4. Invariant points obtained by bitangency at di!erent levels.


can be convex or concave and only one of 16 possible direct correspondences pro-vides a valid coupling.

4.3. Bitangent location in practice

One problem with higher order bitangents is the need for accurate measurements atfine scales because they aremore susceptible to noise. They tend to bemore delicate andit is harder to asses an accurate transformation. On the other hand, internal bitangents

Fig. 6. Removing hypotheses by concavity analysis of bitangent points.

Fig. 5. Bitangents at di!erent levels.


may disappear after a severe projective transformation. So, special attention should bepaid to first order bitangents. They should be considered first since they aremore likelyto survive. Therefore, bitangents are chosen according to their depth, which is definedas the maximum distance between the contour of the figure delimited by the two tan-gency points and the bitangent (depths h1 and h2 for two bitangents are shown inFig. 2). Bitangents are placed in ascending order, where the main ones are deeper sincethey may persist even when the figure is greatly distorted.

In practice, mapping the model contour onto the image plane using the transfor-mation matrix may cause deviations due to inappropriate identification of bitangentpoints. To minimize these deviations one solution is to search the surroundings ofeach invariant point and compute a new transformation. Some authors [28,29] ex-tend the Iterative Closest Point (ICP) algorithm to optimize the registration. TheICP operates by iterating over the set of points and minimizing the distance betweenthe two sets using gradient descent, as briefly outlined in the following:1. First, we determine an initial projective transformation T #0$ based on the corre-

spondence of four bitangent points.2. For all points (Xi; Yi) of the model we obtain the corresponding transformed

points (xi; yi) following (2).3. The nearest point (x0i; y

0i) of the image object contour is searched.

4. Once we know the point correspondence between model points and image points,the projective transformational parameters T #k$ can be directly estimated by theLS method.

5. If any distance between T #k$ and T #k%1$ is less than a given value, the iteration isfinished, or else we continue with 2.The problem with these techniques is the significant increase in computational

cost for recognition. Therefore, in practice we have selected the most important bi-tangent points to find the best projective transformation and left the process of curverecognition to the HD, a metric which performs well even in the presence of noiseand spurious features as described in the next section.

5. Hausdor! measure

Inmanymachine vision and pattern recognition applications it is important to iden-tify instances of amodel that are only partly visible, either due to occlusion or failure ofthe contour detection process. In this sectionwe extend the definition of theHD to takeinto account objects that are partially hidden from view and, therefore, the contour tobe recognized comes from contour pieces of the occluding shapes.

5.1. Previous works

In this section we briefly describe the HD, as used in previous works, [21,22], aswell as the modified versions we use in this work. Based on our own experimentation,we noted some shortcomings with the traditional methods which are describedthroughout the paper, but basically involve behaviour under noise and occlusion.


5.1.1. Hausdor! distanceThe HD in the plane between two finite point sets I (representing an image), and

M (representing a model to locate in the image) is defined in [21] as

H#M ; I$ " max#h#M ; I$; h#I ;M$$; #4$

where h#M ; I$ is the directed distance

h#M ; I$ " maxm2M

mini2I

km& Ik: #5$

In the previous equation, k ' k is some norm in the plane (here it is the L2 norm).h#M ; I$ can be computed by taking each point of M, computing the minimum dis-tance to the I points, and taking the largest distance. h#I ;M$ can be computedsimilarly, where H#M ; I$ is the largest of these two distances. According to [21],h#M ; I$ is the forward distance from M to I, and h#M ; I$ is the reverse distance. If Mand I are quite similar, except that a single point of M is far from every point of I,such as an outlier, then h#M ; I$, and therefore H#M ; I$, will be quite large. Thissensitivity to outliers is unacceptable in practical recognition tasks.

5.1.2. Partial Hausdor! distance based on rankingThe previous problem can be solved by replacing Eq. (5) with

hK#M ; I$ " K thm2M min

i2Ikm& Ik; #6$

where K thm2Mg#x$ denotes the K th ranked value of g#x$ over the set of distances, i.e., for

each point of M , the distance to the closest point of I is computed, and the points ofM are ranked by their respective values of the distance [21]. We define q as thenumber of model points. If K " q, all the points are considered and the value issimply the HD h#M ; I$, as mentioned in the previous section. The main property ofthis definition is the automatic selection of K ‘‘best matching’’ points of M , since weidentify the subset of the model of size K that minimizes HD.

The partial HD is now defined as

HKL#M ; I$ " max#hK#M ; I$; hL#I ;M$$; #7$

where L and K control the size of subsets M and I when evaluating the forward andreverse distances, respectively.

Yi and Camps [26] also use this method, but the HD is calculated in a transformedfigure from the original represented in a new domain and applied with shrink androtate transformations. It provides good results for isolated pieces, but the valuesfrom the partial HD are highly dependent on K and L if there are several pieces inthe image with partial occlusion.

5.2. Proposed methods

We have reviewed HD theory, its solutions and shortcomings. Now two novel so-lutions are suggested to solve these problems, specifically outliers and several objectswith di!erent levels of occlusion. In next sections, di!erent equations for directed dis-tance case are developed.


5.2.1. Proximity Hausdor! distanceIf HD provides a measure of the separation between the model and the image, the

new distance (here defined) identifies the piece of the image contour that is similar tothe model. In this approach, the percentage of the image points close to the modelare computed, i.e., those with a separation lower than a proximity threshold. Thematch between the image and model improves as the percentage increases. The onlyparameter to be selected is the minimum distance threshold, denoted by P , to con-sider two points that are close together.

We define h as the set fmin km& ik < Pg, where m 2 M and i 2 I , i.e., the set ofclosest points of model to image. Some normalization needs to be introduced tocompare with the previous definition of HD. If KP is the size of h and q is the numberof contour points, the directed proximity distance hP#M ; I$ is given by

hP#M ; I$ " KP

q; where 06 hP#M ; I$6 1: #8$

In this approach, the matching is better when the value of the proximity Hausdor!direct distance is high (close to 1). So, we have to modify the expression for theproximity HD

HP#M ; I$ " min#hP#M ; I$; hP#I ;M$$: #9$

Using this approach, the selection of the parameter is not as critical. It is moreappropriate in case of partial occlusion and also provides information about thepiece of the contour image that matches the model.

As mentioned earlier, the proximity HD has some advantages over the partialHD, but experimental results suggest that the values obtained for a clear mismatchare not as low as expected. For example, Fig. 7 shows three matching situations be-tween an image and three di!erent models.

As a result of the normalization process, a score of 1 is a perfect match betweenthe image and the model. As expected, the PHD values of images 1 and 2 are lowerthan image 3, but we would prefer lower values for images 1 and 2 since the modelsclearly do not correspond to the image. In order to improve the matching process wedefine a new HD based on a continuity threshold.

Fig. 7. Results obtained for the proximity HD.


5.2.2. Continuity Hausdor! distanceAs seen in the previous figure, many points from the model and image are close

to each other, so the values obtained by applying Eq. (9) are high. However, mostof the closed pieces of the contour are very short and discontinuous. We proposeto modify the HD to take into account the length of the matched contour pieces,in addition to proximity between both. We call this new approach the ContinuityHausdor! Distance which depends on the proximity threshold P and the continuitylength L.

We define h as a piece of contour formed by neighbouring points of the model Munder the following two conditions:(1) min km& ik < P , where m 2 M and i 2 I ,(2) length#h$P L .

According to condition (1), only closed points between the image and model areconsidered (given by parameter P ). According to condition (2), only pieces of modelcontour larger than a threshold (given by L) are taken into account to compute thecontinuity distance.

If KL is the sum of points belonging to all sets h and q is the number of contourpoints, the function for the directed proximity distance is given by

hC#M ; I$ " KL

q; 06 hC#M ; I$6 1: #10$

As in the previous case, higher values imply a better match and, therefore, betterrecognition. The expression for the continuity HD is given by

HC#M ; I$ " min#hC#M ; I$; hC#I ;M$$: #11$

The values of similar contours are very close to those given by the partial HD, sincethe close contour pieces are very large. However, for false similar contours, thepoints corresponding to short contour pieces are not taken into account in the finalresult. Consequently, the values are significantly reduced.

Fig. 8 shows the values obtained for the previous example.As seen in Fig. 8, the values of images 1 and 2 are much lower than image 3, which

significantly reduces the possibility of false recognition. Since the continuity HD canrecognize components (by contour pieces in this case), it is suitable for cases that in-volve occlusion.

Fig. 8. Results obtained for the continuity HD.


6. Results and discussion

In this section we implement the algorithm and present the final results. We usedMatlab 5.3 with several routines in C language to improve the critical tasks of rec-ognition. All the computations were performed using a Pentium IV processor(2.5GHz). The data are provided to give an idea of the computation time involved.

We used a database of 45 images with 350–700 pixels (see Fig. 9). Each contour hadat least 10 bitangents. A figure was selected from the database and a projectivetransformation was applied to obtain a new figure to be recognized. The projectivetransformations were obtained by providing random values to the parameters ofthe transformation matrix in Eq. (1), within a specific range to avoid ‘‘unusual’’

Fig. 9. Database used in the implementation of proposed methods.


transformations (e.g., those that transform the image plane to a plane which is nearlyparallel to the ground).

6.1. Isolated contours

In the simplest situation, the image is an isolated contour after applying a projec-tive transformation to an object from the database. This contour is noise free (seebelow for contours with noise).

Fig. 10 shows the original contour corresponding to model 44 and the projectivetransformation. We obtained the bitangents to carry out the recognition process andapplied both the proximity HD and continuity HD.

6.1.1. Proximity HDIn this section we analyze the results from the method used for hypothesis evalu-

ation (i.e., the proximity HD introduced in the previous section). In this simple sit-uation with the whole contour, only four bitangent points are considered to reducethe number of generated hypothesis to be verified. The parameters in the algorithminclude:

Images in the database: 45.Pixel length in the images: from 350 to 700.Points of bitangency considered: 4.Proximity threshold: P " 3! discretization step.Fig. 11 is a bar diagram of the results for all images in the database.As explained above, the score was normalized to range from 0 to 1 (1 being a per-

fect match). From Fig. 11, it is clear that figure 44 had the highest score, since 100%of both contours are close together, making a perfect match. Therefore, this is thebest approximation. For figures 11, 16, 24, 31, 32 and 42 there is no correspondencebetween bitangent points, verifying the convexity restriction imposed by the method.There is no transformation and the score is 0. For the rest of the images in the da-tabase the scores were much lower.

The three best approximations using proximity HD are shown in Fig. 12.The recognition by this method was correct (perfect transformation between im-

age and model) in all the situations that involved projective transformation to iso-lated contours.

Fig. 10. Original contour, projective transformation, and bitangents of the image to recognize.


6.1.2. Continuity HDIn this section we analyze the results using the continuity HD. We select the same

target image depicted in Fig. 10 and limit the analysis to four bitangents to reducethe computational cost. The same parameters are used in addition to L (the mini-mum length of the contour piece to be matched), which depends on the figure andconfidence measure used. Let W be a new parameter where

W " Continuity HD for winner figureP#Continuity HD for all figures$ : #12$

As seen in Fig. 13, W is maximum when the minimum length is larger than 33.Except for model 44, no other contour pieces of the models in the databaselonger than L match the target image. The shape of the curve is coherent becausemore figure pieces will coincide when the minimum length to be matched is re-duced.

In conclusion, shape matching was optimal using both methods but the continuityHD was more reliable and better at identifying the correct recognition hypotheses.This is because it includes an additional parameter which avoids to consider shortcontour pieces for matching.

6.2. Reliability under noise

Here we analyze how the proposed method copes with noise contours. Again weuse isolated pieces generated by applying a projective transformation to an originalpiece and adding a one pixel random distortion to the contour. The distortion a!ectsthe location of bitangent points, which are the key aspects of the method.

Fig. 14 shows the original image before transformation and blurring, the image tobe recognized and its bitangents.

Fig. 11. Results of applying the proximity HD to each image of the database.


6.2.1. Proximity HDFirst we consider the results from the hypothesis evaluation method based on the

proximity HD, see Fig. 15. The parameters were as follows:

Fig. 13. Results of applying the continuity Hausdor! distance for all images in the database.

Fig. 12. Results obtained for three images following the proximity HD.


Images in the database: 45.Points of bitangency considered: 6.Proximity threshold: P " 3! discretization step.In this example, we have selected six points of bitangency instead of the minimum

four points required for alignment, since the location of bitangent points is less pre-cise as a result of the noise. Increasing the bitangency points increases computationtime. As seen in the previous bar diagram, model 1 had the highest score, close to 1.0.On the other hand, none of the other pieces was higher than 0.5.

The transformations of the three solutions with highest scores are shown inFig. 16.

As opposed to the noise free contours, there is no perfect match in the presence ofnoise. However, from the point of view of shape recognition there is a positive iden-tification. The recognition was also correct in all cases after testing the same ap-proach with di!erent figures corrupted by (one pixel) random noise.

6.2.2. Continuity HDHere we present the results after applying the hypothesis evaluation method based

on the continuity HD, see Fig. 17. For this approach we selected the following pa-rameters:

Fig. 14. Target image after transformation and blurring and its bitangents.

Fig. 15. Results obtained for the proximity HD in a noisy environment.


Fig. 16. Di!erent figures with their respective scores in a noisy environment.

Fig. 17. Results obtained for the continuity HD in a noisy environment.


Images in the database: 45.Points of bitangency considered: 6.Proximity threshold: P " 3! discretization step.Minimum length of the contour piece to be matched: L " 50 pixels.As shown in the bar graph, only model 1 had a score. All the other models had a

score of 0 since they did not contain a matching contour with a length above the fixedthreshold parameter L. The score formodel 1 is far from themaximum since thematch-ing between the image model and the noisy contour is less than perfect (Fig. 18).

We have compared the proximity and continuity HDs but little has been men-tioned about the selection of the two parameters (P and L), which depends on thelevel of noise and the degree of occlusion. Before considering figure recognition un-der partial occlusion, two simple suggestions can be underlined: (1) parameter Pshould be high when dealing with noisy figures, (2) parameter L could be high to starto! with, depending on the length of the contour figure (we selected a value corre-sponding to 10% of the contour image to be recognized). The value can be reducedif no model can be found with a significant score.

The next section includes some graphs for shape matching of partially occludedcurves in terms of parameter L.

6.3. Recognition under partial occlusion

In this section we discuss the problem of occlusion where new bitangents appearand the original ones disappear. A comparison is made to demonstrate that the con-tinuity HD behaves better than other approaches. After this the method is consid-ered with respect to the parameters which define it. Finally, a set of occlusionexamples are provided with two images that self-occlude each other.

6.3.1. Occlusion problematicAs described in previous sections, the proposed recognition approach is based on

the invariant properties of the bitangents after projective transformation. However,under partial occlusion some points disappear, others change position and, some-times new points appear (see example in Fig. 19).

Under partial occlusion we cannot assure that all located bitangents are appropri-ate to generate the matching hypothesis. Therefore, in the alignment approach the

Fig. 18. An example of model and image to recognize showing bitangents and the corresponding conti-nuity HD.


number of points to be considered should be increased to find at least four corre-sponding points between model and image. Increasing the points increases the com-putational cost drastically since many generated hypotheses need to be verified.

6.3.2. Hausdor! measure comparativeIn this section the new HD variants are compared with the older version (standard

HD and partial !Ranking" HD). We selected a model from the database (42) that isoccluded by a plane. The occlusion is obtained by a line of 60" with respect to thehorizontal axis. The right part of the figure from this line is erased, and this lineforms part of the figure. An example of how the occlusion is achieved can be seenin Fig. 20.

Fig. 19. Drawings to explain the loss of bitangency points.

Fig. 20. Example of occlusion.


The scores of the di!erent Hausdor! metrics are compared in terms of their per-formance at di!erent levels of occlusion, using the whole data set. In order to quan-tify the selectivity of the corresponding HD, we defined a parameter of !influence",which indicates the values obtained by the other models in the database. This param-eter should state, on one hand, how many others models have a score di!erent than 0(represented by nA) and, on the other hand, a statistical value reflecting the score ofall models in the database (except for the centre figure) given by MEAN. This pa-rameter is given by

influence " nA 'MEAN

a; #13$

where a is a normalization factor, so 0 < influence < 1. A value of 0 indicates thatthere are no other figures with a significant score. A value of 1 indicates that allfigures provide a perfect match. Fig. 21 compares three HDs in terms of the degree ofocclusion (from 0 to 35%).

The graphs may appear confusing at first glance since the standard HD at highlevels of occlusion obtains the highest score. However, as mentioned above, the oc-clusion is produced by a plane, so the open contour corresponding to the occludedfigure is closed by a straight line. The valid points only correspond to the open curve,but, for some degrees of occlusion the score is higher than expected because somepoints of the straight line match the model in the database, giving a false score.

In addition, this metric behaves poorly in relation to selectivity, as indicated bythe high influence values (many models in the database with a high score). Moreover,at some degree of occlusion, some scores of models were also higher than the centrefigure, as reflected by the increasing influence values. Thus, from the point of view of

Fig. 21. Comparison of three HDs. Solid lines indicate the score of the searched image and dashed linesindicate the influence of other images (Influence): (A) standard HD, (B) HD based on ranking, and (C)continuity HD (P " 3 and L " 50).


selectivity, the continuity HD behaves better, as reflected by the low values of theinfluence parameter at di!erent degrees of occlusion.

6.3.3. Continuity HD analysisAfter showing that the continuity HD behaves better than the standard HD, we

analyze the influence of parameter L during occlusion. Fig. 22 shows the scores givenby this metric for di!erent values of L (from 1 until the whole length of the curve), atseveral degrees of occlusion. As expected, the score is always 1 for all values of Lwhen occlusion is null. Under some occlusion there is a L value where the score is0, because there are no pieces of contour higher than the length given by L.

It should be noted that the score of some occlusions (e.g., 35%, L " 1) is higherthan expected (i.e., 0.65). The reason for this is similar to that mentioned above whencomparing the standard HD.

Finally, we analyzed the behaviour of the continuity HD in relation to the L pa-rameter using the whole dataset. As in Fig. 21, we represented the score of the centrefigure and the influence parameter at di!erent degrees of occlusion for three di!erentvalues of L (1, 30 and 50 pixels, see Fig. 23). Similarly, the proximity HD becomes aparticular case of the continuity HD when L is equal to 1. All results were obtainedwith P equal to 3, high enough if we consider that the contour was noise free.

The most important conclusion we can extract from these figures is the strong de-crease of the influence parameter for high values of L. As mentioned before, thechoice of P and L depends on the level of noise and the degree of occlusion. Onthe other hand, as seen in Fig. 22, sometimes a null score is obtained for some de-

Fig. 22. Behaviour of the continuity HD with respect to the occlusion level.


grees of occlusion at high L. Thus, L should be high enough to reduce the influenceparameter, but not so high as to obtain a null score.

6.3.4. Occlusion examplesThe proposed recognition method was applied to images with occlusion between

several objects where contour includes di!erent pieces of the objects in the image (seeFig. 24).

As expected, the number of bitangency points depends on the occlusion level andthe number of occluded figures. This demonstrates the importance of finding internal

TRANSFORMATION T

+ =A B

Fig. 24. Image obtained from occlusion between two database images and random transformation.

Fig. 23. Continuity HD with respect to the parameter L: (A) L " 1, (B) L " 30, and (C) L " 50.


bitangents in order to have enough points to recognize the figures in the image. How-ever, including too many points will increase the duration of the recognition process.A trade must be made between the number of points to recognize all the figures andthe computation cost.

In the previous section we analyzed three evaluation methods based on HD. Theone that included the proximity distance and the continuity approach recognized theoccluded pieces better. The partial HD depends heavily on the appropriate selectionof K, especially under occlusion, as described in last section. The proximity HD is anacceptable approach for isolated pieces, but it may lead to false recognition underocclusion since it does not take into account the length of the contour pieces andnever presents the whole contour in the image. Therefore, it is possible that thescores of di!erent contours with similar sections are close to the correct contour.The continuity HD solves all these drawbacks by rejecting contour sections thatare not large enough and recognizing the sections from the occluded figure that be-long to the model database.

The following recognition examples dealing with partially occluded contours werecomputed using the continuity HD.

The recognition parameters of the previous contour were:Images in the database : 45Points of bitangency considered in the occluded contour: 10Points of bitangency considered in the database contour: 10Proximity threshold : P " 3! discretization stepMinimum length of the contour piece to be matched : L " 50 pixelsResults for image A are shown in Fig. 25.

Fig. 25. Figure A shape matching.


Fig. 26 shows the results for image B.Next, we present another example of occluded contour recognition. Fig. 27 shows

the contour with occlusion between figure A and B.We used the following recognition parameters for the contour in Fig. 27:Images in the database : 45.

Fig. 26. Figure B shape matching.

TRANSFORMATION T

+ = A B

Fig. 27. Another example of generating an occluded image.


Points of bitangency considered in the occluded contour: 10.Points of bitangency considered in the database contour: 10.Proximity threshold : P " 3! discretization step.Minimum length of the contour piece to be matched: L " 50 pixels.The recognized parts of first contour present in the occluded image are depicted in

Fig. 28.The results for image B are excellent since there are enough preserved bitangency

points in the occluded contour corresponding to the model, leading to positive rec-ognition. However, for figure A, the results are not as satisfactory due the lack ofbitangents. As a result, the estimated transformation matrix is slightly deviated.One possible solution is to increase the points of bitangency in the occluded contouras well as in the model database contour, but the computation time is excessivelylong.

Table 1 compares the di!erent number of points and the corresponding computa-tion times.

With only a few bitangency points it is di"cult to find the appropriate four pointcorrespondence so, the score given by the estimated transformation is too low. Byincreasing the number of points it is more likely to select the appropriate four pointsin the image, and as a consequence, the results are more reliable but the computationtime increases significantly.

The score can be increased by reducing the length of the section to be matched butthis may lead to a false recognition, especially in very occluded images where only asmall contour piece matches the model.

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6

0 0.2 0.4 0.6 0.8

0

0.2

0.4

0.6 A B

24.27% of the occluded contour 74.04% of the occluded contour

Fig. 28. The two best solutions for the last example.

Table 1Computation times for several examples

Model bitangencypoints

Image bitangencypoints

% figure A % figure B Computationtime (s)

4 4 None None 0.02656 6 7.83 0.0 0.34838 8 20.58 22.88 14.5210 10 24.27 70.04 174


The results for partially occluded contours are very satisfactory, despite the ap-parent di"culty in some situations. The success of this approach mainly dependson the number of bitangency points preserved under occlusion. For this reason, itis necessary to consider many but in some cases the computational cost is excessive.

7. Conclusions

This paper considers the problem of object recognition, with special emphasis onpartially occluded planar objects. These problems are very important in industrialapplications with possible occlusions of flat objects in 3D space. The method devel-oped in this paper is based on two di!erent approaches (object decomposition intoparts and alignment methods) that provide a powerful framework to solve this prob-lem.

In the alignment approach, four non-collinear points are taken from the input im-age and searched in the image database. The points are selected among invariantpoints, such as bitangents.

We used a metric based on HD to evaluate the degree of correspondence betweenthe image to be recognized and the model from the database. Classic HD has limi-tations for partially occluded curves. To avoid this we propose a new distance basedon proximity and continuity. It provided satisfactory results when applied to a lim-ited image database.

We also suggest some improvements to the proposed continuity HD in terms ofmatching occluded contours. If there are !simple" long contours in the image, corre-spondence to some false model might have higher scores than correspondence to theshort !complicated" contour of the right model. As a result, the continuity distanceshould consider the complexity of the matched contours as well as their length.

The main problem with this approach is the computational time, which dependson the transformations that are permitted and the lack of restrictions. In any case,we have shown that this increase is quite linear with respect to the number of modelsin the database.

References

[1] D.W. Murray, Strategies in object recognition, Gec J. Res. 6 (2) (1988) 80–95.[2] Y. Gdalyahu, D. Weinshall, Flexible syntactic matching of curves and its application to automatic

hierarchical classification of silhouettes, IEEE Trans. Pattern Anal. Machine Intell. 21 (12) (1999)1312–1328.

[3] S. Ullman, Aligning pictorial descriptions: an approach to object recognition, Cognition 32 (1989)193–254.

[4] N. Ayache, O.D. Faugeras, HYPER: a new approach for the recognition and positioning of two-dimensional objects, IEEE Trans. Pattern Anal. Machine Intell. PAMI-8 (1) (1986) 44–54.

[5] A. Zisserman, D.A. Forsyth, J.L. Mundy, C.A. Rothwell, Recognizing general curved objectse"ciently, in: Geometric Invariance in Computer Vision, MIT Press, Cambridge, MA, 1992, pp. 228–251.

[6] D.P. Huttenlocher, S. Ullman, Object Recognition using alignment, MIT AI Memo 937.


[7] Z. Xue, D. Shen, E.K. Teoh, An e"cient fuzzy algorithm for aligning shapes under a"netransformations, Pattern Recognition 34 (2001) 1171–1180.

[8] H. Nishida, Model-based shape matching with structural feature grouping, IEEE Trans. PatternAnal. Machine Intell. 17 (3) (1992) 315–320.

[9] M. Singh, A. Chatterjee, S. Chaudhury, Matching structural shape descriptions using geneticalgorithms, Pattern Recognition 30 (9) (1997) 1451–1462.

[10] P.W. Huang, S.K. Dai, P.L. Lin, Planar shape recognition by directional flow-change method,Pattern Recognition Lett. 20 (1999) 163–170.

[11] K. Siddiqi, A. Shokoufandeh, S.J. Dickinson, S.W. Zucker, Shock graphs and shape matching,Internat. J. Comput. Vision 35 (1) (1999) 13–32.

[12] I. Biederman, Recognition-by-components: a theory of human image understanding, Psychol. Rev. 94(2) (1987) 115–147.

[13] K. Siddiqi, B.B. Kimia, Parts of visual form: computational aspects, IEEE Trans. Pattern Anal.Machine Intell. 17 (3) (1995) 239–251.

[14] D.G. Lowe, Organisation of smooth image curves at multiple scales, Internat. J. Comput. Vision 3(1989) 119–130.

[15] F. Mokhtarian, A.K. Mackworth, A theory of multiscale-based shape representation for planarcurves, IEEE Trans. Pattern Anal. Machine Intell. 14 (1992) 789–805.

[16] H. Asada, M. Brady, The curvature primal sketch, IEEE Trans. Pattern Anal. Machine Intell. 8(1986) 2–14.

[17] P. Saint-Marc, G. Medioni, Adaptive smoothing for feature extraction, in: Proc. DARPA ImageUnderstanding Workshop, Boston, MA, 1988, pp. 1100–1113.

[18] C.H. Teh, R.T Chin, On the detection of dominant points on digital curves, IEEE Trans. PatternAnal. Machine Intell. 11 (1989) 859–872.

[19] C. Orrite, A. Alcolea, Identifying perceptually salient segments on planar curves, Pattern recognitionand image analysis, Preprints of the VII National Symposium on Pattern Recognition and ImageAnalysis, 1997, pp. 419–424.

[20] M.A. Fischler, H.C. Wolf, Locating perceptually salient points on planar curves, IEEE Trans. PatternAnal. Machine Intell. 16 (1994) 113–129.

[21] D.P. Huttenlocher, G.A. Klanderman, W.J. Rucklidge, Comparing images using the Hausdor!distance, IEEE Trans. Pattern Anal. Machine Intell. 15 (9) (1993) 850–863.

[22] W.J. Rucklidge, E"ciently locating objects using the Hausdor! distance, Internat. J. Comput. Vision24 (3) (1997) 251–270.

[23] D.-G. Sim, R.-H. Park, Two-dimensional object alignment based on the robust oriented Hausdor!similarity measure, IEEE Trans. Pattern Anal. Machine Intell. 10 (3) (2001) 475–483.

[24] C. Orrite, A. Alcolea, A. Campo, Recognition of partially occluded flat objects, in: Lectures Notes inArtificial Intelligence, vol. 1484, Springer, Berlin, 1998, pp. 242–252.

[25] P.K. Ser, C.S.T. Choy, W.C. Siu, Genetic algorithm for the extraction of nonanalytic objects frommultiple dimensional parameter space, Comput. Vision Image Understanding 73 (1) (1999) 1–13.

[26] X.X. Yi, O.I. Camps, Line-based recognition using a multidimensional Hausdor! distance, IEEETrans. Pattern Anal. Machine Intell. 21 (9) (1999) 901–916.

[27] M. Sonka, V. Hlavac, R. Boyle, Image Processing, Analysis and Machine Vision, second ed.,Thomson Publishing, 1998.

[28] P.J. Besl, N.D. McKay, A method for registration of 3-D shapes, IEEE Trans. Pattern Anal. MachineIntell. 14 (2) (1992) 239–256.

[29] J. Feldmar, N. Ayache, F. Betting, 3D–2D projective registration of free-form curves and surfaces,Comput. Vision Image Understanding 65 (3) (1997) 403–424.


Erratum to "Shape matching of partially occluded curves invariant under projective transformation

Documents

Transcript of Erratum to "Shape matching of partially occluded curves invariant under projective transformation