Planar shape recognition by shape morphing

17
* Corresponding author. Tel.: #001-612-625-0163; fax: #001-612-625-0572. E-mail addresses: sing@cs.umn.edu (R. Singh), npapas@cs. umn.edu (N.P. Papanikolopoulos). Pattern Recognition 33 (2000) 1683}1699 Planar shape recognition by shape morphing Rahul Singh, Nikolaos P. Papanikolopoulos* Department of Computer Science and Engineering, University of Minnesota, 4-192 EE/CS Building, 200 Union Street SE, Minneapolis, MN 55455, USA Received 16 December 1998; received in revised form 18 June 1999; accepted 18 June 1999 Abstract A novel method based on shape morphing is proposed for 2D shape recognition. In this framework, the shape of objects is described by using their contour. Shape recognition involves a morph between the contours of the objects being compared. The morph is quanti"ed by using a physics-based formulation. This quanti"cation is used as a dissimilarity measure to "nd the reference shape most similar to the input. The dissimilarity measure is shown to have the properties of a metric as well as invariance to Euclidean transformations. The recognition paradigm is applicable to both convex and non-convex shapes. Moreover, the applicability of the method is not constrained to closed shapes. Based on the metric properties of the dissimilarity method, a search strategy is described that obviates an exhaustive search of the template database during recognition experiments. Experimental results on the recognition of various types of shapes are presented. ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Shape recognition; Shape morphing; Content-based retrieval; Pen-based computing 1. Introduction Shape recognition is a fundamental problem of pattern recognition and machine vision. This problem arises in a variety of contexts, examples of which include retrieval by content from image databases, document image anal- ysis, automated industrial inspection, analysis of medical imagery as well as vision-based robotics and visual track- ing. A large number of shape recognition techniques have been proposed in the literature. Broadly speaking, these may be classi"ed by the shape representation framework and the dissimilarity measure used. Various schemes proposed for shape representation include representation using global features like moments [1], Fourier descrip- tors [2], autoregressive coe$cients [3], texture [4], and color [5,6]. Other representational techniques have employed local features [7], eigenmode representations [8], subspace representations [9], skeletons [10], part- based descriptions [11], and boundary-based (contour) representations [12}14]. Some examples of various dis- similarity measures used in shape recognition include deformation-based measures like modal deformation energy [8], applications of standard metrics either directly on geometric shape descriptors [15] or on trans- formed shape representation [9], and measures de"ned on non-geometric shape attributes like color [5]. The problem of recognition when shapes are encoded by their contours is interesting for a multitude of reasons. First, contours are easy to obtain and can be used for shape representation regardless of shape convexity or the lack thereof. Second, recognition techniques based on contour analysis have broad applicability since contours are one of the most commonly used shape descrip- tors [16]. Finally, due to the fact that contours may be considered as high order curves, algorithms for con- tour-based recognition are potentially extendible to more generic shape classes including cursive words and hand- drawn sketches. In the context of contour-based shape representation, some of the dissimilarity measures that have been used include the sum of squares of the Euclid- ean distances from each vertex of a polygon to the convex hull of the other polygon [15], the ¸ 2 distance between 0031-3203/00/$20.00 ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 1 4 3 - 0

Transcript of Planar shape recognition by shape morphing

*Corresponding author. Tel.: #001-612-625-0163; fax:#001-612-625-0572.

E-mail addresses: [email protected] (R. Singh), [email protected] (N.P. Papanikolopoulos).

Pattern Recognition 33 (2000) 1683}1699

Planar shape recognition by shape morphing

Rahul Singh, Nikolaos P. Papanikolopoulos*

Department of Computer Science and Engineering, University of Minnesota, 4-192 EE/CS Building, 200 Union Street SE,Minneapolis, MN 55455, USA

Received 16 December 1998; received in revised form 18 June 1999; accepted 18 June 1999

Abstract

A novel method based on shape morphing is proposed for 2D shape recognition. In this framework, the shape ofobjects is described by using their contour. Shape recognition involves a morph between the contours of the objects beingcompared. The morph is quanti"ed by using a physics-based formulation. This quanti"cation is used as a dissimilaritymeasure to "nd the reference shape most similar to the input. The dissimilarity measure is shown to have the properties ofa metric as well as invariance to Euclidean transformations. The recognition paradigm is applicable to both convex andnon-convex shapes. Moreover, the applicability of the method is not constrained to closed shapes. Based on the metricproperties of the dissimilarity method, a search strategy is described that obviates an exhaustive search of the templatedatabase during recognition experiments. Experimental results on the recognition of various types of shapes arepresented. ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.

Keywords: Shape recognition; Shape morphing; Content-based retrieval; Pen-based computing

1. Introduction

Shape recognition is a fundamental problem of patternrecognition and machine vision. This problem arises ina variety of contexts, examples of which include retrievalby content from image databases, document image anal-ysis, automated industrial inspection, analysis of medicalimagery as well as vision-based robotics and visual track-ing. A large number of shape recognition techniques havebeen proposed in the literature. Broadly speaking, thesemay be classi"ed by the shape representation frameworkand the dissimilarity measure used. Various schemesproposed for shape representation include representationusing global features like moments [1], Fourier descrip-tors [2], autoregressive coe$cients [3], texture [4], andcolor [5,6]. Other representational techniques haveemployed local features [7], eigenmode representations[8], subspace representations [9], skeletons [10], part-

based descriptions [11], and boundary-based (contour)representations [12}14]. Some examples of various dis-similarity measures used in shape recognition includedeformation-based measures like modal deformationenergy [8], applications of standard metrics eitherdirectly on geometric shape descriptors [15] or on trans-formed shape representation [9], and measures de"nedon non-geometric shape attributes like color [5].

The problem of recognition when shapes are encodedby their contours is interesting for a multitude of reasons.First, contours are easy to obtain and can be used forshape representation regardless of shape convexity or thelack thereof. Second, recognition techniques based oncontour analysis have broad applicability since contoursare one of the most commonly used shape descrip-tors [16]. Finally, due to the fact that contours may beconsidered as high order curves, algorithms for con-tour-based recognition are potentially extendible to moregeneric shape classes including cursive words and hand-drawn sketches. In the context of contour-based shaperepresentation, some of the dissimilarity measures thathave been used include the sum of squares of the Euclid-ean distances from each vertex of a polygon to the convexhull of the other polygon [15], the ¸

2distance between

0031-3203/00/$20.00 ( 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 1 4 3 - 0

the turning functions of two polygons [17], the Haus-dor!metric [18], and elastic deformation energy [19,20].

Generally speaking, in order to be e!ective, a recogni-tion measure should satisfy the following properties pro-posed by Arkin et al. [17]:

f The measure should be a metric.f It should be invariant under translation, rotation, and

scale change.f It should be easy to compute.f It should match intuitive notions of shape resem-

blance.

From an applied perspective, certain additional prop-erties are desirable in a generic shape recognition tech-nique. These may include the applicability of a methodregardless of shape convexity as well as its ability to dealwith closed as well as open shapes. The latter require-ment is of primary importance in applications like OCRand recognition problems in pen-based computing. Addi-tionally, it contributes to the robustness of a recognitionsystem since in real images, noise and errors in edgelinking may lead to non-closure of object contours. An-other important property is the ability of a recognitionsystem to handle shape deformations. Many contempor-ary applications like content-based retrieval by matchingimage contours with hand-drawn sketches, recognitionof articulate shapes, and handwriting recognition requirea recognition methodology to capture the perceptualsimilarity of two shapes. Often this means that twoshapes have to be placed in the same similarity class evenif they are deformed versions of each other. In such cases,many conventional distance measures perform poorlysince the mathematical description of the deformedshapes may not exactly match each other [21].

In recent literature, di!erent solutions have beenproposed to address the above issues. Bookstein [22]proposed the use of thin-plate splines to model shapedeformations. Sclaro! et al. [8] describe a closed shape interms of the eigenvectors of its sti!ness matrix. Shapesimilarity is de"ned as the amount of modal deformationenergy required to align two shapes. Yuille et al. [23]have used deformable templates to identify and trackfacial features. In the area of image retrieval, the QVEsystem [24] involves computing the correlation (withlimited horizontal and vertical shifts) between a usersketch and an edge image in the database. Bimbo et al.[20] propose an elastic matching technique where thedegree of matching along with the deformation energyspent is used to rank the similarity of hand-drawnsketches with database images. A similar idea has beenused by Azencott et al. [19] for recognition of genericplanar shapes.

The method described in this paper attempts to ad-dress this problem while conforming to the theoreticaland applied criteria mentioned above. It di!ers from the

techniques mentioned in the previous paragraph in thatthe identity of a shape is established by morphing itscontour to templates stored in a database and using aquanti"cation of the morph as a dissimilarity measure.The quanti"cation is formulated in terms of the stretch-ing and bending of the contours and is invariant tosimilarity transformations. Unlike modal matching [8],this formulation is not restricted to closed contours.Neither does it require extensive a priori shape modeling.Furthermore, the approach does not seek to model defor-mations based on simple horizontal and vertical shiftsduring the convolution of the template with the image.While the underlying idea of the proposed method isconceptually similar to the work of Azencott et al. [19]and Bimbo et al. [20] in that it uses deformations formatching shapes, it is (unlike the aforementioned works)invariant to rotations. Additionally, the morph provides,via the synthetic images, an image plane representationof the shape and pose transformation between the inputand the template. In Refs. [25,26] we have shown thatthese synthetic images can be interleaved with real im-ages of an object and used as visual feedback for aneye-in-hand manipulator. Based on the apparent motiondescribed by the virtual images, the trajectory of themanipulator is controlled to perform positioning [26]and grasping tasks [25]. Thus, the approach describedcan form the basis of a uni"ed framework for addressingthe problem of shape recognition as well as that of usingrecognition to control purposive robotic actions.

We organize this paper by looking at shape repre-sentation issues in Section 2. The recognition method isdescribed in Section 3. In Section 4, we propose a methodto reduce the number of direct shape comparisons re-quired for recognition. The experimental results arepresented in Section 5. Finally, in Section 6, the paper issummarized, conclusions are drawn, and future work isoutlined.

2. Shape representation and modeling

Shapes are represented in our approach by their con-tours. This choice is based on the ability of shapecontours to e!ectively capture the visual form, as well asthe applicability of contours for the representation ofdi!erent types of shapes. The contour of each shape ismodeled piecewise by virtual wires. Shape morphingoccurs through deformation (stretching and bending) ofthe arti"cial wires. In this formulation, the shape recogni-tion problem can be treated as an energy minimizationone, where shape similarity is quanti"ed by the energyconsumed for stretching and bending one wire-form con-tour model to another. Shape morphing is guided bya few key points, which are determined by segmenting theobject contour.

1684 R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699

Fig. 1. Segmentation-point placement and B-spline reconstruc-tion using the Pavlidis' algorithm [35]. Curvature maxima areindicated by small squares and curvature minima are shown assmall discs.

2.1. Contour segmentation

Contour segmentation is a well-established area incomputer vision and many segmentation algorithmshave been proposed. A taxonomy of these techniques canbe provided by broadly dividing them into two classes;methods [27}29] that place segmentation points by min-imizing some error norm and methods [27,30}33] basedon the identi"cation of perceptually important points(corners). Contour segmentation can, in general, signi"-cantly reduce the number of contour points while main-taining a su$ciently accurate shape description. In thecurrent context, the two issues that are of importanceare representational accuracy and representational con-sistency. For rigid polygonal shapes both error-basedsegmentation as well as dominant point detection tech-niques can provide adequate representational accuracy.The problem lies in that the results of polygonal approxi-mation may be di!erent (in terms of the number ofsegmentation points and their placement), especially inthe presence of noise and orientation changes. On theother hand, in deformable shapes like hand-drawn line"gures, sections of the contour are often characterized byslowly varying curvature. Segmentation of such contoursusing dominant point detection techniques leads to poorreconstruction [34]. Moreover, small deformations cansigni"cantly alter the number of segmentation pointsobtained with an error-based method.

In this work, depending on the problem context, wehave used two di!erent segmentation techniques. Forproblems like recognition of hand-drawn "gures, wherethe shapes from the same class may vary due to deforma-tions, we use the segmentation algorithm proposed byPavlidis et al. [35]. For the recognition of rigid shapes,we use an algorithm that is based on modi"cation to theerror-based segmentation technique of Ray et al. [29].The modi"cations are primarily designed to obtain con-sistent segmentation results. In the following, we providea brief description of the above techniques.

The basic idea of the algorithm proposed by Pavlidiset al. [35] is to represent a contour as a succession of highcurvature points (corners) and relatively low curvatureregions, that are represented by a single point called a keylow curvature point (see Fig. 1). The actual algorithmconsists of two parts. In the "rst part, corner points aredetected by using the algorithm proposed by Brault andPlamondon [30]. These are then interleaved with lowcurvature points which are computed by using a criteriathat is conjugate to the one used for computing thecorners. The primary advantages of this algorithm lie inthe consistent segmentation pattern2corner } key lowcurvature point } corner2and the automatic identi"ca-tion of the region of support for computing both types ofcurvature extrema. Fig. 1 shows the segmentation andreconstruction of some hand-drawn shapes by using thistechnique.

The segmentation of rigid objects is done by usinga modi"ed version of the error-based algorithm of Rayet al. [29]. This algorithm produces a piecewise linearapproximation of a contour by determining the longestpossible line segment that can "t a set of contour pointswith the smallest possible error.

In our modi"cations, this algorithm is extended bycomputing the curvature at each segmentation point.Points having extremely low curvature are then sup-pressed. As a result of this operation, redundant pointsare removed while preserving the signi"cant cornerpoints. The segmentation list may however containpoints, that are due to noise or quantization errors. Anadditional merging procedure, similar to that suggestedby Huang and Wang [13], is then applied to removeany such points. The strategy is based on computing thedeviation of each segmentation point from the chordjoining its neighbors. If this deviation is less then aprede"ned threshold, the corresponding point is removed.This merging procedure is repeated till no segmentationpoint having a deviation less than the threshold remains.In Fig. 2 we provide some examples of the algorithm'sperformance on rigid shapes at di!erent orientations andpositions.

3. Shape morphing and recognition

Let SI and ST be the input and the target shapes,respectively. The morph of SI to ST is a transformationof the shape, pose and other available image attri-butes (like color or texture) of SI to those of ST. Themorph is characterized by a sequence of intermediateimages that depict this transformation. In the current

R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699 1685

Fig. 3. Shape morphing.

Fig. 2. Segmentation results for some shapes.

formulation, the shape of an object is described by itscontour. The morph between two objects is there-fore de"ned as the morph between their respectivecontours.

In Fig. 3, two morphs are shown. In the "rst case (toprow), the morph occurs between two instances of thesame object di!ering in terms of their pose with respect tothe camera. The intermediate images synthesized duringthe morph show, predominantly, the progressive trans-formation of the input pose to that of the target. In thelower row, we present an example where the morphoccurs between two di!erent shapes.

Let the input and the target shapes, as represented bytheir segmentation points be denoted as SI"[SI

0,2, SI

n]

and ST"[ST0,2, ST

n], respectively. One possible way to

morph SI and ST is through a cross-dissolve operationon the corresponding segmentation points of the

two contours [36]:

S(t)"uSI#tST

"[uSI0#tST

0, uSI

1#tST

1,2, uSI

n#tST

n]

"[S0(t), S

1(t),2, S

n(t)], (1)

where u"1!t and Si(t) is the ith contour point in the

intermediate shape formed at time t. The time parametert is normalized to the interval [0, 1].

The contours SI and ST will, in general, have a di!erentnumber of segmentation points. Therefore, for a morphas de"ned by the cross-dissolve operation of Eq. (1) tooccur, a point correspondence between the segmentationpoints in the input and the target is needed, whereinevery segmentation point on the input contour corres-ponds to at least one segmentation point on the targetcontour and vice versa.

1686 R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699

Fig. 4. A physically invalid morph.

The following formulation for the computation ofpoint correspondences is motivated by the physics-basedapproach of Sederberg et al. [36]. The cost of apoint correspondence is de"ned as the sum of stretchingand bending required to deform the wireform contourso as to bring about the required correspondence. Thestretching energy is computed for every segment (pair ofpoints) and is de"ned as

Es"k

s

D(¸T!¸

O)2!(¸

I!¸

O)2D

(1!cs)¸

.*/#c

s¸.!9

, (2)

where

¸.*/

"min(¸O,2, ¸

I, ¸

T)

and

¸.!9

"max(¸O,2, ¸

I, ¸

T).

In Eq. (2), Esdenotes the stretching energy spent in the

current deformation, ¸O, ¸

I, and ¸

Tdenote the segment

lengths at the beginning, before the current deformation,and after the current deformation, respectively. The termcscorresponds to the penalty for segments collapsing to

points and ks

is the stretching sti!ness parameter. Thebending energy E

bis computed for point triplets and

denotes the cost of angular deformation. It is de"ned as

Eb"k

bD[(/

T!/

O)2!(/

I!/

O)2]D. (3)

In the above equation, kb

indicates bending sti!ness,/O

represents the original angle, and /I

and /T

denotethe angle before the current deformation and the angleafter the current deformation, respectively.

The optimal morph between two contours is determin-ed by the correspondence requiring the least stretchingand bending energy. By constraining the deformations atthe segmentation points, the following optimal substruc-ture property may be observed: The optimum cost of thepoint correspondence (SI

i, ST

j) equals the optimum cost of

the prior point correspondence (SIi~1

, STj) or (SI

i~1, ST

j~1)

or (SIi, ST

j~1) and the cost of establishing the correspond-

ence (SIi, ST

j). Based on the above, an e$cient (O(mn))

dynamic programming scheme can be constructed for

morphing a contour CA with m points to anotherCB having n points. Since the energy computation de-scribed above, requires a starting point correspondence,we de"ne the optimal morphing between two contoursCA and CB as

Dmorph

(CA, CB)"min)

E(CA, CB). (4)

Here, ) denotes the set of all starting point correspond-ences between the contours CA and CB. D

morph(CA, CB)

(hereafter called the degree of morphing), denotes the costof the optimal morph between the contours CA and CB.

Since the formulation of the morph is based on a linearcross-dissolve operation (see Eq. (1)), physically valid in-termediate shapes in the morph, even between instancesof the same object, are not guaranteed unless the inputand the target shapes are rotationally aligned. Such lackof physical validity is expressed by a crossover of theobject contours during the intermediate stages of themorph. An example of a morph with physically invalidintermediate images is presented in Fig. 4.

Physically invalid morphing is inconsistent in thesense of our formulation, because the deformationscaused by the crossovers are due to alignment di!erencesand not shape di!erences. A solution to this problem canbe obtained by warping the contours prior to the applica-tion of the cross-dissolve operation. This warp can bede"ned on the basis of the observation that the pointcorrespondence obtained during the computation of theoptimal morph is invariant to translation and rotation ofthe objects. Based on this correspondence, elongationaxes can be computed for each shape. The rotation trans-formation between the two shapes can be estimated bycomputing the angle between the elongation axes. Sim-ilarly, the translational discrepancy between the shapesmay be obtained by computing the vector joining thecentroids of the two shapes. The shape morphing processis thus divided into two stages. The sequence of inter-mediate images generated during the "rst stage (warping)exhibit a progressive recti"cation of the rigid transforma-tions (translation and rotation) between the input andthe target shapes. This recti"cation involves an updateof the coordinates describing the input contour. The

R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699 1687

correspondences between the input and the target con-tours, however, remain una!ected. These correspond-ences along with the updated values of the contour pointsare then used to rectify the shape deformations betweenthe input and the target using the cross-dissolve opera-tion. We would like to point out to the reader that in ourformulation, the cross-dissolve operation is only de"nedon the geometric description of the shapes (coordinatesof the segmentation points).

The recognition process thus consists of the followingthree steps:

1. Shape recognition: The template closest to the inputshape, in terms of the stretching and bending energiesis identi"ed. The correspondences between the seg-mentation points of the input and the target are deter-mined by using the optimal substructure propertydescribed above.

2. Rotational and translational alignment: The elongationaxis and the centroids are computed for the input andits corresponding template. Based on the translationvector between the shape centroids and the anglebetween the elongation vectors, the input coordinatesare updated to align the shapes.

3. Rectixcation of deformations: The updated input coor-dinates along with the correspondences computedduring shape recognition are used to deform the inputcontour (by stretching and bending) till it becomesidentical to the template.

3.1. Analysis of the recognition technique

Invariance of the dissimilarity measure to translationand rotation follows from the fact that the contours aredescribed in object-centered coordinate systems and notabsolute coordinate systems. Invariance to scale changesis obtained by normalizing the contour length. Further-more, the formulation as well as the computation of thedissimilarity measure does not involve any assumptionsabout closure or convexity of contours. The intuitionbehind the proof for the metric properties of the measurecomes from the fact that we model the shape changes asa conservative system. A formal proof of these propertiesis presented in Appendix A.

4. Reducing shape comparisons by using the triangleinequality and the primordial shape

The shape recognition framework described thus farassumes an exhaustive search of the database involvingthe comparison of each template to the input shape inorder to "nd the closest template to the input. E!orts toavoid an exhaustive search of the database so as toanswer a similarity query are needed because of the

following two reasons:

1. Relative high costs of computing on-line the dissimil-arity function.

2. Potential increase in the size of the image database.

In this section, we examine the use of reference shapesin conjunction with the metric properties of the proposeddissimilarity measure to avoid an exhaustive search ofthe image database. The basic idea [37}39] lies in deter-mining how shapes in the database are related to a pre-determined reference shape. In addition to this, if thesimilarity between the input (query) shape and this ref-erence shape can be computed, then the templates in thedatabase that are highly dissimilar from the query can beexcluded from contention, without resorting to costlyon-line comparisons.

Let SI be the input shape (query). Further, letD"MST

1, ST

2,2, ST

nN be the image database. Denote by

CA, CB, and CC three arbitrary shapes (contours). LetD

morph(CA, CB) be the dissimilarity measure between the

shapes CA and CB and let STR

be the reference shape,STR3D.Since D

morph(CA, CB) is a metric, it has the following

properties:

Dmorph

(CA, CB)*0, (5)

Dmorph

(CA, CB)"0 Q CA,CB, (6)

Dmorph

(CA, CB)"Dmorph

(CB, CA), (7)

Dmorph

(CA, CB)#Dmorph

(CB, CC)*Dmorph

(CA, CC), ∀CC.

(8)

Given an arbitrary shape CX in the image database, itfollows from the triangle inequality (8) that

Dmorph

(SI, CX)*Dmorph

(SI, STR)!D

morph(CX, ST

R). (9)

Let e be some threshold on shape similarity. For instance,e may be selected as the distance between the input shapeSI and a template shape that is closest to the input aftera partial search through the database. It follows fromEq. (9), that if

Dmorph

(SI, STR)!D

morph(CX, ST

R)*e, (10)

then the comparison between SI and CX does not need tobe considered. This is because the de"nition of e guaran-tees the existence of at least one shape in the databasethat is closer to SI than CX with respect to the dissimilar-ity measure D

morph(. , .). Similarly, we also have from the

triangle inequality

Dmorph

(CX, SI)#Dmorph

(SI, STR)*D

morph(CX, ST

R). (11)

But from the symmetry of the dissimilarity measure, wehave

Dmorph

(CX, SI)"Dmorph

(SI, CX). (12)

1688 R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699

Substituting the above in Eq. (11) and simplifying we get

Dmorph

(SI, CX)*Dmorph

(CX, STR)!D

morph(SI, ST

R). (13)

Once again if

Dmorph

(CX, STR)!D

morph(SI, ST

R)*e, (14)

then the comparison between SI and CX does not need tobe carried out.

The constraints in Eqs. (10) and (14) describe the cri-teria that can be used to exclude shapes in the databasewithout in#uencing the correctness of the recognitionprocess [37]. The comparison of the database shapeswith the reference shape, D

morph(CX, ST

R), is computed o!-

line. The comparison between the input shape and thereference shape, D

morph(SI, ST

R), is done on-line for every

new input. If during the comparisons, a template shape isfound that is closer to the query shape than the bestmatch obtained till that point in the search, then theidentity of the best match is updated to this template.

The speedup obtained by using the above idea dependson the computation of the threshold e. Computing a goodvalue of e can substantially ameliorate the search com-plexity. However, the di$culty lies in that computing agood value of e itself involves searching large parts of theimage database. A possible solution to these mutuallycon#icting goals can be obtained by using the idea of aprimordial shape. Every shape CY in the database ismorphed to a primordial shape ST

P. The primordial shape

can be, for instance, a point or a line. The value ofthe morph D

morph(CY, ST

P), then becomes an indicator of

the complexity of the shape CY. An unknown input SI is"rst morphed to the primordial shape ST

P. The value of

this morph is used to identify a subset DS of the imagedatabase D. The subset DS consists of shapes havingthe same order of similarity as SI with respect to ST

P.

Formally,

DS"MCY: CY3D, DDmorph

(CY, STP)!D

morph(SI, ST

P)D(dN.

(15)

In the above equation, d is a parameter the value forwhich is provided. The reference shape can then be se-lected either as the primordial shape or as the shapeclosest to the input SI in terms of the cost of the morphfrom ST

P. The search space is pruned by computing the

dissimilarity measure over the shapes in DS. An alterna-tive to providing the parameter d in Eq. (15), is to usea K-nearest neighbor rule to select the templates for whichthe shape comparison is carried out.

5. Experimental results

Three di!erent sets of experiments were used to testthe performance of the proposed system. In each of theexperiments, a di!erent data set was employed to test

the applicability of the method in di!erent domains. Inthe "rst experiment the method was used for the recogni-tion of rigid objects. In this case the test images di!eredfrom the templates in terms of Euclidean transformationsas well as due to poor thresholding and/or partial occlu-sions. The second experiment was designed to test therecognition performance on shapes which di!er fromtheir corresponding templates both by Euclidean trans-formations and deformations. The applicability of themethod for the recognition of open shapes (handwrittencursive words in an on-line setting), was studied in the"nal experiment.

5.1. Recognition of rigid objects

The e!ectiveness of the proposed method wasvalidated in an experiment with 16 objects. An image ofeach object, taken at an arbitrary location and orienta-tion with respect to the camera was stored as a template.The imaging plane was assumed to be perpendicular tothe optical axis of the camera. In Fig. 5, the objects usedin the experiment are shown. For each object, "ve testimages were captured by arbitrarily varying the objectpose in terms of translation, rotation and scale. Theobject contours were extracted after automatic moment-preserving thresholding. The recognition results are sum-marized in Table 1. In Fig. 6, examples of some noisyshapes from the test set are shown (bottom row), alongwith their corresponding templates (top row). For thecases shown in columns (a) and (b), the translation ofthe camera lead the object to partially move out of theimaging region. In the "rst instance (column (a)), therecognition was not a!ected. For the instance shown incolumn (b), a misrecognition occurred. The shape vari-ations shown in columns (c) and (d) were due to poorthresholding. Both the instances were identi"ed correctly.

The average values of the dissimilarity measure overthe test cases for each shape are presented in Tables 2and 3. The lowest value for the morph for each test set isunderlined. In Table 4, we present the results for pruningthe database search based on the use of the triangleinequality and the primordial shape. The test set used inthis experiment is the same as the one for which therecognition values are presented in Tables 2 and 3. In thisexperiment, the value of the morph D

morph(ST

i, ST

P), be-

tween each template shape STi

and the primordial shapeSTP

was computed o%ine along with the morph valuesbetween the templates. In the on-line recognition phase,each input was initially morphed to the primordial shape.The value of this morph D

morph(SI, ST

P) was then com-

pared with the value of the morphs Dmorph

(STi, ST

P). A se-

lection criteria (K-nearest neighbors, denoted henceforthas k-NN) was used to de"ne the subset of template shapesfrom which the identity of the input was established usingthe constraints de"ned in Eq. (10) and Eq. (14). The initialreference shape ST

Rfor each subset of templates was

R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699 1689

Fig. 5. Examples of rigid shapes.

Table 1Recognition results for rigid objects

Templates Testshapes

Correctlyrecognized

Misrecognized Recognitionrate

16 80 79 1 98.75%

de"ned to be the template which was closest to the inputin terms of the distance from the primordial shape.

For the test set considered in Table 4, an averagenumber of 2.50 shape comparison operations of complex-ity O(mn) were required for a recognition rate of 100%. Incomparison, the general case where the query is com-pared to each of the templates requires 15 such compari-

sons. In the given experiment, the correct template foreach test set was in the list of templates obtained by usingthe 3-NN rule. This along with the use of Eqs. (10) and(14) lead to the reduction in the number of comparisons.With increase in the size of the image database, a rigidclustering criteria like k-NN may not guarantee the in-clusion of the correct template in the subset of shapes onwhich the search is performed. The use of #exible cluster-ing criteria is likely to provide more robust results thanby using "xed criteria or thresholds.

5.2. Recognition of deformable shapes

A collection of 15 shapes was used in this experi-ment. Amongst these shapes were hand-drawn outlines ofindustrial tools and objects, sketches of natural objects,

1690 R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699

Fig. 6. Examples of noisy shapes used in the recognition experiment.

Table 2Average degree of morphing values (templates 1}8)

Templates 1 2 3 4 5 6 7 8

Test shapes1 257.87 748.86 598.12 672.65 710.90 862.61 648.74 510.662 955.50 123.77 413.30 705.64 311.31 789.61 576.60 487.283 705.86 458.35 218.21 691.19 418.22 810.12 535.62 457.824 512.91 561.29 656.64 64.45 302.67 1296.70 125.87 605.225 612.82 272.64 414.44 341.25 91.00 1038.10 360.85 448.816 997.98 788.57 651.96 1269.90 966.67 249.26 1079.30 804.867 579.06 432.03 513.92 207.46 325.58 1044.50 90.73 536.598 560.71 392.93 446.17 725.53 411.54 717.71 624.52 146.359 454.38 517.94 556.84 119.63 353.85 1169.20 155.61 523.22

10 637.74 622.64 658.76 193.19 474.10 1282.70 199.74 558.1211 705.45 233.01 325.90 639.79 210.34 767.89 478.65 472.5412 919.27 265.60 331.16 813.49 390.15 710.22 641.96 543.3713 761.71 473.08 439.39 1034.40 592.99 534.53 838.42 495.1814 728.36 347.26 444.69 394.56 377.66 986.59 323.42 412.0215 815.73 355.02 602.02 404.27 581.37 918.63 278.25 380.5116 771.66 210.32 324.55 689.60 270.57 864.95 530.77 396.49

and contours from medical images. A database of 60samples sketches (four per template) was collected froma group of users. The aim of the experiment was to testthe ability of the system to retrieve objects having thesame visual form as the sketch irrespective of the shapevariations introduced during the drawing. The users wereshown each template prior to the collection of the re-spective test samples. However, during the actual draw-ing phase no references were allowed to the template.This step curtailed excessive drawing variations whileensuring that the recognition problem was not trivial-ized. No other constraints were imposed on the drawings.Some examples of hand-drawn sketches used as tem-

plates (in black borders), and corresponding samplesketches used as test shapes are shown in Fig. 7. Therecognition results for the shapes used in this experimentare reported in Table 5. For most of the test shapes, theproposed system exhibited good tolerance to shape vari-ations. In Fig. 8, we present an example of content-basedretrieval based on user sketches. The template is anindustrial object and the query image is its hand-drawnsketch. In this particular example, the query image di!er-ed from the template in terms of translations, rotationsand deformations. In the "gure, the intermediate imagesof the morph show the recti"cation of rotations anddeformations.

R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699 1691

Table 3Average degree of morphing values (templates 9}16)

Templates 9 10 11 12 13 14 15 16

Test shapes1 585.11 686.62 671.31 791.66 744.59 687.49 872.89 753.692 677.40 745.02 247.05 231.22 517.49 428.31 443.33 208.073 621.80 663.35 301.18 290.54 395.78 485.72 617.66 331.914 148.10 215.85 531.18 755.18 988.87 342.45 378.89 609.385 375.89 475.28 199.65 381.70 586.18 338.07 568.63 275.926 1137.70 1178.50 771.63 628.56 467.02 938.76 924.05 776.217 205.02 253.35 422.09 604.21 768.88 288.72 341.82 415.048 636.38 636.61 344.73 437.92 399.27 477.11 442.34 353.259 39.08 101.14 467.97 761.95 892.65 281.38 410.54 546.85

10 129.43 64.38 589.40 852.05 952.88 257.22 358.76 645.7211 573.85 631.73 137.77 258.50 405.93 407.27 527.82 197.9312 815.06 868.29 282.65 92.01 373.43 549.18 569.08 247.0413 908.83 969.32 420.80 364.95 180.63 687.73 614.99 504.0414 331.88 240.05 413.58 566.94 674.91 70.94 280.58 420.1115 425.85 407.73 501.38 553.58 592.91 331.76 72.19 413.1816 642.01 697.38 223.61 244.67 438.63 379.38 502.87 109.32

Table 4Performance of the search pruning strategy

Test shape Dmorph

(SI, STP) Clustering rule Template subset Shape identity Number of comparisons Correct recognition

1 1465.28 3-NN M1, 6, 13N 1 2 Yes2 499.55 3-NN M5, 2, 16N 2 3 Yes3 927.31 3-NN M3, 8, 10N 3 3 Yes4 781.96 3-NN M4, 7, 9N 4 3 Yes5 485.80 3-NN M5, 2, 16N 5 2 Yes6 1608.85 3-NN M6, 1, 13N 6 2 Yes7 785.70 3-NN M4, 9, 7N 7 3 Yes8 956.54 3-NN M3, 8, 10N 8 3 Yes9 800.26 3-NN M9, 4, 7N 9 3 Yes

10 850.76 3-NN M10, 9, 4N 10 1 Yes11 616.07 3-NN M11, 16, 14N 11 3 Yes12 730.78 3-NN M15, 12, 7N 12 3 Yes13 1106.57 3-NN M13, 8, 3N 13 3 Yes14 680.74 3-NN M14, 15, 12N 14 1 Yes15 731.17 3-NN M15, 12, 7N 15 3 Yes16 566.52 3-NN M16, 11, 5N 16 2 YesAverage number of comparisons 2.50

5.3. Recognition of open shapes

In this experiment, the method was used for the recog-nition of unconstrained cursive handwriting in a user-dependent, on-line setting. Four users participated inthe experiment. A vocabulary of ten randomly selectedwords was used (see Table 6). Each user provided onetemplate sample and four test samples for each word. Thedata was collected using a WACOM UD-0608R tablet

with an inking stylus. Table 7 contains the recognitionresults for each of the users.

Examples of correct morphing for the words adventureand guard both taken from user 4 are shown in Fig. 9. Incase of the word guard, the reader may note the poorsegmentation of the descender stroke for the letter g. Theresultant loss of information in this case is similar toinstances where parts of a shape get occluded. Correctrecognition in the given example was facilitated by the

1692 R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699

Fig. 7. Examples of deformable shapes.

Table 5Experimental results

Templates Testshapes

Correctclassi"cations

Misclassi"cations Successrate

15 60 56 4 93.33%

Table 6List of words used in the handwriting recognition experiment

adventure banana bookshelf cannon #ywheelguard landmark mad peach tooth

Table 7Test results for cursive words

Users Referencewords

Testwords

Correctlyrecognized

Recognitionrate

1 10 40 40 100.0%2 10 40 36 90.0%3 10 40 37 92.5%4 10 40 39 97.5%

features detected in the rest of the word as well as theability of the recognition method to handle deformations.It is worthwhile to contrast this case with the one inFig. 10, where a misrecognition occurred. The test samplein this case was written at a high speed and with insu$-cient pressure on the tablet. The curve describing thisword, therefore, consisted of a small number of unevenlydistributed points. This caused the segmentation algo-rithm to perform poorly in terms of capturing importantfeature points. Consequently, the cost of the morph to thetemplate cannon was lower than that to the correct tem-plate (banana), which was the second lowest.

The results presented in this section are conceptuallysimilar to the more detailed experiments described in aprevious paper [34]. The basic di!erences lie in the use ofa di!erent dissimilarity measure, a di!erent formulation

of the shape-morphing process, and in that the prepro-cessing steps of rotation and slant normalization wereomitted in the present case. While the proposed methodis invariant to rotations, changes in the slant of letterswithin a word were treated as deformations during therecognition process.

6. Conclusions and future work

In this paper, we have described a technique for com-paring shape similarity based on quantifying the morphof one shape to another. Each shape is represented bya polygonal approximation of its contour. Shape morph-ing occurs by the stretching and bending of the contours.Quanti"cation of the morph is obtained by computingthe incremental energy spent in deforming one shape toanother as given by a physics-based model. The recogni-tion methodology uses a pruning scheme, based on themathematical properties of the morph, to signi"cantly

R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699 1693

Fig. 8. Content-based retrieval by morphing a user sketch to the correct database shape.

Fig. 9. Morphs of the words adventure and guard to their respective templates kept in the database (user 4). The input words are shownwith the segmentation points superimposed.

Fig. 10. Morph of the word banana to the template for the word cannon (user 3) produces erroneously the best match. The input word isshown with the segmentation points superimposed.

1694 R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699

reduce the number of on-line shape comparisons duringthe recognition phase.

The proposed approach has been applied for the rec-ognition of both rigid and deformable shapes in di!erentapplication domains. The salient properties and advant-ages of the proposed method include:

1. Invariance of the method to translations, rotationsand scale changes.

2. The method has the properties of a metric.3. Its applicability to convex as well as non-convex

shapes.4. It can be used for the recognition of open shapes like

letters and cursive words.5. The method has low computational complexity and it

is intuitive.

Due to its ability to handle deformations, the proposedrecognition paradigm can be used in retrieval-by-contentsystems or in applications like pen-based computingwhere on-line recognition of deformable shapes is re-quired. Another signi"cant attribute of the proposedmethod is that the intermediate images in the morphdescribe the shape and pose transformation needed toalign the input and the target images. The transformationin the morph plane is relative to a static virtual camera.They can however be interpreted in terms of a mobile realcamera. In such a case the pose transformation describedby the virtual images can be used to control the cameramotion so that the real views of the object being servoedcorrespond to the virtual views generated by the morph.Based on the above idea, we have obtained promisingresults in controlling robotic interactions like positioningand grasping by using image morphing [25,26].

An important constraint on the performance of thesystem is its dependence on the segmentation algorithm.In particular, we have observed that segmentation algo-rithms that lead to inconsistent point placement maylead to degradation of the recognition performanceeven though the error during shape segmentation issmall. We have looked at two segmentation strategiesthat have reasonable performance in terms of consistentpoint placement. Better segmentation strategies, in theabove sense, will improve the recognition performance.Since the proposed approach is based on matching con-tour descriptions, it performs poorly for shapes where theprimary di!erence in appearance is due to internaltopological features or other attributes like texture orcolor. Furthermore, since contour description is in-herently sensitive to illumination and/or shadowing, theperformance of the proposed technique will degrade inthe presence of shadows and poor control of illumina-tion. In the current experimentations, a single arbitrarilyselected image of each object was used as a template.Improvements in the recognition performance can beexpected either by optimizing the choice of the template

or by using multiple templates for each shape class.Extending the present framework to incorporate imageattributes like color and texture as well as recognition of3D objects by morphing 3D shapes are other possibledirections of future work.

7. Summary

A novel method based on shape morphing is proposedfor 2D shape recognition. In this framework, the shape ofobjects is described by using their contours. Shape recog-nition involves a morph between the contours of theobjects being compared. The morph is quanti"ed byusing a physics-based formulation. This quanti"cationserves as a dissimilarity measure to "nd the referenceshape most similar to the input. The proposed dissimilar-ity measure is shown to have the properties of a metricas well as invariance to Euclidean transformations. Therecognition paradigm is applicable to both convex andnon-convex shapes. Moreover, the applicability of themethod is not constrained to closed shapes. Based on themetric properties of the dissimilarity method, a searchstrategy is described that obviates an exhaustive search ofthe template database during recognition experiments.Experimental results on the recognition of various typesof shapes are presented.

Acknowledgements

Ioannis Pavlidis had participated in the initial phase ofthis research. The proof of the metric properties hasbene"ted from numerous discussions with SoumyenduRaha. The critique of Richard Voyles was instrumental indeveloping the segmentation algorithm used for rigidshapes. The presentation of this paper has also improveddue to the comments provided by the anonymous re-viewer. The authors wish to express their gratitude toeach of the aforementioned. The research of Rahul Singhon this project was supported by the National ScienceFoundation through grants dIRI-9410003 and dIRI-9502245.

Appendix A. Proof of the metric property

To prove the metric properties of the dissimilaritymeasure we consider the stretching energy and the be-nding energy separately. We have from the de"nition ofthe stretching energy (see Eq. (2)),

Es"D=

sD"k

s

]D(¸

T!¸

O)2!(¸

I!̧

O)2D

(1!cs) min(¸

O,2, ¸

I, ¸

T)#c

smax(¸

O,2, ¸

I, ¸

T)

(A.1)

R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699 1695

By assuming a hierarchy of shape complexity, as dis-cussed in this paper and the related references, the shaperecognition problem can be solved without an exhaustivesearch of the shape database. The basic idea lies inconsidering shapes in the database to be derived, bystretching and bending, from a primordial shape, likea point or a line. Given a query shape, the measure of itsdissimilarity from the primordial shape can be used toidentify a subset of templates similar to the input. Sincethe original length ¸

Oand the original angle /

Oare

de"ned to be the length and angle prior to any deforma-tions they essentially are parameters of the primordialshape. Conceptually, the primordial shape should lie atthe origin of the shape hierarchy. Its parameters ¸

Oand

/O

are therefore de"ned as

¸O,min(¸

O, ¸

1,2, ¸

k) (A.2)

/O,min(/

O, /

1,2, /

k) (A.3)

where ¸iand /

irefer to the length and the angle after the

i-th deformation respectively.We base the proof of the metric property on the above

construction. Consider the case of the stretching energy;Let the original length be denoted by ¸

0. The lengths

after deformation (stretching) at the stages i, j, and k arecorrespondingly denoted by ¸

i, ¸

j, and ¸

krespectively.

The stretching energy spent in compressing or expandinga wire of length ¸

ito the length ¸

jis represented as

Es(i, j).From the formula of the stretching energy (see Eq. (2),

it may be veri"ed trivially that

Es(i, j)"D=

s(i, j)D*0 (A.4)

and

Es(i, j)"E

s( j, i). (A.5)

To prove the triangle inequality, we consider the follow-ing three types of length changes: monotonic decrease,monotonic increase, and non-monotonic length change.

1. Monotonic decrease: Let the lengths of the virtualwire at stages i, j, and k be ¸

i, ¸

j, and ¸

krespectively.

Furthermore, ¸i*¸

j*¸

k(monotone decrease in length).

To prove the triangle inequality we need to show that

Es(i, j)#E

s( j, k)*E

s(i, k) (A.6)

Expanding each of the above terms by using the formulafor stretching energy, the inequality to be proved becomes

ks

D(¸j!¸

O)2!(¸

i!¸

O)2D

(1!cs)¸

O#c

s¸i

#ks

D(¸k!¸

O)2!(¸

j!¸

O)2D

(1!cs)¸

O#c

s¸i

*ks

D(¸k!¸

O)2!(¸

i!¸

O)2D

(1!cs)¸

O#c

s¸i

.

The above inequality can be simpli"ed to the followingform:

(¸i!¸

O)2!(¸

j!¸

O)2

(1!cs)¸

O#c

s¸i

#

(¸j!¸

O)2!(¸

k!¸

O)2

(1!cs)¸

O#c

s¸i

*

(¸i!¸

O)2!(¸

k!¸

O)2

(1!cs)¸

O#c

s¸i

. (A.7)

The right-hand side of the above inequality may berewritten as:

(¸i!¸

O)2!(¸

k!¸

O)2

(1!cs)¸

O#c

s¸i

,

(¸i!¸

O)2!(¸

j!¸

O)2

(1!cs)¸

O#c

s¸i

#

(¸j!¸

O)2!(¸

k!¸

O)2

(1!cs)¸

O#c

s¸i

. (A.8)

By substituting the above in the RHS of (A.7), we obtainan equality and thus the proof.

2. Monotonic increase: If the lengths of the virtual wireat stages i, j, and k are ¸

i, ¸

j, and ¸

krespectively. Then in

this case, ¸i)¸

j)¸

k. To prove the triangle inequality

we need to show that

Es(i, j)#E

s( j, k))E

s(i, k). (A.9)

Expanding the above by using the formula for stretchingenergy and simplifying for the absolute values, we get

ks

(¸j!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸j

#ks

(¸k!¸

O)2!(¸

j!¸

O)2

(1!cs)¸

O#c

s¸k

*ks

(¸k!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸k

. (A.10)

Rewriting the right-hand side of the above inequality,we get

ks

(¸k!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸k

,ks

(¸k!¸

O)2!(¸

j!¸

O)2

(1!cs)¸

O#c

s¸k

#ks

(¸j!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸k

. (A.11)

Substituting the above on the right-hand side of inequal-ity (A.10), we have

ks

(¸j!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸j

#ks

(¸k!¸

O)2!(¸

j!¸

O)2

(1!cs)¸

O#c

s¸k

*ks

(¸k!¸

O)2!(¸

j!¸

O)2

(1!cs)¸

O#c

s¸k

#ks

(¸j!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸k

. (A.12)

1696 R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699

The above inequality is valid iw

(¸j!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸j

*

(¸j!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸k

.

(A.13)

Clearly the above is true because ¸j(¸

k. h

3. Non-monotonic length changes: Consider the casewhere the lengths of the virtual wire at stages i, j, andk are ¸

i, ¸

j, and ¸

krespectively. Furthermore, let

¸k)¸

i)¸

j. Rewriting the triangle inequality in terms

of the stretching energy, we obtain the following inequal-ity which needs to be proved:

ks

D(¸j!¸

O)2!(¸

i!¸

O)2D

(1!cs)¸

O#c

s¸j

#ks

D(¸k!¸

O)2!(¸

j!¸

O)2D

(1!cs)¸

O#c

s¸j

*ks

D(¸k!¸

O)2!(¸

i!¸

O)2D

(1!cs)¸

O#c

s¸i

. (A.14)

For the term on the RHS of the above inequality, wehave

ks

D(¸k!¸

O)2!(¸

i!¸

O)2D

(1!cs)¸

O#c

s¸i

)ks

D(¸k!¸

O)2!(¸

j!¸

O)2D

(1!cs)¸

O#c

s¸i

#ks

D(¸j!¸

O)2!(¸

i!¸

O)2D

(1!cs)¸

O#c

s¸i

. (A.15)

Substituting the above in the RHS of inequality (A.14),we obtain the following inequality, the validity of whichwe need to prove:

ks

D(¸j!¸

O)2!(¸

i!¸

O)2D

(1!cs)¸

O#c

s¸j

#ks

D(¸k!¸

O)2!(¸

j!¸

O)2D

(1!cs)¸

O#c

s¸j

*ks

D(¸k!¸

O)2!(¸

j!¸

O)2D

(1!cs)¸

O#c

s¸i

#ks

D(¸j!¸

O)2!(¸

i!¸

O)2D

(1!cs)¸

O#c

s¸i

. (A.16)

The above inequality, after simpli"cation for the absolutevalues and regrouping of terms reduces to

(¸j!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸j

!

(¸j!¸

O)2!(¸

i!¸

O)2

(1!cs)¸

O#c

s¸i

*

(¸j!¸

O)2!(¸

k!¸

O)2

(1!cs)¸

O#c

s¸i

!

D(¸j!¸

O)2!(¸

k!¸

O)2D

(1!cs)¸

O#c

s¸j

. (A.17)

The above may be further simpli"ed as

[(¸j!¸

O)2!(¸

i!¸

O)2]

]C(1!c

s)¸

O#c

s¸i!(1!c

s)¸

O!c

s¸j

((1!cs)¸

O#c

s¸j)((1!c

s)¸

O#c

s¸i)D

*[(¸j!¸

O)2!(¸

k!¸

O)2]

]C(1!c

s)¸

O#c

s¸j!(1!c

s)¸

O!c

s¸i

((1!cs)¸

O#c

s¸i)((1!c

s)¸

O#c

s¸j)D. (A.18)

from where it follows that

cs(¸

i!¸

j)[(¸

j!¸

O)2!(¸

i!¸

O)2]

((1!cs)¸

O#c

s¸j) ((1!c

s)¸

O#c

s¸i)

*

cs(¸

j!¸

i)[(¸

j!¸

O)2!(¸

k!¸

O)2]

((1!cs)¸

O#c

s¸j)((1!c

s)¸

O#c

s¸i).

For the above to hold true, we must have

(¸j!¸

O)2!(¸

i!¸

O)2*(¸

k!¸

O)2!(¸

j!¸

O)2

(A.19)

or, equivalently,

(¸j!¸

O)2*

(¸k!¸

O)2#(¸

i!¸

O)2

2. (A.20)

To prove the above, we note that from the initial condi-tions of the non-monotonic length changes we have¸i*¸

k, therefore

(¸k!¸

O)2)(¸

i!¸

O)2. (A.21)

Substituting (¸i!¸

O)2 for (¸

k!¸

O)2 on the RHS of

inequality (A.20), we get

(¸j!¸

O)2*

2(¸i!¸

O)2

2. (A.22)

Clearly, the above inequality is true because ¸i)¸

j,

hence the proof. For other cases of non-monotonicchanges, the triangle inequality may be proved likewise.

The proof of the metric property for the bending en-ergy is based on similar ideas. In the following, we denotethe angle before any deformations by /

Oand the angles

after bending at the i, j, and k instances by /i, /

j, and

/krespectively. The bending energy spent in changing the

angle /ito /

jis denoted by E

b(/

i, /

j).

From the formulation of the bending energy in Eq. (3),it is straightforward to note that

1. Eb(/

i, /

j)*0.

2. Eb(/

i, /

j)"E

b(/

j, /

i).

For the proof of the triangle inequality, we proceed onlines similar to the one followed for the stretching energyby considering angle changes that are monotonic and

R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699 1697

non-monotonic. For monotonic changes, we present theproofs for both decreasing and increasing angular cha-nges. For non-monotonic angular changes, we presentthe proofs for two cases. The other cases may be provedanalogously.

1. Monotonically decreasing angle changes: Let theangles at the instances i, j, and k be /

i, /

j, and /

k,

respectively. Further let /i*/

j*/

k. For the triangle

inequality we need to prove that

Eb(/

i, /

j)#E

b(/

j, /

k)*E

b(/

i, /

k) (A.23)

NkbD(/

j!/

O)2!(/

i!/

O)2D

#kbD(/

k!/

O)2!(/

j!/

O)2

*kbD(/

k!/

O)2!(/

i!/

O)2D . (A.24)

Simplifying, we have

(/i!/

O)2!(/

j!/

O)2#(/

j!/

O)2!(/

k!/

O)2

*(/i!/

O)2!(/

k!/

O)2. (A.25)

The above gives us an equality, thus proving the validityof the triangle inequality.

2. Monotonically increasing angle changes: For thiscase we consider the angle changes to have the followingrelationship: /

i)/

j)/

k. We thus have, for the triangle

inequality,

kbD(/

j!/

O)2!(/

i!/

O)2D

#kbD(/

k!/

O)2!(/

j!/

O)2D

*kbD(/

k!/

O)2!(/

i!/

O)2D. (A.26)

Taking into consideration the relationships between theangles and accounting for the absolute values, we get

(/j!/

O)2!(/

i!/

O)2#(/

k!/

O)2!(/

j!/

O)2

*(/k!/

O)2!(/

i!/

O)2. (A.27)

On simpli"cation, we get an equality. h

3. Non-monotonically angle changes

s Let /i)/

k)/

j, The triangle inequality

Eb(/

i, /

j)#E

b(/

j, /

k)*E

b(/

i, /

k) (A.28)

may then be expanded as

(/j!/

O)2!(/

i!/

O)2#(/

j!/

O)2!(/

k!/

O)2

*(/k!/

O)2!(/

i!/

O)2. (A.29)

This inequality then simpli"es to the inequality

(/j!/

O)2*(/

k!/

O)2. (A.30)

Since /k)/

j, the above inequality holds.

s Consider the case where /k)/

i)/

j. The triangle

inequality then takes the following form:

(/j!/

O)2!(/

i!/

O)2#(/

j!/

O)2!(/

k!/

O)2

*(/i!/

O)2!(/

k!/

O)2. (A.31)

Simplifying, we have

(/j!/

O)2*(/

i!/

O)2, (A.32)

which is true, since /j*/

i. h

References

[1] R.L. Anderson, Real-time gray-scale video processing us-ing a moment generating chip, IEEE J. Robotics Automat.1 (1985) 70}85.

[2] C.C. Lin, R. Chellapa, Classi"cation of partial 2D shapesusing Fourier descriptors, IEEE Trans. Pattern Anal.Mach. Intell. 9 (1987) 686}690.

[3] R.L. Kashyap, R. Chellapa, Stochastic models for closedboundary analysis: representation and reconstruction,IEEE Trans. Inform. Theory 27 (5) (1981) 627}637.

[4] A. Pentland, R.W. Picard, S. Sclaro!, Photobook: con-tent-based manipulation of image databases, Int. J. Com-put. Vision 18 (3) (1996) 233}254.

[5] M.J. Swain, D.H. Ballard, Color indexing, Int. J. Comput.Vision 7 (1) (1991) 11}32.

[6] T.F. Syeda-Mahmood, Data and model-driven selectionusing color regions, Int. J. Comput. Vision 21 (1/2) (1997)9}36.

[7] R.C. Bolles, R.A. Cain, Recognizing and locating partiallyvisible objects: the local-feature focus method, Int. J. Ro-botics Res. 1 (1982) 57}82.

[8] S. Sclaro!, A.P. Pentland, Modal matching for corre-spondence and recognition, IEEE Trans. Pattern Anal.Mach. Intell. 17 (6) (1995) 545}561.

[9] H. Murase, S.K. Nayar, Visual learning and recognition of3-D objects from appearance, Int. J. Comput. Vision 14(1995) 5}24.

[10] M. Brady, H. Asada, Smoothed local symmetries and theirimplementation, Int. J. Robotics Res. 3 (1984) 36}61.

[11] T. Phillips, A shrinking technique for complex object de-composition, Pattern Recognition Lett. 3 (1985) 271}277.

[12] J. Chen, J.A. Ventura, Optimization models for shapematching of nonconvex polygons, Pattern Recognition 28(6) (1995) 863}877.

[13] L. Huang, M.J. Wang, E$cient shape matching throughmodel-based shape recognition, Pattern Recognition 29 (2)(1996) 207}215.

[14] I. Tchoukanov, R. Safaee-Rad, B. Benhabib, K.C. Smith,A new boundary-based shape recognition technique, in:Proceedings of the IEEE/RSJ International Conference onIntelligent Robots and Systems, 1992, pp. 1030}1037.

[15] P. Cox, H. Maitre, M. Minoux, C. Ribeiro, Optimalmatching of convex polygons, Pattern Recognition Lett.9 (1989) 327}334.

[16] P.J. van Otterloo, A Contour-Oriented Approach toShape Analysis, Prentice-Hall, Hemel Hampstead, 1991.

1698 R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699

[17] E.M. Arkin, L.P. Chew, D.P. Huttenlocher, K. Kedem,J.S.B. Mitchell, An e$ciently computable metric for com-paring polygonal shapes, IEEE Trans. Pattern Anal.Mach. Intell. 13 (3) (1991) 209}216.

[18] W. Rucklidge, E$cient Visual Recognition Using theHausdor! Distancep, Springer, Berlin, 1996.

[19] R. Azencott, F. Coldefy, L. Younes, A distance for elasticmatching in object recognition, in: Proceedings of the 13thInternational Conference on Pattern Recognition, Vol. 1,1996, pp. 687}691.

[20] A.D. Bimbo, P. Pala, Visual image retrieval by elasticmatching of user sketches, IEEE Trans. Pattern Anal.Mach. Intell. 19 (2) (1997) 121}132.

[21] S.E. Sclaro!, Modal matching: a method for describing,comparing, and manipulating digital signals, Ph.D. Thesis,School of Architecture and Planning, Massachusetts Insti-tute of Technology, 1995.

[22] F.L. Bookstein, Principal warps: thin-plate splines and thedecomposition of deformations, IEEE Trans. PatternAnal. Mach. Intell. 11 (6) (1989) 567}585.

[23] A. Yuille, P. Hallinan, Deformable templates, in: A. Blake,A. Yuille (Eds.), Active Vision, MIT Press, Cambridge,MA, 1992, pp. 21}38.

[24] K. Hirata, T. Kato, Query by visual example, content-based image retrieval, in: A. Pirotte, C. Delobel, G. Got-tlob (Eds.), Advances in Database Technology-EDBT92,Springer, Berlin, 1992.

[25] R. Singh, R.M. Voyles, D. Littau, N.P. Papanikolopoulos,Grasping real objects using virtual images, Proceedings ofthe IEEE Conference on Decision and Control, 1998.

[26] R. Singh, R.M. Voyles, D. Littau, N.P. Papanikolopoulos,Pose alignment of an eye-in-hand system using imagemorphing, Proceedings of the IEEE/RSJ InternationalConference on Intelligent Robots and Systems, 1998.

[27] H. Freeman, L. Davis, A corner-"nding algorithm forchain-coded curves, IEEE Trans. Comput. 26 (1977) 297}303.

[28] T. Pavlidis, S.T. Horowitz, Segmentation of plane curves,IEEE Trans. Comput. 23 (1974) 860}870.

[29] B.K. Ray, K.S. Ray, Determination of optimal polygonfrom digital curve using L

1norm, Pattern Recognition 26

(4) (1993) 505}509.[30] J.J. Brault, R. Plamondon, Segmenting handwritten signa-

tures at their perceptually important points, IEEE Trans.Pattern Anal. Mach. Intell. 15 (9) (1993) 953}957.

[31] M.A. Fischler, R.C. Bolles, Perceptual organization andcurve partitioning, IEEE Trans. Pattern Anal. Mach.Intell. 8 (1) (1986) 100}105.

[32] B.K. Ray, K.S. Ray, An algorithm for detection of domi-nant points and polygonal approximation of digitizedcurves, Pattern Recognition Lett. 13 (12) (1992) 849}856.

[33] P. Zhu, P.M. Chirlian, On critical point detection of digitalshapes, IEEE Trans. Pattern Anal. Mach. Intell. 17 (8)(1995) 737}748.

[34] I. Pavlidis, R. Singh, N.P. Papanikolopoulos, On-linehandwriting recognition using physics-based shape meta-morphosis, Pattern Recognition 31 (11) (1998) 1589}1600.

[35] I. Pavlidis, N.P. Papanikolopoulos, A curve segmentationalgorithm that automates deformable-model based targettracking, Technical Report TR 96-041, University ofMinnesota, 1996.

[36] T.W. Sederberg, E. Greenwood, A physically based ap-proach to 2D shape blending, Comput. Graphics 26 (2)(1992) 25}34.

[37] J. Barros, J. French, W. Martin, P. Kelly, M. Cannon,Using the triangle inequality to reduce the number ofcomparisons required for similarity-based retrieval, in:SPIE, Storage and Retrieval for Still Images and VideoDatabases, Vol. 2670, 1996, pp. 392}403.

[38] W.A. Burkhard, R.M. Keller, Some approaches to best-match "le searching, Commun. ACM 16 (4) (1973) 230}236.

[39] M. Shapiro, The choice of reference points in best-match"le searching, Commun. ACM 20 (5) (1977) 339}343.

About the Author*RAHUL SINGH received his Master of Science in Engineering Degree in Computer Science (with excellence) fromthe Moscow Power Engineering Institute in 1993, the M.S. in Computer Science from the University of Minnesota in 1997 and the Ph.D.in Computer Science from the University of Minnesota in 1999. Currently he is a scientist in Exelixis Inc., in San-Francisco where he isworking on the area of molecular shape recognition and its applications in computational prediction of pharmacologically relevantmolecular properties. In addition to the above areas, his research interests include computer vision, image morphing, document imageanalysis, and applications of virtual reality in vision-based robotics.

About the Author*NIKOLAOS P. PAPANIKOLOPOULOS (S'88}M'93) was born in Piraeus, Greece, in 1964. He received theDiploma Degree in Electrical and Computer Engineering from the National Technical University of Athens, Athens, Greece, in 1987, theM.S.E.E. in Electrical Engineering from Carnegie Mellon University (CMU), Pittsburgh, PA, in 1988, and the Ph.D. in Electrical andComputer Engineering from Carnegie Mellon University, Pittsburgh, PA, in 1992. Currently, he is an Associate Professor in theDepartment of Computer Science at the University of Minnesota. His research interests include Computer Vision, Pattern Recognition,and Robotics. He has authored or coauthored more than 90 journal and conference papers in the above areas. He was "nalist for theAnton Philips Award for Best Student Paper in the 1991 IEEE Robotics and Automation Conference. Furthermore, he was recipient ofthe Kritski fellowship in 1986 and 1987. He is a McKnight Land-Grant Professor at the University of Minnesota for the period1995}1997 and has received the NSF Research Initiation and Early Career Development Awards.

R. Singh, N.P. Papanikolopoulos / Pattern Recognition 33 (2000) 1683}1699 1699