Shape recognition from large image libraries by inexact graph matching

11
Shape recognition from large image libraries by inexact graph matching Benoit Huet * , Edwin R. Hancock Department of Computer Science, University of York, York, Y010 5DD, UK Abstract This paper describes a graph-matching technique for recognising line-pattern shapes in large image databases. The methodological contribution of the paper is to develop a Bayesian matching algorithm that uses edge-consistency and node attribute similarity. This information is used to determine the a posteriori probability of a query graph for each of the candidate matches in the database. The node feature-vectors are constructed by computing normalised histograms of pairwise geometric attributes. Attribute similarity is assessed by computing the Bhattacharyya distance between the histograms. Recognition is realised by selecting the candidate from the database which has the largest a posteriori probability. Ó 1999 Elsevier Science B.V. All rights reserved. Keywords: Image retrieval; Graph matching; Line patterns; Sensitivity study 1. Introduction Broadly speaking there are two sources of in- formation that can be tapped in the content-based retrieval of images from large databases. The first of these is to use a compact summary of the image attributes. One of the best known examples here is the attribute histogram originally popularised by Swain (1993) for retrieving colour images from databases. The idea has been extended to texture (Gimelfarb and Jain, 1996), local feature orienta- tion representations (Dorai and Jain, 1995; Ri- goutsos and Hummel, 1995) and the geometry of line patterns (Evans et al., 1993). The alternative to the use of attributes is to retrieve images based on their structural information content (Costa and Shapiro, 1995; Sengupta and Boyer, 1995). Here the basic idea is to recognise relational object de- scriptions by comparing graph-structure. This approach has been successfully used to recall ob- ject-models from large databases of engineering drawings by Costa and Shapiro (1995). Although both structural indexing and attribute histograms have been separately used to recall image data from databases, they have not been used in conjunction. The aim of this paper is to fill this gap in the literature by developing an ecient graph-matching algorithm which draws on attrib- ute histograms to perform structural recognition. In fact, since Barrow and Popplestone (1971) first suggested that relational structures could be used to represent and interpret 2D scenes, there has been considerable interest in the machine vision literature in developing practical graph-matching algorithms (Sanfeliu and Fu, 1983; Shapiro and Haralick, 1985; Gold and Rangarajan, 1996; www.elsevier.nl/locate/patrec Pattern Recognition Letters 20 (1999) 1259–1269 * Corresponding author. E-mail addresses: [email protected] (B. Huet), erh@ cs.york.ac.uk (E.R. Hancock) 0167-8655/99/$ - see front matter Ó 1999 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 6 5 5 ( 9 9 ) 0 0 0 9 3 - 8

Transcript of Shape recognition from large image libraries by inexact graph matching

Shape recognition from large image libraries by inexact graphmatching

Benoit Huet *, Edwin R. Hancock

Department of Computer Science, University of York, York, Y010 5DD, UK

Abstract

This paper describes a graph-matching technique for recognising line-pattern shapes in large image databases. The

methodological contribution of the paper is to develop a Bayesian matching algorithm that uses edge-consistency and

node attribute similarity. This information is used to determine the a posteriori probability of a query graph for each of

the candidate matches in the database. The node feature-vectors are constructed by computing normalised histograms

of pairwise geometric attributes. Attribute similarity is assessed by computing the Bhattacharyya distance between the

histograms. Recognition is realised by selecting the candidate from the database which has the largest a posteriori

probability. Ó 1999 Elsevier Science B.V. All rights reserved.

Keywords: Image retrieval; Graph matching; Line patterns; Sensitivity study

1. Introduction

Broadly speaking there are two sources of in-formation that can be tapped in the content-basedretrieval of images from large databases. The ®rstof these is to use a compact summary of the imageattributes. One of the best known examples here isthe attribute histogram originally popularised bySwain (1993) for retrieving colour images fromdatabases. The idea has been extended to texture(Gimelfarb and Jain, 1996), local feature orienta-tion representations (Dorai and Jain, 1995; Ri-goutsos and Hummel, 1995) and the geometry ofline patterns (Evans et al., 1993). The alternativeto the use of attributes is to retrieve images basedon their structural information content (Costa and

Shapiro, 1995; Sengupta and Boyer, 1995). Herethe basic idea is to recognise relational object de-scriptions by comparing graph-structure. Thisapproach has been successfully used to recall ob-ject-models from large databases of engineeringdrawings by Costa and Shapiro (1995).

Although both structural indexing and attributehistograms have been separately used to recallimage data from databases, they have not beenused in conjunction. The aim of this paper is to ®llthis gap in the literature by developing an e�cientgraph-matching algorithm which draws on attrib-ute histograms to perform structural recognition.In fact, since Barrow and Popplestone (1971) ®rstsuggested that relational structures could be usedto represent and interpret 2D scenes, there hasbeen considerable interest in the machine visionliterature in developing practical graph-matchingalgorithms (Sanfeliu and Fu, 1983; Shapiro andHaralick, 1985; Gold and Rangarajan, 1996;

www.elsevier.nl/locate/patrec

Pattern Recognition Letters 20 (1999) 1259±1269

* Corresponding author.

E-mail addresses: [email protected] (B. Huet), erh@

cs.york.ac.uk (E.R. Hancock)

0167-8655/99/$ - see front matter Ó 1999 Elsevier Science B.V. All rights reserved.

PII: S 0 1 6 7 - 8 6 5 5 ( 9 9 ) 0 0 0 9 3 - 8

Wilson and Hancock, 1997). The main computa-tional issues are how to compare relational de-scriptions when there is signi®cant structuralcorruption (Sanfeliu and Fu, 1983; Shapiro andHaralick, 1985; Wilson and Hancock, 1997) andhow to search for the best match (Gold andRangarajan, 1996). Despite resulting in signi®cantimprovements in the available methodology forgraph matching, there has been little progress inapplying the resulting algorithms to large-scaleobject recognition problems. Most of the algo-rithms developed in the literature are evaluated forthe relatively simple problem of matching a model-graph against a scene known to contain the rele-vant structure. A more realistic problem is that oftaking a large number (maybe thousands) ofscenes and retrieving the ones that best match themodel. Because of the perceived fragility of thegraph-matching process, there has been rather lesse�ort directed at attempting to retrieve shapesusing relational information (Sengupta and Boyer,1995; Huet and Hancock, 1998a) than in the use oflow-level attribute histograms (Swain and Ballard,1991; Schiele and Crowley, 1996).

1.1. Paper outline

Here we aim to ®ll this gap in the literature byusing graph matching as a means of retrieving theshape from a large database that most closely re-sembles a query shape. Although the indexationimages in large databases is a problem of currenttopicality in the computer vision literature (Swain,1993; Niblack et al., 1993; Pentland et al., 1994;Gevers and Smeulders, 1992; Picard, 1993), thework presented in this paper is more ambitious.Firstly, we adopt a structural abstraction of theshape recognition problem and match using at-tributed relational graphs. Each shape in our da-tabase is a pattern of line segments. The structuralabstraction is a nearest neighbour graph for thecentre-points of the line segments. In addition, weexploit attribute information for the line patterns.Here the geometric arrangement of the line seg-ments is encapsulated using a histogram ofEuclidean invariant pairwise (binary) attributes.For each line segment in turn we construct anormalised histogram of relative angle and length

with the remaining line segments in the pattern.These histograms capture the global geometriccontext of each line segment. Moreover, we in-terpret the pairwise geometric histograms as mea-surement densities for the line segments which wecompare using the Bhattacharyya distance.

Once we have established the pattern represen-tation, we realise object recognition using aBayesian graph-matching algorithm. This is a two-step process. Firstly, we establish correspondencematches between the individual tokens in thequery pattern and each of the patterns in the da-tabase. The correspondences matches are soughtso as to maximise the a posteriori measurementprobability. Here we use an extension of thegraph-matching technique recently reported byWilson and Hancock (1997) in which we use thecorrespondence matches residing on the edgesrather than the nodes of our attributed relationalgraphs (ARGs) to assess consistency. Once theMAP correspondence matches have been estab-lished, then the second step in our recognitionarchitecture involves selecting the line patternfrom the database which has maximum matchingprobability.

2. MAP framework

Formally our recognition problem is posed asfollows. We abstract the shapes to be recognised asARGs. Each ARG in the database is a triple,G � �VG;EG;AG�, where VG is the set of vertices(nodes), EG is the edge set (EG � VG � VG), and AG

is the set of node attributes. In our experimentalexample, the nodes represent line-structures seg-mented from 2D images. The edges are establishedby computing the N -nearest neighbour graph forthe line-centres. Each node j 2 V is characterisedby a vector of attributes, xj and hence AG �fxj j j 2 V g. In the work reported here the attrib-ute-vector represents the contents of a normalisedpairwise attribute histogram.

The database of line patterns is represented bythe set of ARGs D � fGg. The goal is to retrievefrom the database D, the individual ARG thatmost closely resembles a query patternQ � �VQ;EQ;AQ�. We pose the retrieval process as

1260 B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269

one of associating with the query the graph fromthe database that has the largest a posterioriprobability of match. In other words, the classidentity of the graph which most closely corre-sponds to the query is

xQ � arg maxG02D

P �G0jQ�: �1�

However, since we wish to make a detailed struc-tural comparison of the graphs, rather than com-paring their overall statistical properties, we must®rst establish a set of best-match correspondencesbetween each ARG in the database and the queryQ. The set of correspondences between the query Qand the ARG G at iteration n is a relationf �n�G : VG 7! VQ over the vertex sets of the two graphs.The mapping function consists of a set of Carte-sian pairings between the nodes of the two graphs,i.e. f �n�G � f�a; a�; a 2 VG; a 2 VQg � VG � VQ. Al-though this may appear to be a brute force method,it must be stressed that we view this process ofcorrespondence matching as the ®nal step in the®ltering of the line patterns. We provide more de-tails of practical implementation in the experi-mental section of this paper.

With the correspondences to hand we can re-state our maximum a posteriori probability rec-ognition objective as a two-step process. For eachgraph G in turn, we locate the maximum a poste-riori probability mapping function f �n�G onto thequery Q. The second step is to perform recognitionby selecting the graph whose mapping functionresults in the largest matching probability. Thesetwo steps are succinctly captured by the followingstatement of the recognition condition:

xQ � arg maxG02D

maxf �n�

G0

P f �n�G0

���G0;Q� �: �2�

This global MAP condition is developed into auseful local update formula by applying the Bayesformula to the a posteriori matching probability.The simpli®cation is as follows:

P f �n�G

���G;Q� ��

p AG;AQ f �n�G

���� �P f �n�G

���VG;EG; VQ;EQ

� �P�VG;EG�P�VQ;EQ�

P �G�P �Q� :

�3�

The terms on the right-hand side of the Bayesformula convey the following meaning. The con-ditional measurement density p�AG;AQjf �n�G � mod-els the measurement similarity of the node-sets ofthe two graphs. The conditional probabilityP�f �n�G jEG;EQ� models the structural similarity ofthe two graphs under the current set of corre-spondence matches. The assumptions used in de-veloping our simpli®cation of the a posteriorimatching probability are as follows. Firstly, weassume that the joint measurements are condi-tionally independent of the structure of the twographs provided that the set of correspondencesis known, i.e. P �AG;AQjf �n�G ;EG; VG;EQ; VQ� �P�AG;AQjf �n�G �. Secondly, we assume that there isconditional independence of the two graphs in theabsence of correspondences. In other words, P �VG;EG; VQ;EQ� � P �VQ;EQ�P �VG;EG� and P �G;Q� �P�G�P�Q�. Finally, the graph priors P �VG;EG�,P�VQ;EQ�, P �G� and P �Q� are taken as uniformand are eliminated from the decision makingprocess.

To continue our development, we ®rst focuson the conditional measurement density,p�AG;AQjf �n�G � which models the process of com-paring attribute similarity on the nodes of the twographs. Assuming statistical independence ofnode attributes, the conditional measurementdensity p�AG;AQjf �n�G � can be factorised over theCartesian pairs �a; a� 2 VG � VQ which constitutethe correspondence match f �n�G in the followingmanner:

p AG;AQ f �n�G

���� ��

Y�a;a�2f �n�G

p xa; xa f �n�G �a�����

� a�:

�4�As a result the correspondence matches may beoptimised using a simple node-by-node discreterelaxation procedure. The rule for updating thematch assigned to the node a of the graph G is

f �n�G �a� � arg maxa2VQ

p xa; xa f �n�G �a�����

� a�

� P f �n�G

���EG;EQ

� �: �5�

In order to model the structural consistency ofthe set of assigned matches, we turn to the

B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269 1261

framework recently reported by Finch et al.(1997). This work provides a framework forcomputing graph-matching energies, i.e. the loga-rithms of the matching priors, using the weightedHamming distance between matched cliques. Sincewe are dealing with a large-scale object recognitionsystem, we would like to minimise the computa-tional overheads associated with establishing cor-respondence matches. For this reason, rather thanworking with graph neighbourhoods or cliques, wechose to work with the relational units of thesmallest practical size. In other words we satisfyourself with measuring consistency at the edgelevel. For edge-units, the structural matchingprobability P �f �n�G jVG;EG; VQ;EQ� is computed fromthe formula

ln P f �n�G

���VG;EG; VG;EQ

� ��

X�a;b�2EG

X�a;b�2EQ

ln�1n

ÿ Pe�s�n�a;as�n�b;b

� ln Pe 1�ÿ s�n�a;as�n�b;b

�o; �6�

where Pe is the probability of an error appearingon one of the edges of the matched structure. Thes�n�a;a are assignment variables which are used torepresent the current state of match and conveythe following meaning:

s�n�a;a � 1 if f �n�G �a� � a;0 otherwise:

��7�

With this notation, the rule for updating the cor-respondence match between the query graph Qand the graph indexed G in the database is

f �n�1�G �a� � arg maxa2VQ

ln p xa; xa f �n�1�G �a�����24 � a

��X�a;b�2EG

X�a;b�2EQ

ln �1n

ÿ Pe�s�n�a;as�n�b;b

� ln Pe 1�ÿ s�n�a;as�n�b;b

�o35: �8�

This process is applied in parallel to the nodes ofthe graph G for a ®xed number of iterations. Thetrade-o� between the number of iterations re-

quired and the retrieval quality is one of the sub-jects of the experimental study presented inSection 4.

3. Histogram-based consistency

We now furnish some details of the shape re-trieval task used in our experimental evaluation ofthe recognition method. In particular, we focus onthe problem of recognising 2D line patterns in amanner which is invariant to rotation, translationand scale. The raw information available for eachline segment are its orientation (angle with respectto the horizontal axis) and its length (see Fig. 1).To illustrate how the Euclidean invariant pairwisefeature attributes are computed, suppose that wedenote the line segments associated with the nodesindexed a and b by the vectors va and vb, respec-tively. We use two pairwise attributes. The ®rst ofthese is the relative angle given by

ha;b � arccosva � vb

jvaj jvbj� �

:

The second is the normalised length ratio be-tween the oriented baseline vector va and the vec-tor v0 joining the end (b) of the baseline segment(ab) to the intersection of the segment pair (cd).

#a;b � 11

2

��� Dib

Dab

�:

The two attributes are used as a feature-vectorza;b � �ha;b; #a;b�T for the line-segment pair.

Fig. 1. Geometry for shape representation.

1262 B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269

Each node in the shape graph, i.e. each line inthe pattern, is represented by the histogram of itspairwise geometric attributes to the remaininglines in the pattern. This histogram can be thoughtof as a local estimate of the probability distribu-tion for the pairwise attributes. Accordingly, theangle and position attributes ha;b and #a;b are bin-ned in a histogram. Suppose that Sa�l; m�� f�a; b� j ha;b 2 Al ^ #a;b 2 Rm ^ b 2 VDg is the setof nodes whose pairwise geometric attributes withthe node a are spanned by the range of directedrelative angles Al and the relative position attrib-ute range Rm. The contents of the histogram binspanning the two attribute ranges are given byHa�l; m� � jSa�l; m�j. Each histogram contains nA

relative angle bins and nR length ratio bins. Thenormalised geometric histogram bin-entries arecomputed as follows:

ha�l; m� � Ha�l; m�PnA

l0�1

PnR

m0�1 Ha�l; m� : �9�

The probability of match between the histogramsis computed using the Bhattacharyya coe�cientBa;a. The relationship between the coe�cient andthe a posteriori probability is as follows:

P f �n�G �a��

� a���xa; xa

��

PnA

l�1

PnR

m�1

������������������������������ha�l; m�ha�l; m�

pPa2VQ

PnA

l0�1

PnR

m0�1

������������������������������ha�l; m�ha�l; m�

p� exp�ÿBa;a�: �10�

With this modelling ingredient, and using thecorrespondence matches delivered by the graph-matching scheme outlined in Eq. (8),the conditionfor recognition is

xQ � arg maxG02D

X�a;b�2E0G

X�a;b�2EQ

nÿ Ba;a ÿ Bb;b

� ln �1ÿ Pe�s�n�a;as�n�b;b � ln Pe 1ÿ s�n�a;as�n�b;b

� �o;

�11�

where the assignment variables are taken at theterminal iteration of the correspondence-matchingprocess.

4. Alternative retrieval algorithms

The aim of this paper is to compare the graph-based recognition algorithm with some alterna-tives. The two algorithms use less complex graphrepresentations. The most straightforward uses aglobal histogram of the attributes on the edges of anearest neighbour graph. The second algorithmuses the set of attributes on the edges and realisescomparison on an element-by-element basis usinga robust error kernel. The two algorithms havebeen described in detail elsewhere (Huet andHancock, 1998a,b), but hitherto there has been noattempt at comparative sensitivity analysis.

4.1. Relational histograms

The idea here is to conglomerate the node his-tograms into a global histogram. This histogramprovides a statistical summary for the pairwiseattributes residing on the edges of a nearestneighbour graph (Huet and Hancock, 1998a). Thenormalised histogram bin-contents is given by

hGT �l; m� �

Pa2VG

Ha�l; m�Pa2VG

PnA

l0�1

PnR

m0�1 Ha�l; m� : �12�

The best-matching pattern is retrieved from thedatabase on the basis of similarity with the querypattern histogram. Our similarity measure is thehistogram correlation. The measure of patterncorrelation is the Bhattacharyya distance. Theclass identity of the retrieved pattern is

xQ � arg maxG2D

lnXnA

l�1

XnR

m�1

�������������������������������������hQ

T �l; m� � hGT �l; m�

q:

�13�

4.2. Feature-sets

Here our aim is to e�ect retrieval on the basis ofthe similarity of the set of attributes residing on theedges of the nearest neighbour graph (Huet andHancock, 1998b). One of the most popular waysof comparing a set of unordered observationswhose correspondences are unknown is to use theHausdor� distance. However, this measure is

B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269 1263

notoriously susceptible to measurement outliers.For this reason, we choose instead to gauge simi-larity using a robust error kernel. The class iden-tity of the retrieved pattern is

xQ � arg maxG2D

X�i;j�2EG

max�I;J�2EQ

Cr�kzQI ;J

�ÿ zG

i;jk��;

�14�where Cr�q� � exp�ÿq2=r� is a robust weightingkernel, zQ

I ;J a feature-vector from the query graphand zG

i;j is a feature-vector from a target graph.

5. Sensitivity analysis

The aim of this section is to investigate thesensitivity of the three graph-based retrievalstrategies to the systematics of the line-segmenta-tion process. To this end we have simulated thesegmentation errors that can occur when line seg-ments are extracted from realistic image data.Speci®cally, the di�erent processes that we haveinvestigated are listed below:· Extra lines: Here we have added additional lines

at random locations. The lengths and angles ofthe added lines have been generated by random-ly sampling the distribution for the existing im-age segments.

· Missing lines: Here we have deleted a knownfraction of line segments at random locations.

· Split lines: Here a prede®ned fraction of lineshave been split into two segments. The splittingprocess is e�ected by deleting an internal frac-tion of each line segment. The deleted segmentis randomly positioned along the line. The frac-tion of the line deleted is uniformly sampledfrom the range �0; 1�.

· Segment end-point errors: Here we have intro-duced random displacements in the end-pointpositions for a prede®ned fraction of lines.The distribution of end-point errors is Gauss-ian. The degree of error is controlled by thevariance of the Gaussian distribution.

· Combined errors: Here we have introduced thefour di�erent segment errors described abovein equal proportion.The performance measure used in our studies is

computed as follows. We query the database with

a sample of line patterns. For each pattern in turnwe determine whether or not the correct retrievaloccurs in the top-ranked position. By computingthe fraction of queries that return a correctly rec-ognised recall, we determine the average retrievalaccuracy.

We have conducted our experiments with in-exact queries. Here the query pattern is a distortedversion of the target in the database.

In Fig. 2 we show the accuracy of retrieval foreach of the di�erent error processes in turn as afunction of the fraction of added noise. In eachcase the solid curve is for the attribute histogram,the dashed curve is for the feature-sets and thedotted curve is for the graph-matching algorithm.The plots summarise the results for an inexactquery. In other words we query with a line patternthat is similar but not identical to one of the pat-terns in the database. For extra lines (Fig. 2(a)),split lines (Fig. 2(c)), end-point errors (Fig. 2(d))and combined errors (Fig. 2(e)), the graph-matching algorithm is signi®cantly better than theattribute histogram and also o�ers an improve-ment over the use of feature-sets. In the case ofmissing lines, however, the graph-matching algo-rithm only performs as well as the attribute his-togram and is signi®cantly poorer than the use offeature-sets. In other words, the method copes wellwith the addition of clutter (through either noiseor line-fragmentation) and measurement error, butperforms relatively badly when part of the linepattern is removed.

To conclude the sensitivity study, we focusmore closely on the role of segment end-point er-rors. The reason for this is that such errors willa�ect the accuracy of the relational measurements.Fig. 3 shows the average error in the relative angleattribute as a joint function of the fraction of linesa�ected by such errors and the standard deviationof the Gaussian position error. The main featureto note from this plot is that the angle error in-creases with both the fraction of a�ected lines andthe variance of the positional errors. Fig. 4 showsthe e�ect of line end-point position errors for in-exact queries. The di�erent curves in the plotscorrespond to di�erent values of the standard de-viation of the end-point position errors. Theyshow the accuracy of retrieval as a function of the

1264 B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269

Fig. 3. E�ect of introducing end-point errors on the line-segment's orientation.

Fig. 2. E�ect of various kinds of noise for similarity queries.

B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269 1265

fraction of lines a�ected by end-point errors. Asthe standard deviation of the position error in-creases, so the fraction of corrupt lines for whichperfect recall is possible decreases. The main pointto note from these plots is that the graph-matchingmethod degrades less rapidly under line end-pointerrors than the set-based method and the rela-tional histogram.

6. Recognition experiments

The aim of this section is to provide somequalitative examples of the graph-based recogni-tion scheme on a database of real-world line pat-

terns. We have conducted our recognitionexperiments with a database of 2500 line patternseach containing over a 100 lines. The line patternshave been obtained by applying line or edge de-tection algorithms as appropriate to the raw grey-scale images. Straight line segments are extractedby polygonising the resulting feature-maps. Foreach line pattern in the database, we construct theadjacancy graph of the line centre-points. Thefeature extraction process together with other de-tails of the data used in our study are described inrecent papers where we have focussed on the issuesof histogram representation (Huet and Hancock,1998a) and the optimal choice of the relationalstructure (Huet and Hancock, 1998b) for thepurposes of recognition.

Fig. 4. E�ect of introducing segment end-point errors on retrieval performance when the targets and the query are similar but not

identical.

1266 B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269

The recognition task is posed as one of re-covering the line pattern which most closely re-sembles a digital map. The original images fromwhich our line patterns have been obtained arefrom a number of diverse sources. However, asubset of the images are aerial infra-red line-scanviews of southern England. Two of these infra-red images correspond to di�erent views of thearea covered by the digital map. These views areobtained when the line-scan device is ¯ying atdi�erent altitudes. The line-scan device used toobtain the aerial images introduces severe barreldistortions and hence the map and aerial imagesare not simply related via a Euclidean or a�netransformation. The remaining line patterns inthe database have been extracted from trade-marks and logos. It is important to stress thatalthough the raw images are obtained from dif-ferent sources, there is nothing salient about theirassociated line pattern representations that allowsus to distinguish them from one-another. More-over, since it is derived from a digital map ratherthan one of the images in the database, the queryis not identical to any of the line patterns in themodel library. In order to show that the recog-nition method is not sensitive to the segmentationprocess, we have included four under and oversegmented versions of the low altitude image inthe database (Fig. 5). The results of our recog-nition experiments are presented in the form ofpanels of `thumbnail' images. The thumbnails areordered (from left-to-right and top-to-bottom)

according to their distance from the query image.The ranked thumbnail images are compared forthe graph-matching method, the feature-sets andthe global histogram in Fig. 6. In each case the®ve di�erent segmentations of the low-altitudetarget image appear in the top-ranked positions.In other words, although the controlled over andunder segmentation of the image results in sig-ni®cant variations in the shape-graph, there is noe�ect on recognition performance. The high alti-tude image of the same area appears at rank 6.However, in the case of the relational histogramand the feature-sets, the top 20 thumbnails arecontaminated with logos.

7. Conclusions

We have presented a practical graph-matchingalgorithm for data-mining in large structural li-braries. The main conclusion to be drawn fromthis study is that the combined use of structuraland histogram information improves recognitionperformance. There are a number of ways in whichthe ideas presented in this paper can be extended.Firstly, we intend to explore more a perceptuallymeaningful representation of the line patterns,using a grouping principle derived from Gestaltpsychology. Secondly, we are exploring the possi-bility of formulating the ®ltering of line patternsprior to graph matching using Bayes decisiontrees.

Fig. 5. Images from the database.

B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269 1267

Discussion

Gelsema: How does the performance of thismethod depend on the number of lines that youhave in your images, the number of lines that youare trying to match. I could imagine that there issome optimum. If you have too few lines, youcannot do the match at all, and if you have toomany lines, the whole system may become toocomplicated.

Hancock: I have some results for the histogram-based method, without using the graph technique.We found that if we have too many lines, there issaturation of the histogram; if we have too few,then there are small-sample or counting errors.

Gelsema: Does this in some way possibly restrictthe application areas in which you could use thismethod?

Hancock: Here we are looking at line patternsthat consists of typically between 300 and 400 to-kens. And I think that is a problem of fairly de-manding size. If you have larger graphs, there areseveral things you may do to try to restrict the

complexity of the graphs. For instance, you couldprune them in some way.

Sagerer: If you use such a procedure for largedatabases and you cannot guarantee that thegraphs in the database are all acquired by the samealgorithm, is your technique robust against dif-ferent types of line ®nders, graph generators?

Hancock: It is not clear what the e�ect of havingdi�erent graph generator processes would be. Ourresults show that you can obtain much betterperformance with Delaunay graphs rather thanwith the nearest neighbour graph of low order. So,generating the graphs, for instance, by a nearestneighbour rule, or by a Delaunay triangulation,will a�ect the recognition performance. If thestructures are two nearest neighbour graphs, youwould expect the performance to be rather poorerthan in the case of Delaunay graphs. The secondpoint is: What happens if we have images fromdi�erent sources and di�erent segmentation strat-egies? In fact, in this database, we did use di�erentsegmentation strategies. For the trademarks andlogos we used the Canny edge detector. And for

Fig. 6. The result of querying the database with the digital map. (a) Result obtained with the global relational histogram. (b) Result

obtained with the feature-sets. (c) Result obtained when graph-matching histogram is used. The images are ordered from left-to-right

and top-to-bottom in increasing distance from the query image, which is shown at the top of the ®gure.

1268 B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269

the straight line segments we used a relaxation line®nder, which I developed some years ago. Mixingthose two feature detection strategies did not ap-pear to bias the performance. We have repeatedthese experiments, trying to retrieve trademarks,instead of aerial images. We obtained good clus-tering of similar images. We have some examplesof alphabetic character retrieval, in which we getgood clustering results. And these images wereobtained with a segmentation strategy very di�er-ent from the one used in the aerial images.

References

Barrow, H.G., Popplestone, R.J., 1971. Relational descriptions

in picture processing. Machine Intelligence 5, 377±396.

Costa, M.S., Shapiro, L., 1995. Scene analysis using appear-

ance-based models and relational indexing. In: IEEE

Computer Soc. Internat. Symp. Computer Vision, pp.

103±108.

Dorai, C., Jain, A.K., 1995. View organisation and matching of

free-form objects. In: IEEE Computer Soc. Internat. Symp.

Computer Vision, pp. 25±30.

Evans, A.C., Thacker, N.A., Mayhew, J.W.E., 1993. The use of

geometric histograms for model-based object recognition.

In: Proc. 4th British Machine Vision Conf., September 1993,

pp. 429±438.

Finch, A.M., Wilson, R.C., Hancock, E.R., 1997. Softening

discrete relaxation. In: Mozer, M., Jordan, M., Petsche, T.

(Eds.), Advances in Neural Information Processing Sys-

tems. MIT Press, Cambridge, MA, pp. 438±444.

Gevers, T., Smeulders, A.W.M., 1992. Rnigma: an image

retrieval system. In: Proc. Internat. Conf. Pattern Recogni-

tion, pp. 697±700.

Gimelfarb, G.L., Jain, A.K., 1996. On retrieving textured

images from an image database. Pattern Recognition 29 (9),

1461±1483.

Gold, S., Rangarajan, A., 1996. A graduated assignment

algorithm for graph matching. IEEE Trans. Pattern Anal.

Machine Intell. 18, 377±388.

Huet, B., Hancock, E.R., 1998a. Relational histograms for

shape indexing. In: Proc. IEEE Internat. Conf. Computer

Vision, January 1998, pp. 563±569.

Huet, B., Hancock, E.R., 1998b. Fuzzy relational distance for

large-scale object recognition. In: Proc. IEEE Conf. Com-

puter Vision and Pattern Recognition, June 1998, pp. 138±

143.

Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman,

E., Petkovic, D., Yanker, P., Faloutsos, C., Taubin, G.,

1993. The QBIC project: Querying images by content using

color, texture and shape. In: Proc. SPIE Conf. Storage and

Retrieval of Image and Video Databases, San Jose,

California, Vol. 1908, pp. 173±187.

Pentland, A.P., Picard, R.W., Scarlo�, S., 1994. Photobook:

tools for content-based manipulation of image databases.

Storage and Retrieval for Image and Video Database II,

San Jose, California, pp. 34±47.

Picard, R.W., 1995. Light-years from Lena: video and image

libraries of the future. In: Internat. Conf. Image Processing,

Vol. 1, pp. 310±313.

Rigoutsos, I., Hummel, R., 1995. A Bayesian approach to

model matching with geometric hashing. Computer Vision

and Image Understanding 62, 11±26.

Sanfeliu, A., Fu, K.S., 1983. A distance measure between

attributed relational graph. IEEE Trans. Systems Man

Cybernet. 13, 353±362.

Schiele, B., Crowley, J.L., 1996. Probabilistic object recogni-

tion using multidimensional receptive ®eld histograms. In:

Proc. 13th Internat. Conf. Pattern Recognition, Vol. 2,

pp. 50±54.

Sengupta, K., Boyer, K.L., 1995. Organising large structural

databases. IEEE Trans. Pattern Anal. Machine Intell. 17

(4), 321±332.

Shapiro, L.G., Haralick, R.M., 1985. A metric for comparing

relational descriptions. IEEE Trans. Pattern Anal. Machine

Intell. 7 (1), 90±94.

Swain, M.J., 1993. Interactive indexing into image databases.

Image and Vision Storage and Retrieval, 95±103.

Swain, M.J., Ballard, D.H., 1991. Color indexing. Internat. J.

Comput. Vision 7 (1), 11±32.

Wilson, R., Hancock, E.R., 1997. Structural matching by

discrete relaxation. IEEE Trans. Pattern Anal. Machine

Intell. 19, 634±648.

B. Huet, E.R. Hancock / Pattern Recognition Letters 20 (1999) 1259±1269 1269