Nearest-neighbour classifiers in natural scene analysis

12
Pattern Recognition 34 (2001) 1601}1612 Nearest-neighbour classi"ers in natural scene analysis Sameer Singh*, John Haddon, Markos Markou Department of Computer Science, University of Exeter, Prince of Wales Road, Exeter, Devon EX 4 4PT, UK Defence Evaluation and Research Agency, Farnborough, UK Received 29 July 1999; accepted 27 March 2000 Abstract It is now well-established that k nearest-neighbour classi"ers o!er a quick and reliable method of data classi"cation. In this paper we extend the basic de"nition of the standard k nearest-neighbour algorithm to include the ability to resolve con#icts when the highest number of nearest neighbours are found for more than one training class (model-1). We also propose model-2 of nearest-neighbour algorithm that is based on "nding the nearest average distance rather than nearest maximum number of neighbours. These new models are explored using image understanding data. The models are evaluated on pattern recognition accuracy for correctly recognising image texture data of "ve natural classes: grass, trees, sky, river re#ecting sky and river re#ecting trees. On noise contaminated test data, the new nearest neighbour models show very promising results for further studies. We evaluate their performance with increasing values of neighbours (k) and discuss their future in scene analysis research. Crown Copyright 2001 Published by Elsevier Science Ltd. on behalf of Pattern Recognition Society. All rights reserved. Keywords: Scene analysis; Classi"ers; Nearest-neighbour method; Image understanding 1. Introduction A considerable amount of research has been under- taken globally in the last decade for developing intelli- gent image processing systems. Although a range of basic image processing tools have been around for the last three decades, in the last decade a sharp increase in cheap computational power has meant that we are able to implement and test our models in real applications. Re- search has focussed on the following issues: the develop- ment of image segmentation methods that work in real noisy environments, encoding image component rela- tionships, and developing the technology for intelligent classi"ers. A number of studies have given generic sur- veys in the area. Kodrato! and Moscatelli [1] survey learning in image processing applications discussing * Corresponding author. Tel.: #44-1392-264061; fax: #44- 1392-264067. E-mail addresses: s.singh@exeter.ac.uk (S. Singh), jf } haddon@dera.gov.uk (J. Haddon), m.markou@exeter.ac.uk (M. Markou). learning in 2D shape models, learning strategic knowledge for optimising model matching, learning in automated target recognition, and constraint rules for labelling. Rosenfeld [2] provides an extensive bibliography of com- puter vision research areas arranged by subject and Skrzypek et al. [3] discuss a decade of research at UCLA on the application of neural networks in image process- ing. Yamamoto [4] details several issues in image under- standing including active range "nders, passive stereo sensing, 3D reconstruction, 3D scene analysis, dynamic scene analysis, automatic knowledge acquisition and autonomous vision systems. On defence applications of image understanding, Kohl and Mundy [5] describe a "ve-year DARPA program between CMU and Colorado State university and Firschein [6] details a number of defence applications of image understanding. A detailed treatment of DARPA research in USA is given by Simpson [7]. There are two key components to scene analysis: image segmentation and classi"er analysis. Image segmentation is a key step in the understanding and interpretation of natural scenes and several di!erent methods have been used to achieve this. Neural networks and statistical 0031-3203/01/$20.00 Crown Copyright 2001 Published by Elsevier Science Ltd. on behalf of Pattern Recognition Society. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 0 0 ) 0 0 0 9 9 - 6

Transcript of Nearest-neighbour classifiers in natural scene analysis

Pattern Recognition 34 (2001) 1601}1612

Nearest-neighbour classi"ers in natural scene analysis

Sameer Singh��*, John Haddon�, Markos Markou�

�Department of Computer Science, University of Exeter, Prince of Wales Road, Exeter, Devon EX 4 4PT, UK�Defence Evaluation and Research Agency, Farnborough, UK

Received 29 July 1999; accepted 27 March 2000

Abstract

It is now well-established that k nearest-neighbour classi"ers o!er a quick and reliable method of data classi"cation. Inthis paper we extend the basic de"nition of the standard k nearest-neighbour algorithm to include the ability to resolvecon#icts when the highest number of nearest neighbours are found for more than one training class (model-1). We alsopropose model-2 of nearest-neighbour algorithm that is based on "nding the nearest average distance rather than nearestmaximum number of neighbours. These new models are explored using image understanding data. The models areevaluated on pattern recognition accuracy for correctly recognising image texture data of "ve natural classes: grass, trees,sky, river re#ecting sky and river re#ecting trees. On noise contaminated test data, the new nearest neighbour modelsshow very promising results for further studies. We evaluate their performance with increasing values of neighbours (k)and discuss their future in scene analysis research. CrownCopyright� 2001 Published by Elsevier Science Ltd. on behalfof Pattern Recognition Society. All rights reserved.

Keywords: Scene analysis; Classi"ers; Nearest-neighbour method; Image understanding

1. Introduction

A considerable amount of research has been under-taken globally in the last decade for developing intelli-gent image processing systems. Although a range of basicimage processing tools have been around for the lastthree decades, in the last decade a sharp increase in cheapcomputational power has meant that we are able toimplement and test our models in real applications. Re-search has focussed on the following issues: the develop-ment of image segmentation methods that work in realnoisy environments, encoding image component rela-tionships, and developing the technology for intelligentclassi"ers. A number of studies have given generic sur-veys in the area. Kodrato! and Moscatelli [1] surveylearning in image processing applications discussing

*Corresponding author. Tel.: #44-1392-264061; fax: #44-1392-264067.E-mail addresses: [email protected] (S. Singh),

jf}[email protected] (J. Haddon), [email protected](M. Markou).

learning in 2D shape models, learning strategic knowledgefor optimising model matching, learning in automatedtarget recognition, and constraint rules for labelling.Rosenfeld [2] provides an extensive bibliography of com-puter vision research areas arranged by subject andSkrzypek et al. [3] discuss a decade of research at UCLAon the application of neural networks in image process-ing. Yamamoto [4] details several issues in image under-standing including active range "nders, passive stereosensing, 3D reconstruction, 3D scene analysis, dynamicscene analysis, automatic knowledge acquisition andautonomous vision systems. On defence applications ofimage understanding, Kohl and Mundy [5] describe a"ve-year DARPA program between CMUand ColoradoState university and Firschein [6] details a number ofdefence applications of image understanding. A detailedtreatment of DARPA research in USA is given bySimpson [7].There are two key components to scene analysis: image

segmentation and classi"er analysis. Image segmentationis a key step in the understanding and interpretation ofnatural scenes and several di!erent methods have beenused to achieve this. Neural networks and statistical

0031-3203/01/$20.00 Crown Copyright� 2001 Published by Elsevier Science Ltd. on behalf of Pattern Recognition Society. All rightsreserved.PII: S 0 0 3 1 - 3 2 0 3 ( 0 0 ) 0 0 0 9 9 - 6

clustering methods using texture, shape and colour in-formation are popular methods of image segmentation.Campbell et al. [8] describe image segmentation systemusing neural networks for outdoor environment classify-ing vegetation, buildings, vehicles, roads, etc. on the basisof colour, texture and shape features. Liu and Yun [9]describe a vector quantisation approach for image seg-mentation eliminating the need for setting thresholds.The procedure uses a competitive neural network com-bining the advantages of self-organising feature map andK-means clustering method. Object recognition usingKohonen's self-organised feature maps is also discussedby Lakany et al. [10]. Kasparis et al. [11] detail a neuralnetwork object recognition system based on textureanalysis. The texture features are based on Houghtransform-based descriptors. Booth and Allen [12] use aneural network for real-time scene analysis by usinghardware implementation. In addition to texture,images can also be segmented using localised histograms.Beveridge et al. [13] describe such a system. Localisedhistograms are "rst simpli"ed using a region mergingalgorithm before their application to real images. Fuzzytechniques have also been used in image segmentation.Dellepiane and Vernazza [14] use a region growing pro-cedure for pixel grouping. Fuzzy techniques are used forpixel a$nity measurement. In most cases, this results inpartial segmentation that can be used with other know-ledge about objects for labelling them. In those condi-tions where higher level information on image contexts isavailable for guiding lower level operations, better seg-mentation results are achieved [15]. The segmentationprocess should be optimised to get the best resultsthrough performance evaluation of texture-based seg-mentation algorithms [16].Nearest-neighbour methods provide an important

data classi"cation tool for recognising object classes inpattern recognition domains. Standard models of classi-"cation including nearest-neighbours can be studiedfrom one of the several available books in the area [17].The main objective of this paper is to develop two newversions of the nearest-neighbour method and applythem to the scene analysis problem. For image segmenta-tion, the paper will use co-occurrence matrices andHermite functions. The features extracted will be used asinput to the newly developed classi"ers. The "rst modelwill resolve con#icts in the k nearest-neighbour rule. Thesecond model will be based on the closest average dis-tance of samples of classes involved. The performance ofthese two models will be evaluated on real image under-standing data. The paper is organised as follows. In thenext section we discuss nearest-neighbour methods andstrategies for the improvement of traditional models. Wewill then discuss the image understanding data and itspre-processing. This description involves image pre-processing, texture analysis, feature selection and datageneration. The result section will show the performance

of the two models for one, three, "ve and seven nearestneighbours in terms of their recognition rates and dis-cusses the confusion matrices produced. Finally, we con-clude by highlighting further research in this area.

2. Model and strategies

Nearest-neighbour methods have been used as an im-portant pattern recognition tool. In such methods, theaim is to "nd the nearest neighbours of an unidenti"edtest pattern within a hyper-sphere of pre-de"ned radiusin order to determine its true class. The traditional near-est-neighbour rule has been described as follows [17]:

� Out of N training vectors, identify the k nearest neigh-bours, irrespective of class label. k is chosen to be odd.

� Out of these k samples, identify the number of vectors,k�, that belong to class �

�, i"1, 2,2,M. Obviously

��k�"k.

� Assign x to the class ��with the maximum number

k�of samples.

Nearest-neighbour methods can detect a single ormultiple number of nearest neighbours. A single nearest-neighbour method is primarily suited to recognising datawhere we have su$cient con"dence in the fact that classdistributions are non-overlapping and the features usedare discriminatory [18]. In most practical applications,however, the data distributions for various classes areoverlapping and more than one nearest-neighbours areused for majority voting. In k nearest-neighbour meth-ods, certain implicit assumptions about data are madein order to achieve a good recognition performance. The"rst assumption requires that individual feature vectorsfor various classes are discriminatory. This assumes thatfeature vectors are statistically di!erent across variousclasses. This ensures that for a given test data, it is morelikely to be surrounded by data of its true class ratherthan of di!erent classes. The second assumption requiresthat the unique characteristic of a pattern that de"nes itssignature, and ultimately its class, is not signi"cantlydependent on the interaction between various features. Inother words, nearest-neighbour methods work betterwith data where features are statistically independent.This is because nearest-neighbour methods are based onsome form of distance measure and nearest-neighbourdetection of test data is not dependent on their featureinteraction. Neural networks are better classi"ers whendata is strongly correlated across di!erent features asthey can model this interaction by weight adjustment. Inpractice, the above assumptions are not always satis"ed.In most applications, data is often non-linear, stronglycorrelated across various features and have overlappingfeature distributions across various classes. In such cases,for nearest-neighbour techniques to perform at a desired

1602 S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612

level, further data analysis and algorithm adjustment isnecessary. Data analysis can improve results by usingprincipal components analysis (PCA) to remove featuredependencies and further pre-processing for normalisa-tion, outlier removal and noise management. On the linesof algorithm modi"cation, we propose two models thatare modi"cations of the standard k nearest-neighbourrule. These models will enable an improvement to bemade on the results of nearest-neighbour techniques fora range of pattern recognition problems.The two proposed models based on the nearest-neigh-

bour philosophy are described below. In model-1, we usetwo stages. In the "rst stage, if we "nd that a given classhas more training samples closer to the test pattern, thenwe declare this class as an outright winner and allocatethe test pattern to this class. However, if we "nd thatthere are an equal number of highest neighbours formore than two classes surrounding the test pattern, thenwe perform the second stage called conyict resolution. Atthis stage, the class whose distance from the test data,averaged over all its training samples within the hyper-sphere, is found to be the smallest is declared the winner.In model-2, we do not consider the quantity of neigh-bours but only the average distance of classes from testdata. These average distances are based on the distancefrom the test data of all training samples of given classesfound within the hypersphere. The class with the smallestdistance from the test data is declared the winner and thetest pattern is allocated to this class.

Model-1 NN rule:

� Out of n training vectors, identify the k nearest-neigh-bours, irrespective of class label. k is chosen to be odd.

� Out of these k samples, identify the number of vectors,k�, that belong to class �

�, i"1, 2,2M. Obviously

��k�"k.

� Assign x to the class ��with the maximum number

k�of samples.

� If two or more classes ��, i3 [12M], have an equal

number E of maximum nearest neighbours, then wehave a tie (conyict). Use con#ict resolution strategy.

� For each class involved in the con#ict, determine thedistance d

�between test pattern x"�x

�,2, x

�) and

class ��based on the E nearest-neighbours found for

class��. If the mth training pattern of class�

�involved

in the con#ict is represented as y��"�y���,2, y��

�) then

the distance between test pattern x and class ��is

d�"

1

E

�����

�(x�!y��

�)�.

� Assign x to class C if its d�is the smallest, i.e. x3�

�, if

d��d�for ∀i, such that C3[12M] and iOC.

Model-2 NN rule:

� Out of n training vectors, identify the k nearest-neigh-bours, irrespective of class label. k is chosen to be odd.

� Out of these k samples, identify the number of vectors,k�, that belong to class �

�, i"1, 2,2,M. Obviously

��k�"k.

� Find the average distance d�that represents the

distance between test pattern x"�x�,2, x

�) and

E�nearest-neighbours found for class �

�, i"1,2,M.

Only include classes for which samples were detectedin the "rst step. If the mth training pattern of class�

�found within the hypersphere is represented as

y��"�y���,2, y��

�), then the distance between test pat-

tern x and class ��is

d�"

1

E�

�����

�(x�!y��

�)�.

� Assign x to class C if its d�is the smallest, i.e. x3�

�, if

d�(d

�for ∀i, such that C3[1,2,M] and iOC. The

decision in this model does not depend on the numberof nearest-neighbours found but solely on the averagedistance between the test pattern and samples of eachclass found.

The two models can be explained using Fig. 1. In Figs.1(a)}(d), we have assumed a total of "ve classes (&a' to &e').The samples of each class are represented by symbols &a'to &e'. The test pattern is shown as the square blockaround which a hypersphere is drawn to determine thenumber of neighbours included in the analysis. In a tradi-tional nearest-neighbour implementation, Fig. 1(a)would assign the test pattern to class &a' as there are twosamples of class &a' and only one sample of class &b' withinthe boundary. Such decisions are based only on thenumber of nearest neighbours found. In Fig. 1(b) weshow the problem of neighbour con#ict. In such cases,equal number of training neighbour are found for morethan one class when determining the class of the testpattern. We term this a conyict. Con#icts can be resolvedby either increasing the size of the hypersphere, i.e. in-volving more neighbours for a clear-cut decision, or byusing con#ict resolution described in model-1 above.Fig. 1(c) shows model-2 process of "nding the true classof the test pattern. Here the distance from a given class tothe test pattern represents the averaged distance of allsamples of that class found within the hypersphere. If allsamples of a given class, e.g. &e' in Fig. 1(c), lie outside thehypersphere, then these are not included in the analysis.In all nearest-neighbour methods, the number of neigh-bours analysed has a very important e!ect on the resultsof the analysis. This is shown in Fig. 1(d). In this "gure,when using the inner sphere, the class assignment for theunknown test pattern is &a'. When we consider moreneighbours with the outer sphere, the class assignmentchanges to &d'. Thus, one important parameter to opti-mise in nearest-neighbour methods is the number ofneighbours included in the analysis.The abovemodels will be analysed in this paper on image

understanding data. The modi"ed nearest-neighbour

S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612 1603

Fig. 1. k nearest-neighbour models and strategies: (a) traditional k nearest-neighbour model; (b) Con#ict resolution; (c) closest averagedistance model; and (d) hypersphere size e!ect on pattern recognition.

Fig. 2. Extract of generic image understanding system.

methods will be used to classify unidenti"ed data ofnatural classes from segmented images of forward look-ing infrared (FLIR) imagery on the basis of training datawith samples from these "ve classes: (grass, trees, sky,river re#ecting trees, and river re#ecting sky). The imageunderstanding problem and the data used for this paperis explained in the next section.

3. Image understanding

Themain thrust of our current work is on autonomousscene analysis. It is in this context that we evaluate thenearest-neighbour methods. The aim is to develop intelli-gent systems that are capable of accurate recognition ofvarious classes in natural scenes. An extract from thegeneric image understanding system is shown in Fig. 2.The components of this extract will now be discussed.

� Image acquisition: In the example of Fig. 5, a FLIRimage has been taken from a low-#ying aircraft as it#ew along a river and across a bridge. Note how the

1604 S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612

far hillside of trees is re#ected in the river beyond thebridge and the sky is re#ected in the river in front ofthe bridge. In this image, hot areas appear white whilecool areas are dark. Hot areas on the cli! beyond theriver are re#ected in the river. This image is typical ofthe sort of scenes that the techniques are designed toanalyse, although in general, image quality will beconsiderably poorer.

� Segmentation: The FLIR image has been segmentedusing edge-based co-occurrence techniques [19,20]that are able to both segment the key regions of theimage, and to detect boundaries. A temporal compon-ent to the analysis has been incorporated into theco-occurrence matrix generation so that the quality ofimage segmentation is consistent between di!erent im-ages in the sequence.

� Texture description: Each of the large ('500 pixels)segmented regions have been subjected to detailedtexture analysis. A co-occurrence matrix [21] of a re-gion of a single texture is widely recognised as having acharacteristic texture that can be described using avariety of techniques [22]. In this research, the seg-mented regions are described using edge-based co-occurrence matrices [23], an extension of the normalgrey level co-occurrence matrix that allows greater#exibility in the description of the texture. The pre-dominant form of these matrices is Gaussian with anoverlaid higher level structure. The underlying Gaus-sian is due to Gaussian noise in the original imagerywhile the higher level structure is due to the texture ofthe originating region. It is this structure which mostco-occurrence-based texture measures seek to de-scribe.

An edge co-occurrence matrix is generated for eachregion using an operator appropriate to the scale andorientation of the texture. The physical aspects of theimaging system are taken into account in the de"nitionof this operator. The resultant edge co-occurrence matrixis decomposed using a set of discrete orthogonal Hermitefunctions de"ned on a lattice [24]. The zeroth-orderHermite describes the underlying Gaussian in the matrixand hence the Gaussian noise in the image, while thehigher orders describe the higher-order structure of thematrix and hence the texture of the region. The de"nitionof the Hermite functions and the decomposition of thematrix is discussed in greater detail below.

� Feature classixcation: The hermite decomposition ofthe edge co-occurrence matrices results in a low-orderfeature vector. This is used within the nearest-neigh-bour classi"ers de"ned earlier.

� Performance assessment: The performance of the clas-si"ers and the whole of the image analysis is assessed.This will be used within feedback mechanisms to bothensure that the image analysis is su$cient for the

application, and that new classes that occur within thedata stream can be clustered and incorporated intofuture classi"er systems.

Fig. 3 shows the layout of modules that are beingdeveloped within the image understanding scenario andsome aspect of their interaction and dependencies. Thosethat are shaded contain a lot of sub-modules for theactual analyses. Those that have a wide border alreadycontain intelligent components. The continuous arrowsindicate the #ow of data while the thick arrows indicatethe #ow of intelligence information. In traditional analy-sis, the data #ow during the processing of a single imageis vertical within the diagram. In this research, this iscomplemented by the #ow of intelligence and feedbackinformation, both during the analysis of a single image,and during the analysis of a sequence of images. Thelatter is shown as the horizontal links between the verti-cal components of the diagram. A considerable amountof information #ows forward between the processing ofeach image and enables the techniques to ensure that theprocessing is consistent across time. This information isused as a guideline as changes in image content or imag-ing conditions must be re#ected in subtle changes inprocess parameters. Light dotted arrows indicate the#ow of this data/intelligence being passed forward intime.Consider a function f (n�x, m�y), centred at (x

�, y

�)

with standard deviations (��, �

�). This may be decom-

posed into pq discrete orthogonal Hermite functions:

f (n�x, m�y)"���

�����

f��

�(n�x!x

�) �

(m�y!y

�),

(1)

where

f�

"��

f (n�x, m�y) ��(n�x!x

�) �

(m�y!y

�) (2)

provide a low-order feature vector descriptive of thetexture in the region. The error �

�in the expansion is

given by

���

"��

( f (n�x, m�y)

!

�����

���

f��

�(n�x!x

�)�

(m�y!y

�))�. (3)

Since

��

"

��

���

�����

� f��

��, p�'p, q�'q (4)

then this error will remain the same, or decrease, ifadditional terms are used in the expansion, i.e.

����

)��, ∀(p�'p, q�'q). (5)

S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612 1605

Fig. 3. Scene analysis: operational steps for recognising FLIR image sequences.

These equations assume axes parallel to a square grid,co-occurrence matrices have axes along, and perpendicu-lar to, the leading diagonal of the matrix. Accordingly,the above de"nitions are modi"ed to include a transla-tion along the x-, y-axis and a 453 rotation so that the

axis of the basis function coincide with the natural axisof the co-occurrence matrix. Fig. 4 shows the "rst fewtwo-dimensional Hermite functions (not to the samescale) that are used to decompose the co-occurrencematrices.

1606 S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612

Fig. 4. The few 2D Hermite functions.

Fig. 5. Original scene image and segmented output.

Feature selection and generation: The co-occurrencematrix decomposition techniques de"ned above providea low-order feature vector descriptive of the texture ofa region. The zeroth-order coe$cient describes the Gaus-sian noise while the higher orders describe the textureand are used as the &raw' features in the classi"cationanalysis described in this paper. The following featuresets were derived for training and test purposes.

Training data Originally 121 coe$cients derived

f�, k, l"0. . 10

These coe$cients are then analysed fortheir discriminatory power using lineardiscriminant analysis (LDA). A total of42 features are then selected for "nalanalysis. The data is normalised withinthe [0, 1] range. A total of 3777 patternsare used.

Test data Each feature in the training data is con-taminated by a Gaussian noise distri-bution (sd"1) to yield the test data.The noise added varies from 1 to 10%of the training data value. The test setsize is the same as the training data, 3777patterns (Grass"1924, Tree"1033,Sky"273, River Re#ecting Sky"225,River Re#ecting Trees"321).

4. Results

Our previous analysis of the feature data shows thatit is highly correlated and overlapping across dif-ferent classes. In particular, it is di$cult to separatevegetation (grass from trees) and re#ection of trees inriver from other vegetation. Linear methods are parti-cularly worse with our data. In most trials with ourimage data, we found that average recognition rates varybetween 40 and 50%. Past studies based on similarapplications have shown best results with neural net-works [25,26]. This study focuses to demonstrate thefollowing: (1) nearest-neighbour methods perform ro-bustly with increasing noise in test data; (2) comparativeanalysis of the two models proposed earlier with neuralnetworks.Tables 1 and 2 show the performance of model-1 and

model-2 nearest-neighbour algorithms. A total of 10 test

S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612 1607

Table 1Nearest-neighbour model-1 and model-2 recognition rates for 1 and 3 nearest neighbours

k"1 k"3

Noise (%) Ties T S�

R�(%) R

�(%) Ties T S

�R�(%) R

�(%)

1 0 0 100 100 241 100 81.3 100.02 0 0 99.9 99.9 232 100 81.3 99.93 0 0 99.4 99.4 236 100 81.0 99.44 0 0 98.8 98.8 242 99.5 80.7 98.65 0 0 97.7 97.7 252 100 81.0 97.56 0 0 96.6 96.6 265 98.4 80.5 96.27 0 0 94.7 94.7 281 96 79.9 93.78 0 0 92.3 92.3 282 92.9 78.9 91.29 0 0 90.4 90.4 304 88.4 77.4 89.210 0 0 87.9 87.9 294 84 76.4 86.6

Table 2Nearest neighbour model-1 and model-2 recognition rates for 5 and 7 nearest neighbours. Neural network performance is shown forcomparison in the last column

k"5 k"7

Noise (%) Ties T S�

R�(%) R

�(%) Ties T S

�R�(%) R

�(%) N

���(%)

1 308 78.9 76.3 99.9 193 82.9 72.9 99.5 89.12 317 77.3 76.3 99.6 208 85.1 73.4 98.9 85.33 335 79.1 76.7 99.2 226 86.7 73.4 97.9 79.14 339 79.6 76.5 97.8 219 85.4 72.9 96.2 73.95 343 78.4 76.1 96.1 221 85.5 72.9 94.1 69.06 352 79.8 75.5 94.5 237 83.5 72.6 92.7 64.87 370 76.5 75.5 91.8 228 82.5 72.2 89.8 61.38 405 73.1 75.0 89.4 254 78.3 71.9 87.6 59.29 418 74.6 74.5 86.7 254 78.7 71.0 85.0 57.110 429 70.4 73.1 84.5 261 77.0 71.0 82.0 54.7

sets are used with varying degrees of noise contamina-tion. In Table 1, results for one and three nearest-neigh-bours are shown, and in Table 2 results for "ve and sevennearest-neighbour are shown. As mentioned earlier, thetest sets are noise-contaminated training data. For eachfeature, additive Gaussian noise is added to the trainingdata as

y���

"y#yN,

where y���is the noise-contaminated data, y the training

data, noise percentage (for 10% noise, "0.1) andN the Gaussian noise vector. Similar noise contaminationexperiments have been successful in analysing classi"erperformances in the past [27].The "rst column in Table 1 shows the percentage of

noise in the test data. For the single nearest-neighbourtrials (k"1), there are no con#icts or ties and hence

nothing to resolve. Hence, both T and S�are zero. The

recognition rates R�and R

�are the same as essentially

for this case both models are equivalent. The recognitionrates achieved is considerably high ranging between 100and 87.9% for 1}10% noise. For three nearest-neigh-bours (k"3), the di!erences between the two modelsbecome clearer. Model-2 performs superior on all trials.Model-1 con#icts increase linearly as the amount of noisein test data is increased. Fortunately, most of these tiesare successfully resolved in favour of the true class asshown by high values of S

�. In Table 2, similar trends are

shown for "ve and seven nearest-neighbour trials. Thenumber of ties increase almost linearly for model-1 whichcontinues to perform inferior to model-2. The perfor-mance di!erence between the two models is greater whennoise levels are lower.A neural network model is used for comparing the

quality of nearest-neighbour results. A multilayer

1608 S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612

Fig. 6. The recognition performance of the two models as noiseincreases.

Fig. 7. The di!erence in the recognition performances of the twomodels as noise is increased.

perceptron (MLP) is used with backpropagation trainingalgorithm. The architecture of the network is optimisedby choosing a model of minimal complexity with the leastgeneralisation error. The number of hidden nodes areincreased for training the network and the system perfor-mance is measured on a "xed test set. The network withthe best recognition rate performance is selected. Theneural network has a total of 42 inputs representingthe 42 features selected and "ve outputs corresponding tothe class of the image object. The optimal hidden nodeswere 200 with a single hidden layer so the networkcon"guration was chosen as 42�200�5. The results ofneural network performance are shown in the last col-umn of Table 2. The network performance is comparat-ively inferior to the best model-1 and model-2 perfor-mances especially when the noise is increased. The resultsshow that nearest-neighbour classi"er is better able topick out the correct class of noisy test data than theneural network model.Figs. 6 and 7 show in greater detail the classi"cation

performance. In Fig. 6, we see two sets of lines. The "rstset of four lines at the top showmodel-2 performance andthe bottom three show model-1 performance. Both mod-els perform in a stable manner. Model-2 degrades morethan model-1 as noise is increased. The most dramatice!ect of the noise change is shown with the neural net-work model that shows a steep drop in classi"cationperformance from nearly 90}55%. So, how much gain dowe get from using model-2 over model-1 and whether thisgain is the same for di!erent values of k, and with changein noise levels. Fig. 7 shows this di!erence. For onenearest-neighbour, the models are equivalent. The mostgain is made for k"7, then for k"5 and then for k"3.There is hardly any di!erence between model-1 andmodel-2 di!erences for various values of k when the noiseis at its maximum of 10%. On average, we can see withalmost all values of k, the di!erence between the twomodels is signi"cant around 20%. In any condition,model-2 comes out superior.Further information on the relative superiority of

models can be shown using confusion matrices to detailwhich data were mostly misclassi"ed. In Appendix A, weproduce the confusion matrix of the single nearest-neigh-bour method that performed the best in our study. Re-member that both models (1 and 2) are equivalent in thiscase. The classi"er shows graceful degradation in perfor-mance as the amount of noise is increased [28]. Mistakesare mostly made when trees are classi"ed as grass or viceversa. Also mistakes are made when river re#ecting treesamples are misclassi"ed as of type grass or trees. Thisis understandable as the re#ection of vegetation in riveryields similar texture features to the vegetation itself. InAppendix B, the neural network confusion matrix isshown. The classi"cation results are poorer as discussedbefore. The neural network model makes all the mistakesmade by the nearest-neighbour models as described be-

fore. In addition, it also makes the mistake of classifyinggrass and tree samples as tree re#ection in the river.This shows that neural networks make errors in bothdirections from their confusion matrices whereasnearest-neighbour models show one direction error.The mistakes made by the neural network exacerbate asnoise is increased with a steep slope.

5. Conclusion

In this paper we suggested two models of nearest-neighbour classi"ers and applied them to image under-standing data. Our data was obtained by texture analysisof natural scenes and shows the properties of most realdata in similar contexts, i.e. it is non-linear, stronglycorrelated across various features and overlapping acrossvarious classes. The results shown in this paper for ourtwo nearest-neighbour models are extremely encourag-ing. The results were analysed in comparison with neural

S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612 1609

networks on the same data. One of the study limitationsis the imbalance between patterns of various classes, e.g.there are many more patterns for vegetation compared toother classes. In future studies we propose to address thisby developing synthetic data modelled on the basis ofavailable real data. Such a database of synthetic data willallow further investigation with a range of classi"ersincluding neural networks. We hope that this researchcontribution has highlighted the role of nearest-neigh-bour methods for application in areas where noise man-agement in data is an important factor. We are con"dentthat further development along the lines suggested in thispaper will lead to even more accurate and sophisticatedclassi"ers in the image understanding domain.

Appendix A

Nearest-neighbour (k"1) confusion matrix for 10trials with increasing noise in test data

G T S R�

R�

G 1924 0 0 0 0T 0 1034 0 0 0S 0 0 273 0 0R�

0 0 0 225 0R�

0 0 0 0 321Noise"1%, R

�"100%, R

�"100

G T S R�

R�

G 1923 1 0 0 0T 1 1033 0 0 0S 0 0 273 0 0R

�0 0 0 225 0

R�

0 0 0 0 321Noise"2%, R

�"99.9%, R

�"99.9%

G T S R�

R�

G 1914 10 0 0 0T 9 1025 0 0 0S 0 0 273 0 0R

�0 0 0 225 0

R�

0 0 0 0 321Noise"3%, R

�"99.4%, R

�"99.4%

G T S R�

R�

G 1899 25 0 0 0T 19 1015 0 0 0S 0 0 273 0 0R

�0 0 0 225 0

R�

1 0 0 0 320Noise"4%, R

�"98.8%, R

�"98.8%

G T S R�

R�

G 1880 36 0 0 8T 39 995 0 0 0S 0 0 273 0 0

R�

0 0 0 225 0R

�1 0 0 0 320

Noise"5%, R�"97.7%, R

�"97.7%

G ¹ S R�

R�

G 1869 42 0 0 13T 65 968 0 0 1S 0 0 273 0 0R�

0 0 0 225 0R�

6 0 0 0 315Noise"6%, R

�"96.6%, R

�"96.6%

G T S R�

R�

G 1835 67 0 0 22T 95 935 0 0 4S 0 0 273 0 0R

�0 0 0 225 0

R�

8 4 0 0 309Noise"7%, R

�"94.7%, R

�"94.7%

G T S Rs R�

G 1808 77 0 0 39T 139 886 0 0 9S 0 0 273 0 0R

�0 0 0 225 0

R�

14 8 0 0 299Noise"8%, R

�"92.3%, R

�"92.3%

G T S R�

R�

G 1775 93 0 0 56T 164 857 0 0 13S 0 0 273 0 0R

�0 0 2 223 0

R�

25 9 0 0 287Noise"9%, R

�"90.4%, R

�"90.4%

G T S R�

R�

G 1728 125 0 0 71T 179 833 0 0 22S 0 0 273 0 0R

�0 0 2 223 0

R�

37 18 0 0 266Noise"10%, R

�"90.4%, R

�"90.4%

Appendix B

Neural network confusion matrix for ten trials withincreasing noise in test data

G ¹ S R�

R�

G 1662 178 2 2 80¹ 109 905 0 0 20S 0 1 272 0 0R

�1 1 3 219 1

R�

9 2 1 0 309Noise 1%}89.14% classi"cation

1610 S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612

G T S R�

R�

G 1598 209 2 1 106T 151 846 0 0 37S 0 1 272 0 0R

�2 1 2 219 1

R�

23 6 1 0 291Noise 2%}85.35% classi"cation

G T S R�

R�

G 1485 295 2 1 141T 207 762 0 0 65S 0 1 272 0 0R

�1 2 4 217 1

R�

43 25 1 0 252Noise 3%}79.11% classi"cation

G T S R�

R�

G 1368 366 2 1 187T 246 709 0 0 79S 0 1 269 3 0R

�1 2 6 215 1

R�

66 44 0 0 211Noise 4%}73.39% classi"cation

G T S R�

R�

G 1289 420 2 2 211T 269 668 0 1 96S 2 1 265 5 0R

�4 2 9 210 0

R�

93 52 0 0 176Noise 5%}69.04% classi"cation

G T S R�

R�

G 1195 484 1 2 242T 287 637 0 1 109S 2 1 261 9 0R

�6 2 9 208 0

R�

110 63 0 0 148Noise 6%}64.83% classi"cation

G T S R�

R�

G 1132 517 2 2 271T 312 592 1 1 128S 3 2 254 14 0R

�7 2 9 217 0

R�

114 76 0 0 131Noise 7%}61.31% classi"cation

G T S R�

R�

G 1101 549 2 2 270T 335 564 1 1 133S 4 4 249 16 0R

�7 3 9 206 0

R�

122 82 0 0 117Noise 8%}59.22% classi"cation

G T S R�

R�

G 1077 575 2 4 266T 356 540 1 1 136

S 5 6 232 29 1R

�9 4 11 201 0

R�

124 90 0 0 107Noise 9%}57.10% classi"cation

G T S R�

R�

G 1048 592 2 4 278T 379 506 1 1 147S 9 7 219 37 1R

�9 4 11 201 0

R�

132 94 0 0 95Noise 10%}54.77% classi"cation

References

[1] Y. Kodrato!, S. Moscatelli, Machine learning for objectrecognition and scene analysis, Int. J. Pattern RecognitionArtif. Intell. 8 (1) (1994) 259}304.

[2] A. Rosenfeld, Image analysis and computer vision: 1997[survey], Comput. Vision Image Understanding 70 (2)(1998) 239}373.

[3] J. Skrzypek, E. Mesrobian, D. Gunger, Neural networks forcomputer vision: a framework for speci"cations of a generalpurpose vision system, Proc. SPIE 1076 (1989) 16.

[4] K. Yamamoto, Future directions in computer vision andimage understanding; ETL perspectives, Proceedings ofthe 10th ICPR Conference, Atlantic City, Vol. 1, 1990,pp. 32}37.

[5] C. Kohl, J. Mundy, The development of the image under-standing environment, Proceedings of the IEEE CVPRConference, Seattle, 1994, pp. 443}447.

[6] O. Firschein, Defence applications of image understand-ing, IEEE Expert 10 (5) (1995) 11}17.

[7] R.L. Simpson, Computer vision: an overview, IEEE Expert6 (4) (1991) 11}15.

[8] N.W. Campbell, W.P.J. Mackeown, B.T. Thomas, T. Tros-cianko, Interpreting image databases by region classi"ca-tion, Pattern Recognition 30 (4) (1997) 555}563.

[9] H. Liu, D.Y.Y. Yun, Adaptive image segmentation byquantisation, Proc. SPIE 1766 (1992) 322}332.

[10] H.M. Lakany, E.G. Schukat-Talamazzini, H. Niemann,Object recognition from 2D images using Kohonen self-organised feature maps, Pattern Recognition Image Anal.7 (3) (1997) 301}308.

[11] T. Kasparis, G. Eichmann, M. Georgiopoulos, G.L. Hiele-man, Image pattern algorithms using neural networks,Proc. SPIE 1297 (1990) 298}306.

[12] R. Booth, C.R. Allen, A neural network implementationfor real-time scene analysis, Proc. SPIE 1001 (2) (1988)1086}1092.

[13] J.R. Beveridge, J. Gri$th, R.R. Kohler, A.R. Hanson, E.M.Riseman, Segmenting images using localised histogramsand region merging, Int. J. Comput. Vision 2 (3) (1989)311}347.

[14] S. Dellepiane, G. Vernazza, A fuzzy approach to cue detec-tion and region merging for image segmentation, in: V.Cantoni, R. Creutzburg, S. Levialdi, G. Wolf (Eds.), RecentIssues in Pattern Analysis and Recognition, Springer,Berlin, 1989, pp. 58}64.

S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612 1611

[15] J.T. Allen, H.S. Porter, Alternatives for the image segmen-tation problem, Proceedings of the Southcon/90 Confer-ence, Electr. Conventions Manage, 1990, pp. 138}143.

[16] F. Sadjadi, Performance evaluation of a texture-basedsegmentation algorithm, Proc. SPIE 1483 (1991) 185}195.

[17] S. Theodoridis, K. Koutroumbas, Pattern Recognition,Academic Press, New York, 1999.

[18] S. Singh, A single nearest neighbour fuzzy approach forpattern recognition, Int. J. Pattern Recognition Artif. In-tell. 13 (1) (1999) 49}54.

[19] J.F. Haddon, J.F. Boyce, Integrating spatio-temporal in-formation in image sequence analysis for the enforcementof consistency of interpretation, Digital Signal Processing,8 (4) (1998) 284}293.

[20] J.F. Haddon, J.F. Boyce, Image segmentation by unifyingregion and boundary information, IEEE Trans. PatternAnal. Mach. Intell. 12 (10) (1990) 929}948.

[21] R.M. Haralick, K. Shanmugan, I. Dinstein, Texture featuresfor image classi"cation, IEEE SMC-3 (6) (1973) 610}621.

[22] R.M. Haralick, Image texture survey, in: P.R. Krishnaiah,L.N. Kanal (Eds.), Handbook of Statistics, Vol. 2, 1982,pp. 399}415.

[23] J.F. Haddon, J.F. Boyce, Co-occurrence matrices for im-age analysis, IEE Electron. Commun. Eng. J. 5 (2) (1993)71}83.

[24] J.F. Haddon, J.F. Boyce, Spatio-temporal relaxationlabelling applied to segmented infrared image sequence,Proceedings of 13th International Conference on PatternRecognition, IEEE Press, Austria.

[25] J.F. Haddon, J.F. Boyce, Texture classi"cation of seg-mented regions of FLIR images using neural networks,Proceedings of the First International Conference on Im-age Processing, Texas, 1994.

[26] J.F. Haddon, Adaptive scene analysis, Proceedings of theWorkshop on Advanced Concepts for Intelligent VisionSystems (ACIVS'99), Baden-Baden, 1999.

[27] S. Singh, E!ect of noise on generalisation in massivelyparallel fuzzy systems, Pattern Recognition 31 (11) (1998)25}33.

[28] S. Singh, J.F. Haddon, M. Markou, Nearest neighbourstrategies for image understanding, Proceedings of theWorkshop on Advanced Concepts for Intelligent VisionSystems (ACIVS'99), Baden-Baden, 1999.

About the Author0SAMEER SINGH was born in New Delhi, India and graduated from Birla Institute of Technology, India witha Bachelor of Engineering degree with distinction in Computer Engineering. He received his Master of Science degree in InformationTechnology for Manufacturing from the University of Warwick, UK and a Ph.D. in speech and language analysis of stroke patientsfrom the University of the West of England, UK. His main research interests are in image processing, medical imaging, neural networksand pattern recognition. He is the Director of the Pattern Analysis and Neural Networks group at Exeter University. He serves as theEditor-in-Chief of the Pattern Analysis and Applications journal by Springer, Editor-in-Chief of the Springer book series on &Advancesin Pattern Recognition', Chairman of the British Computer Society Specialist group on Pattern Analysis and Robotics, Editorial Boardmember of Neural Computing and Applications journal, and Editorial Board member of the Perspectives in Neural Computing bookseries by Springer. He is a Fellow of the Royal Statistical Society, and a Member of BMVA-IAPR, IEE and IEEE.

About the Author0DR. JOHN HADDON is the Principal Scientist at the Image Processing and Decomposition Techniques Section,Weapons System Sector at DERA, Farnborough. Dr. Haddon's main research interest lies in the area of image processing techniqueswith a strong mathematical basis for the analysis of infrared images for military targeting applications. Dr. Haddon works closely withthe PANN laboratory as a Honorary Research Fellow at the University of Exeter. Dr. Haddon has been the UK representative andalternate Chairman on NATO AC/243 (Panel 3) on image processing and UK member on TTCP WTP-7 (Guidance, Control andFuzing)/KTA-2 on neural networks technology applied to weapons seekers. Dr. Haddon has published a large number of researchpapers in international conferences and journals in the area of image processing (Haddon 1990}1998).

About the Author0MARKOSMARKOUwas born in Larnaca, Cyprus where he obtained his B.Sc. in comyputing at P.A. College. Hejoined the Department of Computer Science at the University of Exeter in October 1998 for the M.Sc. in New Generation Computing.His research was in the area of &Intelligent Scene Analysis' and covered the classi"cation of objects found in outdoor scenes usinga variety of statistical classi"ers such as di!erent types of Nearest Neighbour methods and biologically inspired classi"ers such as NeuralNetworks. After completing his master's degree, he decided to further continue his research with a Ph.D. at Exeter in the same area.

1612 S. Singh et al. / Pattern Recognition 34 (2001) 1601}1612