Spectral and rank order approaches to texture analysis

Spectral and Rank Order Approaches to Texture Analysis

Stefan0 Fioravanti, Roberto Fioravanti, Francesco G. B. De Natale Dipartimento di Ingegneria Elettronica e Biofisica, Universith di Genova Via All’Opera Pia 1 lA, 16145 Genova - Italy Radek Marik, Majid Mirmehdi, J. Kittler, M. Petrou Deptartment of Electronic and Electrical Engineering, University of Surrey Guildford GU2 5XH - United Kingdom

Abstract. There are two major approaches to texture analysis, both supported by physiological evidence: those based on the spatial statistics, and those based on its spectral properties. One of the most sophisticated spectral approaches to texture is that based on the Wigner distribution where the attributes computed for each pixel encapsulate both the local spectral and phase properties of the local Fourier transform in a unique real spectrum. On the other hand, some of the most efficient methods which operate in the spatial domain alone, are those based on rank order functions. Before one embarks on the use of the sophisticated methods, it is worth exploring the efficient ones to the limit of their performance. In this paper we investigate these two major a p proaches and compare their performance both in terms of quality of results and efficiency. The problem we consider is that of detecting defective blobs and cracks on compIex textural backgrounds. We show that in most cases rank order approaches can perform well, although no unique method can be employed for both types of defects. On the other hand, the Wigner approach with a very small modification can cope with both types of defects and handle even the identification of very subtle cracks. Thus, it seems that for any real time performance inspection system, the rank order approaches should form the front end with the more sophisticated methods coming in play when necessary.

1. INTRODUCTION

With the advancement of electronic means of communication, the interrogation of databases and distant inspection of facilities, safety critical components and product quality becomes increasingly popular. In a sense, a person’s queries which are normally answered by the input of hisher senses, increasingly will have to be answered by the output of a computerised sensor that could operate some distance away. Thus, one expects that an automatic system of visual inspection may come into play when the reIevant question is asked. For such systems to become acceptable in the market place, near human performance is required for all their functions. One of the most intricate functions they are expected to perform is that of inspection of patterned surfaces. This is the problem we are dealing with in this paper. More specifically. we are interested in one of the most subtle versions of the problem, namely that of defect detection in randomly textured images.

However, before one proceeds to identify diversions from a certain perceived pattern, one has first to decide how to represent that pattern in computer language. Thus, the issue of texture representation becomes of paramount importance when it comes to texture defect detection.

Texture has received considerable attention in the image analysis literature. A number of distinct approaches have been suggested for texture representation falling to two major classes: those based on the spatial statistics of the textured image, and those based on its spectral properties. From the methods that operate in the spatial domain, the most restrictive in terms of assumptions pertaining to the textural properties they can characterize are structural methods [ 1.21 which are based on the view that texture is a regular pattern generated by a re- peated placement of well defined textural primitives. Early psychophysical studies [3.4] motivated the development of methods exploiting second order image statistics exemplified by the co-occurrence matrix repre-

Vol.6, NO. 3 May -June 1995 287

Stefan0 Fioravanti, Roberto Fioravanti, Francesco G. B. De Natrle, Radek hlarik, hlajid Mirmehdi, J. Kittler, M. Petrou

sentation [ 5 ] . The generative methods attempt to represent texture by models which can reproduce statistically similar textural characteristics [6. 91. The last category of methods in this class is based on rank order statistics or mathematical morphology [ 101.

On the other hand neurophysiological studies support the view that representation in the human vision system involves a Fourier like decomposition of the visual stimulus into spatial frequency components [ 1 1 I. These studies had a seminal influence on the development of spectral representation of texture in the form of either the energy of the outputs of a bank of filter tuned to different spatial frequency bands [12, 141 or the power spectrum itself. One of the most sophisticated spectral approaches to texture is that based on the Wigner distribution where the attributes computed for each pixel encapsulate both the local spectral and phase properties of the local Fourier transform in a unique real spectrum. Thus, they achieve a spatiaLkpatia1 frequency representation of the texture pattern. This method is reputed to be very powerful, but very much computationally expensive. That is why, before one embarks on the use of the sophisticated methods, it is worth exploring the efficient ones to the limit of their performance. Some of the most efficient methods which operate in the spatial domain alone, are those based on rank order functions.

Thus, in this paper we explore these two major approaches on the same problem and the same set of images and compare their performance both in terms of quality of results and efficiency. In particular, the problem we consider is that of detecting hair-like cracks on complex textural backgrounds. The experimental com- parisons will be carried out on Brodatz textures with cracks superimposed on them and on ceramic and ite tile textures with natural cracks.

The paper is organised as follows. In section 2 we introduce the spectral approach and adopt the Wigner distribution as its most powerful realization. The rank order based approach is developed in section 3 where for crack detection we focus on an adaptive rank order filtering scheme in which the normal textural properties are brought to bear on the filter design. Section 4 collates the experimental results and provides a comparative study. Summary and conclusions are given in section 5 .

2. WIGNER DISTRIBUTION REPRESENTATION

The short time Fourier transform is a commonly used method with which one can compute the frequency con- tent of an image in the vicinity of a pixel by placing a window around it and taking the Fourier transform of the windowed function. The problem with this approach is that the Fourier transform produced this way is a complex array and usually only its magnitude (i.e. the spectrum) is used to associate a set of frequency domain features to each pixel. The Wigner distribution on the

other hand, produces a real valued set of features which encapsulate both the magnitude and the phase information that characterize a signal in the frequency domain. This is achieved by creating first a symmetric function from the signal and taking its Fourier transform which, as a consequence, is real.

The Wigner distribution was initially defined as a cojoint time and time frequency representation of an infinite one dimensional signal [15, 171. Its two dimensional extension suggested in [ 181 is defined by:

f* ( x -:, y -:) exp [ - j (5 Q: + p)] d Q: d p =

where f ( x , y ) is a two dimensional image function, r ( x , y ) its complex conjugate, 5 and C are the angular frequencies in the x and y directions respectively and a, p are some spatial displacement parameters.

In the above expression the image function f (x . y ) is treated like a continuous function. In reality of course, we have a sampled version of it from which the continuous function must be reconstructed:

f, (x , y ) = f (x . y ) comb (x , y , A x, A y ) =

- -

where A x and A , are the sampling intervals. The discrete Wigner disthbution then can be shown to be:

W D , (x ,y , u, v)=

f [ ( r - m ) A x, ( s - n) A Y ] .

A x x - r - , y - s 9) 2 2

(3)

exp (- j 2 ~ [ u ( 2 m - r ) A x + Y ( 2 n - s ) A y ] }

In the above expression u and v are frequencies. Something we notice straight away about the Wigner

distribution is that the function the Fourier transform of which we compute, is the product of the original function with itself. This means that the resultant function has a Fourier transform which is the convolution of the

288 ETT


Fourier transform of the initial function with itself. Thus, if we assume that the initial function was band- limited, the new function will still be band limited but within a much wider band. More precisely, from the exponential term in the defining equation, we can observe that the Wigner distribution of the image function is sampled at double resolution ( A x / 2 , Ay/2) than the normal sampling interval ( A x , A y ) of the Discrete Fourier transform. Hence, the image must be oversam- pled. Generally, Ax, Ay is one unit pixel, therefore, for a given image, twice as many pixels are needed to compute the discrete Wigner distribution than are needed to compute the Discrete Fourier transform. Thus, in order to avoid aliasing, we must sample the image at the ordi- nary sampling rate (Ax, A y ) after it has been processed with a low pass filter. Alternatively, we can interpolate pixel samples to have alias-free distribution.

However, in the case of crack detection, we do not want to low pass filter the image. This is because the local information that is associated with the crack pixels is essential and low pass filtering the image will remove it and blur the cracks. We also want to avoid interpolat- ing the image to provide more samples, as calculating a Wigner distribution is very computationally cumber- some and expensive. Further more, one should note that the crack detection algorithm that we develop is training based, so the aliasing effect on the Wigner distribution is learned during the training phase. Therefore, the mul- tiplicative constant of value 2 in the exponential term of eq. (1) will not be considered in this paper. This means that the image function will be sampled at a normal sampling rate (Ax, A y ) so that a faster version of Wign- er distribution can be calculated using Fast Fourier Transform (FFT) techniques.

Since the limits of integration in the definition of b e Wigner distribution are infinite, it is almost uncomput- able. Accordingly, Martin et al. [I91 introduced a com- putable approximation to the Wigner distribution that they called the pseudo-Wigner distribution, the 2D extension of which is defined as :

(-2 7r J (;a+ v p) f * ( x + r - a , y + s - p ) exp

where

u,v = 0,+1, .., * N

P = 2 N + 1

and H (a; p ) and g (7,s) are windowing functions and

( 2 N + l ) x ( 2 N + I ) and(2M+ 1 ) x ( 2 M + 1)arethe sizes of the corresponding windows. It is desirable to choose windowing functions to eliminate or reduce the undesirable effects of aliasing and Gibbs phenomenon due to sampling and truncation. A windowing function in the Fourier domain should be a reasonable approximation of an impulse (delta) function with compromise between making the width of the delta function as small as possible and the amplitude of the ripple side lobes as small as possible. A prolate spheroidal wave function

'which is optimal in spectral energy within a specific bandwidth is the best candidate [20]. Kaiser 1211 has shown that in the one dimensional case, the prolate spheroidal wave function may be well approximated by the modified Bessel function of zero order, appropriaie- ly scaled. The Bessel function approximation is nearly optimal and much easier to compute than the prolate spheroidal wave function. .

Henceforth, following [22]. the H (a p ) windowing function in the pseudo-Wigner distribution is chosen to be a Kaiser window and is defined as:

where

and - N S k , I 5 N

with ( 2 N + 1) x ( 2 N + 1) being the size of the kernel which is zero outside this region. is the parameter that governs the trade off between the main lobe width and the side lobe ripple amplitude of the spectrum. Typical values of y are in the range 4 I y 5 9.

The other windowing function g (r , s) appearing in eq. (4) is for allowing local averaging. Any averaging will smooth the signal and may make the crack we wish to detect blurred and undetectable..Thus, in this paper this function was not used at all. In order to stick to the proper formalism, we may say that we chose a rectan- gular data window defined as:

(7)

where r and s are integers. 6, is Kronecker's delta and i , j are also integers that take values such as to identify positions within the smoothing window of size (2M + 1) x (2M+ 1) around pixel ( r , s).

2.1. Texture crack detection algorithm

The crack detector that we describe here is based on

Vol.6, No. 3 May-June 1995 289

Stclano Fioravanti, Roberto Fioravanti, Francesco G. B. De Natale, Radek &lank, Majid Mirmehdi, J. Kittler, M. Petrou

the work described in [23] and is able to detect cracks on random or regular textural backgrounds. Basically, it consists of three parts:

- System training for the learning of the underlying texture. Analysis of the test image and calculation of the Mahalanobis distance map. Post-processing to isolate the crack pixels.

-

-

In the first stage, the pseudo-Wigner spectrum at each pixel position of a defect-free image is calculated. Each local Wigner spectrum is normalised to have unit dc spectrai component. In other words, the absolute magni- tudes of the spectral components are not used as in many other cases [24,25]. This is because it was noticed that the information needed for the detection of cracks was best encapsulated by the general shape of the spectrum and not necessarily by the exact value of each Wigner spectral component. The Wigner distribution is a real function and the phase information is implicitly encapsulated in the negative parts of the spectrum. Therefore, we do not lose any phase information after normalisation. The normalised amplitude of each spectral component is then considered as our local texture feature and only half of those features need to be re- tained due to symmetry. Generally, defective pixels can be isolated in the feature space by using some sort ot optimal distance measure from the distribution of the pixels of the underlying texture. The Mahalanobis distance seems to be appropriate. However, when the covariance matrix of the distribution was computed, it was found to be singular, an indication that the features used were not independent. It became obvious, therefore, that a new set of features needed to be computed such that the covariance matrix in the feature space was invertible.

Let us denote by f the local feature vector associated with each pixel of the defect-free image and C the cu- variance matrix of their distribution. We can diagonal- ize I: by writing:

where U is the matrix made up from the eigenvectors of C used as columns, UT is its transpose and A is a diagonal matrix of the eigenvalues of X arranged in the de- scending order of their magnitude along its diagonal. Suppose now that we retain only the m largest eigenvalues and we set the rest to zero. The corresponding transformation matrix then will consist of the corresponding m eigenvectors only.

We can thus define new feature vectorsjassigned to each pixel by using the linear transformation matrix i7 T:

(9)

The new feature vector fconsists of m components

only, which are uncorrelated with each other and encapsulate the most important features of the distribution. In the second stage, the local texture features were computed from the test image as described in the training phase. In the new feature space with the reduced dimensionality, we can use the Mahalanobis distance to measure the distance of each pixel of the test image from the cluster of pixels of the training image. The new covariance matrix of the distribution is the truncated matrix ~ Thus we can create a residual map of the test image which contains the Mahalanobis distance of each pixel from the distribution of the defect- free image in the feature space. Let d be a Mahalanobis distance function defined. Then for each pixel location [ i , j ] , we have

where M,= is the transformed mean feature vector of the underlying texture. Clearly, pixels with large distance measures are potential crack pixels. One could simply threshold the distance map to isolate the defective pixels. This, however does not create a very clean output and in some applications, one may require to identify the crack lines accurately. Henceforth, some post processing is necessary and this is described in the next subsection.

2.2. Optimal line filtering

The post-processing method we used on the residual map is the optimal line filtering approach. Given the assumptions that cracks are mostly generated due to sud- den exertion of external force or material fatigue, the crack features embedded in the Mahalanobis map should have a dominant orientation, that is, horizontal or vertical, instead of becoming spiral in shape. We can then convolve the Mahalanobis distance map with a line filter in the direction normal to the basic orientation of the linear features. The orientation is estimated by comparing variances of responses computed from the distance map in the horizontal and vertical directions. Ob- viously. the direction that has the smallest variance is the basic orientation of the linear features.

The line filter that we used [26] is a one dimensional directional filter which detects lines. All linear features with widths within a factor of 2 of the width of the feature for which the filter is optimal can be detected. The filter parameters are designed optimally by modelling the intensity profile of the linear features in the Mahalanobis distance map and maximising a com- posite performance criterion [26] . When a local maximum in the output is detected, a hypothesis is generated that there is a linear feature passing through it. Since the filter is developed around the assumption that the linear feature we want to detect is adequately described by a certain model, we know what sort of

290 ETT


output is expected from the filter when a true linear feature is encountered. Thus, when the hypothesis of the presence of a linear feature is generated, a template of the expected filter response is invoked and a matching procedure is applied similar to a2 test. If the value of the residual of this template matching is below a certain threshold, a linear feature is marked. The strength of the linear feature marked is calculated as the difference between the response of the filter at the position of the central pixel minus the average response of the filter at two neighbouring positions symmetrical about the centre where the expected response is known to have another local extremum of the oppo- site sign from the central one. This number is considered to be the contrast of the linear feature. Subse- quentiy hysteresis thresholding is applied to these contrast values.

The filters as described in [26] are one dimensional, i.e. they are only 1 pixel wide. In some cases the result could be improved if some smoothing was applied in the direction orthogonal to the direction of convolution, before the convolution with the line filter. The line filter effectively smooths the signal (in the direction of convolution) and at the same time estimates its second derivative in the same direction. It was considered. therefore, as most appropriate to use for just smoothing the line filter twice integrated. Such a filter would be expected to be “optimal” for smoothing, in the sense that it would preserve the linear feature to be detected as best as possible, Thus, what we effectively do is to convolve the image with a two dimensional filter h ( y ) f (x ) (with f (x ) being the line detection filter and h (x ) the function f (x) integrated twice) which is sepa- rable and thus very efficient. The smoothing filter h (x) is given by:

1 2 a‘ h ( x > = [-K* ear cos (a x ) + K, ear

sin (a x ) + K4 e-ar cos (a x ) - K3 eha” sin (a x ) ] +

for - d 2 x > - o

and

1 h ( x ) = - [ - N , eax cos(ax)+N, e a l . 2 a2

sin (a x ) + N~ e-”“ cos (a x ) - N~ e-ax sin (a x ) ] + (12) - 1 K, x2 ++ A K cosh (sx)-* cosh ( I x ) +

2 . S l 2

1 - Nsx2 + r, x + L4 2

for 0 2 x > - d

The values of the constants L I - L4 are chosen so that the two branches of the solution match smoothly at x = - d and the filter vanishes smoothly at x = - w where w is its half size. All other parameters that appear in the above expression are as defined in [26] and it is beyond the scope of the present work to go into more detail about them. The only parameters that are not discussed in [26] are L, - L4 and we give their values here in Table 1 for features of varied sharpness (expressed by parameter s) and calculated for feature half-width d = 1 and 1 = 10. This filter should be scaled and used the same way as the line detection filter described in [26].

It should be noted that the linear filter post-processing is only necessary if one wishes to identify the location of the cracks exactly. One may simply threshold the residual map in order to have an extended (“thick crack“) defective region which in some cases may be more useful. (If for example one wishes to have an en- closing polygon for the fault so that in subsequent treat- ment of the granite slab, like when cutting it into tiles, that region is avoided).

Further, we must mention that when we experiment- ed with cracks which are slightly brighter or slightly darker than the mean grey value, the algorithm ap- peared to work better for the darker cracks. This is due to the normalisation we perform to the spectra by di- viding with the dc component; a low dc component (darker background) tends to accentuate the variability between the defect-free spectra from one location to the next and thus decreases the discriminatory power of each spectral component. We obtained very good quality results for brighter cracks by reversing the polarity of the image. Knowing when to reverse the polarity and when not is not a problem as cracks usually

Table 1 - The L I - L ~ values for features of varied sharpness s

Vol. 6, No. 3 May-June 1995 29 I

Stcfano Fioravanti. Roberto Fiaravanti, Francesco G. B. De Natale, Radek blank, Majid Mirmehdi, J. Kittler, M. Petrou

tend to be either darker or brighter than the background depending on the material. On granite, for example, the cracks tend to be brighter, so for best results the polarity of the image should be reversed from the beginning.

3. ADAPTIVE RANK ORDER OPERATORS

When there is a need for real-time processing a different approach can be taken into account: the rank-order based operators presents several advantages in terms of computational complexity, low dependency on texture statistics, and robustness to noise.

Lets consider a generic Rank Order Filter (ROF) [27, 281 with rank r and mask M, defined as follows:

yi = €r,M [xi] = Rank ( x i - j [ j E M} (h'-l)r+l

where the operator Rank {X) extracts the n-th element of the array X previously ordered in ascending se- quence. It can be observed that common filters like minimum, maximum and median are obtained from eq. ( 13) by imposing the value of r equal to 0, 1, or 0.5 respectively.

Given the formal definition of ROF, we can introduce the dual filter, defined as a filter with rank (1 - r) mir- rored with respect to the mask center. Called D the dual of E, the following eq. holds:

On the basis of eq. (14), a formal definition of a generic Rank Order Based Filter (ROBF) with rank r and mask M is expressed by the following

It can be noticed that simply imposing the value of r in eq. (15) it is possible to generate some interesting filters like Opening ( r = 0) and Closing ( r = l ) with a generic mask M. Such filters, usually called morphologi-

cal filters, have been extensively used to solve the problem of thin discontinuities addressed in this section and are the subject of the following pages.

3.1. Morphological filtering

Among the various types of morphological filters [29], the operators based on combinations of Closings and Openings have been alreadynsed in the past as effective methods to identify image contours [30] or particular structures [31]. In this work we selected the operator given by the difference between a Close and a Open as the best candidate for the identification of thin structures. In the following we will refer to this operator as ClO (Close less Open) to distinguish it from the CO operator usually defined as the sequential application of Close and Open filters; CLO is simply defined as follows:

C 1 q I ) = Cfose(1) - Open([) =

Erosion (Dilution( I ) ) - Dilation (Erosion ( I ) ) (16)

E m o n Dilation &-

Fig. 1 - Result of a 3 x 3 ClO operator on two bidimensional objects.

The effect of such operation can be better clarified by the simple example of Fig. 1: at the top, a 3 x 3 C10 is applied to two round shaped objects, while at the bot-

Fig. 2 - Result of a 3 x 3 C10 operator on a texture image by Brodatz (left original image. right: image filtered and magnified by a factor 2).

292 ETT


tom, the same operation is performed on two thread-like elements. It is evident that, while in the former case the effect of the filtering is minimal since the erosion does not eliminate completely the object, in the latter case the ClO enhance the thin structures. It is to be pointed out that the result would not change if the objects were dark on a light background where the thin structure dis- appear in the Closing operation, or light on a dark background, with the thin structure eliminated by the Open- ing operation.

In more complex situations, the ClO operator in its general formulation is not suitable for two reasons: first, it detects not only the anomalous structures, but also the thin configurations that are normally in the texture, and second, it is sensitive to the noise, in the sense that it enhance regular variations of the intensity value. Such problems result in a non-null output even when the filter is applied to a homogeneous texture, as shown in Fig. 2.

3.2. Adaptive morpltological filtering

In this section we will discuss how the introduction of adaptivity criteria is useful to reduce the sensitivity of morphological filtering to local variations typical of textures and to noise. As the C10 filter is very simple, there are only two elements that can be considered for achieving a spatial adaptivity: the dimension of the mask and its shape. The former parameter is the easier to modify, but does not produce in general the result needed: in fact, it often happens that the thickness of the irregularities to be identified has a size similar to the acceptable variations of the texture. The latter is more interesting, for it is possible to generate and utilize an optimized structuring element, that is, a ClO mask that produces a minimum response when applied to an image without irregularities.

In [32] a general solution to the problem of optimum adaptation of the mask in Rank-Order based filtering is proposed. Here, a set of candidates A is fixed a priori, and the fact that thej-th element (pixel) of A belongs or not to the mask is expressed by the sign of a real vari- able mi. On the basis of such formulation of the problem, it is possible to define a cost function to be minimized (the goal is to obtain a null response on a regular texture); the method used to perform the optimization is the classical gradient descent algorithm.

Analysing in detail the Salembier solution, some major points of criticism should be expressed; in partkular, the definition of the continuous variables m simply appears as a mathematical artifice to make the cost function differentiable, but does not change the underlying binary nature of the problem. As a matter of fact, the output of the filter is not sensitive to any variation of the parameter mi that does not modify the sign: this means that the cost function to be minimized is a step function, and is therefore not suitable for the application of a gradient descent. Fig. 3 shows two examples of cost function in a 2D domain (for higher dimensions the concepts

J

m l m1

a) b)

Fig. 3 - Example of cost functions in a 2D domain.

does not change); it is evident that, at each iteration, the direction toward which to move is computed by looking at the partial derivatives in the borders of the current plateau. Consequently, the next step will move to another plateau or remain in. the same, depending on the step size, but it is possible to reach only the configurations that have Hamming distance 1. Moreover, due to the high discontinuity of the function, not always the joint use of the two partial derivatives performs well: in Fig. 3b) is shown a typical example where this strategy produces the worst possible result.

The proposed algorithm addresses the optimization problem in a more direct way. Two hypotheses are first made:

i) given a minimum number N , of non-null elements, every algorithm tends to reach this number in a few steps: i t is therefore convenient to keep N , fixed;

ii) since the optimization involves the global image, the computation of the score to be associated to each candidate is based on the average value of the cost function over the all image.

Given such rules and a generic cost function J that penalizes the higher outputs of the filter, the algorithm proceeds iteratively by searching for the best swap between two candidates of the mask. In other words, at each step all the pairs of inside-mask and outside-mask elements are exchanged, and the corresponding average cost function is evaluated If the cost of the current configuration is lower than all the new costs, the algorithm ends, otherwise, the best configuration is chosen and another iteration is launched.

The following example better clarifies this procedure. ’ Suppose we have a 2 x 2 search area (a total of candidate elements A = 4) and want to construct a structuring element of two elements ( N , = 2); the situation is represent- ed in a 3D space in Fig. 4. The vertices of the hypercube represent all the possible configurations of the mask, with 1,2,3 or 4 non-null elements: a 0 in the position i means that the i-th element of the mask is not used, while a 1 in the same position means that it has to be considered in the filtering. In particular, the marked vertices represent the configurations with 2 non-null elements that we are inter-

Vol.6. No. 3 May-June 1995 293

Stefano Fioravanti, Roberto Finravanti, Fnncesco G. B. De Natale, Radek Marik, Majid hlirmehdi, J. Kittler, M. Petrou

ested in. Starting from the vertex [ 10101 the algorithm will consider all the four vertices at distance two, produced by the swapping of two elements (i.e., [OllO], I001 I], [IOOll, and [ 1 lW]) w d will compute the minimum of the cost function: if the starting configuration result is the best, then the algorithm is stopped, otherwise, the algorithm is launched again from the new point, and the only remain- ing configuration (i.e., [OlOl]) is introduced.

In general, if the search area has a dimension A and the non-null elements are N , < A, then at each iteration it will be necessary to calculate N,, x (A - N , ) new configurations, and therefore N , x (A - N , ) values of the cost function. As this way to proceed is very expensive, a sub-optimal solution yielding good results has been developed, based on a separation of the two operations that compose a swap: in practice, the transition from a mask to another is achieved by first finding the best configuration at distance one from the starting point (raising N , to N,, + 1) and then the best configuration at distance one from the intermediate point. In such a way the cost function has to be calculated only N , + (A - N , ) = A times.

Fig. 4 - Cost function domain with possible two-element masks pointed out.

Concerning the choice of the configuration from which the minimization procedure starts, a reasonable choice is to select a symmetrical central block of the de- sired dimension in the search area: this would not create any a-priori preference in the evolution of the minimization. To experimentally validate this hypothesis, a number of trials have been performed each with the starting point selected randornically: the results obtained confirmed that the best solution is in general achieved by a symmetrical initialization.

3.3. Decision criterion and cost function

Two major points have still to be defined: the selec- tion of a criterion to detect the presence or not of an anomaly, and consequently the definition of a reliable cost function to be used during the optimization procedure. The choice of an effective decision criterion re- quires the specification of the main properties that char-

acterise a good decision; by studying the outputs of the filter in different situations, three requirements have been identified:

i)

ii)

iii)

a pixel must be penalized proportionally to its value; if there is a number of contiguous pixels with non-null value, they must also be penalized proportionally to the number; as this operation has to be performed several times during the optimization, the number and the complecity of the involved operations must be very low.

The proposed criterion, called Anomaly Presence De- gree (APD) is defined as follows:

i € A

where A is a square search area with side L and f d g ( i ) is the output of the filter for the i-th pixel. During the analysis phase the image is subdivided in blocks of dimension L x L, and for each block the APD parameter is evaluated and compared with a threshold r h A P D : if the threshold is exceeded, the block is classified as anomalous, otherwise it is considered regular. Usually the dimension L can be fixed in the range 30 to 60 pixels; in the tests carried out the value L = 40 was adopted.

From what was previously said, it is clear that the goal of the adaptation algorithm, if applied to a normal image used for training, is to minimize the maximum APD value present in the image: then, the cost function I will be defined as:

I = max (APD ( A ) ) (18) A&

where 0 is the set of square blocks into which the image is spbdivided. Once the optimization is completed, the minimum value of I can be used to define the value of the threshold th A p D :

where cx is a real positive constant, varying in the range (0, 11, to be fixed depending on the characteristics of the image to be analysed. This allows one to define the sensitivity to small variations (the lower the value of a:, the higher the sensitivity).

The following example show the application of the adaptive Close-Open filtering (A ClO) and compares this approach with the non-adaptive filtering. In Fig. 5 , the Brodatz texture of Fig. 2 is filtered by using two different masks formed by 9 non-null elements on a 5 x 5 squared search area: a fixed central mask rn, (left) and an adaptive mask m (I resulting from the optimization algorithm (right).

294


- - 0 0 0 0 0 -00000 0 1 1 1 0 0 0 0 0 0

m,= 0 1 1 1 0 m,= I t 1 1 0

0 0 0 0 0 - - 0 0 0 0 0

-

0 1 1 1 0 1 1 1 1 1

Fig. 5 - Outputs of the non-adaptive and adaptive filtering applied to Fig. 2.

The effect of the adaptation is evident: the mask m a is less sensitive to the variations of the texture, thus producing a lower response (the output values of Fig. 5 have been magnified for the sake of visibility). In Fig. 6 the same texture was artificially altered by introducing a thin structure, while in Figs. 7 a) and 7 b) the response Of the C/O filter applied by respectively using the two mask

Fig. 6 - Texture of Fig. 2 artificially altented with a thin defect.

Fig. 7 - Outputs of the non-adaptive and adaptive filtering applied to Fig. 6

Fig. 8 a) b) The results of tlmsholding the respective filtered images of Fig. 6 and c) only the blocks with APD > thAm


Stefan0 Fiorrvanti, Roberto Fioravanti, Francesco G . B. De Natale, Radek blank, Majid Xlinnehdi, J. Kittler, hl. Petrou

m, and m, is shown. In Figs. 8 a) and 8 b) the filtered images are binarized by a simple threshold to emphasize the higher performances of the adaptive method. In Fig. 8c) the sparse points caused by noise are eliminated by not considering the blocks under the threshold th ApD.

4. COMPARATIVE EXPERIMENTAL RESULTS AND DISCUSSION

The two approaches to crack detection in textured surfaces described in the paper have been tested on a representative set of images of natural textures. The first four images, Figs. 9 a) - d), are from the Brodatz album and have simulated cracks superimposed.

Some simulated cracks are continuous lines while

others are partly continuous and partly semi-randomly dotted spots placed in close proximity. Figs. 10a)-b) are images of ceramic tiles with natural cracks. Figs. I I a) - b) are acquired from natural granite stones and also present natural cracks.

The results of the application of the pseudo-Wigner distribution for the detection of cracks are shown in at bottom-left of each figure. The kernel window parameter N was fixed to 3 which implied that we used windows of size 7 x 7. The residual map was computed by comparing the output of the pseudo-Wigner distribution texture features on the test image against those obtained on a defect-free training image. The post-processing line filter which enhances the residual map was tuned to give the best performance according to the physical nature of the crack. Thus, for the simulated cracks its

Fig. 9 - Four examples of synthetic defect identitication on Brodatz texture images. Upper figures arc the originals. while lower-left and right are the segmented flaws achieved with the pseudo-Wigner and ACO filters respectively.

296 E n


Fig. 10 - Two examples of real defect identification on ceramic tile images. Upper figures are the originals, while lower-left and right are the segmented flaws achieved with the pseudo-Wigner and ACO filters respectively.

Fig. I1 - Two examples of real defect identification on ornamental stones (granite) images. Upper figum are the originals, while lower-left and right are the segmented flaws achieved with the pseudo-Wigner and ACO filters respectively.

width was narrower than for the real cracks. It is apparent from the results that in all cases the

crack defects have been reliably detected. The processing time was of the order of 200 - 250 s for 256 X 256 images on a SUN-SPARC 2 workstation.

The thresholded outputs of the rank order filters for crack detection are shown in bottom-right of each figure. The cardinality of the structuring elements was set to 9 for images of Fig. 9 and 10 with search area of 5 x 5 , while for images of Fig. 11, because of the strong

noise and the wider crack size, the cardinality was 25 with a search area of 7 x 7. However, in the latter case, in order to reduce the computational complexity, a multiresolution approach can be applied. The APD criteria was computed for blocks of size 15 x 15. The Figures shows the output of those blocks with APD > rhApD highlighting the crack structures. No post-processing line-filter has been applied, therefore the residual noise could be reduced.

As a result, there are some fault-detections, that how-

Vol.6. No. 3 May - June 1995 297

Stelano Fioravanti. Roberto Fioravanti, Francesco G. B. Dc Natale, Radek Marik, Majid Mirmehdi, J. Kittler, M. Petrou

ever could be removed through a post-filtering, but generally the crack structures are identified. The processing time was about 1.1 s for cardinality 25, and 0.5 for cardinality 9 elements, 256 x 256 images on a HP9000/750 workstation.

A number of observations can be made when comparing the performance of the two approaches. The pseudo-Wigner approach appears to be more sensitive to crack defects while at the same time producing con- siderably less noisy output. Some false positive pixels (noise) in both methods could be easily rejected based on the knowledge of the physical process giving rise to the cracks. Thus in granite, very short (a few pixels long) cracks are highly unlikely to occur and can easily be filtered out. However on ceramic tiles some crack- like defects can be very short and in such situations any noise filtering could result in elimination of true defects. On textiles, again thin defects are likely to be elongated (broken thread) and noise suppression may be appropriate.

For a number of textures the Rank Order based approach gives acceptable results. From the point of view of real-time processing the Rank Order based approach is very efficient and could be very easily realised with low-cost hardware. The pseudo-Wigner distribution procedure on the other hand is computationally de- manding and its hardware implementation would call for a special-purpose multiprocessor system in a much higher cost bracket.

followed by post-processing with the optimal line filter. Furthermore, the optimal line filter was described in some detail.

Afterward, we introduced an algorithm that makes use of adapted morphological filters. We showed how the normal texture structures can be brought to bear on the filter design in order to efficiently detect the abnor- malities. In particular, we studied the shape of the cost function to be minimized for learning the normal texture and we developed and described an efficient algorithm for filter optimization.

Finally, we provided the results of the application of both approaches and concluded that the user has the choice of technique which depends on the type of texture to be analysed. In many applications the Rank Or- der based approach is likely to provide an acceptable solution which will be easy to implement in hardware. For more difficult textures, the pseudo-Wigner distribution provides a very powerful solution but opting for such a tool will be associated with processing costs of two orders of magnitude higher.

Acknowledgment

This work was partially supported by the CEC within the framework of the project BRE-2-OOXX -> BE-5638.

Manuscript received on I5 September, 1994 5. CONCLUSIONS

Visual inspection will have an increasingly important role as part of automatic systems designed to respond to remotely-placed user queries. One particular problem would be that of inspection of patterned surfaces and in this paper we have dealt with the problem of hairline crack detection on randomly textured images.

Two main approaches were detailed, namely the pseudo-Wigner distribution which provides a cojoint spatiaYspatial frequency representation of the texture pattern and the spatial domain based Rank Order approach which acts on first-order image statistics and morphological aspects.

Initially we described the pseudo-Wigner distribution. It was found that the local crack information was best encapsulated in the general shape of the spectrum. Also, we discarded the local averaging window of the classic pseudo-Wigner distribution as it was found to smooth and blur the crack signals we wished to detect. Next, we described the crack detection algorithm con- sisting of an initial training stage and a testing stage. During the training stage the statistical distribution of the Wigner spectra of the underlying texture was computed using the pseudo-Wigner formulae. In the testing stage the residual map of the Mahalanobis distance of the local spectrum from that distribution was calculated

REFERENCES

K. S. Fu: Syntactic panern recognition and application. Prentice Hall, New Jersey. 1982.

D. H. Ballard. C. M. Brown: Computer vision. Prentice Hall, New Jersey, 1982.

B. Julesz: Experiments in the visual perception of texture. “Sci- entific American”. Vol. 232. 1975. p. 34-43.

B. Julesz: A theoiy of preattentive texture discrimination based on first-order. statistics of textons. “Biological Cybernetics”, Vol. 41. 1981, p. 131-138.

R. M. Haralick “Statistical and structural approaches to texture,’. “Roc. IEEE, Vol. 67. 1979, p. 786-804.

P. C. Chen. T. Pavlidis: Segmenraation b.v texture using correla- tion. T E E Transaction on Pattern Analysis and Machine Intel- ligence”, PAMI-5, Vol. 1, 1983, p. 64-69.

R. Chellappa. S . Chatterjee: Classification of textures using Gauss-Markov random fields. “IEEE Trans. ASSP“. Vol. 33,

G. Sharma, R. Chellappd: Two-dimensional spectral estimation using noncausal autoregressive models. “IEEE Trans. Informa- tion Theory”, Vol. 32. Mar. 1986. p. 268-275.

AUg. 1985. p. 959-963.

R. Picard, I. M. Elfadel. A. P. Pentland: Markov/Gibbs texture modelling: Aura matrices aid temperature effects. “Proceedings IEEECVPR“, Hawaii. 1991, p. 371-377.

J. Serra: Image analysis and mathematical Morphology. Aca- demic Press. London, 1982. Vol. 1, p. 43.

J. G. Daugman: Uncertaing relation for resolution in space, spatial/rrquency, and orientation optimized by two-dimensional

298


visual corticaljilters. “Journal of the Optical Society of Ameri- ca”. Vol. 2, No. 7, 1985, p. 1 1 W I 169.

[12] D. Gabor: Theory of communication. “Journal of Inst. Electrical and Electronic Engineers”, Vol. 93, 1946. p. 429-459.

[ 131 M. Bastiaans: Gabor’s expansion of a signal into g o l u s h ele- menfaty signals. “Proc. IEEE”. Vol. 68, Apr. 1980, p. 538-539.

[ 141 S. G. Mallat: A theory for multiresolurion signal decomposition: the wavelet representation. “IEEE Trans. PAMI”, Vol. 1 I , Jul.

[ 151 M. I. Bastiaans: The Wigner distriburionfwrction applied to optical signals and system. “Optics Communications”, Vol. 25. No. 1. 1978.

[16] M. J. Bastiaans: Wigner disfributionfwrction and its application rofirst order optics. “J. Opt. Soc. AM.”, Vol. 69, No. 12, 1979.

[ 171 T. A. C. M. Claascn, W. F. G. Mecklenbrauker. The Wigner distribution - a tool for rime-frequency signal analysis; Part I : Continuous time signals. “Philips J . Res.”, Vol. 35, No. 3. 1980,

[I81 L. Jacobson, H. Wechsler: The Wigner distribution and its use- fulnessfor 2D image processing. Roc. 6 t h Int. Joint Conf. Pat- tern Recognition. Oct. 1982, p. 19-22.

[ 191 W. Martin, P. Flandrin: Analysis of non-stationary process: short rime periodograms versus a pseudo- Wigner estimator. Schussler EURASI. 1984, p. 455-459.

1989, p. 674-693.

p. 217-250.

1231 K. Y. Song, M. Pctrou, J .Kittler: Texture rruck detection. Ma- chine Vision Applications, to appear in January 1995.

[241 J. S. Weszka, C. R. Dyer, A. Rosenfeld: A comparative study 01 texture measures for terrain clnrsification. “IEEE Transactions on System, Man and Cybernetics”.Vol. 6, No. 4, April 1976. p.

[25] A. D’astous. M. E. Jernigan: Texture discrimination based on detailed measures of the power specrrum. “Pmc of 7-th XCPR’,

[26] M. Petmu: Optimal convolution filters and an algorithm for the detecrion of linear features. “IEE Proc. 1 Communications, Speech and Vision”, 1993.

[27] P. Maragos. R. W. Shafec Morphologicalfilters ~ Pari 11: their relations to median, order statistics and stack filters. “IEEE Trans. ASSP‘. Vol. 35. No. 8, Aug. 1987. p. 1170-1 184.

[28] P. Salcmbier: Adaptive rank order based filrers. “Signal Pro- cessing”, Vol. 27, No. 1, Apr. 1992, p. 1-25.

[29] J. S e m Image analysis ond mathernotical mogofology. Academic Pnss. New York, 1983

[30] S. J. k c , R. M. Haralick, L. G. Shapiro: Morphologic edge detection. “IEEE Journal of Robotics and Automation”, Vol. 3. April 1987.

[311 P. Zampemni: Feature extraction by rank-order filtering for image segmentation. “Intern. J. Patt. Recogn. Art. Intell.”, Vol. 2, Jun. 1988, p. 301-319.

269-285.

Vol. 6, NO. 4, 1984. p. 269-285.

[20] D. Slepian: Analytic solution of hvo apodizarin problems. “J. Opt. Soc. Am.”. Vol. 55. 1965, p. 1 1 10-1 1 15.

(21 ] J. F. Kaiser: Digitulfilters. Wiley, New York, 1966.

[22] L. D. Jacobson, H . Wechsler: Joint spafiaUspatia1-frequency representation. Signal Processing, Vol. 14. 1988, F. 37-68.

[32] P. Salembier, M. Kunt: Size-sensitive multiresolution decomposition of images with rank order basedfilrers. “Signal Process- ing”. Vol. 27, No. 2, May 1992. p. 205-241.

[33] M. Unser: Local tmnsforms for texture analysis. Proceedings of 7-th Int. Conference on Pattern Recognition. lCPR84. 1984, p. 120- 1208.


Spectral and rank order approaches to texture analysis

Documents

Transcript of Spectral and rank order approaches to texture analysis