Variable Pyramid Structures for Image Segmentation

11
COMPUTER VISION, GRAPHICS, AND IMAGE PROCESSING 49, 346-356 (1990) Variable Pyramid Structures for Image Segmentation S. BARONTI, A. CASINI, AND F. LOTTI Istituto di Ricerca sulle Onde Elettromagnetiche, Consiglio Nazionale delle Ricerche, IROE-CNR, via Panciatichi 64, 50127, Florence, Ita!v AND L. FAVARO AND V. ROBERTO Uniuersitd degli Studi di C/dine, Dipartimento di Matematica e Informatica, via Zanon 6. 33100 I/dine, Italy Received December 16,1987: revised April 12.1989 Linked pyramid structures have proved to be a useful tool in digital image processing for many applications because of their ability to face problems at different levels of detail. Some variations suggested by usage to existent pyramid algorithms have been investigated for the segmentation of compact objects in noisy IR images. In particular, the efficacy of increasing the span at the very last iterations in order to correct the link deficiency of the boundary nodes is reported. We also report about a method which separates segment roots at any level in the pyramid and merges the segments under the constraint of the maximum number of regions to be distinguished. The method is applied to IR image segmentation and comparative results are given. ‘r 1990 Academic Press, Inc 1. INTRODUCTION A grey level pyramid structure is a set of images of the same scene, at different resolutions [l-3]. The original image, a 2M x 2M pixel matrix, is at the base level (bottom, 1 = 0); any other matrix at the generic level I is generated with : elements (nodes) of the matrix at level I - 1, up to the one-node vertex (1 = M) of the pyramid. The grey level of each node at level 1 > 0 is computed by means of a local operator, usually the mean or the median, applied on the nodes at level I - 1. Pyramids of any quantity can be generated in this way, even if all the algorithms usually start with a pyramid of grey levels. Multiple links can be established among nodes by means of some homogeneity criterion and of a proper choice of the “span” of the pyramid [2]. The span is a quantity which determines the set of the possible links for a node and which is usually kept constant during all the recursive procedure to be run on the structure. In Section 2 we point out that the variation of the span before the last iterations can avoid some artifacts arising in image segmentation due to the forced choice of some links, particularly between nodes on the borders. In Section 3 a segmentation algorithm is discussed which implements the variable span and improves flexibility and efficiency by identifying segment roots anywhere in the grey-level pyramid only on the basis of a user-provided maximum number of regions. The algorithm is applied to noisy IR images; the results are discussed in Section 4 and compared with those of other algorithms. 346 0734-189X/90 $3.00 Copyright 0 1990 by Academic Press, Inc. All rights of reproduction in any form resewed.

Transcript of Variable Pyramid Structures for Image Segmentation

COMPUTER VISION, GRAPHICS, AND IMAGE PROCESSING 49, 346-356 (1990)

Variable Pyramid Structures for Image Segmentation

S. BARONTI, A. CASINI, AND F. LOTTI

Istituto di Ricerca sulle Onde Elettromagnetiche, Consiglio Nazionale delle Ricerche, IROE-CNR, via Panciatichi 64, 50127, Florence, Ita!v

AND

L. FAVARO AND V. ROBERTO

Uniuersitd degli Studi di C/dine, Dipartimento di Matematica e Informatica, via Zanon 6. 33100 I/dine, Italy

Received December 16,1987: revised April 12.1989

Linked pyramid structures have proved to be a useful tool in digital image processing for many applications because of their ability to face problems at different levels of detail. Some variations suggested by usage to existent pyramid algorithms have been investigated for the segmentation of compact objects in noisy IR images. In particular, the efficacy of increasing the span at the very last iterations in order to correct the link deficiency of the boundary nodes is reported. We also report about a method which separates segment roots at any level in the pyramid and merges the segments under the constraint of the maximum number of regions to be distinguished. The method is applied to IR image segmentation and comparative results are given. ‘r 1990 Academic Press, Inc

1. INTRODUCTION

A grey level pyramid structure is a set of images of the same scene, at different resolutions [l-3]. The original image, a 2M x 2M pixel matrix, is at the base level (bottom, 1 = 0); any other matrix at the generic level I is generated with : elements (nodes) of the matrix at level I - 1, up to the one-node vertex (1 = M) of the pyramid.

The grey level of each node at level 1 > 0 is computed by means of a local operator, usually the mean or the median, applied on the nodes at level I - 1.

Pyramids of any quantity can be generated in this way, even if all the algorithms usually start with a pyramid of grey levels.

Multiple links can be established among nodes by means of some homogeneity criterion and of a proper choice of the “span” of the pyramid [2]. The span is a quantity which determines the set of the possible links for a node and which is usually kept constant during all the recursive procedure to be run on the structure.

In Section 2 we point out that the variation of the span before the last iterations can avoid some artifacts arising in image segmentation due to the forced choice of some links, particularly between nodes on the borders.

In Section 3 a segmentation algorithm is discussed which implements the variable span and improves flexibility and efficiency by identifying segment roots anywhere in the grey-level pyramid only on the basis of a user-provided maximum number of regions. The algorithm is applied to noisy IR images; the results are discussed in Section 4 and compared with those of other algorithms.

346 0734-189X/90 $3.00 Copyright 0 1990 by Academic Press, Inc. All rights of reproduction in any form resewed.

PYRAMIDS FOR SEGMENTATION 347

2. LINKED PYRAMIDS WITH VARIABLE SPAN

The span c (c 2 1) is a key concept in linked pyramid structures: it is the half-side width of the square block of 2c X 2c nodes at level I - 1 which comprises all the possible “sons” to which an inner node at any level I (I > 0) can be linked; an inner node at level I can be linked with up to c x c “fathers” at level I + 1 [2].

In pyramid algorithms all links from a node to its fathers can be taken into account and can be assigned different weights by means of a suitable similarity function [3, 41. To lighten the computing burden, only one link to a single preferential father is commonly preserved [4, 11.

The configuration of the links throughout the pyramid is recursively updated until it comes to a steady state, usually after a few iterations [6]. At each iteration the values of the quantities associated with a node are recomputed according to the values of the linked sons. The upward links can be varied accordingly and a new iteration can be started.

In segmentation applications [l, 3, 41 each algorithm has a peculiar way to separate some nodes in the grey-level pyramid, which may become the roots of trees, whose leaf nodes on the base level are the pixels of the candidate segments.

FIG. 1. Link propagation at the borders: (a) original; (b) segmented version with span c = 2 in which propagation effects are evident at the comers; (c) segmentation of (a) with c = 3; (d) as in (b), but varying c to 3 at the last iteration. In (c) and (d) border effects are removed.

348 BARONTI ET AL.

After the initialization, a span c = 2 is commonly used during all the procedure. In this paper we propose a variable span implementation: we increase the span to c = 3 in the final iterations.

This variant is suggested by the scarcity of link choices imposed by span c = 2 on the boundary and corner nodes. In fact, at any level, the boundary nodes have only

two candidate fathers and the corner nodes have only one candidate father to link to.

This fault is propagated downward on the pyramid, through the linked sons, to wide zones in the bottom level image and can cause striking segmentation errors. This is apparent if we compare the original image of Fig. la with its segmented version of Fig. lb in which the dark areas at the comers are due to that cause.

The higher number of possible links to candidate fathers given by span c = 3 usually overcomes the problem (see Fig. lc) at the expense of computing speed. On the other side, a larger span increases the danger of preserving isolated noisy pixels inside a region, since they can be more easily tied with outer regions of similar grey level.

To deal with the boundary problem and maintain c = 2, some authors embed the original image into a larger background image [4]; others extend the image out of its boundaries as if it were periodic [3]; someone else inwardly reflects each link toward the missing external father [l]. The latter solution requires a procedure which depends on the position of the nodes; moreover, in the subsequent computation, the maximum number of sons linkable to the boundary fathers is greater than the standard one.

We propose a method in which no correction is requested for the boundary effect: after some iterations with span c = 2, a few more iterations are performed with span c = 3. This simple solution allows us to implement a uniform procedure and turns out to be quite effective, as can be seen in Fig. Id, where the peripheral segmentation faults of Fig. lb have been eliminated with only a small increase of the computing time.

3. ROOT MERGING ALGORITHM

Some segmentation algorithms described in literature [3, l] use a grey-level pyramid with a top level L < M chosen in such a way that the number qMeL of top nodes is either equal or slightly larger than the prefixed number of segments to be identified in the image: at the end of the iterations, the nodes at the top level are the roots of the segment trees. As pointed out in [4], such a strategy is more a clustering of the pixels in 4 M-L classes than a segmentation into compact regions: distinct regions may be forcibly relinked to the same top root.

Others, instead of choosing only one father for each node, preserve all the possible links and assign a proper weight to each of them: after the stabilization, the configuration of the weights allows to identify the root nodes, even at levels lower than the top [3, 41. This approach is very interesting because it avoids the forced links to the top roots and the need to predefine the root number, but it is computationally very expensive, since the links to be maintained for each inner node are c2.

We propose an algorithm in which each node links to a single father and becomes a root if the grey level difference between the twos exceed a predefined threshold.

PYRAMIDS FOR SEGMENTATION 349

It may happen that some region is not connected to a single root, but has roots on the same level as well as on neighboring levels. Therefore it is convenient to manage the merging of roots on the basis of:

-their grey level difference in a g-neighboring window at the same level -the proximity of the zones projected from the roots onto the base level.

If one applies this strategy even to low levels in the pyramid, it may happen that too many small regions are extracted and a noisy image is produced. Thus it is convenient to fix a minimum level under which no root is identified and a minimum area size for the region associated to a root by using the a priori knowledge of the image content.

Instead of dealing with the grey level and the minimum area thresholds, we prefer to establish the maximum number of regions as the unique free parameter and to derive the thresholds from it.

The algorithm steps are the following:

(a) Both the grey level threshold Tg and the area threshold T, are initialized to suitable values chosen on the basis of the a priori knowledge about the image contrast and the minimum size of the objects to be considered distinct.

(b) The pyramid is initialized with span c = 2. (c) The iterative relinking algorithm [l] is run for a convenient number of

iterations (e.g., 7). At each iteration, at the pyramid levels about the minimum, a node is said to be a candidate root if:

-the grey level difference with the most similar father exceeds the grey level threshold Tg;

-the number of associated pixels in the bottom image is above the area threshold T,; A candidate root does not contribute to the properties of a node in the upper levels.

(d) After the last iteration, a grouping procedure is applied to g-neighboring roots when the differences of their grey levels are below Tg. Each of these groups is characterized by the mean grey level and by the total number of pixels in the bottom level, linked to the grouped roots.

Two groups are merged if the difference of their mean grey levels is below Tg, and if a “proximity” condition is verified. To define the latter condition we operate as follows:

-At the bottom level, the minimum rectangular window containing the projections from all the roots of a group is separated and its area (A;, i = 1,2) is computed;

-The area A, of the minimum window containing both A, and A, is computed and let A, the maximum between them;

-The ratio ,4,/A, is evaluated; if it exceeds a predefined value (e.g., 0.7) the proximity condition is verified and the new group is formed with a grey level which is the average, and an area which is the sum of those of the two groups.

The process is iterated until there are groups to merge. A group now separates a region.

350 BARONTI ET AL.

(e) If the total number of regions is lower than the user-provided maximum, the whole procedure is ended. Otherwise, the area and the grey level thresholds are updated (see below) and the procedure continues from step (c), a number of iterations lower than before (e.g., 4).

In order to accelerate the convergence, the thresholds are updated by means of heuristic increment functions which exploit the intermediate results.

We found it effective to vary T, only if the number Ng of the resulting regions is significantly larger than the user-provided maximum (e.g., > 70%).

In this case we compute the numbers of groups Ni and N2 whose areas are respectively smaller than T, + dT, and To + 2dT,. If Ni is great enough (e.g.. Ni > 0.25N,), the new area threshold Ta’ is computed as

To’ = T, + dT,(l + N,,‘N,) ;

otherwise, if N2 is great enough (e.g., N2 > 0.3N,) then

T,’ = To + 2dT,(l + N,/N,);

in all the other cases only Tg is updated. To this aim we adopted the rule, suggested by experience,

dTg = 0.5\;oR,,

(3.1)

(3.2)

where CJ is the standard deviation of the grey levels of the regions. R, is the ratio between Ng and the user-provided maximum.

The meaning of (3.3) is that dTg increases with the grey level variability of the segmented image: when the regions have similar grey levels, a little dTg will be sufficient to reduce the number of segments at the following iterations. The dependence on R, weights the distance from the allowed number of segments: if there are only a few regions more than the user-provided maximum, a small variation of Tg will be sufficient to move under the limit. The square root law and the 0.5 factor resulted in giving good results with IR images.

4. RESULTS

Some results obtained by applying the described algorithm are presented in Figs 2 and 3.

Fig. 2a shows an IR image of a multilayer composite material under inspection by means of pulsed thermography [8]; a light square defect (a flaw), is present together with a black reference hole at the low right corner; the light patterns on the left bottom half are due to the alphanumeric frame labelhng. In Fig. 2b the result of our algorithm is reported; a maximum number of five regions was chosen.

Figures 2c, d, for comparison purposes, show the outputs of the algorithms described respectively in [l, 41. In the former, a span c = 2 was chosen, with no correction for border effects and with a maximum number of four regions to be extracted; in the latter, the number of regions is determined by the algorithm itself and is rather high (11 regions).

PYRAMIDS FOR SEGMENTATION 351

FIG. 2. Test on a composite material by means of pulsed thermography: (a) original-a square defect inside the material (a flaw) appears in the central part of the image; (b) segmentation with our algorithm; (c), (d) results with other pyramid algorithms ([4, 31, respectively).

Figure 3a refers to a different IR test image: the fault consists of two flaws which appear in the central part of the image. Some parameters were modified with respect to Fig. 2: a maximum number of four regions was chosen for our algorithm (Fig. 3b); a span c = 3 was imposed on the others (Figs. 3c, 3d, respectively).

A subjective evaluation of the comparative results shows the good behavior of our algorithm.

In order to give a performance evaluation of these algorithms, some synthetic test images have been processed after the addition of white gaussian noise of fixed standard deviation. An example is given in Fig. 4, where Fig. 4a is the original image (128 X 128 pixels) composed of a circular object of 5013 pixels with grey level 100 and a background of 11,371 pixels with grey level 120; in Fig. 4b, white gaussian noise with standard deviation a = 3 has been added, while in Figs. 4c, d, e the results of the algorithms previously considered for Fig. 2 have been respectively reported.

For comparison purposes, Fig. 4f shows the result obtained from Fig. 4b by simple thresholding.

352 BARONTI ET AL.

FIG. 3. Test on a composite material by means of pulsed thermography: (a) original-the fault consists of two flaws in the central part of the image: (b), (c), (d) as in Fig. 2.

It is not immediate quantitatively to compare the results of the various algo- rithms, since the resulting regions usually differ in shape as well as in number; however, for the present test we require two regions for a result. This is easily obtained in our algorithm, setting the maximum number of regions to 2; for the others, we attribute each resulting segment to the object class or to the background by applying a grey-level threshold. This threshold minimizes the total number of wrongly segmented pixels (WPN).

For example, the noisy result of Fig. 4d derives from the initial choice, in the algorithm, of four segments which correspond to the four top nodes of the pyramid [l]. The result after thresholding appears in Fig. 5.

In order to evaluate the performances of the considered algorithms, we plotted the estimated values of the two conditional error probabilities:

P (assigned objectltrue background) = probability that a pixel in the back- ground region is classified in the object segment;

P (assigned background(true object) = probability that a pixel in the object region is classified in the background segment.

PYRAMIDS FOR SEGMENTATION 353

FIG. 4. Example of algorithm tests on a synthetic image: (a) original; (b) white gaussian noise was added to the original; (c) result of the proposed algorithm; (d), (e) results of the algorithms [4, 31, respectively; (f) segmentation on (b) by simple thresholding.

Some results are reported in Figs. 6-8, respectively, for (I = 3, 5, and 8. The grey-level link algorithm described in [l] is indicated as GLL(1); GLL(2) is

the same algorithm with a span increased to c = 3 at the last iteration; WGLL indicates the weighted grey-level link algorithm in [4]. RM (root merge) is our algorithm and THR is the segmentation by simple thresholding.

In the implementation of WGLL we imposed an inferior limit .Ol on the variability a(P) of the 16 sons of an element P in the pyramid, in order to avoid the divergence of the link strength function (see [4, p. 2241).

The probabilities were estimated as average conditional error frequencies com- puted over several replicas of the test image in Fig. 4a with different independent

354 BARONTI ET AL

FIG. 5. Region clustering. The image of Fig. 4d after imposing two regions as output

samples of the noise. The number of replicas was 24 for u = 3 and 12 for u = 5, II to obtain standard deviations of the estimated probabilities not greater than their mean values.

In particular, WGLL has a standard deviation rapidly decreasing with the number of experiments; in fact, it gives the most stable results with respect to the noise sample.

GLL(l), GLL(2), and RM have a stability lower than that of THR, which does not imply that their segmentation capability is lower. In fact, the error probability and the statistical stability are not sufficient criteria to appreciate the segmentation capability of the algorithms. This can be seen in Fig. 9, where WGLL and RM were applied on a test image obtained by adding a linear grey-level wedge on the image in Fig. 4b. The wedge range is 0 to 50% of the previous contrast (a = 3 for the noise).

o-= 3

. x + f

1 * RM

0 Gtt(l)

x GLL(2) l WGU

+ THR

ll? lo+ lo-’ lh-~ IF’ lb

@signed background j true object)

FIG. 6. Plot of the conditional error probabilities of the considered algorithm for a noise standard deviation CT = 3 (the object/background contrast is 20).

PYRAMIDS FOR SEGMENTATION 355

FIG. 7.

FIG. 8.

RM

GUtI)

WI WGU

THR

10-J 1% lr’ 10-J Nr’ w

P(assigned background 1 true abject)

Plot of the conditional error probabilities for a noise standard deviation

Plot

lo-l- II II ‘I ‘? ” 10-J lo-’ lb-’ lo-’ IF’ ld 10-J lo-’ lb-’ lo-’ IF’ ld

P(assignad background 1 trus abject) P(assignad background 1 trus abject)

of the conditional error probabilities for a noise standard of the conditional error probabilities for a noise standard deviation u = 8.

u = 5.

FIG. 9. Qualitative comparison between WGLL and RM. A linear wedge with a range 0 to 50% of the object/background contrast was added. The noise standard deviation is e = 3. The display lookup table was stressed to show the segmented regions.

356 BARONTI ET AL.

It can be seen that RM, according to the imposed limit on the maximum number of regions (2), gives a segmentation of more direct use. Obviously, results of the same quality can be obtained from the output of WGLL if a threshold is placed so as to minimize the WPN for a two-segment classification scheme. This is an important point to be taken into account because, in common applications, varia- tions of the background level are usual (see, for example, Figs. 2 and 3).

CONCLUSION

In this work some applications of pyramid structures to the segmentation of noisy IR images has been discussed.

In such a domain, the importance of varying the span has been stressed and some algorithms have been implemented taking into account the span as a variable parameter of the structure.

A solution to the problem of border pixels is proposed and a new recursive algorithm, which accepts, as input, the maximum number of regions to be extracted, is presented.

ACKNOWLEDGMENT

We thank Dr. A. Hall of Barr & Stroud for allowing us to use the IR images presented above

REFERENCES

1. P. J. Burt, T. H. Hong, and A. Rosenfeld, Segmentation and estimation of image region properties through cooperative hierarchical computation, IEEE Trans. Systems Man Cybern. SMC-11, 1981, 802-809.

2. W. I. Grosky and R. Jam, A pyramid-based approach to segmentation applied to region matching, IEEE Trans. Pattern Anal. Mach. Intell. PAM&S, 1986, 639-650.

3. T. H. Hong, K. A. Narayanan, S. Peleg, A. Rosenfeld, and T. Silberberg, Image smoothing and segmentation by multiresolution pixel linking: Further experiments and extensions, IEEE Trans. $ystems Man Cybern. SMC-12, 1982, 611-622.

4. T. H. Hong and A. Rosenfeld, Compact region extraction using weighted pixel linking in a pyramid. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-6, 1984, 222-229.

5. T. H. Hong and M. Shneier, Extracting compact objects using linked pyramids, IEEE Trans. Pafzrr~~ Anal. Mach. Intell. PAMI-6, 1984, 229-231.

6. S. Kasif and A. Rosenfeld, Pyramid linking is a special case of ISODATA, IEEE Trans. Systems Man Cybern. SMC-13, 1983, 84-85.

7. M. Shneier, Using pyramids to define local thresholds for blob detection, IEEE Trans. Pattern 4nal Mach. Intel. PAMI-5, 1983, 345-349.

8. T. S. Durram, K. Boyle, A. Rauf, F. Lotti, A. Casini, and S. Baronti, Computer Aided Thermai Teihniques for Real Time Inspection on Composite Materials, Reports 1-5, September 1983-Oc- tober 1987, University of Strathclyde, Department of Electronic Engineering, Glasgow, UK.