Implementing Level set Methods for Segmenting Malayalam Characters in Document Images

8
© 2013, IJARCSSE All Rights Reserved Page | 229 Volume 3, Issue 9, September 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Implementing Level set Methods for Segmenting Malayalam Characters in Document Images Bijeesh T V 1 , Nimmi I P 2 ,Rekha P K 3 Department of Computer science &Engineering Sree Narayana Guru College of Engineering, Payyannur, Kerala, India. AbstractAlgorithms for binarizing images and segmenting the areas of interest have always been looked on with great interest by researchers all over the world. Such algorithms have great importance in converting historical document images into and editable document format. Document images may be with non-uniform background, stains or are often subjected to degradation which makes segmentation a difficult task. Segmentation is the first step in developing an Optical Character Recognition (OCR) system. In this paper we implement level set theory based segmentation on document images with an Indian regional language, Malayalam. Two segmentation methods based on level set theory, active contours with selective local or global segmentation [1] and fast global minimization algorithm [2] are discussed in this paper. Segmentation of characters from document image is accomplished by taking advantage of a highly flexible active contour scheme. Curvature-based force used to control the evolution of the contour leads to more natural-looking results. The proposed implementation benefits from the level set framework, which is highly successful in other contexts as well, such as medical image segmentation and road network extraction from satellite images. KeywordsBinirization, Segmentation, OCR, Level set, Active contour I. INTRODUCTION Document binarization is the stage of image processing applications which converts the gray scale image into binary image document. In the context of OCR, it can be viewed as the process of extracting text pixels from the background. Removal of unwanted data in the initial stage helps us make the subsequent steps easier. But due care must be taken at this stage as the error creeping in at this stage affects the performance of following steps. The difficulty level of document image binarization increases with increase in complexity. Non uniformity of pixel intensities, due to ink smearing and variations in pen pressure, are common problems found in hand written document while certain font shapes with very thin segments may be problematic in case of printed document images. Physical characteristics of the original media along with digitalization artifacts contribute to increase complexity in the binarization process. Paper appearance and texture, stains, bleed-through effect, and various other degradations caused by exposure to humidity or light are some commonly found problems. As we come across numerous difficulties, the need for a strong binarization method rather than a simpler one arises which leads us to Global binarization methods. However it must be noted that the global binarization has failed in some challenging cases which is when we go in search of more complex schemes. Global binarization method is categorized as intensity based and edge based. The color/intensity based classification consists of methods which consider variations in color or intensity while the edge based includes methods in which abrupt changes across edges are the main source of classification information. The intensity-based methods can be split into two subgroups: thresholding-based methods and clustering methods. Niblack‟s [3] method can be considered as the first local threshold method but its outputs tend to be noisy in background-only regions. Sauvola and Pietikinen [4] solved this problem by introducing a generalization of Niblack‟s method that tries to identify the background regions. One of the main drawbacks of the local methods is their high computational cost. Failure to segment the inner parts of thick characters adds to its drawbacks. Grid-based modelling reduces the computational time considerably while integral image technique has also been used to reduce the computational cost. Grid- based modelling assists to transform global thresholding methods into their local versions. Edge-based methods usually use a measure of the changes across an edge, such as the gradient or other partial derivatives, or directly use an edge detector such as Canny edge detector. Most recent works state that image contrast has been used to identify text images and to estimate the local threshold values in order to binarize highly degraded historical document images. But edge based method is not considered the best as the edges may not be sharp due to fading of ink or degradation which gives high level of variation in the gradient over edges which is the main reason gradient too is considered important. A prior geometrical

Transcript of Implementing Level set Methods for Segmenting Malayalam Characters in Document Images

© 2013, IJARCSSE All Rights Reserved Page | 229

Volume 3, Issue 9, September 2013 ISSN: 2277 128X

International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com

Implementing Level set Methods for Segmenting Malayalam

Characters in Document Images Bijeesh T V

1, Nimmi I P

2 ,Rekha P K

3

Department of Computer science &Engineering

Sree Narayana Guru College of Engineering, Payyannur, Kerala, India.

Abstract— Algorithms for binarizing images and segmenting the areas of interest have always been looked on with

great interest by researchers all over the world. Such algorithms have great importance in converting historical

document images into and editable document format. Document images may be with non-uniform background, stains

or are often subjected to degradation which makes segmentation a difficult task. Segmentation is the first step in

developing an Optical Character Recognition (OCR) system. In this paper we implement level set theory based

segmentation on document images with an Indian regional language, Malayalam. Two segmentation methods based

on level set theory, active contours with selective local or global segmentation [1] and fast global minimization

algorithm [2] are discussed in this paper. Segmentation of characters from document image is accomplished by taking

advantage of a highly flexible active contour scheme. Curvature-based force used to control the evolution of the

contour leads to more natural-looking results. The proposed implementation benefits from the level set framework,

which is highly successful in other contexts as well, such as medical image segmentation and road network extraction

from satellite images.

Keywords— Binirization, Segmentation, OCR, Level set, Active contour

I. INTRODUCTION

Document binarization is the stage of image processing applications which converts the gray scale image into binary

image document. In the context of OCR, it can be viewed as the process of extracting text pixels from the background.

Removal of unwanted data in the initial stage helps us make the subsequent steps easier. But due care must be taken at

this stage as the error creeping in at this stage affects the performance of following steps. The difficulty level of

document image binarization increases with increase in complexity. Non uniformity of pixel intensities, due to ink

smearing and variations in pen pressure, are common problems found in hand written document while certain font shapes

with very thin segments may be problematic in case of printed document images. Physical characteristics of the original

media along with digitalization artifacts contribute to increase complexity in the binarization process. Paper appearance

and texture, stains, bleed-through effect, and various other degradations caused by exposure to humidity or light are some

commonly found problems. As we come across numerous difficulties, the need for a strong binarization method rather

than a simpler one arises which leads us to Global binarization methods. However it must be noted that the global

binarization has failed in some challenging cases which is when we go in search of more complex schemes.

Global binarization method is categorized as intensity based and edge based. The color/intensity based

classification consists of methods which consider variations in color or intensity while the edge based includes methods

in which abrupt changes across edges are the main source of classification information. The intensity-based methods can

be split into two subgroups: thresholding-based methods and clustering methods. Niblack‟s [3] method can be considered

as the first local threshold method but its outputs tend to be noisy in background-only regions. Sauvola and Pietikinen [4]

solved this problem by introducing a generalization of Niblack‟s method that tries to identify the background regions.

One of the main drawbacks of the local methods is their high computational cost.

Failure to segment the inner parts of thick characters adds to its drawbacks. Grid-based modelling reduces the

computational time considerably while integral image technique has also been used to reduce the computational cost. Grid-

based modelling assists to transform global thresholding methods into their local versions. Edge-based methods usually

use a measure of the changes across an edge, such as the gradient or other partial derivatives, or directly use an edge

detector such as Canny edge detector. Most recent works state that image contrast has been used to identify text images

and to estimate the local threshold values in order to binarize highly degraded historical document images. But edge based

method is not considered the best as the edges may not be sharp due to fading of ink or degradation which gives high level

of variation in the gradient over edges which is the main reason gradient too is considered important. A prior geometrical

Bijeesh et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(9),

September - 2013, pp. 229-236

© 2013, IJARCSSE All Rights Reserved Page | 230

knowledge or estimation regarding the document reduces the complexity. Knowing the stroke width makes it easier to

reject small spurious objects or to fill unwanted gaps.

II. PROPOSED METHOD

In the proposed method for image segmentation and binarization, we exploit the level set theory and partial differential

equations. The two types of methods are edge-based methods and region based methods.

1. Level set method

We begin by choosing a level set function, which generally is an implicit representation of a curve. Initially we fix the

contour C0 arbitrarily and evolve it according to some force. Usually an image property is used as the force that drives the

level set evolution.

Figure 1: A level set function

Let the level set function be ( , , )x y t ; where t is an artificial time variable we introduce. The evolution of the level set

function is governed by the equation,

| | (1)Ft

( , , 0)x y t Corresponds to the initial contour C0. The driving force F acts in a direction normal to the contour.

is the gradient of the level set function ( , , )x y t .

Based on active contour techniques, we have two models for boundary detection, edge based and region based models.

1.1 Edge based model

Here segmentation is done based on the discontinuities in the image. An edge stopping function may be represented by:

2

1, (2)

1 * ,g x y

G f x y

Here G

is a Gaussian filter used to diffuse or smear the edges. The function ,g x y slows down the evolution of

the level set curve at the edges. It has value close to zero near the edges.

1.2 Region based model

This model does not work on the discontinuity in the image, rather it partitions the image into ”objects” and

”background” based on pixels intensity similarity. We try to minimize the following energy:

2 2

1 2, 1, 2

min , 1, 2 (3)inside c outside c

C c c

E C c c f c dxdy f c dxdy Length C

Here c1 and c2 are the average pixels values inside and outside the contour respectively. The value of this integral is

the minimum when the contour C is on the edges of the objects to be segmented.

2. Algorithms adopted for segmentation

This work focuses mainly on two segmentation algorithms on the context of developing an OCR system for

Malayalam language. The algorithms are discussed in detail in the following sections.

2.1. Active contours with selective local or global segmentation

Bijeesh et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(9),

September - 2013, pp. 229-236

© 2013, IJARCSSE All Rights Reserved Page | 231

This is a region-based active contour model based algorithm. The main advantage of this algorithm is that the contours

stop at even blurred or weak edges. Also both the exterior and interior boundaries can be obtained by this algorithm. This

advantage makes the algorithm a potential candidate for being exploited in the OCR context.

2.1.1 Selection of level set function

As it was mentioned earlier, we begin by choosing a level set curve and placing it arbitrarily. A widely used level set

function is a signed distance function. But in out method we use a level set function as given in the equation (4).

0 0

0

0

,

, 0 0 , (4)

,

x

x t x

x

Where is a constant, is a subset in the image domain is the boundary of.

2.1.2 Choosing the Force function, F

The evolution of the level set curve is controlled by equation (1). In our method, we are using a signed pressure force

(SPF) function as the force function. This is a region-based function which takes into account the pressure force inside and

outside the object of interest. The contour shrinks when outside the object and expands when inside the object.

We define the SPF function for an image I(x) as,

1 2( )

2( ) , (5)1 2

max ( )2

c cI x

spf I x xc c

I x

2.1.3 Final formulation

By substituting our SPF function in equation (1), the equation that governs the curve evolution becomes,

( ) . . , (6)t

spf I x x

This method stops the contour even at weak or blurred boundaries. Another advantage of using this method is the use

of a binary function to initialise the level set function, which can be constructed easily than the most widely used signed

distance function.

2.2 Global minimization algorithm for active contour models

The algorithm discussed in the previous section was found to have limitations when images with small letters were

given as input. Also the time taken for segmentation was high. The second algorithm we discuss here is a fast global

minimization algorithm for active contour models developed by Xavier Bresson. This algorithm minimizes the level set

faster than the previous algorithm. The objective is to compute a global minimizer of the active contour model. A global

minimizer is obtained by minimizing the gradient. Energy is minimized with the help of calculus of variations.

The level set minimization problem is a non-convex energy minimization problem and thus the final solution depends

on the initial contour. So the shape, size and position of the initial contour greatly affect the final solution. Due care must

be taken while designing the initial contour because a bad initial contour will result in a bad solution and the segmentation

will not be proper. To avoid this dependency the energy to be minimised is first convexified and then a global minimizer is

computed which will be independent of the initial contour. The detailed description and derivations can be found in [2].

This work aims at exploiting the power and speed of this algorithm in our context of segmenting characters for OCR.

2.3 Advantages of level set based segmentation methods over conventional methods

In traditional approach for segmentation, the input image is partitioned into sub images, which are then further

segmented. The operation of attempting to decompose the image into classifiable units is called "dissection". The

document image initially has to undergo line segmentation, which will segment each line as separate images. These images

then go for word segmentation that gives images that contain single words. Then each character is segmented separately

and given for classification. Segmentation is done using histogram profiling.

The level set based segmentation methods have many advantages over these classical methods. No histogram profiling

required in segmenting. Also, there is no need for line segmentation and word segmentation when level set based methods

are used. The characters are segmented in a single step which considerably reduces the time taken for segmentation.

Level set methodologies also can enable better classification results since it is based on the active contours. The

algorithm generates an outer contour and inner contours for characters which are closed in nature. This contour

information can act as a very good feature for classification purpose. For Indian languages which have most of their

characters with closed shapes, this method can be extremely effective.

III. EXPERIMENTS AND RESULTS

Bijeesh et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(9),

September - 2013, pp. 229-236

© 2013, IJARCSSE All Rights Reserved Page | 232

The experiments are done on images which do not require any pre-processing such as skew correction. The algorithms

are separately applied on the input images and their results compared.

The local or global segmentation algorithm behaved quite unpredictably to various input images given to it. The degree

of error increased when a few sentences was given as input rather than a single word. The initial contour itself was not

proper in many cases. The second algorithm, fast global minimization algorithm, however is found to be very effective in

segmenting and binarizing document images with certain level of degradation.

1. Contour Evolution

This experiment demonstrates how level set evolution is used to segment characters. It is the responsibility of the

programmer to place the initial contour at a proper position. This initial position of the contour affects the speed and

accuracy of the segmentation. The initial contour, intermediate contours and the final contour are shown in Figure 1 with

green, red and blue colour respectively.

Figure 1: Contour evolution for Malayalam letter „ka‟.

As the next step, an image with a word in Malayalam is given as input to the selective local or global segmentation

method. Figure 2 shows the contours formed around and inside the letters those have to be segmented.

Figure 2: Original Image and the binarized image using selective local segmentation algorithm.

The same image, when given as input to the global minimization algorithm, segmented the characters properly and a

good quality binarized image was obtained.

Figure 3: Original Image and the binarized image using global minimization algorithm.

Bijeesh et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(9),

September - 2013, pp. 229-236

© 2013, IJARCSSE All Rights Reserved Page | 233

Though both the algorithms performed with same accuracy with the previous word, with another image of a different

Malayalam word, the selective local or global segmentation algorithm failed to segment properly. The contour was formed

only around a single letter in the word and thus the binarized image contains only this particular letter.

Figure 4: Segmented and binarized image output of selective local segmentation.

The same image when given as input to the global minimization algorithm produced a much better output. The

characters are segmented properly and hence a better binary image was generated. Figure 3 and Figure 4 shows the

segmentation performed by the two algorithms.

Figure 5 : segmented and binarized image output of fast global minimization algorithm.

Selective local segmentation algorithm proved to be highly unreliable in our subsequent experiments. When images

containing a sentence and a paragraph were given as inputs, the segmentation was only partial. But global minimization

algorithm was highly reliable with all the inputs.

Figure 6: Segmented Image and the binarized image using selective local segmentation algorithm.

Figure 7: Original Image and the binarized image using fast global minimization algorithm.

Bijeesh et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(9),

September - 2013, pp. 229-236

© 2013, IJARCSSE All Rights Reserved Page | 234

As the characters became smaller and the number of character increased in the image, selective local or global

segmentation method suffered in quality.

Figure 8: Segmented Image and the binarized image using Zhang‟s method.

Figure 7: Original Image and the binarized image using Bresson‟s method.

Apart from the quality of the segmented image, a considerable difference in execution time was also noted when the

two above mentioned methods were compared. The fast global minimization algorithm produced the output in a much

lesser time than selective local or global segmentation method. Table 1 shows the exact time taken by the two algorithms

for our experiments with Malayalam letter, word, and sentence respectively.

Table 1: Time taken for segmentation by selective local segmentation method and fast global minimization

algorithm.

Input for the

algorithm

Selective local segmentation

method

Fast global minimization

algorithm

A Malayalam letter 1.696996 seconds. 1.030770 seconds.

A Malayalam word 9.563408 seconds. 1.206599 seconds.

A Malayalam word

of different font 10.629329 seconds. 1.446882 seconds.

A Malayalam

sentence 63.065542 seconds. 2.617572 seconds.

Bijeesh et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(9),

September - 2013, pp. 229-236

© 2013, IJARCSSE All Rights Reserved Page | 235

Our experiments with various document images clearly indicate a better performance by the fast global minimization

algorithm when compared to the algorithm which follows selective local segmentation. The former gives much better

output in a shorter time when degraded images are given as input.

So far in our experiments, we have used only document images which are relatively noise free. In this section, we

apply fast global minimization method on a noisy document image. After speckle noise was added to the test document

image and applied the algorithm, the segmentation was highly inaccurate. To overcome this, we performed edge enhancing

diffusion on the noisy image and then segmentation was performed. The result after performing diffusion was found to be

satisfactory. The segmentation results for noisy image before applying diffusion and after applying diffusion are shown in

Figure 8 and Figure 9.

Figure 8 (a): Noisy image before diffusion (b) Segmentation result on the noisy image

The characters were not segmented at all when speckle noise with intensity 0.2 was added to the test document image.

Figure 9 (a) Noised image after diffusion (b) Segmentation result after diffusion

Segmentation was done properly after an edge enhancing diffusion was applied on the noisy image. This diffusion has

almost nullified the effect of the speckle noise that was added to the test image.

For an image with high level of noise, this method produced satisfactory segmentation results when performed after

pre-processing the image using edge enhancing diffusion. We added high level of speckle noise to our test image and

performed an edge enhancing diffusion and then applied the proposed method for segmenting the characters. Figures 10

(a) and 10(b) shows the experiment outputs.

Figure 10(a) Noisy image (b) Segmentation result after diffusion

These experiments underline the fact that level set methods can be used for segmenting characters for OCR (Optical

Character Recognition) technologies. There is presently no good character recognition system for Indian languages. We, in

our work, attempt to exploit the level set theory to segment the characters of Malayalam language. As explained in

previous sections, we used the method proposed in [2] to get the active contours on the characters and the characters are

Bijeesh et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(9),

September - 2013, pp. 229-236

© 2013, IJARCSSE All Rights Reserved Page | 236

segmented based on the active contours. The experiment result for character segmentation using proposed method is show

in Figure 9.

Figure 10: Segmented characters using fast global minimization method

For a Malayalam document image, level set based method was found to be giving satisfactory results in character

segmentation. Even for a considerably degraded image, suitable pre-processing will help in achieving better segmentation.

IV. FUTURE SCOPE

As discussed in previous sections, level set based methods can be effectively used for obtaining active contours of

characters. In addition to providing better character segmentation results, level methods can further improve classification,

since it provides better features for classification. Features such as number of contours, location of contours and the length

of the contour can be extremely good features for classification purpose. This idea can be exploited in developing an OCR

(Optical Character Recognition) system, which can give increased accuracy than the existing OCR. So far there has not

been any good OCR software for any of the Indian languages. These level set based methodologies, when combined with a

good classification algorithm like SVM can lead to highly accurate OCR software for any language. Selecting the right set

of features and extracting those features would be a challenging task.

V. CONCLUSION

Two level set based methodologies for Malayalam character image segmentation are discussed in this paper. These

methods have better computational efficiency than the many other conventional level set based methodologies because of

the use of a better level set function and a region based signed pressure force as the curve evolution governing function.

Our study of the two algorithms leads us to the conclusion that fast global minimization algorithm gives a far better output

than the selective local segmentation method when the input image is a degraded one. Initially we restricted our

experiments to images which did not require pre-processing. In the later section we applied level set based method to a

noisy document image which was diffused using an edge enhancing diffusion method, and were able to produce

satisfactory segmentation result. We, in our work, have used the methods discussed here for segmenting characters from

document images. But these methods can also be used for segmenting images from various applications like medicine,

satellite imaging, and degraded historical documents.

REFERENCES

[1] Zhang, K., Zhang, L., Song, H., & Zhou, W. (2010). Active contours with selective local or global segmentation:

A new formulation and level set method. Image and Vision Computing, 28(4), 668-676. Elsevier B.V. doi:

10.1016/j.imavis.2009.10.009. A. A. Reddy and B. N. Chatterji, "A new wavelet based logo-watermarking

scheme," Pattern Recognition Letters, vol. 26, pp. 1019-1027, 2005.

[2] Bresson, X. (2009). A Short Guide on a Fast Global Minimization Algorithm for Active Contour Models. Energy,

1-19. F. Gonzalez and J. Hernandez, " A tutorial on Digital Watermarking ", In IEEE annual Carnahan conference

on security technology, Spain, 1999.

[3] W.Niblack, An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs, (1986).

[4] Sauvola and Pietikinen . Adaptive document image binarization. Machine Vision and Media Processing Group,

Infotech Oulu, University of Oulu, P.O. BOX 4500, FIN-90401 Oulu, Finland (1999).

[5] C. Li, C. Kao, J.C. Gore, Z. Ding, “Minimization of region-scalable fitting energy for image segmentation‟, IEEE

Transaction on Image Processing 17 (2008) 1940–1949.

[6] T. Chan, L. Vese, “Active contours without edges”, IEEE Transaction on Image Processing 10 (2) (2001) 266–

277.L.

[7] C. Li, C. Xu, C. Gui, and M. D. Fox, “Level set evolution without re-initialization: A new variational

formulation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005, vol. 1, pp. 430–436.P.

[8] L. Vese and T. Chan, “A multiphase level set framework for image segmentation using the Mumford and Shah

model,” Int. J. Comput. Vis., vol. 50, pp. 271–293, 2002.

[9] S.Lankton and A.Tannenbaum, “Localizing region-based active contours”, IEEE transactions on image processing,

Vol 17, NO.11, November 2008.

[10] T. Brox and J. Weickert, “Level set segmentation with multiple regions,” IEEE Trans. Image Process., vol. 15, no.

10, pp. 3213–3218, Oct. 2006.

[11] P. Perona, J. Malik, Scale-space and edge detection using anisotropic diffusion, IEEE Transaction on Pattern

Analysis and Machine Intelligence 12 (1990) 629–640.

[12] G. Aubert and P. Kornprobst, Mathematical Problems in Image Processing: Partial Differential Equations and the

Calculus of Variations. New York: Springer-Verlag, 2002.