Magnetic Resonance Brain Image using Modified FCM Segmentation and Support Vector Machine...

10
Magnetic Resonance Brain Image using Modified FCM Segmentation and Support Vector Machine Classification. K.VINODHINI II MTECH(IT) K.L.N COLLEGE OF INFORMATION TECHNOLOGY R.THILAGAVATHI II MTECH(IT) K.L.N COLLEGE OF INFORMATION TECHNOLOGY K.HARINEE II MTECH(IT) K.L.N COLLEGE OF INFORMATION TECHNOLOGY ABSTRACT Medical Image segmentation and classification plays an important role in medical research field. Image segmentation is often required as a preliminary and an indispensable stage in the computer aided medical image process. In the existing works, the feature extraction is limited. But this paper discusses many features such as Contrast, Homogeneity, Entropy, Correlation, Energy, Maximum Probability, Cluster Shade etc were extracted by combining (hybrid) segmentation and classification using modified FCM and SVM respectively. Modified Fuzzy C-means (FCM) algorithm which has an improved computation rate, modified cluster center and updating membership value criterion is used. Texture based features such as GLCM (Gray Level Co-occurrence Matrix) features play an important role in medical image analysis. Several statistical features are extracted to yield a better performance for classification techniques. To select the discriminative features among them, Sequential Forward Selection (SFS) algorithm is used. Support Vector Machine (SVM) classifier is used to classify whether the input test image belongs to normal or abnormal (benign or malignant). ROC curve analysis is done for calculating the misclassification rate. Keyword: GLCM, Modified FCM, SFS, SVM, ROC

Transcript of Magnetic Resonance Brain Image using Modified FCM Segmentation and Support Vector Machine...

Magnetic Resonance Brain Image using

Modified FCM Segmentation and

Support Vector Machine Classification.

K.VINODHINI II MTECH(IT)

K.L.N COLLEGE OF INFORMATION TECHNOLOGY

R.THILAGAVATHI II MTECH(IT)

K.L.N COLLEGE OF INFORMATION TECHNOLOGY

K.HARINEE II MTECH(IT)

K.L.N COLLEGE OF INFORMATION TECHNOLOGY

ABSTRACT

Medical Image segmentation and classification plays an important role in medical research field. Image segmentation is often required as a preliminary and an indispensable stage in the computer aided medical image process. In the existing works, the feature extraction is limited. But this paper discusses many features such as Contrast, Homogeneity, Entropy, Correlation, Energy, Maximum Probability, Cluster Shade etc were extracted by combining (hybrid) segmentation and classification using modified FCM and SVM respectively. Modified Fuzzy C-means (FCM) algorithm which has an improved computation rate, modified cluster center and updating membership value criterion is used. Texture based features such as GLCM (Gray Level Co-occurrence Matrix) features play an important role in medical image analysis. Several statistical features are extracted to yield a better performance for classification techniques. To select the discriminative features among them, Sequential Forward Selection (SFS) algorithm is used. Support Vector Machine (SVM) classifier is used to classify whether the input test image belongs to normal or abnormal (benign or malignant). ROC curve analysis is done for calculating the misclassification rate.

Keyword: GLCM, Modified FCM, SFS, SVM, ROC

1. Introduction

Magnetic resonance imaging is used as a valuable tool in the clinical and surgical environment because of its characteristics like superior soft tissue differentiation, high spatial resolution and contrast. It does not use harmful ionizing radiation to patients. Image segmentation is the first step in image analysis and pattern recognition, and it is one of the most difficult tasks in image processing, and determines the quality of the final result of analysis [1]. Clustering is one of the widely used image segmentation techniques which classify patterns in such a way that samples of the same group are more similar to one another than samples belonging to different groups. There has been considerable interest recently in the use of fuzzy clustering methods, which retain more information from the original image than hard clustering methods. Modified Fuzzy C means algorithm is widely preferred because of its additional flexibility which allows pixels to belong to multiple classes with varying degrees of membership. The gray level co-occurrence matrix method considers the spatial relationship between pixels of different gray levels. The method calculates a GLCM by calculating how often a pixel with a certain intensity i occurs in relation with another pixel j at a certain distance d and orientation θ. In this proposed work, SFS method begins with a set of features and a sequential way of adding parameters continues until the criterion of selection has reached a minimum or all the parameters are added to the model. Support Vector Machine (SVM) based on statistical learning theory has solid theoretical foundation and it has high generalization ability, especially for dataset with small number of samples in high dimensional space.

2. Proposed methodology

The proposed methodology consist of the stages, viz.MRI image database, preprocessing, modified FCM based segmentation, feature extraction, feature selection and classification

Fig 1. Flow diagram of Proposed Methodology

.

2.1 MRI image database

A set of MRI brain tumor images comprising of tumor and non tumor are collected from radiologists. The images used are 256*256 gray level images with intensity value ranges from 0 to 255.

MRI Image Database

Preprocessing

Median Filtering Histogram Equalization

Segmentation – Modified Fuzzy C-means

Feature extraction - GLCM

Feature selection – Sequential Forward Selection

Classification by SVM (Support Vector Machine)-Analysis

a) Normal b) abnormal

Fig 2.Normal and abnormal brain images

(a)Samples of Testing Set (b) Samples of Training Set

Fig 3. Samples of Testing and Training Data Sets

2.2 Preprocessing

Initially the input image is given to a median filter in order to reduce the noise and obtained output is histogram equalized to enhance the contrast.

i) Median Filtering: In the median filtering operation, the pixel values in the neighborhood window are ranked according to intensity, and the middle value (the median) becomes the output value for the pixel under evaluation. Median filtering does not shift boundaries, as can happen with conventional smoothing filters. Since the median is less sensitive than the mean to extreme values (outliers), those extreme values are more effectively removed. Median filtering preserves the edges.

ii) Histogram Equalization: It enhances the contrast of images by transforming the values in an intensity image, or the values in the color map of an indexed image, so that the histogram of the output image approximately matches a specified histogram.

2.3 Segmentation

2.3.1 FCM clustering

The FCM algorithm assigns pixels to each category by using fuzzy memberships. Let X=(x1, x2,…,xn) denotes an image with N pixels to be partitioned into c clusters, where xi represents multispectral (features) data ([2]-[3]). The algorithm is an iterative optimization that minimizes the cost function defined as follows:

Where uij represents the membership of pixel xj in the ith cluster, vi is the ith cluster center, ║.║is a norm metric, and m is a constant. The parameter m controls the fuzziness of the resulting partition and m values is assign to two.

One of the important characteristics of an image is that neighboring pixels have similar feature values, and the probability of getting same cluster is comparatively high. The spatial information is important in clustering, but the standard FCM algorithm does not fully utilized it, to exploit the spatial information, a modified membership function is defined as,

(2)

Where

is called spatial function, and N(xj) represents a square window centered on pixel x j in the spatial domain. The spatial function Sij represents the probability that pixel xj belongs to i th cluster. The spatial function of a pixel for a cluster is large if the majority of its neighborhood belongs to the same square window of size 3×3 is used. In a homogenous region, the spatial functions enhance the original membership, and the clustering result remains unchanged. However, for a noisy pixel, it will reduce the weighting of a noisy cluster by the labels of its neighboring pixel. As a result, misclassified pixels from noisy regions can be easily corrected. There are two steps in FCM algorithm ([2]-[3]). The first step is to calculate the membership function in the spectral domain and the second step is to map the membership information of each pixel to the spatial domain and then compute the spatial function from that. The iteration proceeds with the new membership function that is incorporated with the spatial function. The iteration is stopped when the maximum difference between two cluster centroids is less than a threshold value (=0.01). After the convergence, defuzzification is applied to assign each pixel to a specific cluster for which the membership is maximal.

2.3.2 The Modified FCM algorithm (MFCM)

Step 1: Set the number of clusters c and the parameter m.

Initialize the fuzzy Cluster centroid vector randomly and set ε = 0.01.

Step 2: compute

Step 3: compute .

Step 4: update .

Step 5: update

Repeat Steps 4 and 5 until the following termination criterion is satisfied:

│vnew - vold│<Є

Two types of cluster validity functions, fuzzy partition and feature structure, are often used to evaluate the performance of clustering in different clustering methods. The representative functions for the fuzzy partition are partition coefficient Vpc and partition entropy Vpe.

The reduction in Vpe with the MFCM, as compared with the conventional technique, is 50% for original images and 64.6% for noisy data

Fig 4. Test image for segmentation and classification.

2.3.3 The clustering results of image using various FCM techniques

Image Technique Vpc Vpe

Original MRI

images

Conventional FCM 0.888 0.234

Modified FCM 0.924 0.117

Noise added MRI

images

Conventional FCM 0.779 0.414

Modified FCM 0.909 0.147

Table 1.Clustering results

2.4 Feature extraction

The procedure for extracting textural properties of image in the spatial domain was presented by Haralick. The gray level Co-occurrence matrix method considers the spatial relationship between pixels of different gray levels [4]. The method calculates a GLCM by calculating how often a pixel with a certain intensity i occurs in relation with another pixel j at a certain distance d and orientation. Each element (i, j) in the GLCM is the sum of the number of times that the pixels with value i occurred in the specified relationship to a pixel with value j in the raw Image. Co-occurrence matrices are calculated for four directions: 0, 45, 90 and 135.

2.5 Feature selection SFS algorithm is a bottom-up search procedure which starts from an empty feature set S and gradually adds features selected by some evaluation function that minimizes the mean square error (MSE). In each iteration, the feature to be included in the feature set, it is selected among the remaining available features of the feature set, which have not been added to the feature set. So, the new extended features set should produce a minimum classification error compared with the addition of any other feature.

2.6 Sequential Forward Selection Starting from the empty set, sequentially add the feature x+ that results in the highest objective function J (Yk+x+) when combined with the features Yk that have already been selected.

1. Start with the empty set Y0 = {Ø}

2. Select the next best feature x+=argmax[J(Yk+x)] x €Yk

3. Update Yk+1=Yk+x+ ; and increment the value of k and go to step 2.

SFS is widely used for its simplicity and speed. Many variants and applications have been proposed based on the SFS algorithm [5].

2.7 ClassificationSVM classifier

(i) Data setup: Dataset contains three classes (normal & abnormal-benign, malignant), each N samples. The data is 2D plot original data for visual inspection.

(ii)SVM with linear kernel of range (-t, 0) is taken in order to find the best parameter value C using 2-fold cross validation (50% of data is used for training and remaining set of data is used to test the image).

(iii)After finding the best parameter value for C, the entire data is again trained using this parameter value.

(iv)Plot support vectors which is close to the hyper plane.

(v) Plot decision area i.e. cases with one category of the target variable are on one side of the plane and cases with the other category are on the other side of the plane..

SVM maps input vectors to a higher dimensional vector space where an optimal hyper plane is constructed ([6]-[8]). Among the many hyper planes available, there is only one hyper plane that maximizes the distance between itself and the nearest data vectors of each category. This hyper plane which maximizes the margin is called the optimal separating hyper plane and the margin is defined as the sum of distances of the hyper plane to the closest training vectors of each category.

3. Results and Discussions

The algorithm is developed using Matlab 7.8, and it is tested with the database of approximately 30 testing images. The sample input image taken for analysis is first preprocessed using Median filter and Histogram equalization.

The preprocessed input is then segmented using modified fuzzy C-Means algorithm to detect the tumor and the result of the same is shown in the fig. 5.

Fig 5 Segmented output for the given test image

Fig. 6 Classified output for the given test image

The segmented output is then analyzed using feature extraction and classification. The features provide the characteristics of the input type to the classifier by considering the description of the relevant properties of the image into a feature space. If the features extracted are carefully chosen, it is expected that they will extract the relevant information from the input data in order to perform the desired task using this reduced representation instead of the full size input. GLCM matrix is constructed from which different statistical features are obtained. Here Sequential Forward Selection algorithm is used as an automatic feature selection algorithm.

The classification follows feature extraction stage. SVM Classifier is used. The goal of SVM modeling is to find the optimal hyper plane that separates clusters of vector in such a way that cases with one category of the target variable are on one side of the plane and cases with the other category are on the other side of the plane. The vectors near the hyper plane are the support vectors.

The Receiver Operating Characteristic (ROC) Analysis is done to classify true positives, while rejecting false positives.

3.1 Classifier Performance analysis: Accuracy and Mis-classification rate

The Accuracy (or Power) is the probability that the test correctly classifies the subjects; the Mis-classification rate is its complement to 1. In statistics, the F1 score (also F-score or F-measure) is a measure of a test's accuracy. It considers both the Precision (positive predictivity) and the Sensitivity of the test to compute the score: P is the number of correct results divided by the number of all returned results. S is the number of correct results divided by the number of results that should have been returned. The F1 score can be interpreted as a weighted average of the Precision and Sensitivity, where an F1 score reaches its best value at 1 and worst score at 0.

3.2 ROC Curve Analysis

Performance of each test is characterized in terms of its ability to identify true positives, while rejecting false positives. The accuracy of a diagnostic test can be summarized in terms of an ROC curve The Receiver Operating Characteristic (ROC) curve is a popular tool in medical and imaging research. It conveniently displays diagnostic accuracy expressed in terms of sensitivity (or true-positive rate) against (1-specificity) (or false-positive rate) at all possible threshold values. The ROC curve for the given test image is shown in the below graph.

It is observed that the straight line is hyper plane that separates Benign and Malignant brain images. Below the hyper plane are the clusters of Benign images and above the hyper plane are clusters of Malignant images .It is found that one Benign image is misclassified as Malignant.

Sensitivity

1-Specificity

Fig.7 Receiver

Operating

Characteristic

(ROC) curve

4. Conclusion

In this paper, MRI brain image segmentation is performed using Modified fuzzy C-Mean algorithm. And in order to get a better classification rate, different statistical feature were extracted by using GLCM. SFS is used to select the discriminative features among them. SVM is used to classify the input, which is MRI Brain image into normal and abnormal classification. Afterwards the abnormal images are further separated as benign and malignant category. Multi-SVM classifier has been used for this three type classification. This kernel technique will help to get more accurate result. In the proposed work about 90% are classified accurately and 10% are misclassified. This is the limitation of the proposed work. To overcome the problem of SVM training, hybrid classifiers can be used. Some more texture features can also be added to improve the classification accuracy. In future, 3D image Analysis can be done for classification of MRI brain images.

References:

[1] Pingwang, Hongleiwang,”A Modified FCM algorithm for MRI brain image segmentation”, IEEE 2008 International Seminar on Future Biomedical Information Engineering.

[2] M.Shasidhar and V.Sudheer Raja,” MRI Brain Image Segmentation using modified Fuzzy C-means clustering algorithm”, IEEE International Conference on Communication Systems and Network Technologies, 2011.

[3] Keh-Shih Chuang, Hong-Long Tzeng , Sharon Chen , Jay Wu , Tzong-Jer Chen,” Fuzzy c-means clustering with spatial information for image segmentation”,ELSEVIER, Computerized Medical Imaging and Graphics 30 (2006) 9–15

[4] S.Daniel Madan Raja, A.shanmugam,”Artificial Neural Networks Based War Scene Classification using Invariant Moment and GLCM Features“, International Journal of Engineering Science and Technology, vol.3 No.2 Feb 2011.

[5] A.Marcano-cedino, J.Quintanilla-Dominguez,”Feature Selection using Sequential Forward selection and classification applying Artificial Metaplasticity Neural Network”, IECON 2010-36 th Annual Conference on IEEE Industrial Electronics Safety.

[6] Lei Guo, Youxi Wu,”Research on the Segmentation of MRI Image Based on Immune Support Vector Machine“The 1st International conference on Bioinformatics and Biomedical Engineering IEEE 2007, ICBBE.

[7] Najafi, Shahla “A new approach to MRI brain images classification “IEEE 2011, Electrical Engineering (ICEE), 2011 19th Iranian Conference.

[8] Azhar Quddus, Senior Member, Otman Basir, Senior Member, IEEE”Semantic Image Retrieval in Magnetic Resonance Brain Volumes”, IEEE 2012.