Automated diagnosis of Lungs Tumor Using Segmentation ...

www.ijecs.in

International Journal Of Engineering And Computer Science ISSN: 2319-7242

Volume 5 Issue 10 Oct. 2016, Page No. 18350-18357

S.Piramu Kailasam1 IJECS Volume 05 Issue 10 Oct., 2016 Page No.18350-18357 8350

Automated diagnosis of Lungs Tumor Using Segmentation Techniques S.Piramu Kailasam

1,Dr. M. Mohammed Sathik

2

1 Research and Development Centre,Bharathiar University,India

E-mail:[email protected] 2Principal, Sathakathullah Appa College, India

E-mail:[email protected]

Abstract— The Objective is to detect the cancerous lung nodules from 3D CT chest image and classify the lung disease and its severity.

Although so many researches has been done in this stream, the problem still remains a challenging one. To extract the lung region FCM

segmentation is used. Here we used six feature extraction techniques such as bag of visual words based on the histogram oriented gradients,

the wavelet transform based features, the local binary pattern, SIFT and Zernike moment . The Particle swarm optimization algorithm is

used to select the best features.

Keywords— Lungs CT, image segmentation, PSO, SVM, ELM, k-

NN, NB.

I. INTRODUCTION

Due to increasing rate of smoking and air pollution lung

cancer is main cause for deaths in different countries. CT is the

best modality to diagnose the lung disease. Time and cost are

the two important factors. The early detection of lung nodule

growth cure the disease of the patient. According to staging

of lung cancer the severity will be found. The radiologist will

help the diagnosis efficiency by calculating the number of

nodule growth in stages X rays image is not sufficient for

early detection of lung cancer [2], [4]. CT plays an important

role on cancer staging evaluation. It is challenging task due to

low contrast, size and location variation in CT imaging.

Distinguishing the cancerous nodules from the blood vessel is

the challenging task because in the central lung regions lung

nodules are confused with the blood vessels imaged in cross

section. The detection of lung nodule have been found

such as feature base, template matching based and neural

based. In [5] the organs of interest and lung area was

classified in to two clusters, air cluster and other organs

cluster. Using Gaussian distribution as reference images,

template

matching algorithm is used to detect the nodules in chest CT

images. Tuo Xu et al.[1] proposed an automatic global edge

and

region force (ERF) field guided method for lung field

segmentation. Experimental results demonstrated that the

proposed method is time efficient and improves the accuracy,

sensitivity, specificity and robustness of the segmentation

results. In the lungs accurate lung segmentation allows the

detection and quantification of abnormalities. A automatic

method for segmentation of the lungs and lobes from thorox

CT scans by Van Rikrort et al. [7]. Here region growing

approach is used to segment the region and morphological

operations are done. Multiatlas segmentation will be applied to

the results. Sobel edge [9] detection method is used to segment

the CT lung image. Jun lai et al.[8] used a fully automatic

segmentation for pulmonary vessals in plain thoraic CT

images.

Vessal tree reconstruction algorithm [10] reduced the

number of false positives by 38%. Camarlinght et al.[11] has

used three

different computer aided detection techniques to identify the

pulmonary nodules in CT images.

Abdulla & Shaharun [12] used feed forward neural

networks to classify lung nodules in X-ray images and with

smaller feature of area, size and perimeter. Kuruvilla et al.[13]

have used six distint parameters including skewness and fifth

and sixth central moments extracted from segmented single

slices and used feed forward back propagation neural network

with them to evaluate accuracy . In Riccardi et al[16] the

authors presented a new algorithm to automatically detect

nodules with 71% overall accuracy using 3D radical

transforms . In neuro imaging data of brain using deep

Boltzmann machine for AD/MDC diagnosis is done by the

author Suk et al. [17]. The method achieves a excellent

diagnostic accuracy of 95.52%.To the best of our knowledge

there has been no work that uses deep features from lung

nodule classification. In this paper the trainned image was preprocessed then

the lung region will be extracted using FCM segmentation.

After that we found the lung cancer detection by extracting the

features.

Cancerous lung nodules detection from CT chest

image

The block diagram of the proposed method is shown in

fig.1. To detect the location of the cancerous lung nodules this

paper uses a novel algorithm. First we denoised the input

image by wiener filter. Contrast is enhanced by histogram

equalization. To extract the lung region the FCM segmentation is used. The further details of these

are discussed below:

DOI: 10.18535/ijecs/v5i10.22

S.Piramu Kailasam1 IJECS Volume 05 Issue 10 Oct., 2016 Page No.18350-18357 Page 18351

Fig.1 Cancerous lung nodule detection

A. GETTING THE ORIGINAL IMAGE

In this module, this paper get the original image for

the segmentation of the lung from the CT image.

B. PREPROCESSING

In an easy way making the image ready for the

processing is known as Preprocessing. Preprocessing is an

improvement of the image data that suppresses unwanted

degraded pixels important for futher processing.

I. Apply Denoising using Wiener Filter

In signal processing, the Wiener filter is a filter used

to produce an estimate of a desired or target random process

by linear time-invariant filtering an observed noisy process,

assuming known stationary signal and noise spectra, and

additive noise. The Wiener filter minimizes the mean square

error between the estimated random process and the desired

process. The goal of the Wiener filter is to filter out noise that

has corrupted a signal. It is based on a statistical approach, and

a more statistical account of the theory is given in the MMSE

estimator article. Typical filters are designed for a

desired frequency response. However, the design of the

Wiener filter takes a different approach. One is assumed to

have knowledge of the spectral properties of the original signal

and the noise, and one seeks the linear time-invariant filter

whose output would come as close to the original signal as

possible. Wiener filters are characterized by the following:

1. Assumption: signal and (additive) noise are stationary

linear stochastic processes with known spectral

characteristics or known autocorrelation and cross-

correlation

2. Requirement: the filter must be physically

realizable/causal (this requirement can be dropped,

resulting in a non-causal solution)

3. Performance criterion: minimum mean-square

error (MMSE)

II. Apply Contrast Enhancement

Digital image processing is still new enough for most

people that no matter how much this paper read, experiment

and work at it, there seems to be an endless amount to learn.

This is particularly true as regards photoshop that invaluable

tool yet also bottomless pit of a time sink. But every now and

then a little tidbit comes along that moves ones work either

easier or better, and this one, which I call Contrast

Enhancement – is one of the best that I have seen in a while.

The imadjust increases the contrast of the image by

mapping the values of the input intensity image to new values

such that, by default, 1% of the data is saturated at low and

high intensities of the input data.

C. LUNG REGION EXTRACTION

I. Apply FCM Segmentation

In fuzzy clustering, every point has a degree of belonging

to clusters, as in fuzzy logic, rather than belonging completely

too just one cluster. Thus, points on the edge of a cluster may

be in the cluster to a lesser degree than points in the center of

cluster. An overview and comparison of different fuzzy

clustering algorithms is available. Any point x has a set of

coefficients giving the degree of being in the kth cluster wk(x).

With fuzzy c-means, the centroid of a cluster is the mean of all

points, weighted by their degree of belonging to the cluster:

The degree of belonging, wk(x), is related inversely to the

distance from x to the cluster center as calculated on the

previous pass. It also depends on a parameter m that controls

how much weight is given to the closest center. The fuzzy c-

means algorithm is very similar to the k-means algorithm:

Choose number of clusters.

Assign randomly to each point coefficients for being

in the clusters.

Repeat until the algorithm has converged (that is, the

coefficients change between two iterations is no

more than , the given sensitivity threshold) :

a)Compute the centroid for each cluster, using

the formula above.

b)For each point, compute its coefficients of

being in the clusters, using the formula above.

II. Extraction of Lung Region

In this technique, the lung region is obtained by the

extraction method. The input is from the segmented region and

it extracts the lung region and gives the output image.

D. SEGMENTING THE LUNG REGION I. Applying Filling Holes

In this method, after extracting the lung region the holes in

the lung region are filled by the segmentation techniques.

E. LUNG CANCER DETECTION I. Extract Features

Feature extraction involves simplifying the amount of

resources required to describe a large set of data accurately.

Usually, one of the main problems in analyzing complex data

is the large number of variables involved in the data. A lot of

memory and computation power is necessary to analyze the

data . But feature extraction method can be used to

construction of variable combinations to overcome these

obstacles but it still describes data accurately. So achievement

http://en.wikipedia.org/wiki/Signal_processing

http://en.wikipedia.org/wiki/Filter_(signal_processing)

http://en.wikipedia.org/wiki/Stationary_process

http://en.wikipedia.org/wiki/Noise

http://en.wikipedia.org/wiki/Statistical

http://en.wikipedia.org/wiki/Minimum_mean-square_error


http://en.wikipedia.org/wiki/Frequency_response

http://en.wikipedia.org/wiki/LTI_system_theory

http://en.wikipedia.org/wiki/Stochastic_process

http://en.wikipedia.org/wiki/Autocorrelation

http://en.wikipedia.org/wiki/Cross-correlation

http://en.wikipedia.org/wiki/Cross-correlation

http://en.wikipedia.org/wiki/Causal_system



http://en.wikipedia.org/wiki/Fuzzy_logic

http://en.wikipedia.org/wiki/K-means_clustering

http://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set

DOI: 10.18535/ijecs/v5i10.22


of best result happens. when an expert constructs a set of

application-dependent features.

II. Detect the Lung Cancer

After the extraction of the features from the image, by

using the features the cancer in the lungs was identified.

F. LOCALIZE THE LUNG CANCER

In computer vision and image processing, this paper often

perform different processing on ―objects‖ than on ―texture.‖ In

order to do this, this paper must have a way of localizing

textured regions of an image. For this purpose, this paper

suggest a working definition of texture: Statistics describes the

term ―texture‖ more accurately when compared to describing

the configuration of all the parts. Texture is a part of local

statistics. But Outliers do not seen to be local statistics and

draw our attention. This definition suggests that to find texture

this paper first extract certain basic features and compute their

local statistics. Then this paper compute a measure of saliency,

or degree to which each portion of the image seems to be an

outlier to the local feature distribution, and label as texture the

regions with low saliency. This paper present a method, based

upon this idea, for labeling points in cancer images as

belonging to texture regions. This method is based upon recent

psychophysics results on processing of texture.

I. Lung tissue classification and severity finding

The overall block diagram of the proposed method is

shown in Fig. 3.1. After finding the location of the cancerous

lung nodules the next process is to classify the lung disease

name and its severity based on the feature extraction. Among

several feature extraction methods this paper uses six feature

extraction technique such as the bag of visual-words based on

the histogram of oriented gradients, the wavelet transform-

based features, the local binary pattern, SIFT, Zernike

Moment. After extracting the features the Particle Swarm

Optimization (PSO) algorithm is used for select the best

features. Finally these features are classified by using the

Extreme Learning Machine (ELM) method. To analyse the

performance of the ELM method it is compared with the five

commonly used classifiers including support vector machine

(SVM), Bagging (Bag), Naive Bayes (NB), k-nearest neighbor

(k-NN), Extreme Learning Machine (ELM) and AdaBoost

(Ada).

II.Methods

The features are extracted from the Lungs CT dataset.

This project extracts four types of ROI features, including the

bag-of-visual-words based on the HOG (B-HOG), the wavelet

features, the LBP, and the CVH. We have 18-D B-HOG

features, 26-D wavelet features, 96-D LBP features, and 40-D

CVH features. Total 180 features are extracted. The details of

each type of features are given as follows.

I. B-HOG:

The HOG feature is a texture descriptor describing the

distribution of image gradients in different orientations.

Following the HOG feature extraction scheme of Dalal and

Triggs, we divide a ROI into smaller rectangular blocks of 8 ×

8 pixels and further divide each block into four cells of 4 × 4

pixels. An orientation histogram which contains nine bins

covering a gradient orientation range of 0–180° is computed

for each cell. Then, a block is represented by the linking of the

orientation histograms of cells in it. This means a 36-D HOG

feature vector is extracted for each block.

The commonly used image representation based on HOG

features is to join the feature vectors of all the blocks in the

image in sequence.All the HOG based images should have the

same size. Because , in the absence of same sized image

diverse resultant feature vector are obtained. But in CT lung

images, the size of ROIs differ patient to patient and different

lesions of the lungs. So the HOG based image has the obstacle

which makes it non applicable in this study. To solve this

problem, we adopt the bag-of-visual-words on HOG features

as the ROI representation. However, different from the

original bag-of-visual-words method, we use a clustering

algorithm based on Gaussian mixture modeling (GMM),

instead of the k-means algorithm, to generate more accurate

visual words. In this paper, total 18 visual words are obtained.

The 36-D HOG feature vector of each block is mapped to

the visual word corresponding to the highest likelihood for it.

Then, the number of HOG feature vectors assigned to each

visual word is accumulated and normalized by the number of

all the HOG feature vectors to form a 18-D histogram

representation of the ROI.

II. WAVELET FEATURES:

Wavelets are important and commonly used feature

descriptors for texture analysis, due to their effectiveness in

capturing localized spatial and frequency information and

multiresolution characteristics. In this paper, the ROIs are

decomposed to four levels by using 2-D symlets wavelet

because the symlets wavelet has better symmetry than

Daubechies wavelet and more suitable for image processing.

Then, the horizontal, vertical, and diagonal detail coefficients

are extracted from the wavelet decomposition structure.

Finally, we get the wavelet features by calculating the mean

and variance of these wavelet coefficients.

III. LBP FEATURES:

The LBP feature is a compact texture descriptor in which

each comparison result between a center pixel and one of its

surrounding neighbors is encoded as a bit. In this way, we can

get an integer for each pixel. Then, the frequency of each

integer is figured out on the ROI level to obtain the

corresponding feature vector. The neighbourhood in the LBP

operator can be defined very flexibly by using circular

neighbourhoods and the bilateral interpolation of pixel values.

These kinds of neighbourhoods can be denoted by (P, R),

which means we evenly sample P neighbors on the circle of

radius R around the center pixel. The corresponding LBP

features will be denoted as LBP(P, R) in the following

descriptions. We consider multiple P and R to get multiscale

LBP features.

IV. CVH FEATURES:

CVH means the histogram of CT values. In lung CT

images, the CT values of pixels are expressed in HU. We

compute the histogram of CT values over each ROI. The

number of bins in the histogram is determined by experiments.

In fact, we obtain various CVHs with different numbers of

DOI: 10.18535/ijecs/v5i10.22


bins. Each CVH is tested for classification under k-NN

classifier, and the corresponding CAR is calculated. Then, the

number of bins, which brings the highest CAR, is adopted.

This choice will keep unchanged for all the experiments.

V. SIFT FEATURES:

Scale-invariant feature transform (or SIFT) is an algorithm

in computer vision to detect . Set of related images are stored

in a database is the keypoint of SIFT.Based on Euclidean

distance of the image feature vectors an object is recognized in

a new image to the data base. The new images are identified

by location, scale, and orientation in filters. Here efficient

Hash table is used and the consistent clusters is performed

rapidly. Each cluster of 3 or more features accept on an object

and verification is done. Subsequently outliers are discarded.

Finally the probability that a particular set of features denotes

the object and number of probable false matches. Object

matches that pass all these tests can be identified as correct

with high confidence.

VI. ZERNIKE MOMENT FEATURES:

1. The two-dimensional complex Zernike moments of a digital

image with current pixel P(x,y) are defined as:

2. The Zernike polynomials defined in polar coordinates as:

3. The real-valued radial polynomial of ρ given as follows:

4. Zernike moments can then be expressed using central and

normalized regular moments as

VII. FEATURE SELECTION USING PSO:

In order to achieve good classification results, we usually

use several types of features at the same time. Since the

different types of features may contain complementary

information, it could bring better classification performance

through selecting discriminative features from various feature

spaces. This idea has attracted a lot of attention in the related

fields, including the medical image classification. According

to Guyon and Elisseff, the feature selection techniques can be

organized into mainly three categories: Filter, Wrapper, and

Embedded methods. We follow their taxonomy to review the

feature selection methods for medical image classification.

For feature selection this project uses PSO. A bird which is

seeking food communicates with its neighbouring birds to

make it a multidimensional search. Similarly, PSO aims at

giving a candidate solution to a problem. It is a method of

meta heuristics because it does few or no assumption of the

problem and can search much wider. But it does not guarantee

a solution because gradient is not used to optimize the

problem. This is also an advantage because gradient is needed

for classical methods like quasi-neuton method and gradient

descent method. Partially irregular, noisy and variable

problems can be optimized ideally by PSO.

III.Testing Phase

In the testing phase the above mentioned feature extraction

and feature selection are performed for the given input image.

then apply the Classification to find the given input image's

disease name.

I. SVM

SVMs (Support Vector Machines have associated learning

algorithms and they can recognize patterns and analyze data

and are supervised learning models. SVM algorithm makes a

given set of training examples get assigned to one or the other

category and it is a non probabilistic linear binary classifier.

II. Bag

The bag-of-words model is a simplifying representation

used in natural language processing and information retrieval.

In this model, a text ( a sentence or a document) is represented

as the bag (multiset) of its words , disregarding grammer and

even word order but keeping multiplicity. Recently , the bag-

of-words model is commonly used in methods of document

classification, where the (frequency of) occurrence of each

word is used as a feature for training a classifier.

III.NB

In machine learning Naive Bayes classifiers are a family of

simple probabilistic classifiers based on applying Baye’s

theorem with strong (naïve) independence assumptions

between the features. Naive Bayes classifiers are highly

scalable requiring a number of parameters linear in the number

variables (features/predictors) in a learning problem.

Maximum likelihood training can be done by evaluating a

closedform expression, which takes linear time.

IV. k-NN

k-NN algorithm is used for classification and regression in

pattern recognition. Here the input consists of the k closeset

training examples in the feature space. In k-NN classification ,

the output is a class membership. The k-NN decision making

as equivalent to a Bayesian decision making procedure in

which the number of neighbors of each type is used as an

estimate of the relative posterior probabilities of class

membership in the neighborhood of a sample to be classified.

An object is classified by a majority vote of its neighbors, with

the object being assigned to the class most common among

its k-nearest neighbors (k is a positive integer, typically small).

If k = 1, then the object is simply assigned to the class of that

single nearest neighbor.

V. Adaboost

https://en.wikipedia.org/wiki/Computer_vision

https://en.wikipedia.org/wiki/Integer

DOI: 10.18535/ijecs/v5i10.22


AdaBoost, short for "Adaptive Boosting", is a machine

learning meta-algorithm. This can be used in conjunction with

many other types of learning algorithms. The output of weak

learner which is combined in to a weighted sum. Adaboost is

adptive in the sense that subsequent weak learners are tweaked

in favour of those instances misclassified with previous

classifiers.

VI. ELM:

A new learning algorithm called extreme learning machine

(ELM) for single- hidden layer feedforward neural networks

(SLFNs) which randomly chooses hidden nodes and

analytically determines the output weights of SLFNs. This

algorithm tends to provide good generalization performance at

extremely fast learning speed.

The algorithm for ELM is explained below:

1. Randomly generate the hidden nodes as well as

randomly assign the weights

jiW , fornjmi ..1;..1

.

2. Activate the input image values nixi ..1,

and

calculate the net input to hidden layer using

mjWxH ji

n

i

iji ..1,,

1

, .

3. To calculate the net output from the hidden to output

layer repeat the step 2.

IV . PERFORMANCE ANALYSIS

A. Expiremental Images

The instances of nine categories of CISLs were collected

from the Cancer Institute and Hospital. The lung CT images

were acquired by CT scanners of GE LightSpeed VCT 64 and

Toshiba Aquilion 64 and saved in DICOM 3.0 format. The

slice thickness is 5 mm, the image resolution is 512×512, and

the in-plane pixel spacing ranges from 0.418 to 1 mm (mean:

0.664 mm). The sample images are shown in the below Fig.3.

B. Performance Analysis

To evaluate the performance of the proposed method several performance metrics are available. This paper uses the Precision Rate, Recall Rate, Classification Accuracy, Error Rate and F-Measure to analyses the performance.

(a) (b)

(c) (d)

Fig. 2. Expiremental Images 1. Precision Rate The precision is the fraction of retrieved instances that are relevant to the find.

where TP = True Positive (Equivalent with Hits) FP = False Positive (Equivalent with False Alarm) 2. Recall Rate The recall is the fraction of relevant instances that are retrieved according to the query.

Where TP = True Positive (Equivalent with Hits) FP = False Negative (Equivalent with Miss) 3. F-Measure F-measure is the ratio of product of precision and recall to the sum of precision and recall. The f-measure can be calculated as,

4. Classification Accuracy

Accuracy is the measurement system, which measure the

degree of closeness of measurement between the original

disease and the detected disease.

Where, TP – True Positive (equivalent with hit)

FN – False Negative (equivalent with miss)

TN – True Negative (equivalent with correct rejection)

FP – False Positive (equivalent with false alarm)

5. Error Rate

Error Rate is the measurement system, which measure no

of falsely identified diseases name form the given input CT

scan images.

To analysis the performance of the proposed system, it is

compared with various techniques by using the performance

metrics which are mentioned above. This is shown in the

below tables and graphs. The performance comparison of the

classification accuracy value of the proposed method and other

four existing approaches such as Adaboost K-NN, NB and

Bag is shown in the below Table.

https://en.wikipedia.org/wiki/Boosting_(meta-algorithm)

https://en.wikipedia.org/wiki/Machine_learning

https://en.wikipedia.org/wiki/Machine_learning

https://en.wikipedia.org/wiki/Meta-algorithm

DOI: 10.18535/ijecs/v5i10.22


Methods Classification Accuracy

Value

Adaboost 85%

K-NN 87%

NB 89%

Bag 92%

SVM 95%

ELM 97%

In the above Table.1 the classification accuracy value of the

five methods including the proposed method is given and then

it is compared. The classification accuracy value of the ELM

method is higher than the other method.

The Classification Accuracy value of the proposed method is

higher than the other four existing approaches. It is clearly

shown in the above Fig.3. Because of the higher classification

accuracy value, the ELM method is best than the other four

existing approaches.

Methods Error Rate Value

Adaboost 15%

K-NN 13%

NB 11%

Bag 8%

SVM 5%

ELM 3%

The performance comparison of the error rate value of the

proposed method and other four existing approaches such as

Adaboost K-NN, NB and Bag is shown in the below Fig.4.

The error rate value of the five methods including the

proposed method is given and then it is compared. The error

rate value of the ELM method is lower than the other method.

The Error Rate value of the proposed method is less than the

other four existing approaches. It is clearly shown in the above

Fig.5. Because of the lower error rate the proposed method is

best than the other four existing approaches.

Methods Precision Rate

Adaboost 82%

k-NN 84%

NB 86%

Bag 89%

SVM 92%

ELM 94%

From the table it is shown that the precision rate value of the

ELM method is higher than the other existing approaches. So

the ELM method is best than the existing approaches. The

graph of precision rate analysis is shown in Fig.5.

The performance comparison of the Recall Rate value of the


Adaboost K-NN, NB and Bag is shown in the below Table.

Methods Recall Rate

Adaboost 80%

K-NN 82%

NB 84%

Bag 86%

SVM 90%

ELM 92%

DOI: 10.18535/ijecs/v5i10.22


From the table it is shown that the recall rate value of the ELM

method is higher than the other existing approaches. So the

ELM method is best than the existing approaches. The graph

of recall rate analysis is shown in Fig.4.16.

The performance comparison of the F-Measure value of the


Adaboost K-NN, NB and Bag is shown in the below Table.1.

Methods F-Measure

Adaboost 85%

K-NN 87%

NB 88%

Bag 91%

SVM 95%

ELM 97%

From the table it is shown that the F-Measure rate value of the

ELM method is higher than the other existing approaches. So

the ELM method is best than the existing approaches. The

graph of precision rate analysis is shown in Fig.7.

The F-Measure value of the proposed method is less than the

other four existing approaches. It is clearly shown in the above

Fig.5. Because of the lower error rate the proposed method is

best than the other four existing approaches.

V.CONCLUSION

This study measuring and investigating a new method for

segmentation performance using FCM algorithm and after

extracting the features the PSO is used to select the best

features. This combination significantly improves the current

state. To compare the performance , five criteria are used such

as classification Accuracy , Error Rate , Precision Rate , Recall

Rate and F measure. Experimental results show that the ELM

performs better than the other classifiers.

References

[1] H.K.Weir,Annual report to the nation on the status of cancer,

1975-2000, Journal National Cancer Institute, vol.95,no.17,pp.1276-

1299,2003.

[2] Tao Xua, Mrinal Mandala, Richard Longb, Irene Chengc, Anup

Basuc, An edge-region force guided active shape approach for

automatic lung field detection in chest radiographs, Computerized

Medical Imaging and Graphics,2012.

[3] D.Behara, Lung Cancer in India, Medicine update 2012, Vol.22,

(401-407).

[4] T.Tanaka, K.Yuta, and Y.Kobayashi, A Study of the false-

negative case in mass screening of lung cancer,‖ Jay.J Thor. Med.,

vol.43,pp. 832- 838,1984.

[5] J.Oda S.Akita, and K.Shimada, A study of false- negative case in

comparative reading of mass- screening of lung cancer, Lung

Cancer,vol.29,pp.271-278,1989.

[6] K.Kanazawa, Y.Kawata, N.Niki, H.Satoh, H.Ohmatsu,

R.Kakinuma,M.Kaneko,N.Moriyama, and K.Eguchi, Computer -

aided diagnosis for pulmonary nodules based on helical CT images,

Comp. Med. Imag.Grap., Vol.22, no.2, pp.157-167, 1998.

[7] Yon bum Lee, Takeshi Hara, Hiroshi Fujitha, Shigeki Itoh, and

Takeo Ishigaki, Automated detection of pulmonary nodules in helical

CT images based on an improved Template matching technique,

IEEE transactions on medical imaging, vol.20, No. 7, July- 2001.

[8] Bram van Ginneken, Bart M. ter Haar Romeny, and Max A.

Wergeyer, Computer-Aided Diagnosis in Chest Radiography: A

Survey, IEEE Transactions on Medical Image, Vol. 20, pp. 1228-

1241, Dec (2001).

[9] E.M.van Rikxoort and B.van Ginneken, Automatic Segmentation

of the lungs and lobes from thoracic CT scans, Diagnostic image

Analysis group, Department of Radiology,Netherlands,2011.

[10] Jun Lai and Mei Xie, Automatic segmentation for pulmonary

vessels in plain thoraic CT scans,

TELKOMNIKA,vol.10,No.4,August 2012,pp. 743- 751.

[11] N. Camarlinghi, I. Gori, A. Retico, R. Bellotti, P. Bosco, P.

Cerello, G. Gargano, E. L. Torres, R. Megna, M. Peccarisi et al.,

―Combination of computer-aided detection algorithms for automatic

lung nodule identification,‖ International journal of computer assisted

radiology and surgery, vol. 7, no. 3, pp. 455–464, 2012.

[12] A. A. Abdullah and S. M. Shaharum, ―Lung cancer cell

classification method using artificial neural network,‖ Information

Engineering Letters, vol. 2, no. 1, pp. 49–59, 2012.

[13] J. Kuruvilla and K. Gunavathi, ―Lung cancer classification using

neural networks for ct images,‖ Computer methods and programs in

biomedicine, vol. 113, no. 1, pp. 202–209, 2014.

[14] R. Bellotti, F. De Carlo, G. Gargano, S. Tangaro, D. Cascio, E.

Catanzariti, P. Cerello, S. C. Cheran, P. Delogu, I. De Mitri et al., ―A

cad system for nodule detection in low-dose lung cts based on region

growing and a new active contour model,‖ Medical Physics, vol. 34,

no. 12, pp. 4901–4910, 2007.

[15] B. van Ginneken, S. G. Armato, B. de Hoop, S. van Amelsvoort-

van de Vorst, T. Duindam, M. Niemeijer, K. Murphy, A. Schilham,

A. Retico, M. E. Fantacci et al., ―Comparing and combining

algorithms for computer-aided detection of pulmonary nodules in

computed tomography scans: the anode09 study,‖ Medical image

analysis, vol. 14, no. 6, pp. 707–722, 2010.

[16] A. Riccardi, T. S. Petkov, G. Ferri, M. Masotti, and R.

Campanini, ―Computer-aided detection of lung nodules via 3d fast

radial transform, scale space representation, and zernike mip

classification,‖ Medical physics, vol. 38, no. 4, pp. 1962– 1971, 2011.

[17] H.-I. Suk, S.-W. Lee, D. Shen, A. D. N. Initiative et al.,

―Hierarchical feature representation and multimodal fusion with deep

learning for ad/mci diagnosis,‖ NeuroImage, vol. 101, pp. 569–582,

2014.

[18] G. Wu, M. Kim, Q. Wang, Y. Gao, S. Liao, and D. Shen,

―Unsupervised deep feature learning for deformable registration of

MR brain images,‖ in Medical Image Computing and Computer-

Assisted Intervention–MICCAI 2013. Springer, 2013, pp. 649–656.

[19] R. Fakoor, F. Ladhak, A. Nazi, and M. Huber, ―using deep

learning to enhance cancer diagnosis and classification,‖ in

Proceedings of the ICML Workshop on the Role of Machine

DOI: 10.18535/ijecs/v5i10.22


Learning in Transforming Healthcare (WHEALTH). Atlanta, GA,

2013.

[20] D. Zinovev, J. Feigenbaum, J. Furst, and D. Raicu, ―Probabilistic

lung nodule classification with belief decision trees,‖ in Engineering

in Medicine and Biology Society, EMBC, 2011 Annual International

Conference of the IEEE. IEEE, 2011, pp. 4493–4498.

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/348336445

Traffic flow Prediction with Big Data Using SAE'S Algorithm

Article · July 2016

CITATIONS

2READS

172

3 authors, including:

Some of the authors of this publication are also working on these related projects:

Fuzzy methods View project

Deep Learning Algorithms View project

Piramu Kailasam

Sadakathullah Appa College

16 PUBLICATIONS 12 CITATIONS

SEE PROFILE

Mohamed Sathik



SEE PROFILE

All content following this page was uploaded by Piramu Kailasam on 08 January 2021.

The user has requested enhancement of the downloaded file.

https://www.researchgate.net/publication/348336445_Traffic_flow_Prediction_with_Big_Data_Using_SAE%27S_Algorithm?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_2&_esc=publicationCoverPdf

https://www.researchgate.net/publication/348336445_Traffic_flow_Prediction_with_Big_Data_Using_SAE%27S_Algorithm?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_3&_esc=publicationCoverPdf

https://www.researchgate.net/project/Fuzzy-methods?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/project/Deep-Learning-Algorithms?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_1&_esc=publicationCoverPdf

https://www.researchgate.net/profile/Piramu-Kailasam?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_4&_esc=publicationCoverPdf


https://www.researchgate.net/institution/Sadakathullah-Appa-College?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_6&_esc=publicationCoverPdf


https://www.researchgate.net/profile/Mohamed-Sathik?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_4&_esc=publicationCoverPdf


https://www.researchgate.net/institution/Sadakathullah-Appa-College?enrichId=rgreq-9a38e38dd3b404eb1f2cd7a0e485e6fa-XXX&enrichSource=Y292ZXJQYWdlOzM0ODMzNjQ0NTtBUzo5Nzc3Mjk3NzIwODkzNDRAMTYxMDEyMDM1MTEzMQ%3D%3D&el=1_x_6&_esc=publicationCoverPdf



S.Piramu Kailasam et al, International Journal of Computer Science and Mobile Computing, Vol.5 Issue.7, July- 2016, pg. 186-193

© 2016, IJCSMC All Rights Reserved 186

Available Online at www.ijcsmc.com

International Journal of Computer Science and Mobile Computing

A Monthly Journal of Computer Science and Information Technology

ISSN 2320–088X IMPACT FACTOR: 5.258

IJCSMC, Vol. 5, Issue. 7, July 2016, pg.186 – 193

Traffic flow Prediction with Big Data

Using SAE’S Algorithm

S.Piramu Kailasam (Ph.d), K.Aruna Anushiya, Dr. M.Mohammed Sathik ¹Research Scholar, Bharathiar University, India

²Assistant Professor, MSU College, India 3Principal, Sadakathullah Appa College, Tirunelveli, India

1 [email protected]; 2 [email protected]

Abstract— Intelligent transportation system is accurate and time based traffic flow information to do best

performance . Last few years, traffic data have been huge, existing system used weak traffic prediction

models which is unsatisfied. The proposed system is using novel deep learning based traffic flow prediction

method, which involves the spatial and temporal correlations inherently. A stack autoencoder model is used

to learn generic traffic flow features and it is trained in a greedy layerwise pattern. This is the first time that a

deep architecture model is proposed using autoencoders to represent traffic flow features for prediction.

Keywords— Deep learning, stacked autoencoders (SAEs), traffic flow prediction.

I. INTRODUCTION

The traffic flow information is [1] the potential to help road users, which make better travel decisions in

traffic congestion and reduce carbon emissions. This will improve traffic operation efficiently. Now days

transportation management system and control becomes more complicated data driven. The most of the Traffic

flow predication system method is used shallow traffic model which are unsatisfied. Deep learning , which is a

type of machine learning method, has a lot of interest academic and industrial level.

Deep learning algorithms use multiple-layer architectures to extract inherent features in data from the lowest

to the highest level using deep learning algorithm. Without prior knowledge, we can represent the traffic

feature which has good performance in traffic flow prediction.

II. LITERATURE REVIEW

A Traffic flow prediction is a key functional component in ITSs. A countable traffic flow prediction models

have been developed to assist in traffic management .These models will control and improving transportation

efficiency ranging from route guidance and vehicle routing . The traffic flow can be considered a temporal and

spatial process. The traffic flow prediction problem can be stated as follows. Let Xt i denote the observed traffic

flow quantity during the tth time interval at the ith observation location in a transportation network. Given a

sequence {Xt i} of observed traffic flow data, i = 1, 2, . . . , m, t = 1, 2, . . . , T , the problem is to predict the traffic flow at time interval (t+Δ) for some prediction horizon Δ. As early as 1970s, the autoregressive integrated

moving average (ARIMA) model was used to predict short-term freeway traffic flow [3]. The variety of models

for traffic flow prediction have been proposed by researchers from different areas, such as transportation

http://www.ijcsmc.com/

mailto:[email protected]




engineering, statistics, machine learning, control engineering, and economics. Previous prediction approaches

can be grouped into three categories, i.e., parametric techniques, nonparametric methods, and simulations.

Parametric models include time-series models, Kalman filtering models, etc. Nonparametric models include k-

nearest neighbor (k-NN) methods, artificial neural networks (ANNs), etc. Simulation approaches use traffic

simulation tools to predict traffic flow.

A widely used technique to the problem of traffic flow prediction is based on time-series methods, a local linear

regression model for short-term traffic forecasting[22]. A Bayesian network approach was proposed for traffic

flow forecasting in [23]. An online learning weighted support vector regression (SVR) was presented in [24] for

short-term traffic flow predictions. Various ANN models were developed for predicting traffic flow. It is

difficult to say that one method is clearly superior over other methods in any situation. One reason for this is that

the proposed models are developed with a small amount of separate specific traffic data, and the accuracy of

traffic flow prediction methods is dependent on the traffic flow features embedded in the collected spatiotemporal traffic data. Moreover, in general, literature shows promising results when using NNs, which

have good prediction power and robustness. Although the deep architecture of NNs can learn more powerful

models than shallow networks, existing NN-based methods for traffic flow prediction usually only have one

hidden layer. It is hard to train a deep-layered hierarchical NN with a gradient-based training algorithm. Recent

advances in deep learning have made training the deep architecture feasible since the breakthrough of Hinton ,

and these show that deep learning models have superior or comparable performance with state-of-the-art

methods in some areas. In this paper, we explore a deep learning approach with SAEs for traffic flow prediction.

III. METHODOLOGY

In proposed system SAE’s model is introduced SAE is Stacked Autoencoder. The SAE is an Neural network

that attempt to reduce its input. Fig.1 gives the details of auto encoder, which has one input layer, one hidden

layer and one output layer. A set of training samples

{x(1), x(2), x(3), . . .}, where x(i) ∈ Rd, an autoencoder first encodes an input x(i) to a hidden representation

y(x(i)) based on (1), and then it decodes representation y(x(i)) back into a reconstruction z(x(i)) computed as (2).

y(x) =f(W1x + b) (1)

z(x) =g (W2y(x) + c) (2)

where

W1 is a weight matrix,

b is an encoding bias vector,

W2 is a decoding matrix, and

c is a decoding bias vector.

Fig.2. Layerwise training of SAEs.

When sparsity constraints are added to the objective function, an autoencoder becomes a sparse autoencoder.

This encoder considers the sparse representation of the hidden layer. To obtain the sparse representation, we will reconstruct the error.



Where

γ is the weight of the sparsity term,

ρ is a sparsity parameter

ρj = average activation of hidden unit

KL(ρ_ˆρj) - is the Kullback–Leibler (KL) divergence

SAEs A SAE model is created by stacking autoencoders to form a deep network by taking the output of the

autoencoder found on the layer below as the input of the current layer .The l- layers in SAE, the first layer is

trained as an autoencoder, with the training set as inputs. After obtaining the first hidden layer, the output of the

kth hidden layer is used as the input of the (k + 1)th hidden layer, multiple autoencoders can be stacked

hierarchically. This is shows in Fig. 2 By using the SAE network for traffic flow prediction, we need to add a

standard predictor on the top layer. In this paper, we put a logistic regression layer on top of the network for

supervised traffic flow prediction. The SAEs plus the predictor comprise the whole deep architecture model for

traffic flow prediction. This is showed in Fig. 3.

Fig. 3. Deep architecture model for traffic flow prediction. A SAE model is

used to extract traffic flow features, and a logistic regression layer is applied for prediction.

C. Training Algorithm

By applying the BP method with the gradient-based optimization technique. it is also known as top-down way,

that deep networks trained in this way not successful. In greedy layerwise unsupervised learning algorithm that

can train deep networks successfully. The key point to using the greedy layerwise unsupervised learning

algorithm is to pretrain the deep network layer by layer in a bottom–up way trained, fine-tuning using BP can be

applied to tune the model’s parameters in a top–down direction to obtain better output. at the same time. The

training procedure is based on the works in [58] and [59], which can be stated as follows.



1) Train the first layer as an autoencoder by minimizing the objective function with the training sets as the

input.

2) Train the second layer as an autoencoder taking the first layer’s output as the input.

3) Iterate as in 2) for the desired number of layers.

4) Use the output of the last layer as the input for the prediction layer, and initialize its parameters randomly or

by supervised training.

5) Fine-tune the parameters of all layers with the BP method in a supervised way.

This procedure is summarized in Algorithm 1.

Algorithm 1. Training SAEs Given training samples X and the desired number of hidden

layers l,

Step 1) Pretrain the SAE — Set the weight of sparsity γ, sparsity parameter ρ, initialize weight matrices and

bias vectors randomly.

— Greedy layerwise training hidden layers.

— Use the output of the kth hidden layer as the input of the (k + 1)th hidden layer. For the first hidden

layer, the input is the training set.

— Find encoding parameters for the (k + 1)th hidden layer by minimizing the objective function.

Step 2) Fine-tuning the whole network

— Initialize randomly or by supervised training like ,

— Use the BP method with the gradient-based optimization technique to change the

whole network’s parameters in a top–down fashion.

Data Description

The proposed deep architecture model was applied to the data collected from the Caltrans Performance

Measurement System (PeMS) database as a numerical example. The traffic data are collected every 30 s from over 15000 individual detectors, which are deployed statewide in freeway systems . The collected data are

aggregated 5-min interval each for each detector station. In this paper, the traffic flow data collected in the

weekdays of the first three months of the year 2013 were used for the experiments. The data of the first two

months were selected as the training set, and the remaining one month’s data were selected as the testing set.

For freeways with multiple detectors, the traffic data collected by different detectors are aggregated to get the

average traffic flow of this freeway. Note that we separately treat two directions of the same freeway among all

the freeways, in which three are one-way. Fig. 4 is a plot of a typical freeway’s traffic flow over time for

weekdays of some week.

Experiments :

we use three performance indexes, which are the mean absolute error (MAE), the mean relative error (MRE), and the RMS error(RMSE).

RESULT

We compared the Index of Performance

To evaluate the effectiveness of the proposed model, we use three performance indexes, which are the mean

absolute error (MAE), the mean relative error (MRE), and the RMS error (RMSE) they are define as



where fi is the observed traffic flow, and ˆ fi is the predicted traffic flow.

TRAFFIC FLOW ANALYSIS PATTERN FOR WEEKDAYS

HOURS(X)

NO.OF

VEHICLES(Y)

ON MONDAY

NO.OF

VEHICLES(Y)

ON TUESDAY

NO.OF

VEHICLES(Y)

ON

WEDNESDAY

NO.OF

VEHICLES(Y)

ON

THURSDAY

NO.OF

VEHICLES(Y)

ON FRIDAY

0 100 100 100 105 120

2 50 50 50 50 80

4 60 60 70 55 80

6 400 380 403 400 380

8 590 550 550 600 580

10 450 450 480 502 40

12 490 500 500 500 520

14 505 550 550 550 570

16 590 590 580 580 560

18 550 510 550 570 530

20 400 400 475 500 450

22 250 300 280 350 350

MONDAY TRAFFIC FLOW ANALYSIS



TUESDAY TRAFFIC FLOW ANALYSIS

WEDNESDAY TRAFFIC FLOW ANALYSIS

THURSDAY TRAFFIC FLOW ANALYSIS



FRIDAY TRAFFIC FLOW ANALYSIS

IV. CONCLUSIONS

The Era of Big data is an urgent need for advanced data acquisition, management and analysis. In this paper

we have presented the concept of big data and highlighted the big data value chain. The proposed method

discovered the traffic flow feature representation as the nonlinear spatial and temporal correlations from the

traffic data. We used the greedy layerwise unsupervised learning algorithm to pretrain the large network and

improve the prediction performance. We assessed the performance of the proposed method and compared with

BP NN, RBF NN, RW and SVM models and the result show that the proposed method is better than the other

method. Future work, it would be interesting to find deep learning algorithm for traffic flow prediction.

he version of this template is V2. Most of the formatting instructions in this document have been compiled by Causal Productions from the IEEE LaTeX style files. Causal Productions offers both A4 templates and US

Letter templates for LaTeX and Microsoft Word. The LaTeX templates depend on the official IEEEtran.cls and

IEEEtran.bst files, whereas the Microsoft Word templates are self-contained. Causal Productions has used its

best efforts to ensure that the templates have the same appearance.

REFERENCES [1] N. Zhang, F.-Y. Wang, F. Zhu, D. Zhao, and S. Tang, ―DynaCAS: Computational experiments and

decision support for ITS,‖ IEEE Intell. Syst.,vol. 23, no. 6, pp. 19–23, Nov./Dec. 2008.

[2] J. Zhang et al., ―Data-driven intelligent transportation systems: A survey,‖IEEE Trans. Intell. Transp. Syst.,

vol. 12, no. 4, pp. 1624–1639, Dec. 2011. [3] M. S. Ahmed and A. R. Cook, ―Analysis of freeway traffic time-series data by using Box–Jenkins

techniques,‖ Transp. Res. Rec., no. 722, pp. 1–9,1979.

[4] M. Levin and Y.-D. Tsao, ―On forecasting freeway occupancies and volumes,‖Transp. Res. Rec., no. 773,

pp. 47–49, 1980.

[5] M. Hamed, H. Al-Masaeid, and Z. Said, ―Short-term prediction of traffic volume in urban arterials,‖ J.

Transp. Eng., vol. 121, no. 3, pp. 249–254,May 1995.

[6] M. vanderVoort, M. Dougherty, and S. Watson, ―Combining Kohonen maps with ARIMA time series

models to forecast traffic flow,‖ Transp.Res. C, Emerging Technol., vol. 4, no. 5, pp. 307–318, Oct. 1996.

[7] S. Lee and D. Fambro, ―Application of subset autoregressive integrated moving average model for short-

term freeway traffic volume forecasting,‖Transp. Res. Rec., vol. 1678, pp. 179–188, 1999.

[8] B. M. Williams, ―Multivariate vehicular traffic flow prediction—Evaluation of ARIMAX modeling,‖ Transp. Res. Rec., no. 1776, pp. 194–200, 2001.

[9] Y. Kamarianakis and P. Prastacos, ―Forecasting traffic flow conditions in an urban network—Comparison

of multivariate and univariate approaches,‖Transp. Res. Rec., no. 1857, pp. 74–84, 2003, Transporation

Network Modeling 2003: Planning and Administration.

[10] H. Y. Sun, H. X. Liu, H. Xiao, R. R. He, and B. Ran, ―Use of local linear regression model for short-term

traffic forecasting,‖ Transp. Res. Rec.,no. 1836, pp. 143–150, 2003, Initiatives in Information Technology

and Geospatial Science for Transportation: Planning and Administration.



[11] S. Sun, C. Zhang, and Y. Guoqiang, ―A Bayesian network approach to traffic flow forecasting,‖ IEEE

Intell. Transp. Syst. Mag., vol. 7, no. 1,pp. 124–132, Mar. 2006.

[12] Y. S. Jeong, Y. J. Byon, M. M. Castro-Neto, and S. M. Easa, ―Supervised weighting-online learning

algorithm for short-term traffic flow prediction,‖ IEEE Trans. Intell. Transp. Syst., vol. 14, no. 4, pp.

1700–1707,Dec. 2013.

View publication statsView publication stats

https://www.researchgate.net/publication/348336445

S. Piramu Kailasam et al, International Journal of Computer Science and Mobile Computing, Vol.5 Issue.7, July- 2016, pg. 194-203

194 © 2016, IJCSMC All Rights Reserved





IJCSMC, Vol. 5, Issue. 7, July 2016, pg.194 – 203

Guidance in Assisting the

Identification / Interpretation of

Lung Cancer using Bronchoscope

S. Piramu Kailasam1, Dr. M.Mohammed Sathik2

¹Research Scholar, Bharathiyar University, Coimbatore, India ²Principal, Sadakathullah Appa College, Tirunelveli, India

1 [email protected]

Abstract— Lung Cancer is one of the leading disease in the world, and increased in many countries. The

eight year overall survival rate is just 17%. We have to remove lung cancer surgically at an early stage to

ensure the survival of the patient. There are several methods to treat an early staged lung cancer such as

Brachytherapy, Cryotherapy, Photodynamic therapy, Argon plasma cogulation, Thermal laser, micro

debrider, electrocautery etc. In this review we are going to discuss various types of bronchoscopy and their

positive and negative sides.

Keywords— Lung cancer, bronchoscopy

I. INTRODUCTION

The first sign of evolution of bronchoscope was apparent when a German physician, Gustav killian removed a

pork bone from the right main bronchus of a Black forest worker in the year 1897. He had used a similar device

with the same limitations as that of a modern bronchoscope includes two types of devices – rigid and flexible

each having their own positives and negatives.

II. CHOOSING BRONCHOSCOPY

Bronchoscopic treatment is advisible for treating only lung cancer at an early stage. But bronchoscopy is a cost effective treatment. The patient undergoing treatment should not have low life expectancy. These are

some criteria for choosing bronchoscopy. Let us decide which bronchoscopy to choose rigid or flexible. Rigid

bronchoscopy is defined as trans oral or trans tracheosomy passage of rigid instruments for diagnosis or therapy

aided by various light sources, telescopes and instruments , requiring a general anaesthetic.

Flexible bronchoscopy is defined as a technical procedure that is utilized to visualize the nasal passage

from nasal opening to bronchial tree end which is usually carried out under conscious sedation.

Mostly a flexible bronchoscopy is preferable because a rigid brochoscopy requires a lot of skill.

Majority of doctors in US and UK lack this skill. But there are some circumstances in which rigid bronchoscopy

is advisable.





Unstable patients, those with significant underlying cardio pulmonary disease, uncontrolled coagulapathy and

those with cervical instability need to be carefully assessed. Mortality from flexible bronchoscopy is rarer than

with rigid bronchoscopy. Rigid bronchoscopy requires a skilled anaesthesiologist support.

A. Comparative study of different bronchoscopical methods

TABLE I

Types of

broncho

sco py

/charac-

teri stics

Electrocau

ter y

Argon

Plasma

Coagulation

Thermal

laser

Micro-

deb rider

Cryo

therap y

Brachy

ther apy

Phtoto

dyna mic

therapy

Indication Hemoptysis, Debulking of

endobronchial tumor

Hemoptysis, Debulking of

endobronchia l tumor

Debulking of endobronchial

tumor

Debulking of endobronc

hial tumor

Debulking of endobron

chial tumor

Treatment of endobron-

chi al or

Peribron-chia l tumor

Treatment of endobron

chi al or

peribron- chia l tumor

Effect Immediate Immediate Immediate Immediate Delayed Delayed Delayed

Rigid or Flexible

Both Both Both Rigid Both Both Both

Caution and Contraindic

-ation

Pace maker or defibrillators

Fio2 of

>

0.4(burns)

Per Electrocauter y

Fio2<0.4 Contraindica- tion No exphytic lesion visible.

Total airway obstruction and no

functional distal airway open

None

Distal lesion

None

Acute airway Obstruction requiring immediate relief

Expensive

Healthcare worker shielding Treatment for malignant

tracheoeso-

phageal fistula

Skin photo-sensitivity for several weeks Massive

hemop-tysis, Bronchople ural fistula, Subepithelia l fibrosis

B. Advances in Therapy

Using Bronchoscopy (Immediate Effect)

1) Electrocautery: Electrocautery uses a high frequency electrical current which causes a heating of

the tissues that the probe is with contact. Coagulation is achieved in low voltage and tissue will be

vaporized in high voltages. The difficulties of endobronchial electrocautery include that of general

bronchoscopy. Application of deep electrocautery too close to the bronchial well may result in perforation

and pneunothromax. Electrocautery is a safe, cheap and effective therapy for early stage and advanced cancer with MAO(Malignant Airway Obstruction).

2) Argon Plasma Coagulation: Argon Plasma Coagulation has to resect MAO and to control

endobronchial bleeding . A beam of argon gas acts as a non contact for the electrocautery effect which

distinguishes APC from electrocautery. APC is a cheap , safe and effective modality for treating both

MAO and hemoptysis.

3) Laser: A Laser in bronchoscopy started almost thirty years ago. Bronchoscopic laser resection is a

useful modality in the treatment of MAO. Airway obstruction to bronchogenic cascinoma is the most

frequent indication for laser resection.The tumor is removed by two methods,

1. Resection

2. Vaporization



In the first one the laser is aimed at the target lesion and through photocoagulation of the feeding blood

vessels and amelioration of the lesion the devitalized tissue is removed through the bronchoscope.

The second, which is vaporization involves aligning the laser parallel to the bronchial wall and aiming

at the edge of the intraluminal lesion. Laser pulses have limits to 1s or less. The laser is used in parallel with the

target lesion and not straight on or perpendicular to reduce the chances of perforation.

A Cheaper endobronchial laser is now a days available. [neodymium:yttrium-aluminum

perovskite(Nd:YAP)] . One study of 133patients with various malignant and benign(7 patients) indications

concluded that the Nd:YAP laser was a safe and effective tool for bronchoscopy.

Laser through bronchoscopy is an effective approach for treatment MAO. ND:YAG laser is expensive. It needs

trained staff to operate with safety to prevent injury to patients.

4) Microdebridement:

A microdebridement operates by using a powered rotating blade and a simultaneously operating

suction device to facilitate the removal of debris.

The elongated rotating tip microdebrider to about 45cm is the special and it is accurate and allows high

flow oxygen. The comblications of thermal such as airway injury, airway fires, tracheoessophgeal fistuals can

be avoided using micro debridement. The complications are hemorrhage, perforation and pneumothorax.

Delayed effect: (Cryotherapy)

In 1812 during Russian Campaign , cryotherapy is a cheap and effective treatment for many medical

conditions. The bronchoscopic cryotherapy is a procedure where, either through rigid or flexible bronchoscopy, malignant or benign tissue is ablated by repeatedly freezing , thawing and refreezing. Cryotherapy may be

considered for MAO in patients without critical airway narrowing . Main advantages is no limitation in oxygen

delivery during the procedure.

Brachytherapy is an invasive two step technique. The first step is to identify the target lesion and the

second step is to remotely under fluoroscopy and computer guidance. Place the radioactive source beside the

target lesion and also changes in its position in a staged process to allow irradiation on the target lesion.

Irridium 192 is the most common isotope used because it permits a dramatic reduction of treatment

time reduces costs, enhances the patient experience and allows for an outpatient setting over several sessions.

Lowdose Rate is less than 2Gy/h and max of 1500 – 5000cGy over 3 days. And costly cumbersome

with radiation protection measures. HDR is about 12 Gy/h with the dose varying from 100 to 300 cGy. HDR is

less costly than LDR and enhances the patient experience.

Brachytherapy is effective in the palliation of both endobronchial and peribronchial malignancy with a

delayed effect. Significant risk of hemoptysis is found.

5) Photo dynamic Therapy:

Skin cancers can be treated by photodynamic therapy. This therapy has a successful record for malignant and benign conditions.

PDT has been combined with other endobronchial therapies, including brachytherapy therapies which are brachytherapy and laser therapy. Complication of PDT are sunburn upto 6 weeks post HPD administration is a concern and patients need to be fully informed to avoid sunlight and to wear appropriate clothing.

PDT is effective in debulking and palliation with delayed effect and no limitation of oxygenation in MAO.

C. Airway stents

Tracheobronchial prostheses or Airway stents are tube shaped devices that are inserted into an airway

to maintain airway patency. These are made of metal , silicone or other meterials can be used to relieve airway obstruction caused be malignant tumors.

Stent therapy is indicated in both intraluminal and extraluminal major airway obstructions.

After removal of endobronchial obstruction stents can be considered to maintain airway patency. Self

expandable metal stents should not be placed in patients where removal is considered in future.



D. Techniques in Bronchoscopy

1) Genomic classifier for lung cancer by diagnostic bronchoscopy

Low Dose Computed Tomography (CT) screening results in a 20% relative mortality reduction in high

risk[2]. The gene expression profile of cyto logically normal bronchial airway epithelial cells has been shown to

be altered in patients with lung cancer. A gene expression classifier from airway epithelial cells that detects the

presence of cancer in smokers. Bronchoscopy suspect lung cancer and evaluated its sensitivity to detect lung

cancer among patients from an independent cohort. Bronchoscopy is considered to be safer than other invasive

sampling methods such as transthoracic needle biopsy(TTNB) or surgical techniques[3].

Lung Cancer – related genes were selected and a classifier for predicting the likelihood of lung cancer based on the combination of the cancer genes, the gene expression correlates and patient age was derived. All aspects of this classifier development procedure were determined using cross validation and using only data from the training set samples.

A logistic regression model with lung cancer status (cancer positive = 1 and cancer negative = 0) as the dependent variable was fit using the training data clinical factor gene expression correlates and patient age as

predictors. This severed as the baseline for subsequence gene analysis.

The classifier adds substantial sensitivity the bronchoscopy procedure resulting in high NPV. This can

be used to aid in decision making when bronchoscopy is non diagnostic by identifying patients who are at low

risk of having lung cancer. ROC curve analysis of the training set cohort using the finalized gene expression classifier is evaluated.

Classifier accuracy was measured by the area under the curve, sensitivity, specificity, NPV and PPV ,

cross validation using a 10% sample hold-out set[4].

TABLE III

Sensitivity of Bronchoscopy , the classifier and Combined procedures comparison by size

Mass size N Bronchoscopy sensitivity Classifier sensitivity Combined sensitivity

< 3 cm 99 44% 87% 93%

>3 cm 48 58% 94% 98%

Infiltrate 16 38% 100% 100%

2) Stem Cell – Associated Markers in human airway obtained via fiberoptic bronchoscopy

The cancer stem cell (CSC) theory proposes that tumors, which is cause for tumor growth, invasion

and metastasis[1]. CSC and normal tissue stem cells share important things: self renewal, multipotency and

unlimited proliferation, and potentially overlapping molecular mechanisms. The Chi-Square test and the Mann- whitney U test were applied to compare the expression of markers between mRNA expression markers and lung

cancer clinical factors. Malignant transformation originate from adult stem cells and may thus express the stem

cell associated markers. The purpose is to investigate the differential expression and clinical significance of

seven stem cell associated markers (Bmi1,CD 133,CD 44, Sox 2, OCT 4 and MSi2) in lung cancer providing

new targets for the diagnosis and treatment of lung cancer[10].

TABLE IV

Specificity, accuracy and sensitivity of seven stem cell associated markers mRNA in biopsy samples obtained from bronchoscopy

Specificity % Accuracy % Sensitivity %

Bmi 1 33.3 80.8 88.4

CD133 44.4 80 85.7

CD44 11.1 86.2 98.2

Sox2 16.7 86.9 98.2

Nanog 66.7 63.8 63.4

OCT4 61.2 82.3 85.7

Msi2 5.6 83.8 96.4



Fig. 1 Classifiers graphical analysis

Table 3 shows the specificity, accuracy and sensitivity values of stem cell associated markers mRNA

in bronchoscopic biopsies. This has taken from lung cancer and non cancer patients.

The highest sensitivity is CD44 (98.2%), Sox2 (98.2%) and Msi2 (96.4%). But their specificity were too low to

be considered of no clinical significance. Nanog marker has the highest specificity 66.7% and the sensitivity is

63.4%. So, Bmi1,CD44 and CD133 are poor diagnostic markers for lung cancer. Nanog may serve as a

promising diagnostic marker of lung cancer and potential therapeutic target in lung cancer.

3) Optimal procedure planning for peripheral bronchoscopy

Multidetector computed tomography scanners and Ultrathin bronchoscopes, the use of bronchoscopy for diagnosing peripheral lung cancer nodules is a option[7]. Image based planning and guidance

system improves upon other systems.

TABLE .V

Collection of Target ROI sites

SITE TYPE RUL RML RLL LUL LLL TOTAL

NODULE(< 3 cm) 4 5 3 9 1 22

MASS (>3 cm) 0 1 0 3 0 4

INFILTRATE 5 2 4 4 1 16

GGO 0 0 2 1 0 3

BAL site 3 1 0 0 0 4

Lymph Node 3 1 1 2 0 7

Other 3 2 2 3 3 13

Total 18 12 12 22 5 69



Fig. 2 Collection of Target RoI sites

Advances are,

Optimal route planning which gives accurate airway routes to arbitrarily selected target sites.

An interactive pre bronchoscopy report enables the physician to preview in advance.

Integrates all views on one monitor during bronchoscopy

Data fusion can be done during bronchoscopic navigation and ROI localization.

Registration of the VB based guidance route with the bronchoscopic video.

As a comparison this ultrathin bronchoscopy study performed with insertion depth of 5.7 airways and a

mean time to first sample of 8:30. Thus the automated system enabled bronchoscopy over 2 airways deeper into

the airway tree periphery with a sample time that was 2 min shorter on average.

4) Algorithm for video summarization of bronchoscopy procedures

The time duration of bronchoscopy lesion findings varies considerably depending on the diagnostic

and therapeutic procedures[14]. In videobronchoscopy, the whole process can be recorded as a video sequence.

The bronchoscopist who initiates the recording process and usually chooses to archive only selected views and

sequences. Video recordings registered during bronchoscopies include a considerable number of frames of poor

quality due to blurry or unfocused images. It seems that such frames are unavoidable due to the relatively tight

endobronchial space, rapid movements of the respiratory tract.

5) Applications to biopsy path planning in virtual bronchoscopy in CT Images

The volumetric data of the human body can be measured precisely by 3D Medical imaging devices.

The CAD systems give a lot of output images in a short interval of time. It is expected to reduce doctor‟s load

and helping in accurate diagnosis. When a user inputs the target zone where there is suspicion of cancer, the

CAD system displays a sequence of anatomical names of branches. The labelling accuracy was about 90%.

6) Automatic centerline extraction for 3D virtual bronchoscopy

We have an algorithm in this paper to the automatic determination of centreline of the bronchial

branches. Centerline is the approximate centre of the bronchial tube which has maximum space for

bronchoscopy to go about. First we determine the centreline of the three dimensional virtual bronchoscopy 3D

image. Initially, a lot of end points in the binary tree which links up all centre points are constructed. Next, the

endpoints of the lung airway tips are extracted. The centreline algorithm reads all the endpoints and suggests all

the shortest paths from the start to those end points. Then , modified Digikstra[13] short path algorithm is

applied to get the centreline of the bronchus.



7) ManiSMC – Manifold Modelling and Sequential Monte Corlo Sampler for boosting navigated bronchoscopy

We can track the bronchoscope‟s motion using ManiSMC and improve the navigation of

Bronchoscopy. There are two stages in ManiSMC system. We extend local and global regressive Mapping

method (LGRM)[16] to get bronchoscopic image sequences and construct their manifolds. With this, we can

classify the specific scenes to specific branches. Next, we employ SMC sampler to integrate the data of stage,

refine positions and orientations of bronchoscopy. Experimental results suggests that this method of navigation

is highly effective. The advantage of this method is that we do not need an additional position sensor.

8) High definition bronchoscopy

Videobronchoscopy is an essential diagnostic procedure in bronchoscopic diagonisis of lung cancer[43].

High definition (HD) and advanced real time image enhancement techniques (i-scan ) can be used to further

enhance the navigation of bronchocopy[15].

CONCLUSION

In conclusion we have highlighted the bronchoscopic options for treatment of MAO which includes

immediate and delayed effect modalities . Patient selection should exclude patients with short life expectancy

limited symptoms and inability to visualize beyond the obstruction. The decision to treat should be made with

other modalities such as external beam radiation in a multidisciplinary mode. In the optical procedure,

comparing two videos of bronchoscopy and MDCT and finalising the lesion sample we treat cancer . The NPV classifier decides when bronchoscopy is non diagnostic.

In video branchoscopy calculating DCT matrix from grayscal frame and counting nonzero elements can

classify the informative from noninformative frame.The centreline algorithm and modified Digikstra short path

algorithm is used to get the centreline of the bronchus. The SMC sampler can be integrate the data of stage ,

refine orientation of bronchoscopy. Navigation of bronchoscopy can be enhanced by HD and iscan.

REFERENCES

[1] Reya T, Morrison SJ .clarke MF , Weissman IL , Stem cells, cancer and cancer stem cells.Nature

2001,414:105-111.

[2] The National lung screening trial research team, reduced lung cancer motality with lowdose computed tomographic screening. N Engl J Med,2011 ; 365-409

[3] Ernst A, Silvestri G, Johnstone D Interventional pulmonary procedures guidelines from the American

college of chest physicians . Chest 2003;123;1693-717

[4] Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, et al.Molecular Classification of

cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286(5439):531–7.

[5] N.C. Dalrymple, S. R.Prasad, M. W. Freckleton, and K.N. Chintapalli,”Informatics on radiology (infoRAD):

Introduction to the language of three dimensional imaging with multidetector CT,”Radiographics, Vol. 25, No.5,

Pno 1409-1428, Sep /Oct 2005.

[6] M.W. Graham,J.D.Gibbs,D.C.Conish and W.E. Higgins, “Robust 3D Airway Tree segmentation for Image

for Image Guided Peripheral Bronchoscopy” IEEE Trans . Med. Imag.,vol.29,no.4,pp.982-997,Apr 2010.

[7] J.Ueno, T. Murase, K. Yoneda, T.Tsujikawa, S.Sakiyama And Kondoh, ”3 Dimensional imaging of thoracic

disease with multidetector row CT”J.Med.invest. ,vol . 51,no. 3-4, pp , 163-170, aug 2004 [8] M.W.Graham,J.D.Gibbs, and W.E.Higgins,”Computer – based route definition system for peripheral

bronchoscopy”, J.Digital imag.,vol.25,no.2,pp.307-317,Apr.2012

[9] Patric D.Mitchell,Marcus P.Kennedy, “ Bronchoscopic Management of Malignant Airway

Obstruction”,Springer Healthcare 2014,May22,2014

[10] Laodong Li,Huina Yu,Xiaoyang Wang,”Expression of seven stem cell associated markers in human airway

biopsy specimens obtained via fiberoptic bronchoscopy”,JECCR,May 2013.

[11] Duncan H Whitney, Michael R Elashoff, Kate PortaSmith , Adam C Gower, Anil Vachani,”Derivation of a

bronchial genomic classifier for lung cancer in a prospective study of patients under going diagnostic

bronchoscopy.,BMC Medical Genomics,May 2015 [12] Kensaku Mori,Sinya Ema,Takyuki Kitasaka,Yoshito Mekada,Ichiro Ide ,”Automated Nomenclature of Bronchial Branches Extracted from CT Images and Its Application to Biopsy Path Planning in Virtual

Bronchoscopy”,Springer,2005

[13] Tsui Ying Law and Pheng Ann Heng,” Automatic Centerline Extraction for 3D Virtual

Bronchoscopy”,Springer,2000.

[14] Mikołaj I Leszczuk and Mariusz Duplaga ,”Algorithm for Video Summarization of Bronchoscopy

Procedures”,BioMedical Engineering,2011

[15] Erik HFM van der Heijden, Wouter Hoefsloot, Hieronymus WH van Hees and Olga CJ Schuurbiers,High

definition bronchoscopy: a randomized exploratory study of diagnostic value compared to standard white light

bronchoscopy and autofluorescence bronchoscopy, DOI 10.1186/s12931-015-0193-7,2015.

[16] ManiSMC: A New Method Using Manifold Modeling and Sequential Monte Carlo Sampler for Boosting



Navigated Bronchoscopy,Xiongbiao Luo, Takayuki Kitasaka, and Kensaku Mori,Springer,2011

[17] Oh J, Hwang S, Lee J, Tavanapong W, Wong J, de Groen PC: “Informative frame classification for

endoscopy video,”Medical Image Analysis 2007, 11:110-127.

[18] Kender John R, Yeo Boon Lock: “Video scene segmentation via continuous video coherence,”. Proc IEEE

Int Conf on Computer Vision and Pattern Recognition 1998, 367-373.

[19] Adjeroh Donald A, Lee Moon-Chuen: “Scene-adaptive transform domain video partitioning,”. IEEE

Transactions on Multimedia 2004, 6(1):58-69.

[20] Aghbari Zaher Al, Kaneko Kunihiko, Akifumi Makinouchi: “Content-trajectory approach for searching

video databases,”. IEEE Transactions on Multimedia 2003, 5(4):516-531.

[21] Jianping Fan, Elmagarmid Ahmed K, Member Senior, Xingquan Zhu, Aref Walid G, Wu Lide: “Classview:

Hierarchical video shot classification, indexing, and accessing,”. IEEE Trans on Multimedia 2004, 6:70-86.

[21] Skarbek Władysław, Grzegorz Galiński, Wnukowicz Karol: “Tree Based Multimedia Indexing - a Survey,”.

Proc Krajowa Konferencja Radiokomunikacji, Radiofonii i Telewizji KKRRiT‟2004 Warszawa; 2004, 77-85,

Krajowa Konferencja Radiokomunikacji, Radiofonii i Telewizji.

[22] Chen Jau-Yuen, Taskiran Cuneyt, Albiol A, Delp E, Bouman C: “Vibe: A compressed video database

structured for active browsing and search,”. 2001.

[23] Lombardo Alfio, Morabito Giacomo, Giovanni Schembra: “Modeling intramedia and intermedia

relationships in multimedia network analysis through multiple timescale statistics,”. IEEE Transactions on Multimedia 2004,6(1):142-157.

[24] Xingquan Zhu, Aref Walid G, Fan Jianping, Catlin Ann C, Elmagarmid Ahmed K: “Medical video mining

for efficient database indexing, management and access,”. 2003.

[25] Xingquan Zhu, Jianping Fan, Aref Walid G, Elmagarmid Ahmed K: “Classminer: Mining medical video

content structure and events towards efficient access and scalable skimming,”. DMKD 2002.

[26] Vilarino F, Spyridonos P, Pujol O, Vitria J, Radeva P: “Automatic detection of intestinal juices in wireless

capsule video endoscopy,”. ICPR „06: Proceedings of the 18th International Conference on Pattern Recognition

Washington, DC, USA; 2006, 719-722, IEEE Computer Society.

[27] Choupo Anicet Kouomou, Laure Berti-Équille, Morin Annie: “Optimizing progressive query-by-example

over preclustered large image databases,”. CVDB „05: Proceedings of the 2nd international workshop on

Computer vision meets databases New York, NY, USA; 2005, 13-20, ACM.

[28] Arevalillo-Herráez Miguel, Ferri Francesc J, Domingo Juan: “A naive relevance feedback model for

content-based image retrieval using multiple similarity measures,”. Pattern Recogn 2010, 43(3):619-629.

[29] Leszczuk Mikołaj: “Analiza możliwości budowy internetowych aplikacji dostępu do cyfrowych bibliotek wideo,”. PhD thesis AGH University of Science and Technology, Faculty of Electrical Engineering, Automatics, Computer Science and Electronics, al. Mickiewicza 30, PL-30059 Krakow, Poland; 2005.

[30] Grega Michał, Leszczuk Mikołaj: “The prototype software for video summary of bronchoscopy procedures

with the use of mechanisms designed to identify, index and search,”. chapter in BRONCHOVID: The Computer

System for Bronchoscopy Laboratory Integrating the CT-Based Bronchoscopy and Transbronchial Needle-

Aspiration Biopsy Planning,Virtual Bronchoscopy, Management of the Bronchoscopic Procedures Recordings,

Automatic Image-Based Data Retrieval,Artifact and Pathology Detection and Interactive 2D & 3D Visualization

Springer-Verlag; 2010, 587-598.

[31] Duplaga Mariusz, Juszkiewicz Krzysztof, Leszczuk Mikołaj, Marek M, Papir Zdzisław: “Design of

Medical Digital Video Library,”. Proc Fourth International Workshop on Content-Based Multimedia Indexing

CBMI‟2005 Riga, Latvia; 2005, Content Based Multimedia Indexing.

[32] Lee Ho Young, Lee Ho Keun, Ha Yeong Ho: “Spatial color descriptor for image retrieval and video segmentation,”.IEEE Transactions on Multimedia 2003, 5(3):358-367.

[33] Duplaga Mariusz, Jarosław Bułat, Leszczuk Mikołaj, Socha Mirosław, Turcza Paweł, Romaniak Piotr:

“Overview of the BRONCHOVID computer system supporting bronchoscopy laboratory – medical and

technical perspective,”. In chapter in BRONCHOVID: The Computer System for Bronchoscopy Laboratory

Integrating the CT-Based Bronchoscopy and Transbronchial Needle-Aspiration Biopsy Planning, Virtual

Bronchoscopy, Management of the Bronchoscopic Procedures Recordings, Automatic Image-Based Data Retrieval, Artifact and Pathology Detection and Interactive 2D & 3D Visualization. Volume 69.

Springer-Verlag; 2010:511-522.

[34] Grega Michał, Leszczuk Mikołaj, Rafał Frączek, Duplaga Mariusz: “Algorithms for automatic recognition

of noninformative frames in video recordings of bronchoscopic procedures,”. chapter in BRONCHOVID: The

Computer System for Bronchoscopy Laboratory Integrating the CT-Based Bronchoscopy and Transbronchial

Needle-Aspiration Biopsy Planning, Virtual Bronchoscopy, Management of the Bronchoscopic Procedures

Recordings, Automatic Image-Based Data Retrieval, Artifact and Pathology Detection and Interactive 2D & 3D Visualization Springer-Verlag; 2010, 535- 545. [35] Park Sun Young, Sargent Dusty: “Colonoscopy video quality assessment using hidden Markov random

fields,”. Proc Medical Imaging 2011: Computer-Aided Diagnosis (SPIE 7963), 79632P (2011) Lake Buena

Vista, Florida, USA; 2011.



[36] Johnson Scott B: “Tracheobronchial injury,”. In Seminars in Thoracic and Cardiovascular Surgery. Volume

20. Spring; 2008:(1):52-57.

[37] Kavuru MS, Mehta AC: “Applied Anatomy of the Airways,”. In Flexible Bronchoscopy.. 2 edition. Edited

by: Ko-Peng Wang, Ahtul C. Mehta and J. Francis Turner. Blackwell Publishing Company; 2004:.

[38] Mikrut Zbigniew, Duplaga Mariusz: “Analysis of images of bronchial tree ramifications for

summarization of brochoscopic videos,”. In chapter in BRONCHOVID: The Computer System for

Bronchoscopy Laboratory Integrating the CTBased Bronchoscopy and Transbronchial Needle-Aspiration

Biopsy Planning, Virtual Bronchoscopy, Management of the Leszczuk and Duplaga BioMedical Engineering

OnLine 2011, 10:110 Page 16 of 17 Bronchoscopic Procedures Recordings, Automatic Image-Based Data

Retrieval, Artifact and Pathology Detection and Interactive 2D & 3D Visualization. Volume 69. Springer-

Verlag; 2010:583-586.

[39] ISO/IEC: “Information technology - multimedia content description interface,” ISO/IEC 15938. 2002. [40] Janowski Lucjan, Suwada Krzysztof, Duplaga Mariusz: “Fast MPEG-7 image features-based automatic

recognition of the lesion in the bronchoscopic recordings,”. chapter in BRONCHOVID: The Computer System

for Bronchoscopy Laboratory Integrating the CT-Based Bronchoscopy and Transbronchial Needle Aspiration

Biopsy Planning, Virtual Bronchoscopy, Management of the Bronchoscopic Procedures Recordings, Automatic

Image-Based Data Retrieval, Artifact and Pathology Detection and Interactive 2D & 3D Visualization Springer-

Verlag; 2010.

[41] Pawlik Piotr, Mariusz Duplaga: “Application of SIFT image analysis method to support of transbronchial

needleaspiration biopsy,”. In chapter in BRONCHOVID: The Computer System for Bronchoscopy Laboratory

Integrating the CTBased Bronchoscopy and Transbronchial Needle-Aspiration Biopsy Planning, Virtual

Bronchoscopy, Management of the Bronchoscopic Procedures Recordings, Automatic Image-Based Data

Retrieval, Artifact and Pathology Detection and

Interactive 2D & 3D Visualization. Volume 69. Springer-Verlag; 2010:571-582.

[42] Jóźwiak Rafał, Przelaskowski Artur, Duplaga Mariusz: “Diagnostically Useful Video Content Extraction

for Integrated Computer-aided Bronchoscopy Examination System,”. Advances in Intelligent and Soft

Computing, 57, Computer Recognition Systems 3 Springer; 2009, 151-158.

[43] Chen W, Gao X, Tian Q, Chen L. A comparison of autofluorescence bronchoscopy and white light bronchoscopy in detection of lung cancer and preneoplastic lesions: a meta-analysis. Lung Cancer. 2011;73:183–8.

All Rights Reserved © 2017 IJERCSE 208

ISSN (Online) 2394-2320

International Journal of Engineering Research in Computer Science and Engineering

(IJERCSE)

Vol 4, Issue 12, December 2017

Pulmonary Solid / Sub Solid Nodule Classification

in Thin Slice CT Images Using SVM

[1] S.Piramu Kailasam,

[2] Dr.M.Mohammed Sathik

Research Scholar, Bharathiar University, Coimbatore, India

Principal, Sadakathullah Appa College, Tirunelveli, India

Abstract: - Machine learning techniques used in diagnosing cancerous lesions in medical images. The phenotype features of the

pulmonary nodule in CT images are important cues for the malignancy prediction. This can improve radiologist make decisions

which are difficult to identify, improving the accuracy with efficiency. Deep Learning or hierarchical learning as a major area of

machine learning in the field of medical imaging hopefully faster and gives best results. Using parallel computing techniques

speed up matrix operation with more parameters. Compared to the conventional machine learning methods deep learning has

shown a superior performance in visual media. In this study, we develop EXHOG descriptor to characterize semantic features in

deep convolutional Neural Network. An SVM classifier finds the nodule types with richer accuracy from LIDC lung medical image

data set.

Keywords: Deep Learning, Convolutional Neural Network, Histogram of Oriented Gradients, Extended Histogram of Oriented

Gradients, Support Vector Machine.

I. INTRODUCTION

Lung Cancer is one of the dangerous malignancies among

other cancers [1]. In USA 2, 21,200 is the lung cancer

cased in 2015 and 13% of cancers diagnosed [2]. Lung

cancer may come for smoker as well as non-smokers. Early

diagnosis could improve 5 year survival period. Pulmonary

nodules are small spherical sized, primary cause of cancer

which leads to death in recent years. Computerized

Tomography (CT) is one of the sensitive imaging

Techniques with good image quality, superior to any other

modality and widely used (Van Ginneken(2008)). In CT

images the phenotype features of the pulmonary nodule is

important. The level of detail and resolution of CT inside

the lung parenchyma is comparatively better than other

modalities [3]. Lung nodules, small mass of tissues in the

lung, appears as round, white shadows in CT scan.

Cancerous lung nodule requires treatment. The effective

utilization of computer aided diagnosis in clinical

procedures can support in making the decision of malignant

nodule [4]. The Lung Image Database Consortium data is

used in this study as resource. Each CAD system uses

different steps to diagnose benign or malignant nodule. The

various steps are: 1. The lung parenchyma extraction to

segment the lung region 2.lung Solid nodule candidates

feature extraction 3. Final Solid / Subsolid classification. In

this CAD, the segmentation of the nodules is a complicated

task aiming the entire region of the nodule. Normally,

image processing algorithm has been used to extract features

on images. Deep learning architecture convolutional neural

network are capable of extracting nodule recognition in

hierarchical manner by using multiple layers of convolution

and maxpooling [5].Deep Learning has been proved in to

detect pattern and classify lung nodules by modifying its

architecture [6] .The heterogeneous features include deep

learning CNN features as well as HOG features[8], score the

radiologist ratings[7]. CNN convnet is highly suited for false

positive reduction of a CAD system [11]. The low

computation time of ConvNets-CAD is highly suited as

decision aid in lung cancer screening [12]. We propose

framework that leverages heterogeneous computational

features derived from the deep learning models of

convolutional neural network (CNN) as well as general low

level histogram of oriented gradients (HoG) and Extended

Histogram of Oriented Gradients (EXHoG) features

diagnose the nodules in CT pulmonary images.

II. MATERIAL

2.1 Data set

The Lung Image Database Consortium (LIDC) and

ANODE09 data base is a publicly available reference for the

medical imaging research community [12]. LIDC dataset



(IJERCSE)



contains 1018 pulmanory CT images that originated from a

total of 1010 patients , in total 2,43,958 images.This

database was used by many over the years in many

problems regarding pulmanory cancer diagnosis in nodule

segmentation[13].

Fig 1. Sample Input image and its threshold image

Fig1. Sample Input image and its threshold image

Fig.2. Input image of lung CT nodule case from left to

right, (a) solid; (b) part solid; and (c) non solid.

2.2 Feature Extraction

Feature extraction is the major role of object learning in

medical images. Low Level and high level features of an

image is important to detect the exact pixel region in an

image [15]. Low level features are color, texture, shape and

spatial location. HOG supports low level features and

translation of high level features. ExHOG supports the

strongest corner detection in pulmanory images. CNN

convnet supports translation, scaling and rotation which is

termed as high level features.When hybridize both

descriptors, the result image shows excellent extraction

which matches the train nodule features. In this

heterogenious scheme the result improves the accuracy.

2.3 Blocks of Convolutional Neural Network as Feature

Descriptor

Convolutional Neural Network (CNN) is a type of

feedforward neural network and inspired by modern

biological and visual system. Convolution means trying

every possible match. It is a function to map input data to an

output. Generally it consists of convolutional layer,max or

sum pooling layers, activation layers and sofmax layer.

Mathematically a CNN can be represented as a function f,

which is the composition of a function f, where f = fL f L-1

…….f1. Each function represents a layer. The first

convolutional layer of CNN extracts edges and subsequent

convolutional layers act as high level feature extractors

[14].The architecture of the CNN in this paper is showed in

fig 1. It is composed of multiple maps in each layer , each

map is composed of many neural units. Each neural units in

the same map share one kernel or weight. Each kernel

represents a feature, such as access to the edge of image

features. The input data has a strong robustness on the

distortion. The multiscale convolution image feature is

formed by kernel size and parameter.

2.4 HOG (Histogram of Oriented Gradients)

The image is divided into small connected regions called

cells. The pixel within each cell, a histogram of gradient

directions is compiled. The descriptor is the concatenation of

these histograms. For improved accuracy the local

histograms can be contrast normalized. In block larger

region of image calculate the intensity measurement. In

normalization image get good intensity and shadowing in

normalization. This operated on local cells. HOG is suitable

in tumor detection in images. To filter color or intensity of

the image gradient computation in horizontal and vertical

directions , [-1 0 1] and [-1 0 1]T are necessary.Steps

followed in Hog,

1. Set the number of HOG windows nwinx, nwiny per

bound box

2. Set the number of histogram bins b



(IJERCSE)



3. [L , C] = size(Image) here L - No of lines , C - No of

Columns

4. If (num of columns ==1) verify the size of the image

5. End

6. Hx=[-1 0 1]

7. Hy = -Hx

8. Find grad-xr,grad-yr

9. Angle = tan-1(grad-xr,grad-yr)

10. magnitude= sqr(grad-yr 2 +grad-xr 2)

11. for n = 0 to y-1

12. for m = 0 to x-1

13. calculate angle and magnitude

14. k = max(size(angles))

15. assembling the histogram with 9 bins range of 20

degrees per bin

16. end

The Hog feature vector is arranged by HOG blocks. The

cell Histogram H(Cyx) is 1 by numbins.

H(C11) refers cell histogram

Fig.4 Sample hog shape feature of nodules

2.5 ExHOG (Extended HOG) Feature Descriptor

In this paper we used a method for whole lung nodule

detection using our extended histogram of gradients

(EXHOG) feature proposed in []. This descriptor resolves

the larger intra class variation of image by the brightness

reversal of image and background. ExHOG represents the

image contour more discriminatively than HOG and has

less intraclass variation than HG. The Number of bins

taken is 18 and window size 6X6. To calculate feature of

lung images the contributes n fold , first is CNN second is

HOG and the third one is ExHOG. Given an image window

I , first order gradients are computed using mask of [1 0 -

1] T convolved in the x and y directions of the window. The

gradient magnitude(M) of each pixel (x,y) and its direction ,

theta are computed as follows:

M(x,y) = √

M(x,y) is the magnitude of pixel.

We can calculate pixel direction (x,y) , = tan-1 ( ,

where the theta intervals between o to 2π. Where Ix and Iy

are gradients in the x and y directions. The image window is

divided into nonoverlapping cells of x pixels.

Histograms of gradients of directions are computed for each

cell in an overlapping block of x cells.HG considers the

directions from 0 to 2π, in which bright object is against a

dark background or vice versa. This makes the object

larger.Denote i as the bin of quantized gradient direction

theta, hy(i) as the bin value of HG and L as number of bins

of HG which is even. HOG is computed from HG by adding

hy(i) and hy(i+L/2). To solve the problem of HOG , we

consider the absolute difference between hy(i) and hy(i+L/2)

of HG of each cell to form a histogram of difference

gradients which is referred as DiffHG.We concatenate

DiffHG and HOG of L/2 bins each to form a histogram of L

bins for each cell in the image window called Extended

histogram of gradients (EXHOG) . The feature concatenated

and normalized.

3. Classification with SVM Classifier The LIDC benchmark dataset consists the details of

patientid, nodule, noduleid, slicenumber, zposition and

diameter. The Trained and shape extracted nodules are

classified by hyperplane distance equation from SVM

classifier. Here TN and TP values are classified and

performance analysed. Support Vector Machine is

supervised learning model, suited for extreme cases,

effective in high dimensional LIDC dataset space and

memory efficient. Also it used different kernel functions for

various decision functions. SVM analyzed lung images and

classified nodule and nonnodule where nodule equal to 1

else zero in matlab 2017b.

Fig.5 Block diagram of proposed work



(IJERCSE)



In support vector machines are supervised learning model

that anayze data used for classification and regression

analysis. Maximum margin hyperplane and margins for an

svm trained with samples from two classes. Samples on the

margin are called the support vectors. The equations of two

hyperplanes,

The distance between 2 hyperplanes is

. So to maximize

the distance between the planes we want to minimize . To

prevent data points from falling in to the margin, we add

the following constraint for i,

if yi =1 or if yi =-1

These constraints state that each data point must lie on the

correct side of the margin.Optimal separating hyperplane

will obtain maximize the distance of closest point to

hyperplane.. Given training examples labeled either “yes”

or “no”, a maximum-margin hyperplane splits “yes” and

“no” training examples, such that the distance from the

closest examples to the hyperplane is maximized.

IV. RESULT AND DISCUSSION

In this study, the Solid and Subsolid are classified from the

CT lung pulmanory images using SVM classifier

implemented in matlab 2017b. The performance of SVM

classifier calculated using various metrics such as Accuracy

, Sensitivity, Specificity, Cohen’s Kappa, Fscore , Precision

and Recall.

4.1 Accuracy

It is noted that SVM classifier is improved with ExHOG

feature descriptor.

Accuracy =

Fig.6 Accuracy of various classifiers

4.2 Specificity

It is the ability of the classifier predict the true negatives

which is given by the equation.

Specificity =

Specificity of SVM classifier from ExHOG descriptor is

good since true negatives are mostly identified. Specificity

value of ExHOG is 80.09%. Because true negative rate is

good.

Fig.7 Sensitivity and Specificity comparison with other

Papers

4.3 Sensitivity

Sensitivity is one that improve by .01% in SVM_EXHOG

than SVM_CNN.

Sensitivity =

4.4 Cohen’s Kappa

The value of Cohen’s Kappa in SVM_CNN is 58.36 and

SVM_EXHOG is 78.71.This metric improves 20.35% in

SVM_EXHOG%.

Kappa Coefficient (K) =

Here Accuracy Expected (Pe) = ∑

Overall accuracy P0 = ∑

4.5 FScore

Fscore is a benchmark metric, which measures accuracy

by considering both the recall and the precision.This is a

measure of relative to sensitivity. The SVM_CNN is

lower than SVM_EXHOG. The difference value is 4.39.

F Score = 2 X

Precision =

(

Recall =

(



(IJERCSE)



Where TP (cancer pixels) is true positive that counts

classified foreground pixels.

FP counts the background pixels incorrectly classified as

foreground.

FN counts false negative that counts foreground pixels

incorrectly classified as background.

TN counts correctly classified background pixels.

V. CONCLUSION

The Proposed feature descriptor Extented Hog method

with classifier SVM in deep learning shows an improved

accuracy and specificity, a reduced FPR with reduced

computation time.This SVM classifier classify the solid ,

part solid and non solid. The Challenge is to decrease the

false positive rate and the method should be generalized

with other medical images which leads to Content Based

Retrieval in web search machine.

REFERRENCE

[1] A. McWilliams, et al., "Probability of cancer in

pulmonary nodules detected on first screening CT," New

England Journal of Medicine, vol. 369, pp. 910-919, 2013.

[2] V. K. Patel, et al., "A practical algorithmic approach to

the diagnosis and management of solitary pulmonary

nodules: part 2: pretest probability and algorithm," Chest,

vol. 143, pp. 840-846, 2013.

[3] D. P. Naidich, et al., "Recommendations for the

management of subsolid pulmonary nodules detected at

CT: a statement from the Fleischner Society," Radiology,

vol. 266, pp. 304-317, 2013.

[4] M. K. Gould, et al., "Evaluation of individuals with

pulmonary nodules: When is it lung cancer?: Diagnosis and

management of lung cancer: American College of Chest

Physicians evidence-based clinical practice guidelines,"

Chest, vol. 143, p. e93S, 2013.

[5] J.Gu.Z.Wang, J.Kuen, L.Ma, A,Shahroudy, B.Shuai,

T.Liu, D.Wang, and G.Wang, “Recent advances in

convolutional neural networks”, arXiv preprint

arXiv:1512.07108, 2015.

[6]Qing Zeng Song,Lei Zhao et al,Using Deep Learning for

Classification of Lung Nodules on Computed Tomography

Images”,JHE,vol.2017

[7]Sihong Chen, Jing Qin,Xing Ji,”Automatic Scoring of

Multiple Semantic Attributes with Multi-task Feature

Leverage: A study on Pulmonary Nodules in CT

Images”,IEEE,2016

[8] N. Dalal and B. Triggs, "Histograms of oriented

gradients for human detection," in CVPR 2005. IEEE, 2005,

pp. 886-893.

[9]Jie-Zhi Cheng, DOng Ni, Yi Hong Chou,Jing Qin et al,

“Computer Aided Diagonosis with Deep Learning

Architecture : Applications to Breast Lesions in US Images

and Pulmonary Nodules in CT Scans,Scientific Reports

,DOI:10.1038/sre/24454

[10]Juan-Juan Zhao and Guohua Ji,“Cavitary nodule

segmentation in computed tomography images based on self

generating neural networks and particle swarm

optimization”,inderscience,2015

[11] Arnaud A, A. Setio,Franceco Ciompi et al,”Pulmonary

nodule detection in CT images : false positive reduction

using multi-view convolutional networks,IEEE,2015

[12] Armato,S.G.,The Lung image database consortium

(LIDC) and image database resource initiative (IDRI), a

completed reference database of lung nodules on CT scans,

Medical Physics,38,915-931

[13]Novo, Rouco,Mendonca and Campilho,”Reliable lung

segmentation methodology by including justaplueral

nodules”,2014

[14] HE Yang,Hengyong Yu,”Deep Learning for the

classification of lung nodules”,article,2016

[15]Alif Hayat M,2015,Classifiction of membrane protein

types using voting feature interval in combination with

Choo’s pseudo amino acid composition. Theory Biology

384, 78-83

A Reliable Computer Aided Lung Cancer Classification System

Using Curvelet Features and Ensemble Classifier S. Piramu Kailasam1, M. Mohammed Sathik2

1Research Scholar, Research and Development Centre, Bharathiar University, Coimbatore, India 2Principal, SathakathullahAppa College, Tirunelveli, India

[email protected]

Abstract- Lung cancer is one of the dreadful diseases, which shows greater mortality rates. The lifespan of the patients can be improved, when the disease is detected earlier. Due to the advancement of medical science, several screening equipments are presented to the society. Computed Tomography (CT) is one of the prominent screening equipments for detecting lung cancer. It would be beneficial for the medical expert, when some reliable computer based assistance is rendered. Taking this into account, this article proposes a computer aided classification system for lung cancer detection by dividing the complete work into four phases. Initially, the contrast of the CT lung images are improved by Adaptive Histogram Equalization (AHE) and the image segmentation is carried out by Kernelized Fuzzy C Means (KFCM). The curvelet features are extracted from the segmented CT image and the ensemble classifier is applied over it. This way of benign and malignant classification results in reliable and accurate results. The performance of the proposed approach is evaluated with respect to accuracy, sensitivity, specificity and time consumption. The proposed classification approach shows better results, when compared to the existing approaches.

Keywords – Lung cancer, pre-processing, CT image, segmentation, ensemble classification.

1. Introduction Lung cancer is the fourth commonly occurring cancer next to breast, cervical and oral cavity cancer. Lung cancer is observed more in men rather than women, as per the medical reports. This dreadful disease is observed as the second and sixth mostly occurring cancer in men and women respectively [1]. The mortality rates with respect to lung cancer are improving every year. The main reason for the increased death rates is the ignorance and negligence of periodical health checkups. Early detection of any cancerous growth supports in increasing the lifespan of the patient. Due to the advancement of medical science and computer technology, computer based diagnostic systems come into picture. As the diagnostic systems assist the physician to make a decision, the reliability of the system is very important. The system is said to be reliable when the accuracy rates are reasonable. However, achieving better accuracy rates is a challenging task, as the images contain several unnecessary details. In spite of the presence of numerous lung cancer detection systems, there is a constant demand for a reliable system. Taking this challenge into consideration, this work attempts to propose a novel computer based diagnostic system for lung cancer detection by incorporating advanced image processing techniques over Computed Tomography (CT) images. The entire work is decomposed and organised into four important phases, which are CT image pre-processing,

segmentation, feature extraction and classification. The image pre-processing phase makes the CT images fitter for the forthcoming processes. Usually, the pre-processing phase attempts to enhance the quality of image by incorporating the noise removal or the contrast enhancement procedure and so on. The segmentation phase focuses on extracting the regions of interest from the whole image, such that the segmented regions alone are processed. The efficiency of the classification system improves, when specific regions of CT images are focussed. The features of the segmented regions are extracted, which are rich enough for the classifiers to detect the abnormalities being present in the CT images. The rest of this paper is organized as follows. A short review of the state-of-the-art literature with respect to lung cancer detection is presented in section 2. Section 3 presents the proposed lung cancer detection approach in a detailed manner. The performance of the proposed approach is analysed by executing several comparative analysis in section 4. The concluding points of this article are summed up in section 5. 2. Review of Literature The purpose of this section is to review the recent literature with respect to lung cancer detection systems. Several image processing techniques for the detection of lung cancer by using CT images are reviewed in [1]. The lung cancer detection is carried out by splitting the review in different aspects such as pre-processing, nodule segmentation and segmentation. The recent trends in lung nodule detection are presented in [2]. Additionally, the performance of the recent lung nodule detection techniques is compared and presented. In [3], a lung cancer classification system is proposed on the basis of wavelet recurrent neural network. This work employs wavelet to remove the noise from the input image and recurrent neural network is utilized for classification. However, this work cannot achieve better specificity rates and this implies that the false positive rates of the work are greater. A lung cancer detection technique, which is based on Local Energy based Shape Histogram (LESH) and machine learning techniques is introduced in [4]. Initially, this work pre-processes the CT images by Contrast Limited Adaptive Histogram Equalization (CLAHE) and the LESH features are extracted. A new lung cancer detection technique based on Mumford-Shah algorithm is proposed in [5]. This work removes the Gaussian noise by applying sigma filter and the regions of interest are segmented by otsu’sthresholding and mumford-shah model is applied. The texture features are extracted from the extracted regions by spectral texture extraction technique and the classification is done by multi-level slice classifier. An automatic lung nodule segmentation and classification technique is proposed in [6]. The areas of interest alone are extracted by means of priory

International Journal of Pure and Applied MathematicsVolume 117 No. 22 2017, 275-278ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu

275

information and Hounsfield Units. SVM is employed for achieving the task of classification. The results of this work can still be improved in terms of sensitivity and specificity. Motivated by these existing research works, this paper aims to present a reliable lung cancer detection algorithm, which can prove better sensitivity and specificity rates with minimal time complexity. The following section elaborates the proposed approach along with the overview of the work. 3. Proposed Lung Cancer Detection Approach The proposed lung cancer detection approach is based on several phases and all the involved phases are explained one after the other. A. CT Image Pre-processing by AHE Most of the medical images suffer from poor contrast issues and so the details of the images cannot be distinguished easily. At this juncture, the role of contrast enhancement algorithm comes into picture. The CT images are pre-processed by employing the AHE technique and the major reason for the exploitation of AHE is that this technique computes several histograms by considering separate areas of the CT image. This means that the contrast of every particular area of the image is enhanced. The computed histogram is adaptive, which ensures the balanced distribution of values. The AHE technique improves the edges and the local contrast of the CT image. Some of the sample pre-processed images are presented in figure 1.

Fig.1. (a) to (d) are input images (a1) to (d1) are the contrast

enhanced images Consider a CT image with pixels and is applied with the contrast enhancement process. The AHE processes each and every pixel and alters by considering the intensity immediate neighbourhood pixels. Unlike standard HE technique, AHE processes the regions of an image separately. This results in the contrast enhancement of each and every region, which in turn enhances the contrast of a complete image. B. KFCM based CT Image Segmentation This phase aims to segment or divide the CT image into several segments, such that each segment can be focussed with utmost care. This idea conserves time and computational power of the system. Hence, the process of segmentation renders several benefits to the classification system. FCM is one of the famous segmentation algorithms, which groups the pixels that show similar characteristic features. The input of the standard FCM algorithm is the image along with the total count of clusters. In order to improve the capability of clustering, this work launches a kernel function that can deal with noise as well. This work employs the Gaussian Radial Basis Function (GRBF) kernel and the kernel function is represented in the feature space as follows. ( ) ⟨ ( ) ( )⟩ (1) In the above equation, ⟨ ( ) ( )⟩ is the inner product. The GRBF kernel is denoted as

( ) ( || ||

) (2)

The objective function of the KFCM is presented as follows.

∑ ∑

|| ( ) ( )||

(3)

In the above equation, and are the total count of clusters and data points respectively. is the membership of the pixel in

the cluster and is the degree of fuzziness. indicates the non-linear mapping implicitly.

|| ( ) ( )|| ( ( ) ( ))

( ( ) ( )) (4)

Based on this equation, the objective function is modified as follows.

∑ ∑

( ( )) (5)

Where

( ( ))

( )

∑ ( ( ))

( )

(6)

∑

( )

∑ ( )

(7)

C. Feature Extraction by Curvelet Curvelet is a multidimensional and multiresolutional analytic tool, which is proposed by Candes and Donoho in the year 2000 [7]. The curvelets are simple and are easy to implement. Additionally, curvelets are faster with minimal redundancy. Initially, Fast Fourier Transform (FFT) is applied over the input image. The obtained fourier frequency plane is sliced into several wedges, which results in circular and angular slices. The circular slices decompose an image into several scales and the angular slices divide the images in different angles. Hence, all the wedges conform to a specific angle and scale. When the Inverse FFT (IFFT) is applied over a wedge, the curvelet coefficients can be obtained for a particular scale and angle. Additionally, the curvelets can handle the edge discontinuities in a better way. The CT images of lungs are considered in three different scales and the GLCM coefficients are extracted. GLCM features are effective in determining the spatial relationship between the pixels. The GLCM can be represented as a square matrix , where is the total number of gray levels being available in an image. All the entities being present in the GLCM is the approximation of probability of intensity between two pixels.

The ( ) entity of the GLCM matrix is formed when a pixel with coordinate ( ) has a gray level and then the pixel ( ) is found with the intensity . The

and are computed by considering different scales and angles. The features are computed by performing weight operation upon every entity of the matrix and the weighted values are added. The features being utilized by this work are energy, contrast, correlation, homogeneity, autocorrelation, dissimilarity and inertia. The co-occurrence is computed in three different angles such as 0, 45 and 90 degrees. D. Ensemble Classification When it comes to supervised learning, the classifiers are given knowledge with the features with which the classification is done. Normally, the process of classification requires prior knowledge about the dataset being utilised. The entire process of classification is achieved by two major phases namely training and testing. In the training phase, the classifier is imparted knowledge by means of the extracted features. By this way, the classifier equips itself in differentiating the objects. The classification problem can be a binary or multiclass problem. The binary classification system involves only two classes, whereas the multiclass problem has several classes involved in it. This work employs ensemble classifier, as the result of a single classifier may introduce false positives or false negatives. In order to ensure reliability, this work employs k-NN, SVM and ELM and the final decision is taken by performing majority vote computation. The final classification results of the k-NN, SVM and ELM are obtained and the dominating result is found out. As three different classifiers are utilized, eight different scenarios may

International Journal of Pure and Applied Mathematics Special Issue

276

occur. For instance, when two or three classifiers produce the same result, then that result is considered as dominating and the same result is declared as final. As this work does not rely on a single classifier to classify between the classes, the reliability of the system is improved. Additionally, in order to reduce time and computational complexity, this work exploits efficient classifiers. The false positive and false negative rates are considerably reduced, which in turn improves the sensitivity and specificity rates. The performance of the proposed approach is analysed in the following section. 4. Results and Discussion The functionality of the proposed approach is evaluated by implementing this work in Matlab version 8.1, on a stand alone computer with 8 GB RAM. The CT images are collected from different laboratories to carry out this research. The tailored dataset contains an image set of 100, which includes 37 and 63 benign and malignant images respectively. The initial round of comparative analysis justifies the choice of the incorporated techniques such as AHE, KFCM, curvelet features and ensemble classification. In the second round, the comparative analysis literally compares performance of the proposed classification approach with the existing literature proposed in LESH + CC [4], Mumford-Shah algorithm [5], apriori + SVM [6], in terms of classification accuracy, sensitivity, specificity and time consumption.

Fig 2. Performance analysis

The reason for the better performance of the proposed approach is that all the techniques involved in this work operate by themselves without any manual setting. Besides this, the better pre-processed and segmented results affect the quality of the classification process. When the initial image processing activities are optimal, obviously the outcome of the system is efficient. Hence, a better classification approach to distinguish between malignant and benign tumour in the lung CT scans is presented. The experimental results of the comparative results are presented in table.1.

Table.1 Performance analysis in terms of accuracy, sensitivity, specificity and time consumption

The experimental results of the proposed approach justify the choice of the techniques that are incorporated in it. The second to fourth columns of the table signifies the comparative technique for pre-processing, segmentation and feature extraction respectively. The columns five to seven indicates the performance

of the classifier being employed alone. Finally, the results of the proposed approach are presented. The proposed approach shows maximum accuracy, sensitivity and specificity, however the time consumption is a bit greater than ELM. Yet the time consumption is reasonable, as the results of the proposed approach performs ensemble classification and does not based on a single classifier. The attained time consumption is tolerable, as the degree of sensitivity and specificity are greater.

5. Conclusion This article presents a reliable computer aided lung cancer classification system based on curvelet features and ensemble classifier. The proposed classification approach is applied over the pre-processed and segmented images, which is accomplished by AHE and KFCM respectively. The features of the segmented regions of the image are extracted by means of curvelet, which is faster with high discrimination ability. The ensemble classifier which is formed by k-NN, SVM and ELM is proposed and the decisions of all the classifiers are obtained. The dominating result is chosen as the final decision and the performance of this work is tested against three different existing techniques. The proposed classification approach serves better in terms of accuracy, sensitivity and specificity rates with minimal time consumption. In future, this work is planned to be extended by including feature normalization and selection techniques. References [1] G. Niranjana ; M. Ponnavaikko, "A Review on Image

Processing Methods in Detecting Lung Cancer Using CT Images", International Conference on Technical Advancements in Computers and Communications, 10-11 April, Melmaruvathur, India, 2017.

[2] RabiaNaseem ; KhurramSaleemAlimgeer ; Tariq Bashir, "Recent trends in Computer Aided diagnosis of lung nodules in thorax CT scans", International Conference on Innovations in Electrical Engineering and Computational Technologies, 5-7 April, Karachi, Pakistan, 2017.

[3] DeviNurtiyasari ; DediRosadi ; Abdurakhman, "The application of Wavelet Recurrent Neural Network for lung cancer classification", International Conference on Science and Technology - Computer, 11-12 July, Yogyakarta, Indonesia, 2017.

[4] SummrinaKanwalWajid ; Amir Hussain ; Kaizhu Huang ; WadiiBoulila, "Lung cancer detection using Local Energy-based Shape Histogram (LESH) feature extraction and cognitive machine learning techniques", International Conference on Cognitive Informatics & Cognitive Computing, 22-23 Aug, CA, USA, 2016.

[5] R. Janudhivya ; S. Gomathi ; P. MadhuMathi ; J. Seetha, "A new approach for lung cancer cell detection using Mumford-shah algorithm", International Conference on Communication and Signal Processing, 6-8 Apr, Melmaruvathur, India, 2016.

[6] ElmarRendon-Gonzalez ; VolodymyrPonomaryov, "Automatic Lung nodule segmentation and classification in CT images based on SVM", International Kharkiv Symposium on Physics and Engineering of Microwaves, Millimeter and Submillimeter Waves, 20-24 June, Kharkiv, Ukraine, 2016.

*7+ Candes, E. J. &Donoho, D. L. (2000). “Curvelets—A surprisingly effective non-adaptiverepresentation for objects with Edges”, Vanderbilt University Press, Nashville, TN, 2000.

International Journal of Pure and Applied Mathematics Special Issue

277

Sadakath: A Research Bulletin


Classification of Solitary pulmonary Nodule/Blob Images Using

Improved Local Binary Pattern [Z⊕ŻLBP]

S.Piramu Kailasam1, Dr.M.Mohamed Sathik2

Abstract

The Solitary Pulmonary Nodule(SPN) in chest radiographs of 3DCT at

risk population is an alert signal to lung cancer. In this paper, an improved

feature vector named Z with Ż Local Binary Pattern [Z⊕ŻLBP] is proposed

for extracting solitary pulmonary blob image features powerfully. The goal is

to reduce Local Binary Pattern (LBP) complexity by reducing the size of the

feature vector. In the proposed work by dividing the vicinity pixels in to non-

overlapped groups of Z and Ż (Tilted Z) further texture features of pulmonary

images are extracted for feature extraction. The Single Classifier KNN has

been used with distance metric for classification purpose. Performance

Evaluation Metric Accuracy and F-measure are used to measure the proposed

system.

Keywords: Classification, Pulmonary Image, Feature Extraction, Active

Contour, KNN Classifier

Introduction

Pulmonary images are very much important to detect lung diseases using CTimaging modality for the assessment of pulmonary blobs which is Region of Interest. Thefeature types of the pulmonary bolb(ROI) in CT images are important cues for themalignancy prediction [1-2], diagnosis and advance management [3-4].The texturefeatures of blob solidity and semantic morphology feature of speculation are critical todifferentiate of pulmonary blobs region and other subtypes. Meanwhile other semanticfeatures calcification pattern, roundedness, margin clearness are shown to be helpful forthe evaluation of blob classification. The region of interest may be found in bronchialtubes or outside of bronchial tube. If the blob<=3mm the detection of malignancy isdifficult. The determination of clinical characteristics may differ from patient to patientand also depends on experience of the observer. Computer aided diagnosis is an assistivesoftware package to provide computational diagnostic references for the clinical imagereading and decision making support. The histogram features for the high level textureanalysis helped to extract blob or blob feature. Ciompi et al [5] developed the bag offrequencies descriptor that can successfully distinguish 51 speculated blobs from theother 204 non-speculated blobs. However, the mapping is from the low level imagefeatures toward the high level semantic features in the domain of clinical terms is notstraight forward task This semantic feature assessment maybe useful for clinical analysis.The Lung Image Database Consortium (LIDC) dataset for its rich annotation databasesupports the training and testing CAD scheme[6-7]. The blobs diameters larger than3mm are further rated by radiologist referred with semantic features of ”subtlety”,

1Research Scholar, Bharathiar University, Coimbatore.2Principal, Sadakathullah Appa College, Tirunelveli.

2 S.Piramu Kailasam, Dr.M.Mohamed Sathik

Vol. V, Feb. 2018 SPECIAL ISSUE ISSN 2347-7644

”calcification”, ”spericity”, ”margin”, ”speculation”, ”texture”, ”lobulation”, ”internalstructure” and “malignancy”[8].

Absorption and scattering of illumination rays are the two major issues that causereduced quality of images. Several methods have been proposed to enhance the quality ofthe pulmonary images. Histogram equalization technique, Contrast stretching methodscapable to enhance the image quality. Contrast Limited Adaptive Histogram Equlization(CLAHE) has been applied to improve the image contrast. Otsu’s adaptive thresholdingmethod form image segmentation has been effective for many applications[9]. Thisprovides bright backgrounds for images. Various thresholding techniques such as Local,Global and Multilevel thresholding have been applied for the segmentation of pulmonarybolb images. The slice thickness of the LIDC CT Scans ranges from 1.25mm to3mm[10,11]. The texture feature descriptor that has been widely used for imageclassification is Local Binary Pattern. Pican et al. have used GLCM’s twenty four typesfeatures for extraction and for each image suitable features have to be chosen forextraction. Hence there is a need of efficient feature descriptor for classification process.In past years Neural Network is performed for classification results time consuming. K-Nearest Neighbour as classifier with Euclidean distance used to classify blobs.Padmavathi et al[15] have classified images using probalisitic neural network whichgives better results than SIFT algorithm with three classes of dataset. Eduardo et al haveclassified images using nine machine learning algorithms such as: Decision Trees,Random Forest, Extremely Randomised Trees , Boosting, Gradient Boosted Trees, NormalBayes Classifier, Expectation Maximization, NN and SVM. Bhuvaneswari.P et al[16] haveclassified coral and textures using KNN by considering K=1 and the accuracy reported as90.35%.

The main tasks in this paper are in 4 steps as follows:

1. The features of blobs are enhanced from LIDC dataset.

2. The dimension of the LBP feature vector is condensed by novel feature descriptor Z

with Tilted Z Local Binary Pattern [Z⊕ŻLBP].

3. The Complication is reasonably reduced and accuracy is raised from existing CenterSymmetric Local Binary Pattern and Orthogonal Combination of Local Binary Pattern.

4. KNN with distance metric have been used.

Proposed Work

To find artfaceted of image, Adaptive Histogram Equalization (AHE) methodapplied to homogeneous regions of the pulmonary blob. This blur problem has beenovercome by Contrast Limited Adaptive Histogram Equalization (CLAHE) method. Itavoids the magnification of noise that might limit the artifactuals.

Figure 1: Pulmonary image

Classification of Solitary pulmonary Nodule/Blob Images Using Improved Local Binary

Pattern [Z⊕ŻLBP] 3


Figure2: Bolb detection in a pulmonary image

(a) Lung Bolb in Segmented in red area, (b)Target Bolb in bronchial,

(c) Path to reach Bolb

Image segmentation, edge based and region based active contours are available.Edge based segmentation depends on image gradient that acts as an edge detector toinitiate boundaries of sub regions by outlining contours of detected boundaries. Regionbased active contour technique is similar to region based segmentation depends onstatistical information of image intensity inside each subset. Compared to other methodsactive contour method deliver continuous boundaries of sub regions and merge multiplecontours into one contour. It works well for both global and local cluster regions.

Figure 3: Sample trained data

Texture features are important useful information for discrimination purposes.

The [Z⊕ŻLBP] reduces the dimensionality of the original LBP histogram while

preserving its discriminative power.

Figure 4: Block Diagram of Proposed Method

The proposed Feature Descriptor [Z ⊕Ż LBP]

The Computed Tomography LIDC[11] dataset is trained with each five slices. Thefive slices taken as p-2,p-1,p,p+1, p+2, where p - present slice and vice versa. Each 3dsegmented[9,12] bolb LBP texture value calculated for every five slices and it applied toall trained data. The trained class checked with tested data class. In CS-LBP(CentreSymmetric Local Binary Pattern) , the information of the centre pixel is not considered .When neighbours increase, CS-LBP’s feature vector size also increases. In OC-LBP(Orthogonal Combination of Local Binary Pattern) , diagonal information is not



considered. But in [Z⊕ŻLBP] diagonal information is also considered by Z and Ż groups.

[Z⊕ŻLBP] operator is proposed to reduce the dimension of feature vector and to

increase the classification accuracy. According to [Z⊕ŻLBP, a 3 x 3 neighbour pixel is

viewed as two non overlapping Z and Ż groups. Separately Z and Ż feature vector groupformation is performed and finally they concatenated.

Figure 5: Block Bk of matrix 3 x 3 with the centre pixel Xij and its neighbour pixels

Figure 6: Illustration of [Z⊕ŻLBP] template as z neighbours in 3 x 3 region.

Figure 7: Ż neighbours of 3 x 3 neighbourhood region.

(a) (b)

Figure8: Neighbours considered for [Z⊕ŻLBP] combinations (a) Z neighbours

(b) Ż neighbours are highlighted

In pulmonary image, take Block Bk, where 1≤k≤N and N is the total number ofblocks in an image. Let Xij be the centre pixel of a 3 x 3 block Bk as shown in the Fig.5.Among the eight neighbours four are considered as the Z neighbours and the other four




as the Ż neighbours. The table is shown in Fig 6. The Z and Ż group neighbours are shownin green and red colors in Fig.8. In proposed work the centre pixel is taken as one of theneighbours rather than getting much importance as in LBP. The resultant of LBP for theZ group is given in Eq(6)

D1 = X i-m, j+n - X i+m,j+n (2)

D2 = X i-m, j+n - X i+m-1,j+n-1 (3)

D3 = X i-m-1, j+n-1 - X i-m,j-n (4)

D4 = X i-m, j-n - X i+m, j-n (5)

Where (m, n) = (1, 1)

LBP_Z = D1 , D2, D3, D4 (6)

Directional code DCz = 1, if LBP_Z(i) > 0 , where 1 ≤ i ≤ 4 (7)

0, otherwise

DDCz =Decimal(DCz) = ∑ ( ).,, 2j (8)

Ż neighbours of the 3 x 3 neighbourhood are considered as shown in Fig.1. The Ż groupwill never overlap Z group. Therefore the result of LBP for the Ż group is given by Eq(13)

D1’ = X i-m, j+n-1 - X i+m-1,j+n (9)

D2’ = X i-m-1, j+n - X i+m-1,j+n-1 (10)

D3’ = X i-m-1, j+n-1 - X i+m-1,j-n (11)

D4’ = X i-m, j-n - X i+m, j+n-1 (12)

LBP_Ż = D1’,D2’,D3’,D4’ (13)

Directional code DCŻ = 1, if LBP_Ż (i) > 0, where 1 ≤ i ≤ 4 (14)

0, otherwise

Directional code will assigned depending on negative and non negative values.Resultant Feature Vector of Bk is formed by concatenating the decimal values pertainingto Z and Ż neighbours of every Block Bk, 1≤ k ≤ N.

DDCŻ =Decimal(DCŻ) = ∑ Ż( ).,, 2j (15)

FVk = < DDCz1, DDCŻ1....., DDCZN-1, DDCŻN-1,...... DDCZN, DDCŻN > (16)

Where N is the total blocks of the image.



Algorithm: Texture feature Extraction using proposed feature vector

Input : Segmented 3D CT pulmonary image slices

Output : FV is termed as feature vector

Do for segmented CT slices (p-2,p-1,p,p+1,p+2) of every block Bk

Compute D for Z neighbour using Eq(2 ) to Eq(5)

LBP_Z value is calculated using eq(6)

Directional Code for DCZ is calculated using Eq (7)

DDCZ value is calculated using Eq(8)

Compute D’ for Ż neighbours using Eq.(9) and Eq(12)

LBP_Ż is calculated using Eq(13)

Directional Code for DC_Ż is calculated using Eq(14)

DDCŻ is calculated using Eq(15)

Resultant feature vector of [Z⊕ŻLBP] is calculated using Eq(16)

End

Store the values of feature vector for classification.

Figure 9: Algorithm of the proposed [Z⊕ŻLBP] operator for pulmonary images

K-Nearest Neighbour Classification

Classification is learning a function that maps a data item into one of severalpredefined classes. The classification accuracy Ai of an individual program depends onthe number of samples correctly classified. In pattern recognition K-Nearest Neighbouralgorithm is a non – parametric method for classification. K-NN is a type of instance –based learning. In KNN classification[13], the output is a class membership. Theproposed improved [Z⊻ŻLBP] operator with K-NN classification worked in matlab16a.

The classification method used is KNN classifier with various distance metrices. It findsthe correlation between quantitative, continuous variables. Small SPN imageclassification assessment is a risk task. The image is classified by a majority vote of its knearest neighbours. If u = (u1,u2,....un) and v = (v1,v2,....vn) then the Euclidean distancebetween the feature vectors u and v is given below.

(u1 - v1)2 + (u2 - v2)2+ ............+(un - vn)2

D(u,v) = (u − v ) + (u − v ) + . . . . . . . . . . . . +(u − v ) (17)

Where u and v are training and testing samples generated from the histograms of thefeatures.

The steps involved in a KNN algorithm,

Step 1: Start

Step 2: Initialize and define the parameter K

Step 3: Find the distance between the test sample and all the training samples

Step 4: Sort out the Distance




Step 5: Take K-Nearest Neighbour value

Step 6: Apply simple majority of category

Step 7: End

The feature descriptor LBP has histogram formation value as 2N. But the proposedfeature descriptor histogram is 2 X N. Average time taken for LBP is 0.53secs. Theresultant average time taken by proposed is 0.38secs.

Quantitative Comparison

1. F-measure

F-measure is a standard metric, which measures the image classification andsegmentation accuracy by combining both precision and recall values.

F measure = 2 x (18)

where Precision = (19)

Recall = (20)

2. Overall Accuracy

The overall accuracy (OA) of the classified LIDC dataset is derived by the ratio ofthe number of correctly classified tested samples and the total number of tested samples.

Overall Accuracy (OA) = (21)

3. Cohen’s Kappa

The value of Cohen’s Kappa of LBP_KNN and [Z⊕ŻLBP] got same value as 74.14%.

Kappa Coefficient (K) = (22)

Wherena is number of agreements, ne is number of agreements due to chance and n isnumber of samples.

Figure10: Performance Metrics Analysis of LBP and [Z⊕ŻLBP] with KNN Classifier

0.9462 0.9317 0.9414

0.7414

0.9582 0.9417 0.9614

0.7414

0

0.5

1

Fscore Precision Recall kappa

LBP_KNN

Z ⊻ ŻLBP_KNN



Conclusion

The proposed [Z⊕ŻLBP] feature descriptor is best competent for SolitaryPulmonary Nodule/Blob images and texture datasets. This feature descriptor extractsmore detailed information by dropping the size of the feature vector bin size of 8. The F-measure, Precision, Recall and Kappa coefficient are compared with uniform pattern.Compared to the existing feature descriptors the proposed [Z⊕ŻLBP] with KNN classifierprovides an improvement of 3.5% on accuracy as 95.48%. In future shape features andother texture features of pulmonary images can be calculated for Content Based ImageRetrieval from Web Search Engine.

References

[1]. McWilliams , et al.,” Probability of cancer in pulmonary nodules detected on firstscreening CT”, New England Journal of Medicine, Vol,369,pp.910-919,2013

[2]. V.K.Patel et al;, “A Practical algorithmic approach to the diagnosis and managementof solitary pulmonary nodules part 2: pretest probability and algorithm”,Chest,vol143, pp 840-846, 2013

[3]. D.P.Naidich, et al., “ Recommendations for the management of subsolid pulmonarynodules detected at CT: a statement from the Fleischner society”, Radiology,vol266,2013

[4]. M.K.Gould et al., “Elevation of individuals with pulmonary nodules when is it lungcancer? Diagonosis and management of lung cancer American college of chestphysicians evidence based clinical practice guidelines” Chest, vol 143,2013

[5]. F.Ciompi, et al., “Bag of frequencies: A Descriptor of pulmonary nodules inComputed Tomography Images”, IEEE Transcactions on Medical Imaging, Vol 34,pp962-973,2015

[6]. S.G.Armato III,et al., “ The lung image database consortium(LIDC) and imagedatabase resource initiative (IDRI); a completed reference database of lung noduleson CT scans “, Medical Physics.vol 38,pp 915-931,2011

[7]. S.G.Armato III,et al., “ The lung image database consortium: Developing a resourcefor the medical imaging research community 1”, Radiology , vol 232,pp,739-748,2004.

[8]. Sihong Chen,Jing Qin,Xing Ji et al, Automatic Scoring of Multiple Semantic Attributeswith Multi-task Feature Leverage : A study on Pulmonary nodules in CTImages,IEEE,2016

[9]. Juan-Juan Zhao and Guohua Ji, “Cavitary nodule segmentation in computedtomography images based on self generating neural networks and particle swarmoptimization”,inderscience,2015

[10]. Arnaud A, A. Setio, Franceco Ciompi et al, ”Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks, IEEE,2015

[11]. Armato, S.G., The Lung image database consortium (LIDC) and image databaseresource initiative (IDRI), a completed reference database of lung nodules on CTscans, Medical Physics,38,915-931




[12]. Novo, Rouco, Mendonca and Campilho, ” Reliable lung segmentation methodologyby including just a plueral nodules”,2014

[13]. Alif Hayat M, ”Classification of membrane protein types using voting feature intervalin combination with Choo’s pseudo amino acid composition. ”,Theory Biology384,78-83,2015

[14]. A.Sherley Alphone,Dejey Dharma,”Novel directional patterns and generalizedsupervised dimension reduction system(GSDES) for facial emotionrecognition”,JMTA,2017

[15]. Padmavathi et al, “Kernel Principal component analysis feature detection andclassification for under water images”,(CISP),2010

[16]. P.Bhuvaneswari,Dr.A.Brintha therese,”Detection of cancer in lung with KNNclassification using genetic algorithm”,Elsevier,2015

International Journal of Multidisciplinary Approach

and Studies ISSN NO:: 2348 – 537X

Volume 06, No.4, July – Aug 2019

Pag

e : 5

7

A Survey on Pulmonary CT Image Classification in Deep learning

Dr. M. Mohamed Sathik*, & S Piramu Kailasam**

Principal,Sadakathullah Appa College,Tirunelveli,India

Assistant Professor,Sadakathullah Appa College,Tirunelveli,India

ABSTRACT

Image Classification in deep learning is a recently developed Soft computing method to

classify image data with pixel information in a way that only hybrid technique can do better

access using the pixel data in the aspect of performance measures. Classification is used to

predict the unknown data. Deep learning is a cascade of nonlinear transformation in a

hierarchical model as well as convert feed forward deep architecture worked on any image

size.

Lung Cancer is one of the leading diseases in the world and increased in many countries. The

eight years overall survival rate is just 17%. Early to remove lung cancer surgically will

ensure the survival of the patient. Before surgery a doctor needs help of radiologists

suggestion. In digital era, fast CAD’s system plays a vital role in surgery. In this junction,

medical image analysis, image classification, image detection and diagnosis involved much

in past decades. Pre-processing and Segmentation are the preliminary basic works for binary

classification.

KEYWORDS: Image Pre-processing, Image Enhancement, Descriptor, Image

Classification

I . INTRODUCTION

The purpose of this section is to review the recent literature with respect to lung cancer

detection systems.

1.1 PreProcessing

Several image processing techniques for the detection of lung cancer by using CT images are

reviewed in (Niranja.G et al., 2017). The lung cancer detection is carried out by splitting the

review in different aspects such as pre-processing, nodule segmentation and segmentation.

The recent trends in lung nodule detection are presented in

(Rabia Naseem et al., 2017). Additionally, the performance of the recent lung nodule

detection techniques are compared and presented. In (Devi Nurtiyasari et al., 2017), a lung

cancer classification system is proposed on the basis of wavelet recurrent neural network. This

author employs wavelet to remove the noise from the input image and the recurrent neural

network is utilized for classification. However, this author could not achieve better specificity

rates and this implies that the false positive rates of the work are greater.

The lung cancer detection algorithm based on FCM and Bayesian classification is presented

in (Bhagyarekha.U et al., 2016). In this paper, FCM is applied to achieve segmentation and the

GLCM features are extracted. Based on the feature set, the Bayesian classifier is employed to

distinguish between the normal and cancer affected CT images. Yet, the results of this work




Pag

e : 5

8

are not convincing in terms of sensitivity and specificity rates. In (Manasee KurKure et al.,

2016), a lung cancer detection technique that relies on genetic approach is proposed.

However, this work involves more time complexity and the number of connected objects

have been calculated by assigning 1 to inside and 0 to outside of the object that shows brain

MRI image based on threshold technique to improve the skull stripping performance (Nilesh

Bhaskarrao Bahadur et al., 2017).

1.2 Segmentation Techniques

Image segmentation by different methods was done by many people. Some of them are Local

entropy image segmentation (Ali Shojaee Bakhtiari et al., 2007; L.Goncalves, J.Novo, 2016),

Discrete cosine texture feature (Chi-man pun et al., 2010), parallel algorithm for grey scale

image (Harvey et al., 1996), Clustering of spatial patterns and watershed algorithm (Kai-Jian

et al., 2010), Medical image segmentation (Prasantha H.S. et al., 2010), Finite bivariate

doubly truncated Gaussian Mixture model (Rajkumar 2010), Random set of view of texture

(Ramana Reddy 2010), Sobel operator technique, Prewitt technique, Kiresh technique,

Laplacian technique, Canny technique, Roberts technique and Edge maximization Technique

(Salem Saleh 2010), Mathematical image processing (Sun hee kim 2010), Simple algorithm

for image denoising (Vijaya 2010), Iterative regularized likelihood learning algorithm (Zhiwu

Lu 2006), Automatic model selection and unsupervised image segmentation by Zhi Wu Lu

(2007), finite mixtures and entropy regularization by Zhi wu Lu (2008).

As described by the Sun Hee Kim the following steps are followed in image processing for

counting the number of mosquitoes in a room (1) The acquisition of the image. (2) The

extraction of the region of the mosquitoes. The intensity and the property of the location of

the mosquitoes considering the image. (3) Reduction of the image to smaller size to process

in matlab. The image processing is applied to separate the region of mosquitoes and

backgrounds. Then the smaller size is obtained by processing it in paint. (4) Use array editor

convert the image to values. The recognized mosquitoes by the method of the array editor

conversion from the matlab software. (5) Identify the cluster where mosquitoes are present.

The place where the mosquitoes are there shows lesser value because of darkness. Here the

raw figure converted into values that can be processed by computer algorithms. The

conversion of the figure to a smaller size is done and the image is converted to array editor

values (Harvey A et al., 1996).

In (Xia Li et al., 2017), an algorithm is proposed to detect the pulmonary nodules based on

cascade classifier. This work detects the pulmonary nodules and classifies them into normal

and benign. A learning method is based on cascade classifier and applied over the detected

pulmonary nodules. This work focuses more on accuracy rates, rather than on sensitivity and

specificity rates. A technique to detect lung nodules from a series of CT slices is presented in

(May Phu Paing, 2017). This work segments the lung nodules by applying Otsu‟s threshold

along with some morphological operations. The geometric, histogram and texture features are

extracted from the segmented nodules to carry out the process of classification. The

Multilayer Perceptron (MLP) is employed as a classifier and this work involves

computational overhead. In (S.Avinash et al., 2016), a Gabor filter and watershed

segmentation based lung cancer detection technique is proposed. The process of segmentation

is carried out by watershed segmentation approach and the Gabor features are extracted from

the CT images. This technique does not include the process of classification and it stops itself

with segmentation.




Pag

e : 5

9

A large number of lung segmentation methods have been proposed. Among them,

bidirectional differential chain code combined with machine learning framework is able to

correctly include the just a pleura nodules into the lung tissue while minimizing over under

segmentation. This method is capable of identifying low-dose, concave/convex regions

(Shiwen Shen 2015). In the Hessian-based approach, 3D lung nodule has been segmented in

the multiscale process through the combination of Shape Index and Curvedness methods.

Image characteristics, Nodule position, nodule size, and nodule characteristics are included in

this approach. Eigenvalues are computed from the 3 x 3 Hessian Matrix (L.Conclaves et al.,

2016).

2 Feature Vectors

The feature types of the pulmonary nodule in CT images are important cues for the

malignancy prediction (A.McWilliams et al., 2013, V.K.Patel et al., 2013), diagnosis and

advanced management (D.P.Naidich et al., 2013, M.K.Gould et al., 2013). The texture features

of nodule solidity and semantic morphology feature of speculation are critical to

differentiating of pulmonary nodules and other subtypes. Meanwhile, other semantic features

like calcification pattern, roundedness and margin clearness are shown to be helpful for the

evaluation of nodule classification. The nodule may be found in bronchial tubes or outside of

the bronchial tube. If the nodule<=3mm the detection of malignancy is difficult. The

determination of clinical characteristics may differ from patient to patient and depends on the

experience of the observer.

Computer-aided diagnosis (Shiwen shen, Alex A.T.Bui et al., 2015) is an assistive software

package to provide computational diagnostic references for the clinical image reading and

decision making support. The histogram feature (Sherly Alphonse,

Dejey Dharma, 2016) for the high-level texture analysis helps to extract nodule feature. The

bag of frequencies descriptor is developed that can successfully distinguish 51 spiculated

nodules from the other 204 non-spiculated nodules (Ciompi et al., 2015). However, the

mapping from the low-level image features toward the high-level semantic features in the

domain of clinical terms is not a straightforward task. This semantic feature assessment

maybe useful for clinical analysis. The Lung Image Database Consortium (LIDC) dataset for

its rich annotation database supports the training and testing CAD scheme (S.G.ArmatoIII et

al., 2011).

Absorption and scattering of light rays are the two major issues that cause reduced quality of

images. Several methods have been proposed to enhance the quality of the pulmonary

images. Histogram equalization technique and Contrast stretching methods are capable of

enhancing the image quality. Contrast Limited Adaptive Histogram Equalization (CLAHE) has

been applied to improve the image contrast. Otsu‟s adaptive thresholding method form image

segmentation has been effective for many applications. This provides bright backgrounds for

images. Various thresholding techniques such as Local, Global and Multilevel thresholding

have been applied for the segmentation of pulmonary nodules images. The texture feature

descriptor that has been widely used for image classification is Local Binary Pattern. Pican et

al., have used GLCM‟s twenty-four types of features for extraction and for each image

suitable features have to be chosen for extraction. Hence there is a need for efficient feature

descriptor for the classification process. In past years Neural Network is performed for

classification. The process is time-consuming. K-Nearest Neighbour as a classifier with

Euclidean distance was used to classify nodules. Padmavathi et al., (2010) have classified




Pag

e : 6

0

images using a probabilistic neural network which gives better results than SIFT algorithm

with three classes of the dataset. Eduardo et al., have classified images using nine machine

learning algorithms such as Decision Trees, Random Forest, Extremely Randomised Trees,

Boosting, Gradient Boosted Trees, Normal Bayes Classifier, Expectation Maximization, NN

and SVM. Bhuvaneswari.P et al., (2015) have classified coral and textures using KNN by

considering K=1 and the accuracy was reported as 90.35%.

In (K.Gopi et al., 2017), a technique based on Ek-means algorithm and Support Vector

Machine (SVM) is presented to recognize and classify a lung tumor. This work pre-processes

the CT images for removing the unwanted information by means of thresholding approach.

The Threshold value calculated by Zack‟s Algorithm in white blood cell segmentation by

F.Sadglian et al., 2009. The regions of interest alone are extracted and the Gray Level Co-

occurrence Matrix (GLCM) features are extracted. Gray level co-occurrence matrix texture

features of CT image of small solitary pulmonary nodules(Vishal K Patel,K Naik, 2015),

which can profit diagnosis lung cancer in earlier by five combined patient level features

inertia, entropy, correlation, difference-mean and sum-entropy (Huan Wang et al., 2010).

Finally, SVM is utilized to distinguish between the cancerous and non-cancerous areas. An

early lung cancer detection mechanism is proposed in (Rachid Sammouda et al., 2016), which

exploits Hopfield Neural Network classifier for extracting the lung areas from the CT images.

The edges of the lung region lobes are detected by bit planes and the diagnostic rules are

framed to detect the abnormality.

A lung cancer detection technique, which is based on Local Energy based Shape Histogram

(LESH) and machine learning technique are introduced .Initially, this work pre-processes the

CT images by Contrast Limited Adaptive Histogram Equalization (CLAHE) and the LESH

features are extracted. Machine learning algorithms such as Extreme Learning Machine

(ELM) and SVM are applied. This work is efficient but the computational overhead can still

be decreased by altering the feature extracting technique. Shape features (Ashis et al., 2016)

are easily extracted by HOG (Histogram of Oriented Gradients) feature descriptor (Chen J et al.,

2014). The other method which is extension of HOG detected whole human detection using

EXHOG (Extended Histogram of Gradients) feature descriptor (Amit et al., 2011). HOG

method always deals with 180-degree angle whereas EXHOG method considers 360 degrees.

A Variety of Shape-based, margin-based and texture-based features are analyzed to improve

the accuracy of classification. These results are evaluated in terms of area under the receiver-

operating-characteristic-curve in Ashis Kumar et al., (2016) paper. Nodules are categorized

with rank1, rank2, rank3, rank4 and rank5 with malignancy ratings solid, part-solid and Non-

solid. Based on 2D Shape-Based Features and 3D Shape-Based Features the Nodules

Features are extracted. Han et al., introduced 3D Haralick features. Fourteen Haralick

features (1973) were computed from each GLCM matrix. The maximum correlation

coefficient is not considered in our experiment, as it is computationally expensive (Han et al.,

2014). Several shapes based, margin based and texture based features are computed to

represent the pulmonary nodules.

A set of relevant features is determined for the efficient representation of nodules in the

feature space. Han et al., introduced 3D Haralick features considering nine directions which

provide better classification performance than five directions. The performance of

classification is evaluated in terms of area (A), under the receiver operating characteristic

curve (Ashis Kumar Dhara et al., 2016). Predicting membrane protein sequences in




Pag

e : 6

1

imbalanced data set is often handled by a decision tree, AdaBoost, random and rotation

forest, SVM and naive Bayes classifiers. The random forest has good performance except for

classes with fewer samples (E.Siva Shankari et al., 2017). Local Binary Pattern is an

incredibly well-known texture feature descriptor (D.Jeyabharathi, Dejey 2016) that has been

used in many application such as facial recognition (Ani brown et al., 2017), texture

classification and coral image classification. Contrast Limited Adaptive Histogram

Equalization (CLAHE) (Ani brown et al., 2017) has been applied for coral images to improve

the image contrast and equalize the image histogram effectively.

The relational feature provides information about the relational and hierarchical structure of

the regions related to a single object or a group of objects. Neural Network classifiers and

SVM classifier

2.1 DCNN as Feature Vector

Eduardo et al., have used wavelet filters to extract texture feature from OpenCV library

(Shiela et al.,) and have determined the counts of corals by extracting texture features using

LBP descriptor because they use the information from eight directions around a pixel (Huan

Wang 2010). Deep feed-forward ANN analyses the visual imagery. Artificial Neural

Networks consist of a method of solving problems related to science through simple models

that mimic the human brain, including their behaviour. An ANN is armed by small modules

which simulate the operation of a neuron. In ANN with a minimum number of layers are

used. Because the dimension of ANN is very limited. Deep feed-forward ANN analyses the

visual imagery. The convolutional neural network has hundreds of hidden layers, using filters

can extract the data. When the target reaches the value no change in weight. In sentiment

classification, RNN network is highly preferable. Recurrent Neural Network uses the loop to

translate sequence to sequence. This happens by sequence to a single output. Neurons within

the same layer are not connected. So in RNN, the neurons in the layers are connected. RNN is

the best for Reinforcement learning. Imagenet is suitable for Visual recognition challenges.

Instead of autoencoder, we can use Conditional Random Field (CRF) model or deep

sequential model.

Handwritten digit recognition using backpropagation over a coalitional network makes a

good change. Alexnet, Zfnet, VGGnet, Googlenet, and MsresNet are the other Neural

models. CNN was introduced by Lecun et al., 1998, CNN allows multiple features to be

extracted at each hidden layer. Convolutional Neural Network is used to classify tumors seen

in lung cancer screening, have special properties such as spatial invariance and allows to

multiple feature extraction. The deep CNN has been shown widely that the accuracy of

prediction increases dramatically (Prajwal Rao et al., 2016). Lecun et al., have written a paper

that CNN has been shown to eliminate the necessity of handcrafted feature extractors in gradient-

based learning applied to document recognition (Lecun et al., 1998). Hence, multiple

characteristics can be extracted and the learning process assigns weights appropriately to

significant features. So automatically performing the difficult task of feature engineering in

hierarchically. AlexNet using binary classification gives challenging visual task in face

recognition.

Convolutional Neural Networks are alternative type of neural network. This method can be

useful to reduce variations in spectral and model correlations in signals. Speech signals

exhibit both properties, hence CNN's are more effective model for speech compared to Deep




Pag

e : 6

2

Neural Networks. In prior the number of convolutional layers, filter size and pooling size

needed is decided. Secondly, investigate an appropriate number of hidden units and best

pooling strategy. Then find how to incorporate speaker-adapted features. Finally, give the

importance of sequence training for speech tasks. During Hessian free sequence training of

CNN's, using ReLU+dropout can be done the process (Prajwal Rao et al., 2016).

Convolutional Neural network recognizes face using a Self-Organizing Map (SOM) neural

network and a CNN. MatConvNet is an open source for the convolutional neural network. In

XNOR networks or Image Net, both the filters and the input to convolutional layers are

binary. The filters are approximated with binary values resulting in 32x memory savings.

The paper Arnaud A.A Setio et al., 2015 showed the multi-view ConvNet. This is very much

suited for false positive reduction and achieved good results for the nodule detection task of

lung CT images. CNN is an automated Preprocessing architecture so no need to do

preprocessing again. In Deep Learning (Qing zeng song, lei zhao, xing ke, 2017) images or

videos may profit from preprocessing, whose job may become much easier (Jodogne and

Piater, 2007, Legenstein et al., 2010, Cuccu et al., 2011). A dominant dictionary like texture

descriptor, texton is proposed as a feature (Beijbom et al., 2012). Deep learning has proved a

popular and powerful method in many medical imaging areas. In (QingZeng Song et al.,

2017) paper CNN, DNN, and SAE are designed for lung cancer calcification. The

experimental result achieved the best performance with accuracy among the three

architectures.

3 Classification

The prerequisite of classification is, designing a suitable image processing procedure. In

remote sensor data, a thematic map is a good method of suitable classification especially

significant for improving classification accuracy. Non-parametric classifiers are neural

network and decision tree classifier. The knowledge-based classification has become an

important approach for multisource data classification.

Remotely sensed data vary in spatial, radiometric, spectral, and temporal resolutions. Both

airborne and space borne sensor data included remotely sensed data. Understanding the

strengths and weaknesses of different types of sensor data is the basic need for the selection

of suitable remotely sensed image data for image classification.

Barnsley (1999) and Lefsky and Cohen (2003) showed the characteristics of different remote-

sensing data in spectral, radiometric, spatial, and temporal resolutions and angularity. The

selection of suitable data is the first important step for a successful classification for a specific

purpose (Phinn et al., 2000, Lefsky and Cohen 2003). It requires considering such factors as

user‟s need, the scale and characteristics of a study area, the availability of various image

data and their characteristics, cost and time constraints, and the analyst‟s experience in using

the selected image. Scale, image resolution, and the user‟s need are the most important

factors affecting the selection of remotely sensed data. Selecting potential variables like

spectral signatures, vegetation indices, transformed images, textual information,

multitemporal images multi-sensor images used in sensor image classification. Principal

component analysis, minimum noise fraction transform, discriminant analysis, decision

boundary feature extraction, nonparametric weighted feature extraction, wavelet transform

and spectral mixture analysis used for feature extraction to reduce the data redundancy

inheritance to extract specific land information. The optimal selection of spectral bands for




Pag

e : 6

3

classification has been elaborately discussed in previous literature (Mausel et al., 1990,

Jensen 1996). The Region of Interest classification using low-level and high-level features.

Here region of interest meant with lung nodule. Classification method used is binary

classification.

An automatic lung nodule segmentation and classification technique are proposed in the 2006

D.Lu paper (D.Lu and Q.Weng, 2006). Initially, the images are pre-processed by different

thresholding techniques and morphological operations. The areas of interest alone are

extracted by means of apriori information and Hounsfield Units. SVM is employed for

achieving the task of classification. The SVM classifier is an experimental evaluation of its

accuracy, stability and training speed in deriving land cover classifications from satellite

images (Huang.C 2002). The results of this work can still be improved in terms of sensitivity

and specificity. In remote sensor data and the multiple features of data, selection of a suitable

classification method are especially significant for improving classification accuracy. Non-

parametric classifiers such as neural network, decision tree classifier, and knowledge-based

classification have increasingly become important approaches for multisource data

classification. Integration of remote sensing, geographical information systems, and expert

system emerges as a new research frontier. More research is needed to identify and reduce

uncertainties in the image-processing chain to improve classification accuracy. In many

cases, a hierarchical classification system is adapted to take different conditions into account.

The PSO-SGNN method segment the cavitary nodules which adapted region growing

algorithm. PSO-Self Generating Neural Forest (SGNF) based classification algorithm is used

to cluster regions (J-j Zhao et al., 2015). Lung region growth can be calculated by area and

eccentricity of nodule (Senthil Kumar Krishnamurthy et al., 2017). Support Vector Machine

is a group of theoretically superior machine learning algorithms, found competitive within

classifying high dimensional data sets. Satellite images are classified through this method and

compared with Maximum likelihood, neural network and decision tree classifiers using three

variables and seven variables. SVM uses Kernel functions in the original data area into linear

ones in a high dimensional space. When seven variables were used in the classification, the

accuracy achieved better when polynomial order p increased. However, an experiment using

arbitrary data points revealed that misclassification error (Chengquan Huang et al., 2012).

Coral reef image classification employing Improved LDP for feature extraction paper results

indicate that ILDP feature extraction method tested with five coral datasets and four texture

data sets achieves the highest accuracy and minimum execution time (Ani Brown et al.,

2017).

Motivated by these existing research works, aims to present a reliable lung cancer detection

algorithm, which can prove better sensitivity and specificity rates with minimal time

complexity. The following section elaborates the proposed approach along with the overview

of the work. Object-based classification approach is less explored compare to pixel-based

classification. The result based on quick bird satellite image indicates that segmentation

accuracies decrease with increasing scales of segmentation. The negative impacts of under

segmentation errors become significantly large at large scales. There are both advantages and

limitations in object-based classification and their trade-off determines the overall effect of

object-based classification, dependent on the segmentation scales. For the large scales, object-

based classification is less accurate than pixel-based classification because of the impact of large

under-segmentation errors (D.Lu and Q.Weng, 2006). Multiple features of remote sensor data

are improved by selecting a suitable classification method. Nonparametric classifiers such as




Pag

e : 6

4

neural network, decision tree classifier, and knowledge-based classification have become

important for multisource data classification. Integration remote sensing, geographical

information systems, and expert system emerge with mapping as another research line. In

texture analysis of an image, the LBP method takes an important role. Particularly LBP helps

to extract the features in subclassification of medical CT lung nodules. Bin size of the image

shows the pixels range in each area. The bin complexity reduced by introducing a new

operator in the existing pattern.

3.1 Classifiers

Machine learning algorithms such as Extreme Learning Machine (ELM) and SVM (E.Siva

Shankari, D Manimegalai et al., 2017; Ani Brown Mary Dejey Dharma, 2017), KNN,

Decision Tree, Random Forest classifiers are applied. Extreme learning machines are

feedforward neural networks for classification and feature learning with a single/multiple

layers of hidden nodes, where parameters of hidden nodes need not be tuned. This work is

efficient but the computational overhead can still be decreased by altering the feature

extracting technique.

A new lung cancer detection technique based on the Mumford-Shah algorithm is proposed in

(Janudhivya et al., 2016). This work removes the Gaussian noise by applying a sigma filter

and the regions of interest are segmented by otsu‟s thresholding and Mumford-shah model is

applied. The texture features (Huan Wang, Xlu Hua Guo et al., 2010) are extracted from the

extracted regions by spectral texture extraction technique and the classification is done by

multi-level slice classifier. However, the classification accuracy of this work can be improved

further.

In (Mustafa Alam, 2016) a lung nodule detection and segmentation technique are proposed

on the basis of a patch based multi-atlas method. This work chooses a small group of atlases

by matching the target image with a large group of atlases in terms of size and shape based

feature vector. The lung nodules are then detected by means of a patch-based approach and

the Laplacian of the Gaussian blob detection technique is utilized to detect the segmented

area of the lung nodule. However, the images utilized for testing is very minimal and hence,

the efficiency of this work cannot be determined.

A work to enhance the lung nodules is presented in (Fan Xu et al., 2016). This work exploits

a three-dimensional multi-scale block Local Binary Pattern (LBP). This filter can distinguish

between the line based regions and the edges effectively. This work focuses only on

enhancement, which is just a part of this proposed approach.

The gap between computational and semantic or minimal features (Shihong Chen, Jing Qin,

Jing Qin et al., 2016) have to be overcome by hybrid feature vectors and Convolutional

Neural Network method. It is observed that there may relations among speculation, texture,

margin etc in pulmonary CT image. The LIDC bench mark dataset is adapted and applied to

various feature vectors and classifiers with possible combinations.

CONCLUSION

Combination of more than two Model Studies for the different pulmonary CT Image features

are reviewed in the context of Nodule and Region of Interest recognition. The prior work has




Pag

e : 6

5

shown the performance evaluation of the Hybrid model system under the different trait

combination scheme, Identification rate, a technique adopted, databases and the number of

objects used. Important attributes are summarized. The combination of angle, shape features

is suggested. Among the studies reported in the previous section, it claims that the hybrid

model are used to achieve the performance than another multimodel system.

REFERENCES

i. Abdulrazzaq, M.M., Shahrul Azman Noah, X-Ray Medical Image Classification

based on Multi Classifiers , 4th International Conference on Advanced Computer

Science Applications and Technologies, 2015.

ii. Aghbari Zaher Al, Kaneko Kunihiko, Akifumi Makinouchi: “Content-trajectory

approach for searching video databases,” IEEE Transactions on Multimedia 2003,

5(4):516-531.

iii. Ali Shojaee Bakhtiari, Ali Asghar Beheshti Shirazi and Amir Sepasi Zahmati, “An

Efficient segmentation Method Based Local Entropy Characteristics of Iris Biometrices”,

World Academy of Science , Engineering and Technology, 28, 2007, 64-68.

iv. Amit Satpathy, Xudong Jiang ,Extended Histogram of Gradients with Asymmetric

Principal Component and Discriminant Analysis for human detection, 2011.

v. Arevalillo-Herráez Miguel, Ferri Francesc J, Domingo Juan: “A naive relevance

feedback model for content-based image retrieval using multiple similarity measures,”

Pattern Recogn 2010, 43(3):619-629.

vi. Armato, S.G., III, et al., “The lung image database consortium (LIDC) and image

database resource initiative (IDRI); a completed reference database of lung nodules

on CT scans”, Medical Physics. vol 38, pp 915-931, 2011.

vii. Ashis Kumar et al, “A combination of shape and texture features for classification of

pulmonary nodules in lung CT images”, J Digit Imaging, 2016.

viii. Bhagyarekha U. Dhaware ; Anjali C. Pise, "Lung cancer detection using Bayasein

classifier and FCM segmentation", International Conference on Automatic Control

and Dynamic Optimization Techniques, 9-10 Sept, Pune, India, 2016.

ix. Bhuvaneswari, P., A. Brintha Therese, “Detection of cancer in lung with KNN

classification using genetic algorithm”, Elsevier, 2015.

x. Chen Jau-Yuen, Taskiran Cuneyt, Albiol A, Delp E, Bouman C: “Vibe: A compressed

video database structured for active browsing and search,”. 2001.

xi. Chen W, Gao X, Tian Q, Chen L. A comparison of autofluorescence bronchoscopy

and white light bronchoscopy in detection of lung cancer and preneoplastic lesions: a

meta-analysis. Lung Cancer. 2011, 73:183–8.

xii. Chi-Man Pun and Hong-Min Zhu, “Image segmentation using Discrete Cosine

Texture Feature, International Journal of computers Issue1,Volume 4, 2010, 19-26.

xiii. Choupo Anicet Kouomou, Laure Berti-Équille, Morin Annie: “Optimizing

progressive query-by-example over preclustered large image databases,” CVDB „05:




Pag

e : 6

6

Proceedings of the 2nd international workshop on Computer vision meets databases

New York, NY, USA; 2005, 13-20, ACM.

xiv. Ciompi, F., et al., “Bag of frequencies: A Descriptor of pulmonary Nodules in

Computed Tomography Images”, IEEE Transactions on Medical Imaging, Vol 34, pp

962-973, 2015.

xv. Devi Nurtiyasari ; Dedi Rosadi ; Abdurakhman, "The application of Wavelet

Recurrent Neural Network for lung cancer classification", International Conference

on Science and Technology - Computer, 11-12 July, Yogyakarta, Indonesia, 2017.

xvi. Duplaga Mariusz, Juszkiewicz Krzysztof, Leszczuk Mikołaj, Marek M, Papir

Zdzisław: “Design of Medical Digital Video Library,” Proc Fourth International

Workshop on Content-Based Multimedia Indexing CBMI‟2005 Riga, Latvia; 2005,

Content Based Multimedia Indexing.

xvii. Gibbs JD, Graham MW, Bascom R, Cornish DC, Khare R, Higgins WE, “Optimal

procedure planning and guidance system for peripheral bronchoscopy”, IEEE Trans

Biomed Eng. 2014 , 61(3):638-57

xviii. Goncalves, L., J.Novo, A. Canpilho,”Hessian based approaches for 3D lung nodule

segmentation”, Expert system with Application, 2016.

xix. Goodfellow IJ, Courville A, Bengio Y. Scaling up spike-and-slab models for

unsupervised feature learning. IEEE Trans Pattern Anal Mach Intell.

2013;35(8):1902-1914. doi:10.1109/TPAMI.2012.273

xx. Gopi, K. ; J. Selvakumar, "Lung tumor area recognition and classification using EK-

mean clustering and SVM", International Conference on Nextgen Electronic

Technologies: Silicon to Software, 23-25 March, Chennai, India, 2017.

xxi. Gould, M.K., et al., “Elevation of individuals with pulmonary nodules when is it lung

cancer? Diagnosis and management of lung cancer American college of chest

physicians evidence based clinical practice guidelines” Chest, vol 143, 2013.

xxii. Grega Michał, Leszczuk Mikołaj: “The prototype software for video summary of

bronchoscopy procedures with the use of mechanisms designed to identify, index and

search,” chapter in BRONCHOVID: The Computer System for Bronchoscopy

Laboratory Integrating the CT-Based Bronchoscopy and Transbronchial Needle-

Aspiration Biopsy Planning, Virtual Bronchoscopy, Management of the

Bronchoscopic Procedures Recordings, Automatic Image-Based Data Retrieval,

Artifact and Pathology Detection and Interactive 2D & 3D Visualization Springer-

Verlag; 2010, 587-598.

xxiii. Harvey A, Cohen computer science and computer engineering La Trobe University

Bundoora 3083 Parallel algorithm for gray scale image segmentation Harvey A.

Cohen, Parallel algorithm for gray –scale image segmentation, proc. Australian and

new Zealand conf. Intelligent information systems, ANZIS- 96,Adelaide, Nov 18-20,

1996, pp 143-146.

xxiv. Huan Wang, Xiu-Hua Guo, Zhong-wei Jia et al,” Multilevel binomial logistic

prediction model for malignant pulmonary nodules based on texture features of CT

image”, European Journal of Radiology, 2010, Elsevier.




Pag

e : 6

7

xxv. Janowski Lucjan, Suwada Krzysztof, Duplaga Mariusz: “Fast MPEG-7 image

features-based automatic recognition of the lesion in the bronchoscopic recordings,”

chapter in BRONCHOVID: The Computer System for Bronchoscopy Laboratory

Integrating the CT-Based Bronchoscopy and Transbronchial Needle Aspiration

Biopsy Planning, Virtual Bronchoscopy, Management of the Bronchoscopic

Procedures Recordings, Automatic Image-Based Data Retrieval, Artifact and

Pathology Detection and Interactive 2D & 3D Visualization Springer-Verlag; 2010.

xxvi. Jeyabharathi, D., Dejey, “Vehicle Tracking and Speed Measurement System (VTSM)

Based on Novel Feature Descriptor: Diagonal Hexadecimal Pattern (DHP)”, Visual

Communication and Image Representation, 2016.

xxvii. Jianping Fan, Elmagarmid Ahmed K, Member Senior, Xingquan Zhu, Aref Walid G,

Wu Lide: “Classview: Hierarchical video shot classification, indexing, and

accessing,”. IEEE Trans on Multimedia 2004, 6:70-86.

xxviii. Johnson Scott B: “Tracheobronchial injury,” In Seminars in Thoracic and

Cardiovascular Surgery. Volume 20. Spring; 2008:(1):52-57.

xxix. Jóźwiak Rafał, Przelaskowski Artur, Duplaga Mariusz: “Diagnostically Useful Video

Content Extraction for Integrated Computer-aided Bronchoscopy Examination

System,” Advances in Intelligent and Soft Computing, 57, Computer Recognition

Systems 3 Springer; 2009, 151-158.

xxx. Kai-jian XIA, Jin-yi CHANG , Jin –cheng ZHOU, “An Image segmentation based on

clustering of spatial patterns and watershed algorithm (IJCNS) International journal of

computer and network security, 2(7), 2010, 15-17.

xxxi. Kavuru MS, Mehta AC: “Applied Anatomy of the Airways,” In Flexible

Bronchoscopy.

2 edition. Edited by: Ko-Peng Wang, Ahtul C. Mehta and J. Francis Turner.

Blackwell Publishing Company; 2004.

xxxii. Lee Ho Young, Lee Ho Keun, Ha Yeong Ho: “Spatial color descriptor for image

retrieval and video segmentation,” IEEE Transactions on Multimedia 2003, 5(3):358-

367.

xxxiii. Lee WL, Chen YC, Hsieh KS. Ultrasonic liver tissues classification by fractal feature

vector based on M-band wavelet transform. IEEE Trans Med Imaging. 2003.

doi:10.1109/TMI.2003.809593

xxxiv. Leszczuk Mikołaj: “Analiza możliwości budowy internetowych aplikacji dostępu do

cyfrowych bibliotek wideo,” PhD thesis AGH University of Science and Technology,

Faculty of Electrical Engineering, Automatics, Computer Science and Electronics, al.

Mickiewicza 30, PL-30059 Krakow, Poland; 2005.

xxxv. Lombardo Alfio, Morabito Giacomo, Giovanni Schembra: “Modeling intramedia and

intermedia relationships in multimedia network analysis through multiple timescale

statistics,” IEEE Transactions on Multimedia 2004, 6(1):142-157.

xxxvi. Lu, D. & Q. Weng, “A survey of image classification methods and techniques for

improving classification performance”, International Journal of Remote Sensing,

2007, 823–870.




Pag

e : 6

8

xxxvii. Manasee Kurkure ; Anuradha Thakare, "Lung cancer detection using Genetic

approach", International Conference on Computing Communication Control and

automation, 12-13 August, Pune, India, 2016.

xxxviii. May Phu Paing ; Somsak Choomchuay, "A computer aided diagnosis system for

detection of lung nodules from series of CT slices", International Conference on

Electrical Engineering/Electronics, Computer, Telecommunications and Information

Technology, 27-30 June, Phuket, Thailand, 2017.

xxxix. McWilliams, A., et al.,” Probability of cancer in pulmonary nodules detected on first

screening CT”, New England Journal of Medicine, Vol, 369, pp. 910-919, 2013.

xl. Naidich, D.P., et al., “Recommendations for the management of subsolid pulmonary

nodules detected at CT: a statement from the Fleischner society”, Radiology,

vol 266, 2013.

xli. Niranjana, G.; M. Ponnavaikko, "A Review on Image Processing Methods in

Detecting Lung Cancer Using CT Images", International Conference on Technical

Advancements in Computers and Communications, 10-11 April, Melmaruvathur,

India, 2017.

xlii. Nixon M, Aguado A. Feature extraction by shape matching. Featur Extr Image

Process Comput Vision, Second Ed. 2008:161-216. doi:10.1016/B978-0-12-396549-

3.00003-3

xliii. Nixon MS, Aguado AS. Low-level feature extraction (including edge detection). In:

Feature Extraction & Image Processing for Computer Vision. ; 2012:137-216.

doi:10.1016/B978-0-12-396549-3.00004-5



Volume 06, No.3, May – June 2019

Pag

e : 7

5

A generic approach for human lungs CT scan Image analysis

using Deep learning technique

Dr. M. Mohamed Sathik* & S Piramu Kailasam**

*Principal,Sadakathullah Appa College,Tirunelveli,India ,

**Assistant Professor,Sadakathullah Appa College,Tirunelveli,India,

ABSTRACT:

The lower part of the human lungs (HL) has increased, in which many unknown events are

coming. In complex tuberculosis, restoration of hanging in lungs, prevent further

complications and eliminate bacteria. Complex Neck Run Management is still controversial,

and surgery is one of the cures that should be estimated according to its role in treating

complicated illness. This study is a case-based case report. The database used for finding

literature is cocaine, medical and proposal. The main word Primary lungs ninety runner,

surgery, pulmonary ambulance and baby search. The quality added is , under 18 years of

children. As per the diagnosis of 2019, according to evidence, Arthy Centre was organized

according to Medical to Medicine. Six joint studies have been analyzed. Surgery was

diagnosed for surgical patients, who did not respond to the treatment. It should be noted that

standard DLT-WOO,CBX,CMC dataset is tested before using and therefore it may differ from

existing usage. At the same time, We go into the lungs Acquisition after acquisition we

become one 96.77% accuracy Deep learning technique (DLT Method) there are instructions

suhisen algorithm using on lung prevention. However, compared with surgery, with lower

part of the complicated lungs there is better death and patient among children.

Keywords - lungs, surgery, treatment, patient,Sushisen algorithms,DLT

INTRODUCTION

The analysis 9.6 million new cases and 1.5 million deaths were reported in tuberculosis

(TB). Of the new cases, there are 1 million child cases. The prevalence of pulmonary

tuberculosis in children is on the rise, and many cases remain. [1] Early oral therapy was the

best predictor of childhood tuberculosis, however, due to prolonged treatment, incomplete

treatment, relapse, and increased risk of complications.

Treatment of complex lung tuberculosis was not limited to bacterial eradication, but to return

to normal lung function and prevent further complications. Recovery surgery with physical

abnormalities in 3,4 complex tuberculosis patients with functional disruption and

antitubercularis. Recovery surgery is the only treatment for pulmonary tuberculosis in

children with permanent bronchial destruction. By combining oral therapy and surgery with

the right type and timing of surgery, complications can be stopped and lung function

recovered.




Pag

e : 7

6

Review of Literature

CASE DLT METHODS ILLUSTRATION IN SIFT

Main complaint was two months ago, came to the emergency room with a 9-month-old boy

five months ago. Producer cough, fever and hunger. Pulmonary tuberculosis had a family

history. Six months before the hospital admission, the patient had a standard dose

combination (FTC) regimen (30 mg isoniazid, 60 mg rifampicin, and 150 mg pyrazinide) in

the midst of oral antiaridulosis therapy.

There is no history of the Bacillus Calamite Guerin (PCG) vaccine. By physical examination,

the child suffered from malnutrition, failed to develop and was stable hemodynamic. Dyspnea

was observed using sternocleidomastoid muscles and intercostyle muscles. From the chest

rhythm, the upper left breast was hypercaricular, and vesicular breathing sounds were

observed with the bilateral roach.

With laboratory use, leukocyte 28.230 / cL, c-reactive protein (CRP) 10,4 mg / L,

procoalkidone 3,4 ng / ml, erythrocyte sedimentation rate 5 mm. Acid-fast stains showed

negative results from sputum. Tuberculosis testing was positive. X-ray of the aperture and

lateral chest shows eradication in the ventricle, which shows the upper right lung. The results

of the CT scan showed multiple cystic lesions with varying sizes, including the largest lesion

being 7.13 cm in diameter, where the pulmonary tuberculosis cavity in both lungs was

compressed in all lobes.

STRUCTURE OF THE PAPER

Next Section deals with the detailed architecture listing all the involved steps viz. pre-

processing of CT images, features and classifier details. In Section 3, we will talk about the

datasets used for training , validation and testing of the classification models. Results,

Conclusion and Future scope are described in Section 4 and 5 respectively. Section 6 is for

acknowledgements.




Pag

e : 7

7

Fig 1 : The proposed training loop with adaptive sampling of the training images

Sushisen Algorithm

Input: Metric Start = I, start with the identity matrix.

Spread the nearest neighbor of KM points around the test point xo in Σ metric read.

Calculate the weight between classes and matrices between W and B using the Ices and

TSS (T = W + B) points.

Define the new metric W = W-1/2 [W-1 / 2BW-1/2 + -I] W-1/2

. Output: Segementation after classification at Test Point x accuracy 97..

1. DLT Methods Regression: Y = a + b X.

A. Two parameters A and B indicate the curve and are estimated using the data at hand.

B. Use the least square parameters for known values of Y1, Y2, ..., X1, X2.

2.DLT Regression: Y = b0 + b1 x1 + b2 x2.

3. DTL. (Dataset) = truncate (tree_generation (dataset))

4. knn.tree_generation (dataset) = if finish_discount (dataset)

5. Address (majority_category (dataset))

6. Training Data Form (X1, Y1), (X2, Y2),,, (X | | |, Y | D |)

7. Examining 2-D data, we may have: y = w0 + w1 x1 + w2 x2

Otherwise .. do (test. Training = 100%)

9. Best_Test = FE_Function (Dataset)

10. On the node




Pag

e : 7

8

11. Training Data: Error_ Training (h)

12. Full Distribution of Data D: Error_D (h)

13. Suhisen (error_ training (h) <error_ training (h))

14. error_D (h)> error (h ')

15. For every value of the besttest

16. Subtree_V = Tree_Generation ({e Example; eBest_Test = V)

17. In the (best_test, subtree_v) node

18. Comparison NB = Best. Breed + test;

19. The result of accuracy;

NIMA FLOW CHAT

Fig 2: Flow chart of NIMA

METHODOLOGY

All evaluation studies were studied in conjunction with the patients, until the outcome was

not available until treatment. There were 13 low-level 4 because there was no control group

in the joint study, least sample computation and no statistically significant analysis.

The only study that directly compared the effects of morbidity and mortality on surgical and

antidepressant medications, Darton et al. Therefore, it is possible to fully evaluate the validity

parameters using these studies. The surgery was performed on the basis of 8 symptoms, there

was no anomaly, and there was variability in the patient's original characteristics, including

the occurrence of acute illness. Antibiotics were usually administered only to treatment

groups. There was no statistical analysis or blindness, but the objective outcome (mortality

and morbidity) prevented measurement bias.

a) Predicted Type:

Significance can be measured by the required number of treatments (NNT), with the NNT of

the study being 15.89 or 16.12 targets. For 16 lobectomy, 1 flap or persistent bronchitis can

be prevented. Despite good validity, the study should be used with caution. A study in

Liverpool, completed in 1957, compared to Indonesia, found that antibiotic drugs were not




Pag

e : 7

9

validated with different sociology and ethnicities. The problem found in the study is also

different from the case

Other literature did not evaluate the control group and, therefore, could not assess the

validity. One aspect of computational imports is the test incidence rate (EER), which defines

the ratio of patients with outcomes compared to total subjects. For mortality, ERR ranged

from 0 to 13%, and the incidence of illness was 0 to 38%.

b) Reconstruct Map

Primary tuberculosis after early infection of MDP. In children, mature tuberculosis can lead

to bronchial obstruction and ventricular dysfunction. 15 complications can cause breathing

difficulties, so the decision to undergo surgery or conventional therapy should be addressed.

There was no guidance in decision making, which created controversy about the patient's

eligibility and when to perform 15 surgeries

Validity defines the validity of the study. Cameron et al have not shown the underlying

characteristic of the subjects and the treatment objective of the analysis. There may be

deceptive factors that affect the outcome of the study. [13] Surgery cannot be blinded to

treatment, so there must be an objective reduction in the dependence effect. From the 13

major analyzes of the literature, there are many points to note.

c) Segmentation of feature extraction

1. Includes multiple holes> 1 flap

2. The large cavity consists of> 2 adjacent folds

3. The large cavity contains a significant paranoid ulcer and is unable to partially implant

4. Pores with signs of knot disease

5. Major tracheal branch stenosis

d) Classification of coxPH model

1. Unilateral pits are limited to a specific area where there is no improvement after

conservative treatment

2. Increase the pressure in the cavities

3. Diagnosis, secondary to bronchitis, cavity and endopranchial obstruction

In addition, other signs of surgery include:

1. The case of immunity towards conservative treatment

2. Lung of advanced and irreversible chronic fibroids

Antide pressolysis was performed postoperatively to reduce disease activity and proved to be

a good outcome in surgery. Careful selection was necessary to treat lung tuberculosis

patients. Cameron et al found that 31 patients who were injured in the non-surgical section

had not been reactivated, concluding that surgery was not recommended to prevent

complications.

Mortality was higher among surgical patients, but the baseline characteristics of the groups

were different and patients in poor condition underwent surgery. On the other hand,

comparisons of efficacy of surgery and conventional therapy for children with complex lung

tuberculosis are unethical.




Pag

e : 8

0

e) Prognotic Risk score

In this case, the patient had many symptoms, such as an unresponsive small child and an

anatomical ulcer that showed chest x-ray and CT scan. It called for surgical evaluation with

six review literature

r

i

ii

r

i

r

i

iiii

xxN

xxxN

k

1

2

1 1

)(

)(ˆ

16082191903973602162261

r

i

iix

16082191903973602162261

r

i

iix

9677.0382,124,12480

382,124,116082480ˆ2

k

IMPLEMENTATION

a)Data Set

This analysis for the three datasets (DLT-WOO,CMC,CMC) used by us Dataset made

available by the National Library Science. 326 with normal cases and 336 with its 662

pictures. Tuberculosis cancer case With each CXR image, a brief report This includes the

patient's age, gender and diagnosis. All X-ray images are not the same size, but 3K X 3K can

be better. Around. Second dataset is Indian Dataset This CMC works for Vellore. There are

56 common cases In the .jpg format, 70 cases of tuberculosis have been given. The resolution

of images varies from 1.5K to 1.5K to 2.7k.

b) Computation phase

(1) We use the algorithm in to calculate X-ray image of the model. This pattern produces a

Maps are probably more similar sets and then defines transforming mapping original

patient's X-ray image registration.

Fig3:(a) data set segmentation Phase CMC image (b) Result CMC dataset NIMA value

Type

Features

DLT WOO

Data

Features

CMC,CBX

Proposed

Features

New

Method

Sushisen 85.5 88.25 96.77

DLT 82.5.5 85.46 88.84




Pag

e : 8

1

(2) The rib map is doubled further along the range of 0.5 950 (this value) with all the

connected components Dataset needs to be modified based on pixels Removed to get cleaner

rib map It is being done Further calculation in any false margin.

Fig :4 (a) data set segmentation CBX image (b) Result CBX dataset NIMA value

(3) Around the area that detects the costphronic angle Only 20% of the area's two lungs are

considered For further calculation. This percentage should be of the value Tune in based on

your data-set.

Fig 5: (a) data set DLT-WOO segmentation image (b) Result DLT-DOO NIMA

value

(4) For the revised image, we have calculated indexed points the last non-zero points are the

lungs of the costoffrenic angle For each column of the image.

Type

Features

DLT

WOO

Data

Features

CMC,CB

X

Proposed

Features

New

Method

Sushis

en

85.5

88.25

96.33

DLT

82.5.5

85.46

78.84

Type

Features

DLT

WOO

Data

Features

CMC,CBX

Proposed

Features

New Method

Sushisen 81.5 78.25 92.77

DLT 82.5 82.16 78.24




Pag

e : 8

2

(a) Real dataset result (b) real data set results

(5) Using a binary rib map of gradient computed is a sobel Operator To avoid any fraudulent

calculations At the end of the lung, the interval is not the same), we only remove In the 20%

internal area, calculate high for each lung.

Fig 6: (a) data set all segmentation

image (b) Result all NIMA value

RESULT

Study is an Evidence-Based Case Report. Medline, Proquest, and Cynadirect are synonyms

have been added in a wide range of search results, including "lungs", "operation" and "hair

disease". literature search, six articles were selected and then withdrawn in Table 1. Most of

the studies were done before the era of many antituberbularis regimens. Primary tuberculosis

surgery after standard treatment with leptocycin (Strape) and para-aminosalicylic examined

48 cases of pulmonary TB in children with segmental lesion complications from 2015 to

2019. Based on the results of bronchoscopy after adequate oral therapy, five patients were

rescued. The traditional treatment group was not given antibiotics, but was rested in bed and

was observed intensely in the ward. Reported 101 cases of pulmonary tuberculosis in

children. Primary tuberculosis complications were present in 69,3% of cases. 9 Feltis and

Campbell reported a potential colleague, who had undergone surgery with 47 children with

pulmonary tuberculosis. A study by surgery, persistent cavity, fibonodular disease, atllectasis,

Type

Features

DLT

WOO

Data

Features

CMC,CBX

Proposed

Features

New Method

Sushisen 81.5 78.25 92.77

DLT 82.5 82.16 78.24

Type

Featur

es

DLT

WOO

Data

Features

CMC,CBX

Proposed

Features

New Method

Sushisen 85.5 88.25 96.77

DLT 82.5.5 85.46 88.84




Pag

e : 8

3

and lost lung / lobe 10 quinlan et al. Cavity, bronchiectasis, waste lung / lobe and middle ear

syndrome.

(a) (b)

Fig: 7 (a) segmentation uncertainty map computed using the proposed method, (b) the same

for Platt scaling method

Fig:8 Over all image the final segmentation obtained methods with sushisen algorithm, Types %

AC %

PC %

FC %C

M CT RS MF TT FR LF MM

CMC 6,365

5,026

4,179

2,517

1,990

1,452

1,352

1,250

599 442 497

CBX 32 122 135 211 303 288 429 764 709 427 702

DLT WOO 69 1,090

2,585

4,254

4,868

4,573

4,686

4,333

4,127

3,622

3,157

Sushisen algorithm

87 478 983 2,865

3,609

4,219

4,884

5,683

8,188

8,334

9,591

DLT 0 0 21 159 454 687 1040 1356 1083 1167 2026

Table 1: New segmentation uncertainty estimation in terms of the accuracy of the final

segmentation after DLT Methods fitting




Pag

e : 8

4

COMPARISON STUDY

TABLE 2: NIMA Comparison Study

Classifier Three data set

Dataset

Number of

Test

Attributes

Correctly

Classified

Attributes

Classification

rate (%)

Sushisen algorithm

(proposed)

DLT WOO Dataset 1200 1150 97.77

CBX.CMC

Dataset

1200 1160 97.77

KNN ABC Dataset 1200 1020 85

PSO Dataset 1200 980 81.67

NP PSO Data set 1200 990 82,5

ABC Dataset 1200 940 78.33

The performance of the Sushisen algorithms is better than KNN and NP. Table V explains the

Comparison of algorithms - BABC, KNN and NP. Fig. 1 shows the performance analysis of

the algorithms which Table V explains in a bar chat. Fig. 2 exhibits the performance analysis

of the algorithms using Multidimensional to show the algorithm Sushisen is better than KNN

and NP in extracting the details according to the user’s need with better accuracy.

a)System Description

The Toshiba Portege R600 U2530 laptop is powered by Intel Core 2 Duo SU9400, 1400

Mega Hertz (Mhz) processor. This Portege series laptop from Toshiba comes with 3072

Megabytes (MB) of RAM, which is expandable up to Megabytes (MB). Toshiba Portege

R600 U2530 laptop or notebook PC has a 128Solid State Drive Gigabytes (GB) hard disk

capacity, and HDMI Port. The display of Toshiba Portege R600 U2530 is with 1280 x 800

pixels resolution. This Toshiba laptop has a battery life of hours and weighs around 1 kgs.

Operation system Windows 10 running the MATLAB Software programming.

CONCLUSION

In this study, we have proposed a method for detecting tuberculosis From lung radiograph

Our focus is on building aspects for the personal manifestations of tuberculosis using cancer

specified in DLT WOO .Most of the work existing in the field is based on construction

Features to show formatting and layout information Lungs. But many CXR films overlap

body structures are also distorted by acquisitions Noise goes to the wrong information Size

and design details. Such features do not apply. The size is very large and is not well-

normalized for others In the data set We tried to see the indirect effects of lungs on the lungs

Abnormalities rather than directly analyzing the severity Change. The features are based on

what points Radiologists see the X-ray image. We achieved great results Compared to the

refined CMC working in the Indians Dataset But since no direct comparison is possible

Dataset differences Our feature vector is almost the same Tenth part of the length of the

description of the size and design Performance oriented TB detection system is analyzed

Area of receiver working under (AUC) rules (ROC) curve For the removed section, we

follow the following procedure Per column Retrieve index with non-zero values and all

pixels and take it in column of the left and right lungs, respectively. Factors Calculate the




Pag

e : 8

5

difference between two consecutive factors Each array to get the difference range And it

stores the rib's width and Sushisen algorithm using constant lines in intervals In fact we really

can not only remove the ribs with rows Distance is always different because each rib map is

different Whether or not the rib space is attacked on odd lines We are not aware of the

starting point. To estimate took one Each category of average is indicated 97.77% accuracy.

REFERENCES

i. Anas, E.M.A., Nouranian, S., Mahdavi, S.S., Spadinger, I., Morris, W.J., Salcudean,

S.E., Mousavi, P., Abolmaesumi, P., 2017. Clinical target-volume delineation in

prostate brachytherapy using residual neural networks. In: International Conference

on Medical Image Computing and Computer-Assisted Intervention. Springer, pp.

365–373.

ii. Barra, V., Boire, J.-Y., 2001. Automatic segmentation of subcortical brain structures

in mr images using information fusion. IEEE Trans. Med. Imag. 20 (7), 549–558.

iii. Betrouni, N., Vermandel, M., Pasquier, D., Maouche, S., Rousseau, J., 2005.

Segmentation of abdominal ultrasound images of the prostate using a priori

information and an adapted noise filter. Comput. Med. Imag. Graph. 29 (1), 43–51.

iv. Blake, A., Curwen, R., Zisserman, A., 1993. A framework for spatiotemporal control

in the tracking of visual contours. Int. J. Comput. Vis. 11 (2), 127–145.

v. Choromanska, A., Henaff, M., Mathieu, M., Arous, G.B., LeCun, Y., 2015. The loss

surfaces of multilayer networks. In: Artificial Intelligence and Statistics, pp. 192–204.

vi. Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J., 1995. Active shape models-their

training and application. Comput. Vis. Image Understand. 61 (1), 38–59.

vii. Elhamifar, E., Vidal, R., 2013. Sparse subspace clustering: algorithm, theory, and

applications. IEEE Trans. Pattern Anal. Mach. Intell. 35 (11), 2765–2781.

viii. Gal, Y., 2016. Uncertainty in Deep Learning. University of Cambridge. Gal, Y.,

Ghahramani, Z., 2015. Bayesian convolutional neural networks with Bernoulli

approximate variational inference. arXiv: 1506.02158.

ix. Ghavami, N., Hu, Y., Bonmati, E., Rodell, R., Gibson, E., Moore, C., Barratt, D.,

2018. Integration of spatial information in convolutional neural networks for

automatic segmentation of intraoperative transrectal ultrasound images. J. Med. Imag.

6

x. (1), 011003.

xi. Ghose, S., Oliver, A., Martí, R., Lladó, X., Vilanova, J.C., Freixenet, J., Mitra,

J.,Sidibé, D., Meriaudeau, F., 2012. A survey of prostate segmentation methodologies

in ultrasound, magnetic resonance and computed tomography images. Comput.

Method. Progr. Biomed. 108 (1), 262–287.

xii. Ghose, S., Oliver, A., Mitra, J., Martí, R., Lladó, X., Freixenet, J., Sidibé, D.,

Vilanova, J.C., Comet, J., Meriaudeau, F., 2013. A supervised learning framework of

statistical shape and probability priors for automatic prostate segmentation in

ultrasound images. Med. Image Anal. 17 (6), 587–600.




Pag

e : 8

6

xiii. Gibson, E., Hu, Y., Ghavami, N., Ahmed, H.U., Moore, C., Emberton, M., Huisman,

H.J.,Barratt, D.C., 2018. Inter-site variability in prostate segmentation accuracy using

Deep learning. In: International Conference on Medical Image Computing and

Computer-Assisted Intervention. Springer, pp. 506–514.

xiv. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y., 2016. Deep Learning, 1. MIT

press

xv. Cambridge. Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q., 2017. On calibration of

modern neural networks. arXiv: 1706.04599.

xvi. Hodge, A.C., Fenster, A., Downey, D.B., Ladak, H.M., 2006. Prostate boundary

segmentation from ultrasound images using 2d active shape models: optimisation and

extension to 3d. Comput. Method. Progr. Biomed. 84 (2), 99–113.

xvii. Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L., 2017. Densely connected

convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision

and Pattern Recognition, 1, p. 3.

xviii. Jemal, A., Bray, F., Center, M.M., Ferlay, J., Ward, E., Forman, D., 2011. Global

cancer statistics. CA: A Cancer J. Clin. 61 (2), 69–90.

xix. Jendoubi, A., Zeng, J., Chouikha, M.F., 2004. Segmentation of prostate ultrasound

images using an improved snakes model. In: Signal Processing, 2004. Proceedings.

ICSP’04. 2004 7th International Conference on, 3. IEEE, pp. 2568–2571.

xx. Ji, P., Zhang, T., Li, H., Salzmann, M., Reid, I., 2017. Deep subspace clustering

networks. In: Advances in Neural Information Processing Systems, pp. 23–32.

xxi. Kachouie, N.N., Fieguth, P., Rahnamayan, S., 2006. An elliptical level set method for

automatic TRUS prostate image segmentation. In: Signal Processing and Information

Technology, 2006 IEEE International Symposium on. IEEE, pp. 191–196.

xxii. Karimi, D., Zeng, Q., Mathur, P., Avinash, A., Mahdavi, S., Spadinger, I.,

Abolmaesumi, P., Salcudean, S., 2018. Accurate and robust segmentation of the

clinical target volume for prostate brachytherapy. In: Frangi, A.F., Schnabel, J.A.,

Davatzikos, C., Alberola-López, C., Fichtinger, G. (Eds.), Medical Image Computing

and Computer Assisted Intervention – MICCAI 2018. Springer International

Publishing, Cham, pp. 531–539.

xxiii. Kendall, A., Cipolla, R., 2016. Modelling uncertainty in deep learning for camera

relocalization. In: 2016 IEEE International Conference on Robotics and Automation

(ICRA). IEEE, pp. 4762–4769.

xxiv. Kendall, A., Gal, Y., 2017. What uncertainties do we need in Bayesian deep learning

for computer vision? In: Advances in Neural Information Processing Systems, pp.

5574–5584.

xxv. Kingma, D.P., Ba, J., 2014. Adam: a method for stochastic optimization. In:

Proceedings of the 3rd International Conference on Learning Representations (ICLR).

Kohavi, R., Wolpert, D.H., et al., 1996. Bias plus variance decomposition for zero-

one loss functions. ICML, 96. 275–83




Pag

e : 8

7

xxvi. Kuncheva, L., Whitaker, C., 2003. Measures of diversity in classifier ensembles and

their relationship with the ensemble accuracy. Mach. Learn. 51 (2), 181–207.

xxvii. Lakshminarayanan, B., Pritzel, A., Blundell, C., 2017. Simple and scalable predictive

uncertainty estimation using deep ensembles. In: Advances in Neural Information

Processing Systems, pp. 6402–6413.

xxviii. Li, X., Li, C., Fedorov, A., Kapur, T., Yang, X., 2016. Segmentation of prostate from

ultrasound images using level sets on active band and intensity variation across edges.

Med. Phys. 43 (6Part1), 3090–3103.

xxix. Mahdavi, S.S., Chng, N., Spadinger, I., Morris, W.J., Salcudean, S.E., 2011. Semi-

automatic segmentation for prostate interventions. Med. Image Anal. 15 (2),226–237.

xxx. Milletari, F., Navab, N., Ahmadi, S.-A., 2016. V-net: fully convolutional neural

networks for volumetric medical image segmentation. In: 3D Vision (3DV), 2016

Fourth International Conference on. IEEE, pp. 565–571.

xxxi. Morris, W.J., Keyes, M., Spadinger, I., Kwan, W., Liu, M., McKenzie, M., Pai, H.,

Pickles, T., Tyldesley, S., 2013. Population-based 10-year oncologic outcomes after

low-dose-rate brachytherapy for low-risk and intermediate-risk prostate cancer.

Cancer 119 (8), 1537–1546.

xxxii. Mozaffari, M.H., Lee, W., 2016. 3d ultrasound image segmentation: a survey. arXiv:

xxxiii. 1611.09811.

xxxiv. Naeini, M.P., Cooper, G.F., Hauskrecht, M., 2015. Obtaining well calibrated

probabilities using Bayesian binning. In: AAAI, pp. 2901–2907.

xxxv. Nag, S., Bice, W., DeWyngaert, K., Prestidge, B., Stock, R., Yu, Y., 2000. The

American brachytherapy society recommendations for permanent prostate

brachytherapy postimplant dosimetric analysis. Int. J. Radiat. Oncol. Biol. Phys. 46

(1), 221–230.

xxxvi. Neal, R.M., 2012. Bayesian Learning for Neural Networks, 118. Springer Science &

Business Media. Noble, J.A., Boukerroui, D., 2006. Ultrasound image segmentation:

a survey. IEEE Trans. Med. Imag. 25 (8), 987–1010.

xxxvii. Nouranian, S., Mahdavi, S.S., Spadinger, I., Morris, W.J., Salcudean, S.E.,

Abolmaesumi, P., 2015. A multi-atlas-based segmentation framework for prostate

brachytherapy. IEEE Trans. Med. Imag. 34 (4), 950–961.

xxxviii. Nouranian, S., Ramezani, M., Spadinger, I., Morris, W.J., Salcudean, S.E.,

Abolmaesumi, P., 2016. Learning-based multi-label segmentation of transrectal

ultrasound images for prostate brachytherapy. IEEE Trans. Med. Imag. 35 (3), 921–

932.

xxxix. Pawlowski, N., Brock, A., Lee, M.C., Rajchl, M., Glocker, B., 2017. Implicit weight

uncertainty in neural networks. arXiv: 1711.01297.




Image fusion on PET/ MRI Images Using Inverse Wavelet decomposition

Transform

Article · June 2019

CITATIONS

0READS

36



Pulmonary nodule identification View project

Deep Learning Algorithms View project

Piramu Kailasam



SEE PROFILE

All content following this page was uploaded by Piramu Kailasam on 26 November 2019.


https://www.researchgate.net/publication/337532404_Image_fusion_on_PET_MRI_Images_Using_Inverse_Wavelet_decomposition_Transform?enrichId=rgreq-64122bfd07b283fe92cc6ae6e2896cce-XXX&enrichSource=Y292ZXJQYWdlOzMzNzUzMjQwNDtBUzo4MjkzNDU2MzU1MTY0MTdAMTU3NDc0MjgxNjM0Mg%3D%3D&el=1_x_2&_esc=publicationCoverPdf

https://www.researchgate.net/publication/337532404_Image_fusion_on_PET_MRI_Images_Using_Inverse_Wavelet_decomposition_Transform?enrichId=rgreq-64122bfd07b283fe92cc6ae6e2896cce-XXX&enrichSource=Y292ZXJQYWdlOzMzNzUzMjQwNDtBUzo4MjkzNDU2MzU1MTY0MTdAMTU3NDc0MjgxNjM0Mg%3D%3D&el=1_x_3&_esc=publicationCoverPdf

https://www.researchgate.net/project/Pulmonary-nodule-identification?enrichId=rgreq-64122bfd07b283fe92cc6ae6e2896cce-XXX&enrichSource=Y292ZXJQYWdlOzMzNzUzMjQwNDtBUzo4MjkzNDU2MzU1MTY0MTdAMTU3NDc0MjgxNjM0Mg%3D%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/project/Deep-Learning-Algorithms?enrichId=rgreq-64122bfd07b283fe92cc6ae6e2896cce-XXX&enrichSource=Y292ZXJQYWdlOzMzNzUzMjQwNDtBUzo4MjkzNDU2MzU1MTY0MTdAMTU3NDc0MjgxNjM0Mg%3D%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/?enrichId=rgreq-64122bfd07b283fe92cc6ae6e2896cce-XXX&enrichSource=Y292ZXJQYWdlOzMzNzUzMjQwNDtBUzo4MjkzNDU2MzU1MTY0MTdAMTU3NDc0MjgxNjM0Mg%3D%3D&el=1_x_1&_esc=publicationCoverPdf

https://www.researchgate.net/profile/Piramu-Kailasam?enrichId=rgreq-64122bfd07b283fe92cc6ae6e2896cce-XXX&enrichSource=Y292ZXJQYWdlOzMzNzUzMjQwNDtBUzo4MjkzNDU2MzU1MTY0MTdAMTU3NDc0MjgxNjM0Mg%3D%3D&el=1_x_4&_esc=publicationCoverPdf


https://www.researchgate.net/institution/Sadakathullah-Appa-College?enrichId=rgreq-64122bfd07b283fe92cc6ae6e2896cce-XXX&enrichSource=Y292ZXJQYWdlOzMzNzUzMjQwNDtBUzo4MjkzNDU2MzU1MTY0MTdAMTU3NDc0MjgxNjM0Mg%3D%3D&el=1_x_6&_esc=publicationCoverPdf



© 2019 IJRAR June 2019, Volume 6, Issue 2 www.ijrar.org (E-ISSN 2348-1269, P- ISSN 2349-5138)

IJRAR19K4195 International Journal of Research and Analytical Reviews (IJRAR)www.ijrar.org 436

Image fusion on PET/ MRI Images Using Inverse

Wavelet decomposition Transform

ABSTRACT In this paper a simple and improved medical fusion approaches for PET / MRI images. Image fusion produces merging of two or more images to get the most relevant characteristics of image. The input image is applied by discrete wavelet transform technique to get synthesized image. This paper focuses on applying wavelet decomposition method inverse discrete wavelet decomposition method on PET / MRI images in pixel and structure level. The resultant fused image preserves the possible important information from input images and not introduce artefacts also noises reduces in maximum extent. The parameters MSE, PSNR, RMSE showed good results for PET / MRI images. Keywords : Image Decomposition, Positron Emission Tomography, iDWT, Fused image INTRODUCTION Positron emission tomography has been utilized in various medical diognosis applications[1-3]. PET or SPECT data require an activity map of radiotracer distribution of the lung or brain scan images. PET images are suffered by notable noises, because of dosage and image acquisition in seconds. Hounsfield greater than 600 were considered to be bone after thresholding process in CT images. The attenuation maps were helpful to identify bone tissue. A high noise in PET images lead to diagnosis the type of diseases. PET scans generate 3D images of ROI. A shrived substance tracer in injected and traced throughout the blood circulation to attain target are of interest. If a positron interacts with anklectron , which generates 2 gamma photons in opposite directions . The PET detector array scanner measured the photons. So it impossible to locate the current active region and organ activity also revealed. A radio nuclide is (18F) flurodexyglucose (FDG) which is short lived nature in PET scanner. Due to low resolution of PET image it is combined CT or MRI data to enhance the functional data. This requires the registration of the generated volumetric PET /CT/MRI datasets. The registration of image in however difficult since the scanned object has to moved to another prescribed scanner. This causes the position and orientation of the object. This complicates the matching of ROIs. These are aggravated by multilevel of resolution of image. This CT/MRI/PET/SPECT data renders the matching of properties difficult. Thus the PET scanning is costlier than other modalities. PET scanner have good ability to characterize the metabolite activity. Added with it is good in searching for metastases of cancer. The reconstruction of the PET image diagnosed region area , the expectation maximization is adopted. This gives a maximum likelihood estimate. The ML-DM accurately find the position noise characters from many iterations of PET data. This increases the noises of PET hence the quality of the images decreases. In order to reduce noise we need preknowledge of image data for reconstruction process. Post filtering has been applied in some PET devices. Gaussian filtering used for Post filtering process. Gaussian filter is a linear low pass filter also a filter mask has in bell shaped curve with centre and symmetric tapering section in either side . The weights of Gaussian filter are calculated using Gaussian function. The NLM an d BM3D algorithm constantly suggested for denoising process in recent upcoming times. In the clinical diagnosis, the problems regarding images like MRI-PET, CT-PET, and CT-MRI were discussed frequently in medical field. So as to provide more useful information for clinical diagnosis, there is a necessity to combine useful information from various device images. SPECT – CT are useful to study about vascular blood flow test in abdominal and ultrasound images [1]. The integrated scanners perform in immediate sequence without changing the patients position. The improved images will enable doctors to see the condition of organs of astronauts hearts, which is a NASA project in space[2]. Image Fusion can be categorized in to pixel level , feature level , signal level and decision level. Pixel level image fusion is going to discussion in upcoming sections. Image fusion based on wavelet multiresolution technique followed by PCA technique in given in next section. Comparative analysis is brought by section also continued by Conclusion section. II.Material and Methods Pixel or Spatial Level Image Fusion Image fusion can be done by many methods like technical fusion, multiplicative fusion, subtractive fusion, PCA fusion, IHS fusion , Brovery , Color related technique and wavelet technique. The source image combined with single

Dr. M. Mohamed Sathik,Principal,Sadakathullah Appa College,Tirunelveli,India S Piramu Kailasam,Assistant Professor,Sadakathullah Appa College,Tirunelveli,India

http://www.ijrar.org/

USER

Underline

USER

Underline



image in Pixel level fusion. Compared with feature or decision level fusion, pixel level fusion can preserve more original information. In this study images are taken from Arthi scan lab located in Tamil Nadu. In the beginning take image A , Image B and analyse both separately then do the fusion by any one the above methods. The resultant image will be fused also informated image. Added with , image fusion can be done more than two images.

Fig 1. BLOCK DIAGRAM In multiplicative image fusion low and high level resolution images are discussed and the informations are mulitiplied. MLijk = sqrt ( XSijk x PNijk ) (1) Subtractive image fusion reflects the mathematical function. The wavelet image fusion discovered in the year 1980, is a multiresolution technique and the resultant output image could be partial resolution level , the details and edges are distorted. Compare to wavelet, Principle component analysis (PCA) method transforms interrelated variable to unrelated variable in spatial domain. Here pixel values are finalized. The redundancy of information is less in PCA method. IHS method is standard image fusion method where three bands are involved. This is used for sharpening , feature enhancement and color enhancement of datasets. The complex wavelet index image fusion method is best in structural similarity. Fusion technique is based on high pass filter and texture of the image. Object identification, Classification, Change detection are some of the main applications of Image fusion. If the image is very small then the pan(black and white) image is fused with another MSS image and find the new image. The result will give high resolution image. This method applied in Remote sensing and GIS. a.Average of intensity values The traditional and simple method for fusing image information is averaging the pixel intensity values. There may be reduced contrast value in resultant fused image. This technique is basic and straight forward one , can be choosen according to the need of application. If (x,y) = ( (I1 (x,y) + I2 (x,y) ) / 2 (2) b.Knowledge based image fusion Fuzzy models have been proving efficiency in handling imprecise information in various image types. Various image fusion techniques based on fuzzy logic are described in [3]. Knowledge based fuzzy approach in multi model medical images are often used by many researchers. Fuzzy inference system for image fusion works in three main scenario, fuzzification of input images, implementation of fusion rules and finally, defuzzification of output. Fuzzification of input image is to divide an image into different fuzzy sets. The finding membership function with pixel conditions is




most important to form fuzzy sets. High(H), Low(L), Very High (VH),medium(H), and Very Low(VL) are membership functions based on grey levels of image. Then the rules can like the below[4], 1. If (CT is VL) and (MRI is VL) then (Fused image is VL) 2. If (CT is VL) and (MRI is H) then (Fused image is H) 3. If (PET is H) and (CT is VL) then (Fused image is H) Fuzzification and Defuzzification method reduces the processing cost. Due to simplicity of defuzzification computer diagnosis level people can offer more than other methods[5]. NFIS (Neuro fuzzy Inference system) calculates output gray value using fuzzy if then rules. By proper training and membership functions , NFIS can give best results with less time complexity. But NFIS requires large data for training.

Fig 2. Trained data of MRI brain image

Fig 3. Testing data of MRI brain image

The computational intelligence computational intelligence techniques like fuzzy logic, neural network and genetic algorithms with multi scale decomposition can add more efficiency in fusion result. The degree of fuzziness of fuzzy sets called fuzzy entropy and is normally used in fuzzy amount of information in fuzzy sets. Fuzziness is meant by imperfect information resulting from the belonging to not belonging sets[13][14]. III.Image Fusion Based on Transforms Pixel level image fusion is so simple but it introduces loss of information and blurring of pixel in corner [3-4]. The multi scale decomposition improved the image in better way than pixel level method. Multi scale decomposition is mathematical method applied to signal to retrieve the hidden values. The time domain is transformed to frequency domain, achieved by Fourier, laplacian pyramid, wavelet transforms , contourlet transform and curvelet transforms in medical image fusion. Wavelet transform decomposed images in to low level and high level sub bands. Problem of angle direction can be solved by complex wavelet transforms (CWT) [7][8]. The CWT can extend in to six directions. Even though contourlet method achieves good image representation, it also having redundancy problem. Hence the wavelet contourlet method was proposed. New methods like bandlet, wedgelets and quadtree are produced to better the result in edge detection. FUSION RULES Fusion rules are mainly dependent on imaging modality and fusion application.




a)Activity level measurement: The amount of information at each pixel measured by energy at each pixel or energy of neighbouring pixels. This is window level measurement . The integer values for n are odd level. The edge level smoothness is good. The size of window may be n x n.

E(i,j) = mod(x(i,j)) or E(i,j) = mod(x(i,j))2 (3) x(i,j) is the gray value of the pixel of image at (i,j) the position. The another window based spatial frequency (SF) given by[9]:

SF = sqrt(RF2 + CF2) (4) where RF and CF are row frequency and column frequnency respectively and calculated by,

RF =√1

𝑀(𝑁−1)𝑁∑ ∑ (𝑥(𝑚, 𝑛 + 1) − 𝑥(𝑚, 𝑛)2) 𝑁−2

𝑛=0𝑀−1𝑚=0 and (5)

CF=√1

𝑀(𝑁−1)𝑁∑ ∑ (𝑥(𝑚 + 1, 𝑛) − 𝑥(𝑚, 𝑛)2)𝑁−1

𝑛=0𝑀−2𝑚=0 (6)

High value of activity level indicates more amount of information in input image. b)Principal Component analysis Principle component analysis is a mathematical tool. The fusion is done by computing weighted averge of images to be fused. If p1 and p2 are normalized component of eigen vector then the fusion rule can be writtern as [15] [16],

If = p1 * I1 + p2 * I2 (7) The input images to be fused I1(x,y), I2(x,y) are arranged in two column vectors and their empirical means are subtracted.

Multiscale Decompostion Wavelet Based Image Fusion In the first step the contrast of image is enhanced using histogram equilzation method. The enhaced images are fused using pixel or transform methods. To enhance the contrast , scaling factor is adopted. Inverse discrete wavelet transform based image fusion The Left image XL and right image XR of the input image is taken. The images applied with DWT with Haar features. The decomposed Left and Right images are then again decomposed with max and mean detail with IDWT transform. The generated synthesized image provides the essential character of input image. PERFORMANCE EVALUATION METRICS a)Fusion Index (FI) It is the combination of IAF and IBF factors. Where IAF is the mutual information index between multispectral image and fused image, IBF is mutual information index between panchromatic image and fused image. This measures affects the quality of an image. The quality of image depends on degree of fusion index. FI = IAF + IBF (8) b)Entropy Entropy is the amount of information present in the image Here P is the probability of pixel xi,

E = -∑ 𝑝(𝑥𝑖) log2𝑝(𝑥𝑖)𝑁𝑖=1 (9)




c)Fusion Symmetry (FS) Fusion symmetry is an indicator of the degree of symmetry in the information content from both images.

FS = abs(𝐼𝐴𝐹

(𝐼𝐴𝐹+ 𝐼𝐵𝐹 ) - 0.5) (10)

The lower value of FS indicates better symmetry in the fused image. d)Peak Signal to Noise Ratio (PSNR) The peak signal to noise ratio (PSNR) is the ratio between the maximum value of an image and magnitude of background noises.

PSNR = 10log10 (2𝑛−1)2

𝑀𝑆𝐸 (11)

e)Means squre Error(MSE) It is one the famous used measures to calculate the deviation between the original and fused image data.

MSE =∑ ∑ [𝑅(𝑖,𝑗)−𝐹(𝑖,𝑗)]2𝑁

𝑗=1𝑀𝑖=1

𝑀𝑁 (12)

where i,j are spatial position of pixels and M , N are dimensions of the image.

f)Root Mean Square Error (RMSE) This measurement is used to evaluate the fused image. Let be the fused image, is the input image, then

RMSE = √∑ ∑[𝑅(𝑖,𝑗)−𝐹(𝑖,𝑗)]2

𝑀𝑁 (13)

Result and Discussion The two dimensional wavelet technique is famous in image fusion area. It merges two images . This is the simple and fast compare to previous methods for medical image. The wavelet decomposing of two original images using fusion methods applied to approximation coeggicients and detailed coefficients. Wavelet transform divide the original image into low and high frequency levels. Low level contains the outline information and high level reflects the luminance changes and edge information of an image. The fusion image should contain both information of ROIs. The two images should be in same size and resolution. Peak Signal to Noise Ratio (PSNR) This is a good quality measure in image fusion. It indicates the similarity between the two medical images. The higher the PSNR value gives better fused image. Means squre Error(MSE) It is computed by finding the squared error dived by total number of pixels . The smaller the value , the better fused image. It is difficult to find blur and blocking of aritifacts.




Root Mean Square Error (RMSE) It presents the amount of deviation present in the fused image compared to reference image.

Fig 4. a.PET lung Image1 b.PET lung Image2 c. Fused PET lung Image

d.Decomposion of image1 e.Decomposion of image2 f. Fusion of Decomposed image In the above figure two input images are loaded and using Discrete wavelet transform method the images decomposed and the resultant decomposed image is further decomposed using Inverse discrete wavelet transform method. Now the resultant synthesized image received. The work has been done in Matlab2012b image processing tool. Wavelet Decomposition Tree Through the decomposition tree , it shows the decomposition splitting tree structure. With this tree we can get clear idea of hierarchical structure flow throughout the image.

Fig 5.Wavelet Decomposition Tree for PET synthesized image

Fig 6. a.MRI Brain Image1 b.MRI Brain Image2 c. MRI fused Image

d.Decomposion of image1 e.Decomposion of image2 f. Fusion of Decomposed image

Fig 7.Wavelet Decomposition Tree for brain synthesized MRI image




Inverse discrete wavelet transform (IDWT) of input or reconstruct signals from subbands with smaller bandwidths and slower sample rates.The IDWT block is the same as the Dyadic Synthesis Filter Bank block in the MultiMate Filters library, but with different default settings.

Fig 8. a.MRI Brain Image1 b.MRI Brain Image2 c. MRI fused Image

d.Decomposion of image1 e.Decomposion of image2 f. Fusion of Decomposed image

Fig 9.Wavelet Decomposition Tree for brain fused image

Fig 10. Fourier Curve fitting for MRI brain images

Fig 11. Gaussian Curve fitting graph for MRI brain images The Gaussian Equation is given as below f(x) = a1*exp(-(x-b1)/c1)^2 (14)




The fourier Equation f = a + a1 * cos(x * w) + b1 * sin(x * w) (15) here, the coefients with 95% confidence level. In equation the values of a1,b1 and c1,w are varied image to image. The data are trained well and tested results are shown in tabulated in tables. Different sets of test data with different types of curve fitting are analysed with several iterations. The graph shows fourier is best for Brain image and Polynomial curve fit is best for PET image. Fusion Result of MRI image

Fused Results of image B and synthesized image with mathematical operations a)original image b)Subtracted image c)Min d)Max e)Average f)Difference g)OR h)XOR

Table 1.Comparison of Statistical value for individual image before fusion

Image Area Mean StdDev Median

Brain Images Before Process

32900 90.1 69.8 42

33300 94.7 70.3 53

bs.PNG 32800 86.8 70.2 40

ave.PNG 32800 94 76.6 42

diff.PNG 32400 30.7 40.3 15

max.PNG 33800 99.2 78.1 45

min.PNG 33900 83.9 75.2 34

or.PNG 32600 115 85.9 59

sub.PNG 34100 132 35.5 129

transparent.PNG 32000 92 72.7 44

xor.PNG 32700 56.7 65.2 39

lungpet1.PNG 38600 44.4 49.2 27

lungpet2.PNG 35600 46.4 46 26

lungpet3.PNG 34300 43.8 39.6 33

Ave1.PNG 32800 90 71.2 52

Diff1.PNG 32100 28.4 39.9 13

Max1.PNG 33400 99.2 74.8 67

Min1.PNG 33200 191 40.8 206

Minnew1.PNG 33200 81.2 72.2 33

Or1.PNG 32100 120 83.7 106

Sub1.PNG 33400 82.4 73.1 33

Trans1.PNG 33300 94.7 73 64

Xor1.PNG 32700 60.5 68.6 41




Table 1 showed Area, Mean, Standard deviation and Median of the given image. The value calculated according to the equation. The mean value is increased higher in ORed and Subtracted images. It is very low in Difference operation. But , in standard deviation category notably high in ORed image and low in Subtracted image. Table 2.Comparison result of MRI Brain image and synthesized image

Images 1&7 1&2 2&3 2&4 2&5 2&6 2&7 2&8 2&9 2&10

MSE .21 .16 .18 .21 .24 .23 .20 .22 .21 .20

PSNR 54.87 56.22 55.54 54.97 54.30 54.51 55.04 54.76 55.01 55.24

The higher the PSNR (Peak Signal to Noise Ratio) value will be good for image fusion . So according to the table all images having 50% above similarity like input image. In particular image 1 and 2 having psnr value as 56.22. In addition, if you note the MSE value compare to other values image 1 and 2 having .16 value which is reasonable. This table shows that image 1 and image 2 gives less error in both MSE and PSNR parameters.

Table 3. Comparison result of Brain MRI and CT image with T1 contrast with T2 contrast

Image Name

Curve fit type

SSE (x10-2)

R-square Coef RMSE (x10-7)

Brain image1

Gaussian 2.75 .88 3 20.52

Brain image1

Fourier 2.23 .90 4 18.44

PET Image1 Polynomial 1.40 1 2 4.37

PET image1 Gaussian 6.22 .97 3 8.94

Brain image2

Gaussian 2.9 .99 3 5.41

Brain image2

Fourier 6.2 1 4 2.52

Brain image3

Fourier 6.5 1 4 11.61

Brain image3

Fourier 7.0 1 4 2.66

Brain image3

Gaussian 3.26 .99 3 5.75

Prediction errors could be found by root mean square error. Residuals are a measure of how far from the line of data points . Also it tells about data on line of best fit. Root mean square error is commonly used in climatology, forecasting to verify experimental results. In above table the all RMSE values are in x10-7. It shows the the data is having very minor errors. Hence the DWT + iDWT fusion method is successful method. SSE is sum of squared error also very low value. CONCLUSION In this study, pixel level and multiresolution level fusion applied in to CT and MRI images. Each fusion technique is discussed with its merits and demerits. Wavelet image fusion method has been applied successfully to newly formed database with various feature techniques and results are shown and discussed with different parameters. Data base has got from arthi scan centre, tirunelveli. It is impossible to design universal image fusion technique for all type of images for medial applications. In future this could be extend with PCA and CWT technique also can improve the performance of fusion method.




REFFERENCES

1. Nemir Al-Azzawi and Wan Ahmed K. Wan Abdullah, Medical Image Fusion Schemes using Contourlet Transform and PCA Based, Annual International Conference of the IEEE, Engineering in Medicine and Biology Society,2009, pp.5813-5816.

2. http://www.nasa.gov/centers/ames/research/2007/imagefusion.html 3. Irshad, H.; Kamran, M.; Siddiqui, A.B.; Hussain, A., Image fusion using computational intelligence:A survey,

Environmental and Computer Science, ICECS ’09. Second International Conference ,2009, pp.128- 132. 4. Jionghua Teng; Suhuan Wang; Jingzhou Zhang; Xue Wang, Fusion algorithm of medical images based on fuzzy

logic, Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference, vol.2, pp.546-550.

5. Jionghua Teng; Suhuan Wang; Jingzhou Zhang; Xue Wang, Neuro fuzzy logic based fusion algorithm of medical images, Image and Signal Processing (CISP), 2010, 3rd International Congress, vol.4, pp.1552- 1556.

6. R. C. Gonzalea and R. E. Woods, Wavelets and Multiresolution Processing, Digital Image Processing, 2004, 2nd Ed., Prentice Hal.

7. Ivan W. Selesnick, Richard G. Baraniuk, and Nick G. Kingsbury, The Dual-Tree Complex Wavelet Transform, IEEE Signal Processing Magazine, 2005, vol. 22, no. 6, pp. 123-151.

8. Jingyu Yang, Yao Wang, Wenli Xu, and Qionghai Dai, Image Coding Using Dual-Tree Discrete Wavelet Transform, IEEE Trans. Image Processing, 2008, vol. 17, no. 9, pp. 1555-1569.

9. Wang Xue-jun; Mu Ying, A Medical Image Fusion Algorithm Based on Lifting Wavelet Transform, Artificial Intelligence and Computational Intelligence (AICI), 2010 International Conference, vol.3, pp.474-476.

10. Gonzalo Pajares, Jes us Manuel de la Cruz, A wavelet-based image fusion tutorial, Pattern Recognition, 2004, pp.1855 1872.

11. Yong Chai; You He; Chaolong Ying, CT and MRI Image Fusion Based on Contourlet Using a Novel Rule, Bioinformatics and Biomedical Engineering, 2nd International Conference, 2008, vol., no., pp.2064- 2067.

12. L.A.Zadeh, Probability measure of fuzzy events,J.Mathematical analysis and applications,1968, pp.421-427. 13. Zhuanzheng Zhao, YunXiangLi, A Novel Image Fusion Method Based on Fuzzy Contourlet Transform,Computer

and Communication Tech- nologies in Agriculture Engineering (CCTAE), 2010 International Conference, vol.1, pp.404-407.

14. Zhuanzheng Zhao; Yunxiang Li, A novel image fusion method based on fuzzy contourlet transform, Computer and Communication Technologies in Agriculture Engineering (CCTAE), 2010 International Conference,1, no., pp.404,407.

15. Changtao Hea, Quanxi Liub, Hongliang Lia, Haixu Wangb, Multimodal medical image fusion based on IHS and PCA,Procedia Engineering, 2010, pp. 280285.

16. V.P.S. Naidu and J.R. Raol, Pixel-level Image Fusion using Wavelets and Principal Component Analysis, Defence Science Journa, 2008, Vol. 58, No. 3, pp. 338-352.

17. Shen, R.; Cheng, I.; Basu, A., Cross scale coefficient selection for volumetric medical image fusion, IEEE Transactions on Biomedical Engineering, 2012, Volume: PP , Issue: 99.

18. Zhou Wang, Alan C. Bovik., A Universal Image Quality Index, IEEE Signal Processing Letters, 2002, Vol. 9, No. 3. 19. Manjusha Deshmukh, Udhav Bhosale, Image Fusion and Image Quality

20. J.Jenith chritinal,T.Jeemina Jebaselvi,A novel color image fusion for multi sensor night vision images,IJCATR,2013,Vol.2,Issue 2



http://www.nasa.gov/centers/ames/research/2007/imagefusion.html


Pulmonary CT Image analysis for Nodule

Detection using Inspired FCM Clustering

Dr. M. Mohamed Sathik

Principal,

Sadakathullah Appa College,

Tirunelveli,India

S Piramu Kailasam,

Assistant Professor ,

PG & Research Department of Computer Science ,

Sadakathullah Appa College, Tirunelveli,

India

Abstract—The segmentation of white blob nodule area from

pulmonary CT images are tedious and time consuming task.

The accuracy of nodules as tumor depends on the radiologist

experience only. Hence the CAD still necessary to overcome

these limitations. In this study to improve the performance

and reduce the complexity of segmentation, adaptive

histogram equalization method and Fuzzy C means applied

with high level iterations. Further to improve the accuracy

local and global threshold also used. The resultant parameters

mean , median showed good accuracy results.

Keyword— Pre-processing; Global Threshold; Fuzzy C

Means(FCM); Segmentation

I. INTRODUCTION

Image Segmentation is the process that segment

an image into many disjoint small area based on the certain

principle, and it is the one of the most basic area of

research in the image processing. Image segmentation is

used in many applications; with Image Retrieval it is

possible to analyse the images in Geology Information

Systems, Medical Image Analysis, Film and Photo

Analysis etc.

II. RELATED WORKS

Many image processing techniques for the

detection of lung cancer by using CT images are reviewed

in [1], lung cancer detection is carried out by splitting the

review in different aspects such as pre-processing, nodule

segmentation and segmentation, lung nodule detection are

presented in [2], the performance of the recent lung nodule

detection techniques are compared. A lung cancer

classification system[3] is proposed on the basis of wavelet

recurrent neural network also employs wavelet to remove the

noise from the input image and the recurrent neural

network is utilized for classification. Perhaps, this author

could not achieve better specificity rates and this implies

that the false positive rates of the work are reasonably

greater. In paper [4], FCM is applied to achieve

segmentation and the GLCM (Gray Level Co - occurence

Matrix) features are extracted. The author Manasee[5]

says a lung cancer detection technique that relies on genetic

approach is proposed, this work involves more time

complexity and the number of connected objects have been

calculated by assigning 1 to inside and 0 to outside of the

object that shows the medical image based on threshold

technique to analyze the performance.

A. Segmentation Techniques

Some of the image segmentation methods are

Local entropy image segmentation [6]. A Discrete cosine

texture feature is applied by [7].The parallel algorithm for

grey scale image was clearly declared[8]. Clustering of

spatial patterns and watershed algorithm[9] is done.

Medical image segmentation performed [5] in the year

2010. Iterative regularized likelihood learning algorithm

[11]is performed well by Zhi Lu . The Automatic model

selection and unsupervised image segmentation by Zhi Wu

Lu , finite mixtures and entropy regularization by Zhi wu

Lu [12] is well delt in the paper.

As described by the Sun Hee Kim[13] ,

acquisition of the image, extraction of the region , intensity

and noises of the image affect the accuracy. Resize the

image to process in matlab. also another measurable factor.

The image processing is applied to separate the region of

nodules and backgrounds. Then the smaller size is obtained by

processing it in matlab12a. The array editor convert the

image to values. Identify the cluster where nodules are

present by iterations. The place where the nodules are there

shows higher value because of brightness. The conversion

of the image to array editor values done by Harvey A et al.,

[14].

Among the large number of lung segmentation

methods bidirectional differential chain code combined

with machine learning framework is able to correctly

include the just a pleura nodules into the lung tissue on

minimizing over segmentation. This method identifying

low-dose, concave/convex regions [15]. In the Hessian-

based matrix approach, 3D lung nodule segmented in the

multiscale process through the combination of Shape and

Curvedness methods. Image characteristics, Nodule

position, nodule size, and nodule characteristics are

included also Eigenvalues are computed from the 3 x 3

Hessian Matrix .

B. CT Image Pre-processing by AHE

The contrast of the CT images is enhanced by

employing adaptive histogram equalization technique,

which is versatile for both greyscale and colour images.

Kernelized Fuzzy C Means (FCM) is utilized to segment

the CT images, which shows better performance than the

standard FCM. Unlike Histogram Equalization, Adoptive

Histogram Equalization works with region of image. FCM

International Journal of Engineering Research & Technology (IJERT)

ISSN: 2278-0181http://www.ijert.org

IJERTV8IS070412(This work is licensed under a Creative Commons Attribution 4.0 International License.)

Published by :

www.ijert.org

Vol. 8 Issue 07, July-2019

1013

www.ijert.org

www.ijert.org

www.ijert.org

algorithm shows the group of pixels which are having same

characters.

III. MATERIALS AND METHODS

This section shows the materials , the source of

pulmonary nodule image dataset and the methods used on

pulmonary image. LIDC dataset taken as source bench

mark dataset available in pulmonary image database online.

A. Pre-processing

Noise is a variation of image intensity in random

manner and visible as dots or grains in the image. Noisy

pixels in the image show different intensity values instead

of true pixel values. The

Fig1.Flow Diagram

Fig 2 a. Gaussian Noise b .Wiener Filter

common types of noise arises in the image area. The imaging

sensor may be affected by climatic conditions during image

acquisition period. It may affected by various hardware

devices. Insufficient light and sensor temperature may

introduce the noise in the image. Interference in the

transmission channel may also corrupt the image. The dust

particles on scanner screen scan also introduce noise in the

image. The fig 1. shows the flow diagram of the work and

fig 2 shows the Gaussian noise,wiener filter applied CT

pulmonary images.

B. Denoising in Medical Images

The main properties of a good image denoising

model are that it will remove noise while preserving edges.

Subband decomposition of an image is performed by a

mathematical transform. Some of the familiar transforms

are discrete wavelet transform (DWT), Bandelet transform,

contourlet transform, curvelet transform and ridgelet

transform. Curvelet transform and ridgelet take the form of

basic elements which exhibit very high anisotropic.

Therefore the curvelet transform represents edges better

than wavelets, and is well suited for multiscale edge

enhancement. Curvelet have a variable width and length so

that it has a variable anisotropy.

Most of the medical images suffer from poor

contrast issues and so the details of the images cannot be

distinguished easily. At this juncture, the role of contrast

enhancement algorithm comes into picture.

The CT images are pre-processed by employing the

Adaptive Histogram Equilization (AHE) technique and the

major reason for the exploitation of AHE is that this

technique computes several histograms by considering

separate areas of the CT image. This means that the

contrast of every particular area of the image is enhanced.

This method consider pixel (pi) and alters into nearest

neighbourhood pixel intensity. This focus on the region of

image unlike Histogram Equivalization method. This

computes several histogram for each region of the CT

images. This means that contrast of every particular region

is enhanced. This ensures the balanced distribution of

values.

The AHE method improves the edges and

regions. The computed histogram is adaptive, which ensures

the balanced distribution of values. The AHE technique

improves the local contrast of the CT image. Some of the

sample pre-processed images are presented in fig 2 (a) &

(b) by Gaussian Noised and Wiener filter methods.

Consider a CT image with pixels and is

applied with the contrast enhancement process. The AHE

processes each and every pixel and alters by

considering the intensity immediate neighbourhood pixels.

Unlike standard HE technique, AHE processes the

regions of an image separately. This results in the contrast

enhancement of each and every region, which in turn

enhances the contrast of a complete image. By this way, the

contrast of the CT image is enhanced and the contrast

enhanced images are segmented as follows.

C. Types of Image Segmentation Techniques

Image segmentation by different methods are done

by many people. The goal of segmentation is to simplify

and/or change the representation of an image into

something that is more meaningful and easier to analyze.

Segmentation is a crucial task in medical image processing

Some of them are Local entropy image segmentation is

done by Ali Shojaee Bakhiari[6], Discrete cosine texture

feature by Chi-man pun, parallel algorithm for grey scale

image by Harvey, Clustering of spatial patterns and

watershed algorithm by Kai-Jian,Medical image

segmentation by Prasantha, Finite bivariate doubly

truncated Gaussian mixture model by Rajkumar[16],

Random set of view of texture by Ramana reddy[17], Sobel

operator technique, Prewitt technique, Kiresh technique,

Laplacian technique, Canny technique , Roberts technique

and Edge Maximization Technique by Salem Saleh[18],




Published by :

www.ijert.org


1014

www.ijert.org

www.ijert.org

www.ijert.org

Mathematical image processing by Sun hee kim, Simple

algorithm for image denoising by Vijaya[19], Iterative

regularized likelihood learning algorithm by Zhiwu Lu,

Automatic model selection and unsupervised image

segmentation by Zhi Wu Lu, Finite mixtures and entropy

regularization by Zhi wu Lu[11].

The various algorithms used for image

segmentation such as watershed, Sobel operator technique,

Roberts technique, Edge Maximization Technique (EMT),

Bivariate Finite doubly Truncated Gaussian Mixture

Model, Expectation Maximization (EM) algorithm, K-

Means algorithm, Iterative Entropy Regularized Likelihood

Learning Algorithm, Partial Differential Equation (PDE),

Mean shift (MS) segmentation. Wavelet based

segmentation, Genetic algorithm based segmentation,

Parallel Algorithm, Discrete Cosine Texture Feature are

explained in the following.

Fig 3. Manual Segmentation

In fig 3, the red color shows the manual

segmentation of lung nodules. The traditional water shed

algorithm is a mathematical morphology segmentation

algorithm based on the topology theory. The basic idea is

that the image is regarded as the topological landform in

the geodesy, gray-scale value of each pixel in the image

represents the Altitude of this point. Every local minimum

and influence area is called the catchments basin, while its

boundary forms a watershed. The Watershed concept and

process of the formation can be illustrated by simulating

immersion. In every local minimum surface, pierce a small

hole, and then slowly immerse the model into the water,

with the immersing deeper and deeper, the influence area

of every local minimum value gradually expands outward,

and construct the dam basin at the confluence of the two

catchments basin. It is just the watershed.

The computation of the partial derivation in

gradient may be approximated in digital images by using

the Sobel operators, Prewitt operator and Canny operator

which are shown in the masks below fig 4.

-1 0 1 1 2 1

-2 0 2 0 0 0

-1 0 1 -1 -2 1

Fig 4 The Sobel Masks

The prewitt operator uses the same equations as

the Sobel operator, except the constant value one. The

Prewitt operator (fig.5) measures two components, the

vertical edge component (kernel Gx) and the horizontal

edge component with (kernel Gy). |Gx| + |Gy| gives e

current pixel’s intensity gradient.

-1 0 1 1 1 1

-1 0 1 0 0 0

-1 0 1 -1 -1 -1

Fig 5. Gx Fig 6. Gy

Prewitt Technique

The digital implementation of the Laplacian

function is usually made through the mask (Fig 7) below :

0 -1 0

-1 4 1

0 -1 0

Fig 7. The Laplacian mask

The Laplacian is used to establish whether a pixel

is on the dark or light side of an edge. The performance of

edge detection of nodules can also done by canny operator.

The Canny method applies a best edge detector

based on a set of conditions which finding the most edges

by low error rate.

Spherical

Ellipsoid

Lobulated

Spiculated

Fig 8. Canny Edge Detection of Nodules

The Roberts Cross operator performs a simple,

fast compute method. It measures the 2D spatial gradient of

an image. Hence highlights the regions of high level spatial

frequency which correspond to edges.

In images when there is more than one

homogenous region or different gray levels. In this

situation, some part of the objects may be merged with the

background. An automatic threshold selection technique’s

performance becomes much better in images with large

homogenous splitted regions.

Fig 9. Edge Detection using prewitt




Published by :

www.ijert.org


1015

www.ijert.org

www.ijert.org

www.ijert.org

In the image the vertical edge component and the

horizontal edge component is calculated using Prewitt

operator. Hence edge of lung is detected in above Fig .9.

Image is a matrix in which each element is a pixel

also the pixel value is a number that shows intensity or

color of the image. Based on the probability values the

pixel intensities of their corresponding segments based on

GMM (Gaussian Mixure Model).

EM (Expectation Maximum) algorithm is highly

dependent on initialization of variable. A common

approach is using multiple random starts and choosing the

final estimate with the highest likelihood.

This will greatly increase the computation

complexity. Using the K-Means algorithm, the number of

image regions is identified.

The number k of clusters in the data set is usually assumed

to be pre-known. However, since image databases in the

context of image retrieval are often huge, the prior setting

of cluster number for each image is no longer feasible.

The segmentation consists of two steps. The first

step is mixture estimation.

The EM plus MML method is used to estimate the mixture

parameters and determine the number of Gaussian

components Minimum Message Length. In the second step,

pixel clustering, the segmentation is carried out by

assigning each pixel into a proper cluster according to the

Maximum Likelihood (ML) estimation.

As for unsupervised image segmentation, one

important application is content based image retrieval. In

this context, the key problem is to automatically determine

the number of regions (ie. clusters) for each image so that

we can then perform a query on the region of interest. An

iterative Entropy Regularized Likelihood (ERL) learning

algorithm for cluster analysis based on a mixture model is

used to solve this problem.

Segment the image by using Gibou-Fedkiw

algorithm, considered 2-means clustering method by using

level set function.

The MS algorithm is a robust feature – space

analysis approach which can be applied to discontinuity

preserving smoothing and image segmentation problems. It

can significantly reduce the number of basic image entities,

and due to the good discontinuity preserving filtering

characteristic, the salient features of overall image are

retained. The later property is particularly important in the

portioning natural images, in which only several distinct

regions are used in representing different scenes such as

sky, lake, sand beach, person and animal, whereas other

information within a region is often less important and can

be neglected. MS algorithm is an unsupervised clustering –

based segmentation method, where the number and the

shape of the data cluster are unknown, a priori. Moreover,

the termination of the segmentation process is based on

some region-merging strategy applied to the filtered image

result, and the number of regions in the segmented image is

mainly determined by the number of pixels in a region

which is denoted as M pixels. Here M will be eliminated

and merged into its neighboring region.

For a z1 X z2 color image to be segmented, we

consider an 8 – dimensional vector consisting of color,

texture, and position features for each pixel. The three color

features are the coordinates in the Lab color space, and we

smooth these features of the image to avoid over-

segmentation arising from local color variations due to

texture. The three texture features are contrast, anisotropy,

and polarity, which are extracted at an automatically

selected scale. The position features are simply the [x, y]

position of the pixel, and including the position generally

decreased over segmentation and leads to smoother

regions. We can get a sample set S of z = z1 . z2 samples

for each image in the database.

In case of images where in all of the boundaries

arise from neighbouring patches of different texture, we

compute pair wise texture similarities by comparing

windowed texton histograms.

D. Threshold based Image Segmentation

The thresholding transformation sets each gray

level that is less than or equal to some prescribed value T

called the threshold value to 0, and each gray level greater

than T is changed to K-1. Thresholding is useful when one

wants to separate bright objects of interest from a darker

background or vice versa. The thresholding does not

produce edges that correspond to the edges of the objects in

the scene. The global threshold image is better than manual

threshold (fig.10 a,b,c). Segmentation for nodule is

recommended with global thresholding. But in the case of

nonnodule adaptive threshold is better. On the whole

adaptive threshold is best for nodule and nonnodule

images.

Fig 10 (a) Threshold based Image Segmentation for Nodule

Fig 10 (b) Threshold based Image Segmentation for Non-Nodule 1




Published by :

www.ijert.org


1016

www.ijert.org

www.ijert.org

www.ijert.org

Fig 10 (c) Global threshold and manual threshold samples of nodules

Fig 11. Paranchyma with threshold value 128

Fig 12. Region of Interest “Spericity” by Threshold Value 128 using

Fuzzy Edge detection method

Edge detection is a basic step in image

segmentation, image enhancement, image extraction, image

selection, restoration, image registration and image

compression.

The aim of edge detection is to identify the physical

phenomena and localize the variations in the intensity of an

image. Edge detection is an easy way of extracting most

important features in all type of binary images (fig 12).

Fig 13. Line Graphs of Nodule Edges

Edge detection should be efficient because the

edge detection should provide all significant information

about the image. However, differentiation of an image is

sensitive to electronic, semantic and discreatization or

quantification effects. So it is tough work to design a

general edge detection algorithm which performs well in

many aspects.Fig 13, shows the line graph of edges in (x,y)

pixels for nodules. Types of edges are step edge, ramp

edge, line edge and roof edge. Because of low frequency

elements or smoothing applied by sensing devices, sharp

discontinuities like step and line edges, rarely exist in real

signals/images. Step edges become ramp edges and line

edges become roof edges. Intensity changes are not

instantaneous by occur over a fixed distance.

Table I. Statistical Result of Nodules

NO

DU

LE

Image 3

nn

-75

-

1.b

mp

Image

3-4

4-1

.bm

p

Image

3-1

15

-1.b

mp

Image

3-n

-71.b

mp

Image

240

-10

-1.b

mp

Image

24-6

9-1

.bm

p

Image

476

-217

-1.b

mp

Image

11

-115

-1.b

mp

ME

AN

103 54 60 62 120 115 115 119

ST

AN

DA

RD

DE

VIA

TIO

N

16 32 28 15 68 62 76 68

The mean of an image is calculated by sum up the pixel

values of nodule image divided by the total number of

pixels. In the above table mean value 11-115-1.bmp is 119.

This is the maximum mean value. The probability

distribution and intensity value of image 476-217-1.bmp

indicated the better intensity level and high contrast of

edges. Hence the standard deviation is the measure of

inhomogenity of nodule.




Published by :

www.ijert.org


1017

www.ijert.org

www.ijert.org

www.ijert.org

Fig 14. Histogram of trained nodule

Histogram is a graphical representation of pixels

in grey or level. The fig 14 shows the pixel falls in bins.

The min value is zero and max value is 226 showed

perfectly in graph. The mean and standard deviation also

calculated for segmentation.

Fig 15. (a) FCM Result from Cluster 1

Fig 15. (b) FCM Result from Cluster 2

Fig 15. (c) FCM Result from Cluster 3




Published by :

www.ijert.org


1018

www.ijert.org

www.ijert.org

www.ijert.org

Fig 15. (d) FCM Result from Cluster 4

The FCM result with cluster2 showed the less

number of iterations for clustering the pixels of

preprocessed nodule. Compare to other cluster this showed

the best. Clustering algorithms achieve region

segmentation by partitioning the image into sets or clusters

of pixels that have strong similarity in the feature space. In

hard clustering, data are divided into distinct clusters, where

each data element belongs to exactly one cluster. In fuzzy

clustering, data elements can belong to more than one

cluster. with Each element is associated with set of

membership levels. These indicate the strength of the

association between that data element and a particular

cluster. Fuzzy clustering is a inspired process of assigning

membership levels. Then assign data elements to one or more

clusters. One of the most widely used fuzzy clustering

algorithms is the FCM Algorithm. This can be used in

many fields such as pattern recognition, fuzzy

identification, and feature extraction.

IV. CONCLUSION

Segmentation types are discussed with edge based,

threshold based, region based, soft computing based

methods. Gaussian Radial Basis Function (GRBF) using

Fuzzy Method works well for CT images. Computer-aided

diagnosis of lung cancer is used to segment pathologically

changed tissues fast and accurately. The proposed method

HE+FCM successfully segment the pulmonary nodule from

the CT image with accuracy value 91.6%. For lesion the

average volume error obtained is 0.968%. The coefficients

of similarity, spatial overlap, RMSE, average over and

under of segmentation ratio are 0.914, 0.584, 0.10, 0.63%

and 0.015% respectively. The experimental results indicate

that the newly proposed method could segment blood vessel

adhesion, pleura adhesion fast and exactly performs better

than traditional methods.

REFERENCES

[1] Niranjana, G.; M. Ponnavaikko, "A Review on Image

Processing Methods in Detecting Lung Cancer Using CT Images", International Conference on Technical Advancements

in Computers and Communications, 10-11 April,

Melmaruvathur

[2] Rabia Naseem ; Khurram Saleem Alimgeer ; Tariq Bashir,

"Recent trends in Computer Aided diagnosis of lung nodules in

thorax CT scans", International Conference on Innovations in Electrical Engineering and Computational Technologies, 5-7

April, Karachi, Pakistan, 2017

[3] Devi Nurtiyasari ; Dedi Rosadi ; Abdurakhman, "The application of Wavelet Recurrent Neural Network for lung

cancer classification", International Conference on Science and

Technology - Computer, 11-12 July, Yogyakarta, Indonesia, 2017.

[4] Bhagyarekha U. Dhaware ; Anjali C. Pise, "Lung cancer detection using Bayasein classifier and FCM segmentation",

International Conference on Automatic Control and Dynamic

Optimization Techniques, 9-10 Sept, Pune, India, 2016. [5] Manasee Kurkure ; Anuradha Thakare, "Lung cancer detection

using Genetic approach", International Conference on

Computing Communication Control and automation, 12-13 August, Pune, India, 2016.

[6] Ali Shojaee Bakhtiari, Ali Asghar Beheshti Shirazi and Amir Sepasi

Zahmati, “An Efficient segmentation Method Based Local Entropy Characteristics of Iris Biometrices”, World Academy of Science ,

Engineering and Technology, 28, 2007, 64-68.

[7] Chi-Man Pun and Hong-Min Zhu, “Image segmentation using

Discrete Cosine Texture Feature, International Journal of

computers Issue1,Volume 4, 2010, 19-26

[8] Harvey A, Cohen computer science and computer engineering La Trobe University Bundoora 3083 Parallel algorithm for

gray scale image segmentation Harvey A. Cohen, Parallel

algorithm for gray –scale image segmentation, proc. Australian and new Zealand conf. Intelligent information systems,

ANZIS- 96,Adelaide, Nov 18-20, 1996, pp 143-146.

[9] Prasantha, H.S., Shashidhra. H., K.N.B.Murthy, Madhavi Lata, Medical Image Segmentation. S.Prasantha et al. (IJCSE)

international journal on computer science and engineering,

02(04), 2010, 1209-1218 [10] Zhiwu Lu Unsupervised Image Segmentation using an Iterative

Entropy Regularized Likelihood Learning Algorithm J. Wang

et al.(Eds.): ISNN 2006, LNCS 3972 , pp. 492-497, 2006. C Springer- Verlag Berlin Heidelberg 2006.

[11] Zhiwu Lu, Xiaoqing Lu, and Zhiyuan Yc, entropy

Regularization, Automatic V Model Selection V, and Unsupervised Image Segmentation Z. –H. Zhou, H.Li, and

Q. Yang (Eds.):PAKDD 2007. LNAI 4426, pp. 173-182,

2007._c Springer –Verlag Berlin Heidelberg 2007. [12] Zhiwu Lu, Yuxin Peng and Jianguo xiao Institute of Computer

Science and Technology, Peking University, Beijing, China,

“Unsupervised learning of Finite Mixtures Using Entropy Regularization and its application to Image Segmentation”,

978-1-4244-2243-2/08/$25.00, 2008 IEEE.

[13] Sunhee kim, Seugmi oh and Myungjoo Kangy, “Mathematical Image Processing for Automatic number plate Recognition

System”, 14(1), 57-66, 2010.




Published by :

www.ijert.org


1019

www.ijert.org

www.ijert.org

www.ijert.org

[14] Harvey A, Cohen computer science and computer engineering La Trobe University Bundoora 3083 Parallel algorithm for

gray scale image segmentation Harvey A. Cohen, Parallel

algorithm for gray –scale image segmentation, proc. Australian and new Zealand conf. Intelligent information systems,

ANZIS- 96,Adelaide, Nov 18-20, 1996, pp 143-146

[15] Raj Kumar, GVS, Srinivas Yarramalle, A.Tejaswi, Naveen Kumar K., Chhajar Sunanda. “Image Segmentation by using

finite bivariate doubly Truncated Gausian Mixture model”,

International Journal of Engineering Science and Technology, 2(4), 2010, 694-703.

[16] Ramana Reddy, B.V., M. Radhika Mani, V. Vijaya Kumar, “A

Random Set View of Texture Segmentation International Journal of Signal and Image processing”, (Vol.1-2010/Iss.1), A

Random Set view of Texture Segmentation / pp.24-30.

[17] Salem Saleh Al-amri, N. V.Kalyankar and Khamitkar S. D, “Image Segmentation by using edge Detection”, (IJCSE)

International Journal on Computer Science and Engineering,

02(03), 2010, 804-807 [18] Vijaya, G., V. Vasudevan, “A Simple Algorithem for Image

Denoising Based on MS Segmentation”, International Journal

of Computer Applications (0975 - 8887). 2(6), 2010.




Published by :

www.ijert.org


1020

www.ijert.org

www.ijert.org

www.ijert.org

Asian Pacific Journal of Cancer Prevention, Vol 20 457

DOI:10.31557/APJCP.2019.20.2.457A Novel Hybrid Feature Extraction Model for Classification on Pulmonary Nodules

Asian Pac J Cancer Prev, 20 (2), 457-468

Introduction

Lung Cancer is one of the risky malignancies more than breast, colon and prostate cancers combined (Niranjana and Ponnavaikko, 2017). In Japan, Lung cancer is the leading cause of cancer deaths of men 25% in 2015. Globally, Statistical report says 80% cigarette smoking is the root cause. Beginning stage diagnosis could improve survival period. Pulmonary nodules which is spherical or oval sized, origin of cancer. Computerized Tomography (CT) is better in image quality, level of detail and resolution than other modality. Cancerous lung nodule, small mass of tissues , appears round in shape requires treatment. Computer aided diagnosis in clinical procedures can support in making the decision of malignant nodule (Rabia et al., 2017). The Lung Image Database Consortium dataset is used. The pre-processing of the nodule is complicated task. Convolutional Neural Network in Deeplearning preprocessed the lung CT images automatically, leading superior performance. A good feature extraction technique and feature descriptor should be capable of extracting the required features for nodule recognition. Deep learning architecture convolutional neural network is capable of extracting nodule recognition in hierarchical manner by using multiple layers of convolution and maxpooling. Deep Learning has been shown in to extract feature pattern and classify nodules by modifying its architecture and improve the classification time. Deep learning techniques

Abstract

In this paper an improved Computer Aided Design system can offer a second opinion to radiologists on early diagnosis of pulmonary nodules on CT (Computer Tomography) images. A Deep Convolutional Neural Network (DCNN) method is used for feature extraction and hybridize as combination of Convolutional Neural Network (CNN), Histogram of Oriented Gradient (HOG), Extended Histogram of Oriented Gradients (ExHOG) and Local Binary Pattern (LBP). A combination of shape, texture, scaling, rotation, translation features extracted using HOG, LBP and CNN. The Homogeneous descriptors used to extract the feature of lung images from Lung Image Database Consortium (LIDC) are given to classifiers Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Decision Tree and Random Forest to classify nodules and non-nodules. Experimental results demonstrate the effectiveness of the proposed method in terms of accuracy which gives best result than the competing methods.

Keywords: Deep learning- convolutional neural network- feature- descriptor -classification

RESEARCH ARTICLE

A Novel Hybrid Feature Extraction Model for Classification on Pulmonary NodulesS Piramu Kailasam1*, M Mohamed Sathik2

can potentially change the design paradigm of the systems without need of explicit design (Jie-Zhi et al., 2016). The heterogeneous features of CNN as well as feature descriptors score the radiologist ratings (Sihong et al., 2016). Deep Learning Convnet technique is well-matched for false positive reduction of a CAD system (Goharian et al., 2003). The low computation time of ConvNets is a decision aid in lung cancer screening (Apte et al., 1997). The proposed framework, in order to understand the lung nodule features, heterogeneous computational feature type derived from the convolutional neural network (CNN) as well as general low level histogram of oriented gradients (HOG), Extended Histogram of Oriented Gradients (ExHOG) and Local Binary pattern features, exactly detect the nodules in CT pulmonary images. Further more extracted features from hybridized descriptors are classified using excellent classifiers SVM, KNN, Decision Tree and Random Forest methods.

Related WorksPulmonary lung images are to be addressed in four

stages, in the order of occurrence namely Deep learning Convolutional Neural Network(DCNN) architecture formation, Image Enhancement, Feature Extraction and Classification.

DCNN is an automated Preprocessing architecture so no need to do preprocessing again. In Deep Learning images or videos may profit from preprocessing , whose job may become much easier (Jodogne and Piater, 2007).

Editorial Process: Submission:04/16/2018 Acceptance:01/10/2019

1Research Scholar, Research and Development Centre, Bharathiar University,Coimbatore, 2Principal, Sadakathullah Appa College, Tirunelveli,Tamil Nadu, India. *For Correspondence: [email protected]

S Piramu Kailasam and M Mohammed Sadik

Asian Pacific Journal of Cancer Prevention, Vol 20458

A dominant dictionary like texture descriptor, texton is proposed as a feature (Beijbom et al., 2012). Eduardo et al have used a wavelet filters to extract texture feature from open CV library have determined the counts of corals by extracting texture features using LBP descriptor because they use the information from eight directions around a pixel (Huan et al., 2010). CNN were introduced by Lecun et al., (1998), CNN allows for multiple features to be extracted at each hidden layer. Hence, multiple characteristics can be extracted and the learning process assigns weights appropriately to significant features. So automatically performing the difficult task of feature engineering in hierarchically. Shape features (Ashis et al., 2016) can easily extracted by HOG (Histogram of Oriented Gradients)feature descriptor (Chen et al., 2014). The other method to detect whole human detection using (Extended Histogram of Gradients) ExHOG feature descriptor (Amit et al., 2011). Support Vector Machine (SVM) produce good classification for medium size data set. K–Nearest Neighbor(KNN) with Euclidean distance as classifier which is log likelihood and have reported an accuracy of 80%. Upon comparison , that the decision tree yields the most accurate performance and SVM results in poor performance. Decision Tree (Golmohammeadi et al., 2007), Random Forest (Golmohammadi et al., 2007; Hayat and Khan, 2012a) are other classification methods. In this paper various feature descriptors and classifiers are used to extract the lung image nodule and classify with improved accuracy in deep learning environment.

Materials and Methods

Data setThe proposed classification framework and the

competing technique deep learning is evaluated over a data set of 467 nodules of Lung Image Database Consortium (LIDC) public database. Each slice of lung CT images consists of 512 X 512 pixels. The pixel resolution starts from 0.5 to 0.8mm. The nodule, noduleid, slice number, diameter are available. The Lung Image Database Consortium (LIDC) dataset is accessible in public for the medical imaging research community (Apte et al., 1997). LIDC dataset contains 1018 Pulmonary CT images that originated from a total of 1010 patients , in total 2,43,958 images.

Deep LearningDeep learning is cascade of nonlinear transformation

in hierarchical mode. Convolution net (ConvNet) feedforward deep architecture end to end learning is applied well. In convolution net each layer can visualize features and have high variance (Tara et al., 2013). At test time, Convnet, run only in forward mode and all layers are differentiable. ConvNet trying to match every possible pieces of image. The convolution layer linearizes manifold and pooling layer collapses manifold. In convolutional layer size of the output is proportional to the number of filters and depends on the stride (Prajwal et al., 2016). Assume filter is an eye detector. If kernels have size C x C, input has size F x F, stride is 1 and there are M input feature maps and N output feature maps then the input has

size M@F x F , the output has size N@(F-C+1) x (F-C+1), the kernels have M X N X C X C coefficients which have to be learnt and cost M x C x C x N x (F-C+1) x (F-C+1).The size of the filters has to match the size / scale of the patterns we want to detect. The output size of pooling layer is depends on the stride between the pools. For instance if pools do not overlap and have size C x C and the input has size F x F with M input feature maps then the output is M @ (F / C) x (F / C) (Qingzheng et al., 2017).

(1)

Each function f ɭ represents a layer which takes the output of the previous layer , denoted by x ɭ -1 to compute the output x ɭ using the parameters w ɭ equipped for each layer,

ie. xɭ = fɭ (xɭ -1; wɭ ) for ɭ = 2,3,....L

Feature ExtractionShape based features are computed from the biggest

axial slice of nodule (Rangayan et al.,1997).The shape based low level features are Area, Perimeter, Compactness, Major axis length, Minor axis length, Equilent diameter, Convex area, Convex perimeter and Circularity. Low level feature which is shape based is calculated by HOG, ExHOG feature descriptor and texture feature calculated by LBP feature descriptor. High level features translation , scaling and rotation are calculated by Convolution Neural Network Convnet. When both descriptors are hybridized to get low and high level feature extraction, the resultant image shows excellent performance which matches with train nodule features.

Blocks of Convolutional Neural Network as Feature Descriptor

In this paper on the performance of the 2D CNN with LIDC dataset, the first layer of the CNN is a convolutional layer with filter size 20, stride size of 1, followed by a max pooling layer of size 2 x 2 with stride size of 1. The third layer is also a convolutional layer with filter size 32 and the stride size as layer 1. The first six layers are arranged alternately by convolution layer1, max pooling layer1, convolution layer 2, max pooling layer 2, convolution layer3, max pooling3 pattern, except that the fifth layer is with filter of size 32. The seventh layer is an activation layer with ReLU (Rectified Linear Unit) and the eighth layer is again a convolutional layer with filter size 40 @ 4 x 4 x 32. The last layer is a softmax operator. The parameters of filters in the convolutional layers and the size of max pooling operators must be consistent to allow meaningful computations. For our datasets, each input image of size 50 x 50 x 1 leads to an output of size 1 x 1 x 2 after forward propagation of the 9 layers (Yu et al., 2016; Qingzeng et al., 2017).

CNN consists of convolutional layers which are characterized by input Image I, bank of filters K of dimension k1 X k2 with height ‘h’ ,weight ‘w’ and biases ‘b’. The output from this convolution procedure is as below

(2)

f = f ɭ o f ɭ -1 o ..........f1

(I * K)x,y = ∑ ∑ 𝑤𝑤𝑗𝑗=1 ℎ

𝑖𝑖=1 Kij . I x+i-1,y+j-1 + b



pixel, the value is set to one or set to zero. If k is the total number of neighbouring pixels , then LBP technique feature results in 2k dimensions of feature vector. This makes 256 patterns for eight neighbour pixels and 65,536 patterns for 16 neighbour pixels. As the size of the neighbourhood increases, the size of feature vector also increases. The histogram of binary of pulmonary image, patterns obtained for a region is considered as feature descriptor (Armato et al., 2000).

(6)

where gp is neighbour pixel, gc central pixel

Classification Support Vector Machine is supervised learning model

that is trained with nodule and nonnodule CT images by means of extracted feature values. Both classes are divided by a hyperplane, which derives a kernel to separate the classes. Also SVM is memory efficient classifier. K Nearest Neighbour Classifier(K-NN) is one of the simplest classifiers that identifies the undefined data by previously known data and classify them by voting system. K-NN classifies the nodule points using more than one nearest neighbour (Wang et al., 2005; 2008; 2010a; 201b). In KNN classification , the output is a class membership, if k = 1 then the object is simply assigned to the class of nearest neighbour.

Decision Tree predict responses to data in a tree structure. Each node denotes a test on an element value. Each branch represents an outcome of the test. This follow decisions in the tree from the parent node down to a child node. The child node contains the response. Decision tree responses that are nominal, such as ‘1’ or ‘2’. The tree predicts classifications based on two predictors which is 1 or 2 in feature extracted nodule images (Qi et al., 2016). The approach of this tree is widely used since it is efficient, can deal with both continuous and categorical variables and generates understandable rules (Apete, 1997). Also it is able to deal with missing values in the training data and can also tolerate some errors in the data. Furthermore , decision trees can lead to large errors if the number of training examples per class is small. Random Forest (RF) is a collection of decorrelated decision trees and good in large data set. RF and regression trees are an ensemble of classification. (CART) (Heet al., 2016). The error rate of the classification of all the test sets is the out of bag estimate of the generalization error (Alif Hayat, 2015).

Psuedo Code of Proposed method// Let us Consider CT slice as i and Given nodule

region (i-2,i-1,i,i+1,i+2). Train nodules (1...n) from 3DCT image of LIDC . Extract feature of nodule using LBP, HOG, CNN, EXHOG//

Begind(s)← 1...nWhile (d(s) =! eof){HOGfea←HOG(nod_reg(i-2,i-1,i,i+1,i+2))EXHOGfea ← EXHOG(nod_reg(i-2,i-1,i,i+1,i+2))CNNfea←2DCNN(fL o f L-1 o…….f1)

HOG (Histogram of Oriented Gradients) Feature Descriptor

The image resized into 28 x 28 is decomposed into 6 x 6 overlapping blocks where each block is 2 x 2 cell with stride size of 4 pixels. The number of bins are 9 .There are totally 1,296 HOG features computed as lowlevel shape features. To improve accuracy the local histograms can be contrast normalized. In block larger region of image calculates the intensity measurement. The pulmonary image get good intensity and shadowing in normalization. This operated on local cells. Here opposite directions are put in to same bin and calculated as same orientation. The angle range is 0 to 180 degree. To filter color or intensity of the image gradient computation in horizontal and vertical directions , (-1 0 1) and (-1 0 1)T are necessary (Amit saptapathy,2011).

The gradient magnitude(M) of each pixel (x,y) and its direction , theta are computed as follows:

(3)

M(x,y) is the magnitude of pixel.We can calculate pixel direction (x,y) ,

(4)

here the theta intervals between o to 2π. Where Ix and Iy are gradients in the x and y directions.

ExHOG (Extended HOG) Feature DescriptorExHOG is a feature descriptor which extract HOG

features around corner points. First read the nodule image and divided into two halfs as A and B. Through Histogram Image, the values A+B and A-B are calculated. Detect the strongest corners in the image and difference of (A+B) and (A-B) is the EXHOG feature descriptor. (I, Strongest) is equal to (ExHOG, Validpoints, p).The bin value is 18 and window is 6 x 6. So in total 648 bins formed. Extended Histogram of Gradient (ExHOG) is a excellent feature descriptor which concatenate Difference of HOG and HOG of L/2 bins each to form a histogram of L bins for each cell in the image window(Amit,2011). Then the feature block is concatenated and normalized by clipping method.

The equation summarizing the steps is as follows:

(5)

where hx (i) is the ith bin value of ExHoG, hy(i) is the ith bin value of HG and L is the number of bins in HG and ExHoG, correspondingly. The features from all the blocks are collected for the image window to form a combined feature vector.

Local Binary PatternLocal Binary Pattern (LBP) is an efficient technique

for texture feature extraction. It considers that each pixel is compared with neighbour pixels and encodes the binary effect of the local neighbours by simple threshold function (Ani Brown Mary et al., 2017). The gray value of the neighbouring pixels are higher or equal to the centre

θ = tan-1 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼

hx(i) = | hy(i) + hy(i + L/2 ) |; 1 ≤ i ≤ L/2 (5)

| hy(i - L/2 ) - hy(i) |; L/2 + 1 ≤ i ≤ L

LBP =∑ 1𝑝𝑝=0 s(gp-gc).2p

, s(x) = 1 , if x > = 0 (6) 0 , otherwise

M(x,y) = �𝐼𝐼𝐼𝐼2 + 𝐼𝐼𝐼𝐼2



LBPfea←LBP(nod_reg(i-2,i-1,i,i+1,i+2))// Store features of nodules in FV //FV←(LBPfea HOGfea CNNfea EXHOGfea)F V 1 ← ( C N N f e a + L B P f e a C N N f e a + H O G f e a

CNNfea+EXHOGfea......)CL←AssignClassLabel(noduleregion1....n)// Test the data and use kfold validation as 5 and

classify using SVM Classifier //Classify←(SVMClassify(FV1,CL,testnodreg),RF(F

V1,CL,testnodreg),DT(FV1,CL,testnodreg),KNN(FV1,CL,testnodreg))

for j ← 1 to nif Classify((j)) ← 1print noduleelse print nonnoduleendfor}End

Results

In this study convolutional neural network is made in deeplearning framework and accuracy was calculated. A total of 467 images of lung nodules and nonnodules 131 used in the training. Training set learnt properties are applied to tested data. A five fold validation is used in the combination of {(10,90) (20,80) (30,70) (40,60) (50,50)} where each set denotes (train, test) data. Total pulmonary images are tested in CNN and the same is applied for HOG, EXHOG and LBP descriptors.

The total execution time of hybridized techniques LBP, HOG, ExHOG, CNN with classifiers SVM, KNN, DT, RF are tabulated as shown in Table.1 to 4 seperately. The interative variables are incrementing as per iterations . The time value of SVM_ExHOG_CNN is 5.332s. This is higher than KNN_LBP execution time 2.233s which is least value.

Tables 5 to 8 displays the result obtained. A sample image from each combination is used to create the test set, while remaining samples construct the training set. Totally 598 samples are used for training and 420 for testing. The similar measures are calculated. From this matrix the number of instances that are accurately and inaccurately predicted is easily known. From Table.1, SVM with HOG, HOG_CNN and CNN_LBP descriptors got least False Positive value as 14. The proposed best hybrized

feature descriptor EXHOG_CNN with SVM classifier is highlighted and the corresponding TP, TN, FP, FN values are 452, 118, 15, 13 correspondingly. Similarly the other confusion matrix results are also displayed above.

3.3 Comparing different combinations of methodsThe highest precision value is 97.23% for

SVM_ExHOG_CNN. The highest recall value among 36 methods is 99.78% for KNN_HOG method.

Convolution Neural NetworkIn the deep network during training stage convolution

layers consist of 20 filters. The kernel size is 5. The pooling layer has a kernel size of 2. Convolution layers and pooling layers are arranged alternatively three times.

Figure 1. Nodule ≥ 3mm

Figure 2.Non Nodule ≥ 3mm

SVM_LBP 4.265sSVM_HOG 4.199sSVM_ExHOG 5.936sSVM_CNN 3.969sSVM_HOG_CNN 4.111sSVM_ExHOG_CNN 5.332sSVM_CNN_LBP 4.059sSVM_ExHOG_CNN_LBP 5.908sSVM_HOG_CNN_LBP 12.386s

Table 1. Time Taken by SVM Classifier Using LBP, HOG, ExHOG, CNN

KNN_LBP 2.233sKNN_HOG 3.414sKNN_EXHOG 4.902sKNN_CNN 4.006sKNN_HOG_CNN 7.518sKNN_EXHOG_CNN 4.994sKNN_CNN_LBP 2.387sKNN_EXHOG_CNN_LBP 5.296sKNN_HOG_CNN_LBP 3.837s

Table 2. Time Taken by KNN Classifier Using LBP, HOG, EXHOG, CNN

RF_LBP 11.999sRF_HOG 22.758sRF_ExHOG 34.978sRF_CNN 15.496sRF_HOG_CNN 25.398sRF_ExHOG_CNN 36.273sRF_CNN_LBP 15.630sRF_ExHOG_CNN_LBP 38.139sRF_HOG_CNN_LBP 25.398s

Table 3 .Time Taken by RF Classifier Using LBP, HOG, EXHOG, CNN



ReLU layer is activation layer. Softmax layer which is output layer extracted the feature of nodule as shown in Figure 19. The average of accuracy is 84.59% with maximum iteration 1770 in CNN feature extraction.

Hybridized feature descriptorsHOG, ExHOG, LBP feature descriptors have low

level features which is shape or texture. High level feature scaling , translation, shearing and rotation is taken from CNN. By combining the low level and high level features we get the shape features of the nodule which is hetrogenous feature compution(Sihong Chen, Jing Qin et al,2016). The hybridized feature descriptors output of CNN+HOG, CNN+EXHOG, CNN+LBP are taken and then classified by SVM , KNN, Random Forest, Decision Tree classifiers.

The Combination CNN and LBP formula, (7)

(8)

(9)

Figure 3. Sample Trained Data

Figure 4. CNN Architecture from the Input Image of Size 48X48

Figure 5. Four Diagonal Pairs of Neighbours of Centre Pixel Xij

Figure 6. Block Bk of Matrix 5X5 with the Pixel Xij and Its Neighbours

DT_LBP 2.372sDT_HOG 4.234sDT_ExHOG 7.229sDT_CNN 4.185sDT_HOG_CNN 4.579sDT_ExHOG_CNN 7.218sDT_CNN_LBP 7.390sDT_ExHOG_CNN_LBP 7.725sDT_HOG_CNN_LBP 4.923s

Table 4 .Time Taken by DT Classifier Using LBP, EXHOG, CNN

CNN_LBP =∑ ∑ 𝑤𝑤𝑗𝑗=1 𝑛𝑛

𝑖𝑖=1 Kij . I x+i-1,y+j-1+ ∑ 1𝑝𝑝=0 s(gp-gc).2p

(I * K)x,y - ( Î * K)x,y

Hence LBP P,R = ---------------------------- (8) K xc-x+1,yc-y+1

LBP P,R = ∑ 1𝑝𝑝=0 s(gp-gc).2p

= I (xc, yc) & s(x) = 1 , if x > = 0 (9) 0 , otherwise



The hybridized CNN and HOG became as,

(10)

Next HOG and LBP mathematically expressed as,

(11)

(12)

(13)

where LBP of an image can be converted in to

SVM_LBP

SVM_CNN

SVM_EXHOG SVM_HOG SVM_HOG_ CNN

SVM_HOG_ CNN_LBP

SVM_EXHOG_ CNN_LBP

SVM_CNN_LBP

SVM_ EXHOG_ CNN

430 37 429 38 451 16 453 14 453 14 435 32 434 33 453 14 452 1527 104 45 86 26 105 20 111 15 116 24 107 24 107 15 116 13 118

Table 5. Confusion Matrix of SVM Classifier with Other Feature Descriptors

Figure 7. Multi Classifiers with Hybrid Descriptors

RF_LBP RF_CNN RF_EXHOG RF_HOG RF_HOG_ CNN

RF_HOG_ CNN_LBP

RF_EXHOG _CNN_LBP

RF_CNN_LBP

RF_ EXHOG_ CNN

464 3 448 19 466 1 464 3 456 11 435 32 457 10 453 14 456 11

41 90 23 108 55 76 29 102 28 103 24 107 29 102 20 111 33 98

Table 6. Confusion Matrix of Random Forest Classifier with Other Feature Descriptors

Figure 8. HOG - Histogram of Nodule Image

Figure 9. ExHOG-Histogram of Nodule Image

Figure 10. LBP-Histogram of Trained Nodule

(I m,θ * K) x,y=∑ ∑ 𝑤𝑤𝑗𝑗=1 𝑛𝑛

𝑖𝑖=1 Kij . M x+i-1,y+j-1 . 𝑒𝑒𝑖𝑖θ(x+i −1,y+i−1)

I x,y = M(x,y). 𝑒𝑒𝑖𝑖θ(x,y)

LBP = I xc,yc = I M(xc,yc), θ (xc,yc)

LBP = M(xc,yc). 𝑒𝑒𝑖𝑖θ(xc ,yc )



magnitude ‘M’ and angle ‘θ’.

Analysis of classification with different combination of feature descriptors

The Classifiers SVM, K-NN, Decision Tree and Random Forest are used for the classification of nodules and nonnodules. The details of patientid, nodule, noduleid, slicenumber, zposition and diameter are the data collected from LIDC dataset. Here TN and TP values are classified by hyperplane distance in SVM Classifier. All data are tested and trained. Seperate training and test data sets are constructed using k-fold cross validation approach, the

value of k taken as five. In k fold validation 467 nodules and 131 nonnodules are validated and implemented in matlab 2017a. The small round or oval shaped growth in lung size less than or equal to 3mm in diameter is nodule. In the Region of Interest (ROI), the nodule <=3mm assigns the value one or else the value is zero. The nodule intensity is 21 x 21. In CT input dicom images, the nodule region of each five slices taken as p2,p1,n,n1,n2 and trained well. Here ‘n’ is Region of interest. The Positive Predictive Value (PPV) and Negative Predictive Value (NPV) are calculated from confusion matrix. The PPV value of SVM_LBP is 92%. The PPV value of SVM_CNN is

DT_LBP DT_CNN DT_EXHOG DT_HOG DT_HOG_ CNN DT_HOG_CNN_LBP

DT_EXHOG_CNN_LBP

DT_CNN_LBP

DT_ EXHOG_ CNN

424 43 422 45 423 44 422 45 424 43 429 38 417 50 418 49 418 49

31 100 33 98 26 105 27 104 34 97 37 94 35 96 35 96 35 96

Table 7. Confusion Matrix of Decision Tree Classifier with Other Feature Descriptors

KNN_LBP

KNN _CNN

KNN _EXHOG

KNN _HOG

KNN _HOG_ CNN

KNN _HOG_CNN_LBP

KNN _EXHOG_CNN_LBP

KNN _CNN_LBP

KNN _ EXHOG_ CNN

449 18 431 36 451 16 466 1 445 12 449 18 449 18 443 24 451 16

33 98 24 107 24 107 34 97 29 102 33 98 33 98 26 105 24 107

Table 8. Confusion Matrix of KNN Classifier with Other Feature Descriptors

Figure 11. Performance Metrics Analysis of CNN with Classifiers

Figure 12. Performance Metrics Analysis of LBP with Classifiers

Figure 13. Performance Metrics Analysis of HOG with Classifiers

Figure 14. Performance Metrics Analysis of ExHOG With Classifiers



91.8%, lesser than SVM_LBP. The PPV value of SVM_HOG_CNN, SVM_ExHOG and SVM_HOG_CNN_LBP are 97% ,96.5% and 93% respectively. The NPV value of SVM_LBP is 79.3%. The SVM_CNN, SVM_LBP, SVM_ExHOG, SVM_HOG_CNN_LBP have NPV values as 65.64%, 79.3%, 80.15%, 81.6% vice versa.

The output of hybridized feature descriptors CNN, LBP, HOG, ExHOG combined in various patterns and SVM classifier classified the nodules as shown in Table1 and 2. The results depict that the proposed method

performed well. The TP, TN, FP, FN values are taken from confusion matrix Table1 and Table2. The performance of SVM classifier is calculated using various metrics such as Accuracy, Sensitivity, Specificity, Cohen’s Kappa, Fscore, Precision and Recall. The accuracy of classification based on SVM classifier for both training and test data was evaluated by chi square test at confidence level. The highest accuracy value of SVM_ExHOG_CNN is 95.32% as result. The performances of the SVM Classifier is compared with KNN, Decision Tree and Random Forest

Figure 15. Performance metrics analysis of HOG_CNN with Classifiers

Figure 16. Performance Metrics Analysis of CNN_LBP with Classifiers

Figure 17. Performance Metrics Analysis of ExHOG_CNN with Classifiers

Figure18. Performance Metrics Analysis of EXHOG_CNN_LBP and with Classifiers

Figure 19. The Output Layers of CNN for Given Nnodule



Classifier. The KNN classifier classifies the nodule and nonnodule with K value one and the accuracy 94.15% is obtained. A decision tree is a non-parametric supervised learning method generally used in data mining. The goal is to create a model that predicts the value of target value using decision rules and pruning which inferred from data features. Decision Tree does not work well for simple classification. Random forest uses a number of decision trees in order to improve the classification rate of pulmonary nodule images , depends on the values of a random vector for all trees in the forest. K-NN performs medium level in classification of two classes.

Time Among the determined KNN classifier techniques

minimum time of KNN_LBP is 2.333s which is tabulated and the minimum value is bolded. The DT_LBP gives the time value as 2.37s which is lower than SVM classifier. The RF_LBP gives the time value 11.999s which is higher than the other classifiers.

AccuracyOver all Accuracy value is calculated in the combination

of SVM_LBP, SVM_HOG, SVM_ExHOG, SVM_CNN, SVM_ExHOG_CNN, SVM_CNN_LBP, SVM_HOG_

CNN. The result shows the accuracy value of both SVM_HOG_CNN and SVM_CNN_LBP are the same as 95.15% and the highest accuracy value is SVM_ExHOG_CNN as 95.32%. The value of SVM_ExHOG_CNN value is greater than SVM_HOG, SVM_ExHOG, SVM_CNN, SVM_CNN_LBP, SVM_HOG_CNN combinations. It is noted that SVM classifier is improved in CNN + ExHOG than CNN+HOG . The accuracy value of KNN classifier is good with HOG descriptor as 94.15%. Random forest classifier‘s best accuracy value with HOG is 94.65% and other combinations are greater than 90.64%. Decision Tree classifier with ExHOG combination gives best accuracy value as 88.3%.

(14)

Where TP is true positive that counts classified foreground pixels. FP counts the background pixels incorrectly classified as foreground. FN false negative that counts foreground pixels incorrectly classified as background. TN counts correctly classified background pixels.

SpecificityIt is the ability of the classifier to predict the true

Accuracy = 𝑇𝑇𝑇𝑇+𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇+𝐹𝐹𝑇𝑇+𝐹𝐹𝑇𝑇+𝑇𝑇𝑇𝑇

Figure 20. Overall Acuracy

Figure 21. Accuracy Vs Iteration for Proposed Convolutional Neural Network (ConvNet)



negatives which is given by the equation

(15)

Specificity of the SVM classifier is good since the true negative values are mostly identified. The Specificity value of SVM_ExHOG_CNN combination is 95.03% which is higher than other classifiers KNN, Decison Tree and Random Forest combinations. The value of RF_EXHOG is 57.95% due to less misclassification rate.

In specificity the maximum value among the results is taken. The Specificity values of KNN, Decision Tree and Random Forest are 81.77% , 80.17, 79.52 % vice versa.

SensitivitySensitivity of both RF_ExHOG and KNN_HOG has

value 99.78% and it is higher than other classifiers.

(16)

Cohen’s KappaThe value of Cohen’s Kappa of SVM_HOG_CNN

and SVM_CNN_LBP got same value as 85.78% and the value of RF_HOG is 82.9%.

(17)

where na is number of agreements, ne is number of agreements due to chance and n is number of samples.

F scoreF score is a benchmark metric, which measures image

classification accuracy by considering both the recall and precision. Best value of F score is one, while worst is zero.

(18)

(19)

(20)

This is a measure that has relation with sensitivity. F score value of SVM_EXHOG_CNN classifier is 97%.

F score is a measure relative to sensitivity. Generally F score is less for classes with fewer samples and high for classes with larger samples. The F score value of both SVM_CNN_LBP and SVM_HOG_CNN is equal as 96.9%. The Random forest classifier scores the highest value 96.88%. The least value got SVM _CNN and the value is 65.78% .

Figure 22. Comparison of Specificity

Figure 23. Comparison of F Score

Specificity = 𝑇𝑇𝑇𝑇𝐹𝐹𝑇𝑇+𝑇𝑇𝑇𝑇

Sensitivity = 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇+𝐹𝐹𝑇𝑇

Kappa Coefficient (K) = 𝑛𝑛𝑛𝑛 − 𝑛𝑛𝑒𝑒𝑛𝑛 − 𝑛𝑛𝑒𝑒

F score = 2 X Precision X RecallPrecision + Recall

where Precision = Number of images classified accuratelyTotal number of images classified

Recall = Number of images classified accuratelyTotal number of images in the database



Discussion

Many researchers used LIDC-IDRI dataset to Classify nodules. A Combination of shape, margin and texture based features for classification pulmonary nodules are segmented using a semi automated technique and classified using SVM classifier to attain accuracy. But Ashis Kumar et al., (2016) is not considered maximum correlation coefficient and provide only seed point for segmentation of nodules. Prajwal et al., (2016) dealed ConNet architecture. ConNet a type of CNN have special properties such as special invariance and allow multiple feature extraction accuracy improvement. ConNet are much better than ANN architecture in accuracy. Qing Zeng et al., (2017) results show that the CNN network achieved the best accuracy of 84.15%. This was the best result among the CNN, DNN and SAE networks. Huan et al., (2010) extracted 14 GLCM textural features to analyze the CT images of benign and malignant pulmonary nodules. The author suggested that need additional variable for explaining texture features in CT images of benign and malignant. He used multilevel binomial logistic prediction model of pulmonary feature of CT images. Goncalves et al., (2016) approaches 3D lung nodule segmentation with Hessian based matrix. He proposed a method validated with 569 solid nodules, provided accurate pulmonary nodule volumes for posterior characters. Shape index and Curvedness approaches are calculated in justavascular nodules. Aranaud et al., (2016) detected multiview convolutional neural network automatically learnt discriminative features from the LIDC data with sensitivities 85.4% at 1 false positive per scan with low computation time to detect nodules. Multitask learning to leverage heterogenous computational features derived from deep learning models of Stacked Denoising Autoencoder (SDAE) and CNN, as well as Hand crafted Haar like and HOG features for the description of a semantic features of lung nodule. The gap between the clinical remark concepts and the low level image features are the major factors that thwarts the retrieval accuracy. The step of computational mapping between the clinical remark terms and image contents to define the similarity measures at semantic level can helped to improve the CBIR performance. In this method the retrieval is based on overall similarity or each specific semantic feature.

In the proposed work, it is observed that the tree based classifiers Decision tree and Random Forest with hybrid feature vectors in deep learning produces lower classification rate ie. approximately 86 – 94 % compared to other classifiers SVM and KNN. Yet SVM can handle the classification of nodules in images well in manner with accuracy rate 95.32%. KNN is also having accuracy as 94.15%. It is noticed that SVM (RBF kernel) can be appropriate classifier for large datasets.

In conclusion, in Deep learning environment, the proposed hybridized feature descriptor is found to be the best for LIDC pulmonary image dataset. This paper presents the test results of classification of the pulmonary image dataset. The feature descriptors CNN, LBP, HOG, ExHOG are hybridized in two or three combinations with various single classifiers that is like SVM, KNN, DT

and RF. It may be noted that the SVM_CNN_LBP and SVM_HOG_CNN have classification accuracy which is comparable to the accuracy value of SVM_EXHOG_CNN and the accuracy is 95.15%. SVM_EXHOG_CNN assists in achieving the highest classification accuracy of 95.32% and the execution time is 3s more than KNN_LBP’s execution time 2.233s. Therefore SVM_EXHOG_CNN is 10.04% higher than DT_HOG_CNN in terms of accuracy. Among 36 combinations tried the execution time of KNN_LBP got to be the least with value 2.233s. Considering all the above points, SVM classifier is better than KNN, DT and RF classifiers for the classification of two classes due to its higher accuracy. In future best detection of malignant lung nodule can be achieved by training 2DCNN to 3DCNN with other datasets.

Referrences

Alif Hayat M (2015). Classification of membrane protein types using voting feature interval in combination with Choo’s pseudo amino acid composition. Theor Biol, 384, 78-83.

Amit S, Xudong J, How-Lung E (2011). Extended histogram of gradients with asymmetric principal component and discriminant analyses for human detection. IEEE Canadian Conference on Computer and Robot Vision, pp 64-71.

Ani Brown Mary N, Dejey D (2017). Coral reef image classification employing Improved LDP or feature extraction. IVCIR, 49, 225-42

Apte C, Weiss SM (1997). Data mining with decision trees and decision rules. Future Gener Comput Syst, 13, 197-210.

ArmatoSG,McLennan G, Bidaut L (2011). The lung image database consortium (LIDC) and image database resource initiative (IDRI), a completed reference database of lung nodules on CT scans. Med Phys, 38, 915-31

Ashis kumar D, Sudipta M, Anirvan D, Mandeep G, Niranjan K (2016). A combination of shape and texture features for classification of pulmonary nodules in lung CT images. Springer, 29, 466-75.

Beijbom O, Edmunds PJ, Kline DI, Mitchell BG, Krigman D (2012). Automated annotation of coral reef survey images,in: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), providence, Rhode Island, pp 16-21.

Chen J, Chen Z, Ch Z, Fu H (2014). Facial expression recognition based on facial components detection and Hog features. in international workshops on ECE sub fields, pp 884-8.

Corinna C,Vladimir V (1995). Support vector networks.Machnine Learning, 20, 273-97.

Eduardo T, Alan R, David ML, Hyxia V, Antonio B (2014).Implementation of a fast coral detector using a supervised machine learning and gabor wavelet feature descriptors, Sensor Systems for a Changing Ocean (SSCO). IEEE, pp1-6.

Golmohammadi KSK, Crowley L, Reformat M (2007). Classification of cell membrane proteins. Front Convergence Biosci Inf Technol, 20, 153 -8.

Goncalves L, Novo J (2016). A. Canpilho, Hessian based approaches for 3D lung nodule segmentation. Exp Sys Appl, 61, 1-15.

Gould MK, Donington J (2013). Evaluation of individuals with pulmonary nodules: When is it lung cancer?: Diagnosis and management of lung cancer: American College of Chest Physicians evidence-based clinical practice guidelines. Chest, 143, 93 –120.

Hayat M, Khan A (2012a). Mem- PHybrid: hybrid features-based prediction system for classifying membrane protein types.



Anal Biochem, 424, 35 -44He Y, Hengyong Yu (2016). Deep learning for the classification

of lung nodules, Conference Paper, Cornell University.Huan W, Xiu H, Zhong WJ, et al (2010). Multilevel binomial

logistic prediction model for malignant pulmonary nodules based on texture features of CT image. Eur J Radiol, 74, 124-9.

Jie-Zhi C, Dong N, Yi-Hong C, et al (2016). Computer aided diagnosis with deep learning architecture:Applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep, 6, 1-13

McWilliams A, Martin MB,Tammemagi C, et al (2013). Probability of cancer in pulmonary nodules detected on first screening CT. N Engl J Med, 369, 910-9.

Niranjana G, Ponnavaikko M (2017). A review on image processing methods in detecting lung cancer using CT images. International Conference on Technical Advancements in Computers and Communications, pp10-1.

Prajwal R, Nishal A, Perieira RS (2016). Convolutional neural networks for lung cancer screening in computed tomography (CT)scans. IEEE, DOI:10.1109/IC3I.2016.7918014.

Qi D, Hao C, Lequan YU, et al(2016). Multi-level contextual 3D CNNs for false positive reduction in pulmonary nodule detection. Biomed Eng, 64, 1558-67.

Qingzeng S, Lei Z, Xcing Ke L, Xuechen D (2017). Using deep learning for classification of lung nodules on Computed tomography images. J Health Care Eng, 2017, 7.

Rabia N, Khurram SA, Tariq B (2017). Recent trends in Computer Aided diagnosis of lung nodules in thorax CT scans. International Conference on Innovations in Electrical EngineeringAnd Computational Technologies, pp 5-7.

Rangayyan RM, El-Faramwy NM, Desautel JL, Alim OA (1997).Measures of acutance and shape for classification of breast tumors. IEEE Trans Med Imaging, 16, 799 -810.

Sihong C, Jing Q ,Xing J (2016). Automatic scoring of multiple semantic attributes with multi-task feature leverage: A study on pulmonary nodules in CT images. IEEE, 36, 802-14.

Tara NS, Abdel - Rahman M, Brian K, Bhuvana R(2013), Deep donvolutional neural of networks for LVCSR. IEEE conference,Speech and signal processing, DOI:10.1109/ICASSP.2013.6639347.

Vishal KP, Sagar KN, David PN, et al (2013). A practical algorithmic approach to the diagnosis and management of solitary pulmonary nodules. Chest J, 143, 840-6.

This work is licensed under a Creative Commons Attribution-Non Commercial 4.0 International License.

Abstract—This study is expected to compress true color image

with compression algorithms in color spaces to provide high compression rates. The need of high compression ratio is to improve storage space. Alternative aim is to rank compression algorithms in a suitable color space. The dataset is sequence of true color images with size 128 x 128. HAAR Wavelet is one of the famous wavelet transforms, has great potential and maintains image quality of color images. HAAR wavelet Transform using Set Partitioning in Hierarchical Trees (SPIHT) algorithm with different color spaces framework is applied to compress sequence of images with angles. Embedded Zerotrees of Wavelet (EZW) is a powerful standard method to sequence data. Hence the proposed compression frame work of HAAR wavelet, xyz color space, morphological gradient and applied image with EZW compression, obtained improvement to other methods, in terms of Compression Ratio, Mean Square Error, Peak Signal Noise Ratio and Bits Per Pixel quality measures.

Keywords—Color Spaces, HAAR Wavelet, Morphological

Gradient, Embedded Zerotrees Wavelet Compression.

I. INTRODUCTION

ECENT techniques in the discipline of image analysis resulted large amount of images in each time. Processed

image or video takes large storage area during the time of download or upload. Similarly, the storage and transmission of statistics is big hassle and there is a necessity to take essential steps. As stated by way of Parkirson’s first law [13], as the necessity of storage increases, transmission capacity twice will increase. Though it is far boom in rate, the storage capacity is good than the technology. Data compression deals with morse code later hired in the famous Huffman code. In order to overcome this challenge, data compression [17] has been presented to compress the size of data being stored or transmitted. Eliminating redundancies is subsequent state of artwork method to symbolize information in binary form. The velocity of moving records rate in WIFI is becoming important these days because of thousands and thousands of users in social media. The storage and computation speed are not in direct percentage. Like sparse matrix, the redundant records are compressed by suitable compression algorithms like EZW, SPIHT, Spatial Orientation Tree Wavelet (STW), Wavelet Difference Reduction (WDR), Adaptively Scanned Wavelet Difference Reduction (ASWDR) and SPIHT 3-D.

II. RELATED WORKS

In this section existing papers applied inside the discipline of image compression are explained elegantly. The wavelet

S. Piramu Kailasam is Assistant Professor with Sadakathullah Appa College, India (e-mail: [email protected]).

remodel [5] has important role with image compression. The wavelet is outstanding than Discrete Cosine Transform (DCT) [10]. Similarly, wavelet based compression is best in interpreting mistakes and transmission of images because of the purpose of high resolution nature [7] and degradation tolerance. In [1], the Huber–Markov random field model with objective artifact achieved good Bits per pixel (BPP) and PSNR.

Alzahir and Borici [4] compress discrete color images used the method codebook row column reduction coding in comparison with maps and binary images.

Babu et al. [2] overcome the drawback of Run Length Coding (RLC) with matrix based on mapped-pair approach. In addition, the author analysed the performance metric of compressed images .tiff, .jpg, .gif and text documents by RMSE, SNR, PSNR and CR. The robustness and compression efficiency [9] is performed in lossless compression [2] technique for color images and raster images. Consequently, DCT area has been introduced in most video compression techniques [2]. The overall performance of DCT degrades in high compression ratio [14]. Moreover, many researchers substantially target Discrete wavelet transform (DWT). DWT [14] is appropriate for human visible gadget for purpose of adaptive spatial frequency. At the same time, another compression algorithm, Embedded zerotree, does no longer need preknowledge of image. EZW is good enough in encoding and accuracy rate for true color images.

In embedded image coding, the usage of zerotree of wavelet coefficient (EZW) using stream of bits [6] is an efficient lossy compression image technique. Larger coefficients of EZW are the prime factors than smaller coefficients regardless of their scale [16]. An embedded code is produced in the order of importance. It defines a string of binary selections. In EZW, encoding and decoding may be ended when the goal is reached. EZW [17] is the basic method for tree shape approach and it is an effective embedded image compression that generates bit circulation. Ultimately SPIHT [11] is refinement to EZW and it uses the principle of operation. Evaluation of EZW, SPIHT is high in terms of compression parameter [12]. In alternative, SPIHT coding operates in exclusive scales on the equal spatial vicinity within the wavelet subbands. This yields embedded bits with less MSE. Similarly, SPIHT can be useful to lossless algorithm [15].

Jamel [21] pointed that SPIHT algorithm is simple computation approach. The work in SPIHT algorithm [18] suggests that the execution time is reduced when compression ratio value is elevated. DWT-SPIHT is good in quality with high PSNR. The wavelet based contourlet transform (WBCT)

Efficient HAAR Wavelet Transform with Embedded Zerotrees of Wavelet Compression for Color Images

S. Piramu Kailasam

R

World Academy of Science, Engineering and TechnologyInternational Journal of Computer and Information Engineering

Vol:14, No:11, 2020

406International Scholarly and Scientific Research & Innovation 14(11) 2020 ISNI:0000000091950263

Ope

n Sc

ienc

e In

dex,

Com

pute

r an

d In

form

atio

n E

ngin

eeri

ng V

ol:1

4, N

o:11

, 202

0 pu

blic

atio

ns.w

aset

.org

/100

1157

7/pd

f

using SPIHT like algorithm is visually superior to the wavelet SPIHT in preserving data and texture details.

Ajala et al. worked on hybrid compression. The LZW and Huffman image compression strategies are helped to broaden hybrid compression [1] in clinical images. Concatenation of LZW code words and Huffman blunders detection techniques are implemented to get size reduction, high compression ratio and high signal noise ratio.

SPIHT algorithm is a powerful wavelet based image compression algorithm, here the tree structure is chosen to represent the transformed image which is identified by the coefficients of each node. The existing work focused that Runlength Encoding [17] employs high redundant data in loss less compression. Pearlman pointed that SPIHT [15] method is efficient for wavelet image compression [3]. SPHIT uses arithmetic coding to improvise image quality and it is suitable for hardware implementations. Later days SPIHT has become benchmark algorithm. It is one of the best wavelet based coding algorithms. 3D SPIHT coding is an excellent technique to color image compression than conventional methods such as EZW; SPIHT always generates local optimal values. 3D SPIHT [11], [19] is the modern octal tree technique for 3 dimensional true color image compression. In this study 3D SPIHT with different color conversion methods are applied to different types of images. SPHIT algorithm is applied to sequence of images with consequent angles. It is experimented with true color image as well as gray scale image.

The LZW and Run length Encoding compression methods are excellent to Tiff documents. Especially LZW is suitable for textual content files. Another technique LVL_MMC (subband thresholding of coefficients and Huffman Encoding) is a level thresholding wavelet compression method. The parameters of this subband thresholding are calculated from the differences between means method. One of the most important papers on image compression within 1977, by Ziv and Lempel notably pointed about compression with all varieties of facts. In the paper [20], LZ coding, dictionary based coding with loss much, less strategies are applied on Tiff, Gif, Pdf, Gzip, Zip, V.42 and Png files.

III. MATERIAL AND METHOD

A sequence of portable network graphics (png) file formats are taken as input images. This image contains a single or multi image in an extensible structure of chunks, encoding with the basic pixels and other textual information. The file size is 128x128.

A. Color Space Based Compression Algorithm

In this observation, the three goals are as follows: The first objective is to discover best color space with

HAAR wavelet based compression techniques EZW, SPHIT, SPIHT 3D, LVL-MMC using true color images.

The second objective is to find best compression method for sequence of true color images as well as gray color images for best compression ratio.

The third objective is applying morphological gradient input image to get best compression ratio and PSNR using

proposed frame work.

Fig. 1 Sample True Color Images The motive of this work is to lessen the storage area and

improve compression ratio using HAAR wavelet compression approach. The proposed work is dealt with series of images with .png files. Further the file formats png, bmp, tiff files are checked using color space conversion techniques rgb, yuv, kit, yiq and xyz. Here yuv represents YUV colour space transform, klt represents Karhunen-Loeve transform, yiq represents YIQ color space transform, xyz represents CIE-XYZ color space transform. XYZ color space transform gives suitable for true color single object image.

There are legitimate compression strategies which can be progressive coefficients significance method (PCSM) and Coefficient thresholding method. Embedded zerotree wavelet, SPIHT, Spatial Orientation Tree, wavelet difference reduction, Adaptively Scanned Wavelet difference reduction, SPIHT 3-D are the compression techniques comes beneath PCSM category. Among these compression techniques the proposed work uses EZW, SPHIT, SPHIT 3-D and LVL-MMC methods which are suitable for png type single object files.

On this segment, the technique for assessing reconstruction and overall performance of HAAR wavelet for compressing unique record codecs has been investigated using MATLAB. To schematic diagram of experiment is shown in Fig. 2. During the evaluation period EZW, SPIHT, SPIHT 3-D, LVL-MMC are examined for gray scale and true color single object sequence images. Also same strategies are tried to Tiff, Bmp, Jpg, Png file formats. The Compression ratio, PSNR and Saving space % measure the reconstruction performance of an image. Here compression ratio represents compressed image in terms of storage. PSNR represents maximum pixel magnitude of original image.

B. Morphological Gradient

In order to achieve good PSNR and compression ratio, morphological internal gradient method is applied. This internal gradient (Gi) operation forms edge enhanced image with 3x3, 6x6, 12x12 in x, y axes. Transitions of image edges’


Vol:14, No:11, 2020


Ope

n Sc

ienc

e In

dex,

Com

pute

r an

d In

form

atio

n E

ngin

eeri

ng V

ol:1

4, N

o:11

, 202

0 pu

blic

atio

ns.w

aset

.org

/100

1157

7/pd

f

representation from dark to bright and bright to dark region in morphological gradient’s procedure is as: Step1. Create a duplicate of the input image. Let it be Q. Step2. Apply Erosion operation to the input image (P). Step3. Edge = Input Image – Eroded Image,

i.e., Gi = P - (P Ɵ Q) Step4. Display the Gradient image Step5. Exit

Here image may be either True Color Image or Gray Scale Image.

IV. RESULT AND DISCUSSION

The SPIHT algorithm gives good PSNR value and the values are plotted as graph below. Fig. 3 shows that when the angle increases from 5 to 85 degree the PSNR value decreases. When the angle reached at 90 degree the curve goes down. In addition, at the angle of 90 to 160 degree the PSNR values are in increasing order for gray scale image as shown in Fig. 3.

Fig. 2 Schematic Diagram

Fig. 3 PSNR attained for Gray Scale image

Fig. 4 MSE attained for Gray Scale Image

Fig. 4 shows that the MSE value of gray scale images slowly increase from the angle 5 to 80 degrees and suddenly

Image

Apply Compression

Methods (EZW, SPIHT,

SPIHT 3D, LVL-MMC)

Apply HAAR Wavelet, Morphological gradient

Apply Color Spaces to True

Color Image and Gray Color Image

Apply Best Color Spaces Select Best Compression Method

Reconstruct Image with Best Color Space and Compression method


Vol:14, No:11, 2020


Ope

n Sc

ienc

e In

dex,

Com

pute

r an

d In

form

atio

n E

ngin

eeri

ng V

ol:1

4, N

o:11

, 202

0 pu

blic

atio

ns.w

aset

.org

/100

1157

7/pd

f

increase from the angle 20 to 70. Along with this, there is decrement in angle from 70 to 25 then it slowly decreases from 100 to 160 degree. True color image’s MSE value is higher than gray scale image in SPIHT method.

In SPIHT method, the compression ratio is lesser in true color images than gray scale images. Fig. 4 shows that compression ratio also displays the result similar to PSNR graph, Fig. 3.

Fig. 5 Space Saving for True Color Image

Even though the lossy EZW coding algorithm reduces number of bits, still it cannot compromise loss of information.

SpaceSaving=1 -

(1)

Space saving is a second quality measure compression

method using compression ratio. This gives the numerical value of storage space saved by the HAAR-EZW method. Equation (1) shows the unity difference of ratio between number of bits in compressed data and uncompressed data gives the storage value of SpaceSaving.

The SPIHT algorithm has been applied to true color images. The notable point is that PSNR values received from true color images using SPHIT are greater than gray color with SPIHT method.

Fig. 6 Comparison of PSNR and BPP for True Color Image

Fig. 7 Sample compressed gray scale images of SPIHT Method and true color images O1,O2,O3,O4,O5

In order to achieve good PSNR and compression ratio,

morphological internal gradient method is applied. Fig. 8 shows the results for test gray scale images. The

LVL-MMC compression method is driven by morphological gradient to increase PSNR value. The results of true color

Angle Vs Space Saving

0 20 40 60 80 100 120 140 16085

86

87

88

89

90

91

92

Sp

ace

Sa

vin

g %

Angle


Vol:14, No:11, 2020


Ope

n Sc

ienc

e In

dex,

Com

pute

r an

d In

form

atio

n E

ngin

eeri

ng V

ol:1

4, N

o:11

, 202

0 pu

blic

atio

ns.w

aset

.org

/100

1157

7/pd

f

image shows that when the BPP value increased by 0.2, also MSE value is decreasing randomly. The value of PSNR increased excellently. Added with this, compression ratio is reasonably increased. Hence the LVL-MMC subband thresholding method attains commendable PSNR and compression ratio. Thus, several methods are applied to both gray scale and color images. Images are taken in sequence and single forms. The results showed that grayscale image is best for compression. So in prosed work the true color images are converted to gray scale image with same size.

Fig. 8 Morphological gradient applied true color and gray scale images

A. Proposed Morphological Gradient with EZW Compression

EZW is an edge wise compressed zerotree algorithm, Hence the other name of EZW could be EWZ. In EZW the coefficients are close to zero after the transformation. Among compression methods, Embedded Zerotrees Wavelet method is termed as conventional compression method.

Fig. 9 Morphological Gradient with EZW Compressed Sequence images

Compared to other recent methods, EZW is giving

appreciable results in sequence gray color images. Also producing a fully embedded bit stream is one of the good properties of EZW compression. No training data of any kind are required. EZW is a lossy compression method proved again with good compression ratio. Here the levels of HAAR is seven. Number of encoding loops is taken as nine.

The above morphological gradient applied images, Fig. 9, give good compression ratio and quality of image also viewable.

In this work, Fig. 10 (d) normalized histogram Original

Image shows that the storage space is higher than compressed Image of Fig. 10 (c).

(a) (b)

(c) (d)

Fig. 10 (a) Original Image, (b) Compressed Gray Scale Image, (c) Normalized Histogram of Original Image, (d) Normalized Histogram

of Compressed Gray Scale Image

V. PERFORMANCE EVALUATION

Firstly, the proposed HAAR-MG-EZW compression method is compared with SPIHT, SPIHT-3D, LVL-MMC methods for gray scale images in terms of PSNR (dB) and compression ratio. EZW with morphological gradient compression scheme shows good results than other methods, Fig. 11.

PSNR = 20 log10 (2)

Here Xi is original data; RMSE is Root Mean Square Error. Another important measure, Peak Signal Noise Ratio

(PSNR) computes the value in decibels, between two true color and gray scale compressed images.

Fig. 11 Evaluation of wavelet based compression methods for HAAR wavelet on gray scale images

0

5

10

15

20

25

30

35

40

EZW

SPHIT

SPHIT 3D

LVL‐MMC

EZW

SPHIT

SPHIT 3D

LVL‐MMC

PSNR CR

A1 A2 B1 B2 B3 B4 B5


Vol:14, No:11, 2020


Ope

n Sc

ienc

e In

dex,

Com

pute

r an

d In

form

atio

n E

ngin

eeri

ng V

ol:1

4, N

o:11

, 202

0 pu

blic

atio

ns.w

aset

.org

/100

1157

7/pd

f

The most popular measure to calculate efficiency of a compression algorithm is compression ratio (CR). It is defined as the ratio of bits to store uncompressed data and total number of bits to store compressed data.

CR =

(3)

Secondly the HAAR-MG-EZW scheme is compared in

terms of BPP for all test images as Fig. 12. EZW-BPP value is 1.67. Hence for true color image, BPP ratio is 24:1.67. This is notably high than other methods. But, the smallest BPP value is 0.45 for LVL-MMC method and BPP ratio is 24:0.45. Here, BPP is an absolute value which represents the average number of bits needed to encode the image pixel information with HAAR-MG-EZW compression scheme.

Fig. 12 Bits Per Pixel attained for test images

In Fig. 12, A1, A2, B1, B2, B3, B4, B5 are gray scale sequence images as per Fig. 7.

MSE = ∑ Xi – Yi 2 (4)

Here Xi is original data, Yi is reconstructed data.

TABLE I

MSE FOR SEQUENCE OF GRAY SCALE IMAGES

IMAGES HAAR-

MG-EZW HAAR-MG-

SPIHT HAAR-MG-SPIHT 3D

HAAR-MG-LVL_MMC

O1 14.72 73.94 73.94 168.5

O2 14.31 69.04 69.07 167.8

O3 14.67 72.62 72.62 173.4

O4 14.24 73.81 73.81 169.9

O5 12.90 64.90 64.90 156.0

CR% FOR SEQUENCE OF GRAY SCALE IMAGES

IMAGES HAAR-

MG-EZW HAAR-MG-

SPIHT HAAR-MG-SPIHT 3D

HAAR-MG-LVL_MMC

O1 20.92 8.36 8.15 6.6

O2 20.27 8.5 8.34 6.54

O3 20.43 8.39 8.26 6.34

O4 20.39 8.29 8.13 6.06

O5 19.09 8.07 7.87 6

The Mean Square Error is used to compare the image quality. This work gives the result such that when the MSE value is low then the quality of compressed image is high. Table I depicts that HAAR-MG-EZW method gives the lowest MSE value 12.90. HAAR-MG-LVL-MMC method shows the highest error value 173.4.

Compression Ratio percentage of EZW is higher than SPIHT, SPIHT-3D AND LVL_MMC methods. In this work, the results of YUV and YIQ color space methods gave high MSE values whereas, KIT and XYZ methods showed low MSE and Max Error values.

VI. CONCLUSION

Image compression is one of the exceptional applications of data compression to lessen information redundancy in an efficaciously way. Image Reconstruction in overall performance of HAAR Wavelet is vital due to the fact that series of coefficients are used for extraordinary instance of time. The exceptional choice of color space xyz using png image is implemented to series images and different file formats with varying dimensions also evaluated. The end result confirmed that HAAR wavelet method is comparatively faster and much less complexity than other wavelet methods. Experimental effects additionally show that HAAR-EZW is satisfactory in compression ratio. The image satisfactory is relying on the image type and compression ratio. This study confirms that high compression ratio produces low image quality. HAAR_LVL_MMC is good in PSNR value. In consequent sequence image using SPIHT is the best technique in phrases of compression errors and compression ratio.

REFERENCES [1] Ajala F.A., Adigun A.A, Oke A.O, ”Development of Hybrid

Compression Algorithm for Medical Images using Lempel-Ziv-Welch and Huffman Encoding”, IJRTE,Vol.7, 2018

[2] Anantha Babu, S., Eswaran, P., Senthil Kumar, C., “Lossless compression algorithm using improved RLC for grayscale image”, Arab. J. Sci. Eng. 41, 3061–3070. https://doi.org/10.1007/s13369-016-2082-x., 2016

[3] Ali Kadhim, Al-Janabi, ”Efficient and simple scalable image compression algorithms”, Ain Shams Engineering Journal,vol.10,463-470 (wavelet image compression),2019

[4] Alzahir, S., Borici, A., “An innovative lossless compression method for discrete color Images”, IEEE Trans. Image Process. 24, 44–56,2015

[5] Gonzalez R and Wood R, "Digital image processing. 2nd Edition, Pearson Education Inc., London, England. 2002

[6] Jerome, S.,119.pdf,1993 [7] Kang LW, Hsu CC, Zhuang B, Lin CW, and Yeh CH, “Learning-based

joint super-resolution and deblocking for a highly compressed image”, IEEE Transactions on Multimedia, 17(7), 2015, 921-934.

[8] Kim BJ , Pearlman WA , “An embedded wavelet video coder using three dimensional set partitioning in hierarchical trees (3D-SPIHT)”, In the Proceeding of Data Compression Conference 1997, Snowbird, Utah, USA:,1997, 251−260.

[9] Khan. A, Khan. A., Khan. M, Uzir. M, “Lossless Image Compression: application of bi-level burrows wheeler compression algorithm (BBWCA) to 2d data”, Multimedia Tools Application,2016

[10] Luo, J., Chen, C.W., Parker, K.J., Huang, T.S., “Artifact reduction in low bit rate DCT-based image compression”, IEEE Trans. Image Process. 5, 1363–1368,1996

[11] Mohammed Pooyan, Ali Taheri, “Wavelet Compression of ECG Signals Using SPIHT Algorithm”, International Journal of Signal Processing, 2004

[12] S. Nirmalraj, “SPIHT: A Set Partitioning in Hierarchical Trees

0

0,2

0,4

0,6

0,8

1

1,2

1,4

1,6

1,8

A1 A2 B1 B2 B3 B4 B5

BPP EZW BPP SPHIT BPP SPHIT 3D BPP LVL‐MMC


Vol:14, No:11, 2020


Ope

n Sc

ienc

e In

dex,

Com

pute

r an

d In

form

atio

n E

ngin

eeri

ng V

ol:1

4, N

o:11

, 202

0 pu

blic

atio

ns.w

aset

.org

/100

1157

7/pd

f

Algorithm for Image Compression”, Contemporary Engineering Sciences, Vol.8, 2015, 263-270.

[13] Parkinson, C.N., “Park Parkinson’s First Law: Work expands so as to fill the time available.” In: Parkinson’s Law and Other Studies in Administration, Ballantine Books, New York,1957

[14] Ranjeet Kumar, Utpreksh Patbhaje, A. Kumar, (2019). “An efficient technique for image compression and quality retrieval using matrix completion”, Journal of King Saud University Computer and Information sciences,2019

[15] A. Said and W. A. Pearlman, (1996). “A New Fast and Efficient Image Codec Based on Set Partitioning in Hierarchical Trees,” IEEE trans. On Circuits and Systems for Video Technology, vol. 6, 1996, pp. 243-250

[16] J.M. Shapiro, “Embedded image coding using zerotrees of wavelet coefficients,” IEEE trans. on Signal Processing (Special Issue, Wavelets and Signal Processing), vol. 41,1993, pp. 3445-3462

[17] J. Udhayakumar, T. Vengattaraman, P. Dhavachelvan, “A Survey on Data Compression techniques: from the perspective of data quality, coding schemes, data type and applications”, Journal of King Saud University- Computer and Information Sciences, 2018.

[18] Vijayakumar Sajjan, Mohammed Parvez Ali, “Design and Implementation of SPIHT algorithm for Image Compression”, IJESC, 2017.

[19] M.R. Zala, S.S. Parmar, “3D Wavelet transform with SPIHT algorithm for image compression”, IJAIEM, 2013.

[20] Ziv, J., Lempel, A., “A universal algorithm for data compression”, IEEE Trans. Inf. Theory 23, 1977, 337–343.

[21] Enas M.Jamel, “Efficiency Spiht in compression and quality of image”, J. of college education for women, vol. 22,2011.


Vol:14, No:11, 2020


Ope

n Sc

ienc

e In

dex,

Com

pute

r an

d In

form

atio

n E

ngin

eeri

ng V

ol:1

4, N

o:11

, 202

0 pu

blic

atio

ns.w

aset

.org

/100

1157

7/pd

f

Pixel level Image Fusion using Wavelet decomposition technique in Color Images

Mrs. Dr. S. Piramu Kailasam

Assistant Professor

Department of Computer Applications

Sadakthullah appa college(autonomous)

Tirunelveli, India.

Abstract Color Image Fusion is the design method to improve the image substance

by fusing images. The aim of the image fusion is to fuse more useful information and

remove the outliers from registered images. Pixel level Wavelet Image fusion method

is suitable for color images of equal in size as well as resolution in visual perspective.

Quantitative analysis is done using RMSE and entropy, DFE, SSE and R-square

measurements. The analysis report obtained from various image file types give

better result for inverse discrete wavelet transform applied image fusion method.

Keywords Image Fusion · Pixel level Image Fusion · Color Images ·

Wavelet Image Fusion · Resolution

1 Introduction

The use of colour in image processing is given by two principal factors. First the

color is an effective descriptor which is often identify the objects. Secondly, the

manual analysis made by human beings. The colour image processing is classified

into two major areas full color and pseudo colour processing. The result of full color

image processing are now used in visualization and internet. Digital color image

processing is done in pseudo color level. Colour Image fusion is especially useful in

security level in authendicated communication via fast network services. In defence

section, the send images hide with encoded colours in Lab , YCMk color model.

This will be retrieved by applying certain algorithms in terms of DWT technique.

The probability of interception get by inversing number of images to be fused and

the total possibility of combinations of fused images[7] over input images is NCr.

Here N is number of images and r is required images for combination.

The important thing is both the input images should be in same size and

resolution. In color image fusion the image splitted in to R, G, B colors calculated

in bravery color normalized technique by

Rnew=R/(R+G+B) x Pan1 (1)

Gnew=G/(R+G+B)xPan2 (2)

Bnew=B/(R+G+B) x Pan3 (3)

R-Red G-Green B-Blue Pan – Pan image

Mukt Shabd Journal

Volume IX, Issue X, OCTOBER/2020

ISSN NO : 2347-3150

Page No : 860

2 Related works

A multifocus image fusion [5] using SWT and Content adaptive blurring worked

by Fapasmin Sahoo et al., 2018. It combines multiple images of the same

scene with different resolution , intensity to get improved image fusion. Suggested

process conserves the spectral characteristics of the resultant uniformly focused fused

images. The author says the low the value of deviation index is credited. Correlation

coefficient measures the similarity between two images [1].

Fusion rule is formed based upon content adaptive blurring. In another

paper deals the image fusion of under water objects using DWT and SWT to

enhancing the resolution of image rerichment [3]. Discrete wavelet transforms

based image fusion [ 2,11,12] and denoising in FPGA by A. Anoopsuraj , Mathew

fransis (2014) is found by the parameters edge strength , fusion factor, fusion

symmetry using UWT and DTCWT methods [ 2 ]. The sobel edge method gives the

higher edge strength values represents higher edge information. Fusion factor says the

mutual information between source images and fused images. Higher the value

represents better data density. But in the case of fusion symmetry lower the value of

Fusion symmetry better result of image fusion[6].

The novel color image fusion for multi sensor night vision images, the

author says the visible image fused with Infrared image scaling factor is introduced

in transferring equation in the ‘ b ’ channel which gives multiband fused images with

natural day time color appearance.

3 Materials and methods

Image security algorithm using DWT , subband exchange based color image

fusion scheme is developed for enhanced security. Fusion rule [1] can be formed by

following pattern ( min, min ), ( min, max ), ( max, min ), ( min, mean), ( mean,min ),

( mean,median ), ( median,max ), ( median,min ),( max,mean). Fusion rules depend

on fusion application. Image fusion can give better result by choosing best fusion

rule. Fusion also having subtypes updown fusion,downup fusion, rightleft

fusion,right left fusion, user defined fusion and triangular fusion in linear.

Fig.3.1 a. Blur image1 b.Blur image2 c. Original Image (128X128)

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 861

IMAGE A (128X128)

IMAGE B (128X128)

DWT DWT

FUSED IMAGE

Color image processing is motivated by color descriptor and color

intensities. Color intensities are important in image analysis. Further color image

processing divided into two sub categories which are full color and pseudo color

processing. The problem of second one is assigning a color to particular intensity or

range of intensities. Color spectrum seen by passing whitelight through prism will

give VIBGR colors. If the light is achromatic then that is called its intensity. The

primary colors are R,G,B. The primary colors can be added to produce secondary

colors of light magenta, cyan and yellow. Distinguish on color from another are

brightness, hue and saturation.

Hue and saturation are called chromaticity. So every color is characterized

by tristimulus values denoted by x,y and z.

x + y + z = 1 here x, y, z are red, green and blue (4)

The types of color models normally used are RGB model, CMYK color model

and HSI color model. Pseudo color or false color image processing is another

milestone, consists of assigning color to gray values. Funtionals FR, FG, FB are fed

into the corresponding red, green and blue inputs of an RGB color monitor for

color image. The f( x,y ) is transformed into 3 colors R,G & B. Hence the output

functions are fR (x,y), fG (x,y), fB (x,y). The color transformations are further

categorize in to RGB-HSI, HSI-RGB conversions. Color complements are hues

opposite to one another on color circle. Color slicing is useful in separating objects

from their surroundings. Histogram processing transformation can be used to

improve color images in another way. Uniform histogram of intensity values method

is alternative type highly suitable for HSI color space.

Fig. 3.2 Complete Flow Diagram

FUSION RULE

iDWT

LL (rgb)

LH (rgb)

HL (rgb)

HH (rgb)

LL (rgb)

LH (rgb)

HL (rgb)

HH (rgb)

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 862

W J-1,D

V J

Parent Node

W J-2 V J-2

V J-1

W J-1,A

W J-1

Input images A, B are registered colour images (.png) with dimension

(128 X 128). After the pre-process, the Discrete Wavelet Transform

technique is applied on images. Lower and higher level frequency detailed pixels

are taken from four deterioration portions of the specific targets. Now best fusion

rule is chosen and applied on rgb images. Next inverse DWT applied for

further improvement for clarity and proposed fused image achieved. Resultant

synthesized image clarified with fusion tree structure. Datasets are online multi

focus colour images with same dimension.

D – Detail Filtering; A – Approximation Filtering Fig.

3.3 A three scale wavelet packet analysis tree

Haar transform is a multi resolution analysis term, basis functions are called

orthogonal wavelets [ 12]. The Haar transform is both seperable and symmetric .

Fig. 3.4 Haar wavelet decomposition 1 & 2 with max and mean fusion method

W J-3 V J-3 WJ-1,D WJ-1,A

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 863

The matrix form is

T = HFH ,

here F = N x N, H = N x N transformation matrix ,

T = resulting NxN transform.

The other transforms are wavelet transform in one dimension, discrete wavelet

transform (DWT), continuous wavelet transform (CWT), Sum wavelet transform

(SWT)[1] , Unique wavelet transform (UWT)[2] , Invert discrete transform

(iDWT), Fast wavelet transform (FWT). Haar, Daubechies, symlet, biorthogonal are

types of wavelet. The properties of haar function in dwt make easy for users to

access low quality versions of images during searches and retrieve data to refine. The

subimages are scaled to make their underlying structure more visible. The image

sizes are 64x64, 128X128, 256x256 etc. A perfect 512x512 reconstruction of the

original image also possible.

The relationship between scaling and wavelet function spaces are like below

(5)

where bigoplus operator shows the union of spaces, the orthogonal complement of

V0 in V1 is W0. Wavelet transform[4] in one dimension generalized, wavelet

series expansion, discrete wavelet transform and continuous discrete

transform[12].

Fig. 3.5 Wavelet Decomposition Tree for two blurred images with fused image

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 864

Fig. 3.6 Fourier curve fit for Blue Channel

Fig 3.7 Fourier curve fit for Blue Channel

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 865

Fig 3.8 Polynomial fit for Green channel graph of soap image 65

Fig 3.9 Gaussian Curve fitting between HLr and HLg

The above Fig 4.1 to 4.4 shows different curve fitting graph with coefficient

value 1. The dark dots are pixels. The dots plots on blue curve are fit enough. Thus

image fusion analysed with fourier, polynomial and Gaussian for LLr, LHr, LLb,

LHg, HLr and HLg.

Like the fourier series expansion[9], the wavelet series expansion maps a

function of continuous variable into sequence of wavelet coefficients. Hence the

resulting coefficients are called discrete wavelet transform (DWT) of f(x).

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 866

W j-1

D

W j-1

<----V--->(л)

W j-1

D

<---H ---- > W j-1

(-л)

V j-1

<---H--> W j-1

(л)

D

W j-1

<--- V ----- > (-л)

W j-1

D

W j-1

Fig.3.10 First decomposition of a 2 dimensional FWT spectrum

The natural extension of the discrete wavelet transform is the continuous wavelet

transform (CWT) by redundant function of two continuous variables translation

and scale. The fast wavelet transform (FWT) is an efficient dwt, good in

computation[8]. Inverse transform for the reconstruction of f(x) from DWT/FWT

approximation and coefficients can also be formulated. It is called as inverse fast

wavelet transform which uses the scaling and wavelet vectors employed in the

forward transform.

We can modify a DWT for edge detection also selected coefficients can

be deleted for better image reconstruction. We can replace the coefficients

by its subspace. The resultant is the image analysis tree. Analysis tree

supports unique decompositions. Analysis tree also an efficient mechanism for

representing wavelet packetS[12]. A wavelet packet tree supports 26 different

decompositions[11, 14].

VJ

V J-1 𝑤H - ……... 𝑤𝐷 𝐽−1 𝐽−1

Fig.3.7 Subspace analysis tree

In the above fig. 3.7 , as in the one dimensional case , generate P scale

transforms like j = J-1, J-2 ...J-P. In two dimensional nature of the input,

are denoted by wj−1H , wj−1

𝑉 , wj−1𝐷 wavelet subspaces, where H,V,D

are horizontal, vertical and diogonal directions.

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 867

4 Experimental Results and Performance Comparison

4.1 Fusion Factor (FF)

Fusion Factor is the combination of IFA and IFB factors. Where IFA is the

mutual information index between image ‘A’ and fused image, IFB is mutual

information index between image ‘B’ and fused image. Fusion Factor affects the

quality of an image. Higher value represents better data density.

FF = IFA + IFB (7)

4.2 Edge Strength (ES)

Edge Strength can be done using sobel edge operator. It calculates Gx, Gy for

image A and B,

G= √Gx2 + Gy2 (8) Higher edge strength more edge information.

4.3 Entropy

Entropy is the amount of information present in the image Here P is the

probability of pixel xi,

𝑁 𝑖=1 (𝑥𝑖) log2(𝑥𝑖) (9) E = - ∑

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 868

8

Fig. 4.1 comparison of entropy

In the above graph blue and red bar show entropy percentage of image1 and image2. Green

bar shows the entropy of idwt applied wavelet based fused image. High entropy value shows the

proposed method is better than dwt methods.

4.4 Fusion Symmetry (FS)

Fusion symmetry is an indicator of the degree of symmetry in the

information content from both images.

FS = |( 𝐼𝐹𝐴 - 0.5) | (10)

(𝐼𝐹𝐴+ 𝐼𝐹𝐵 )

The lower value of FS indicates better symmetry in the fused image.

Table 4. 1 Result of MSE, PSNR values

Ima

ges

b1_

65

&

b2

_6

5

b1_

85

&

b2

_8

5

b1_2

60

&

b2

_26

0

t1_4

&

T2_

4

t1_2 &

T2_

2

o1

_3

40

&

o2_

34

0

O1

_3

45&

O2

_34

5

O1

_3

50

&

O2

_35

0

MSE 0.01 0.01 0.01 0 0 0.02 0.02 0

PSNR 70.72 69.27 67.19 74.28 74.29 65.42 65.94 73.23

4.5 Means Square Error(MSE)

It is one the famous used measures to calculate the deviation between the

original and fused image data.

Means Square Error = ∑ ∑ [𝐼(𝑖,𝑗)−𝐹(𝑖,𝑗)]2𝑁

𝑗=1𝑀𝑖=1

𝑀𝑁 (11)

where i,j are spatial position of pixels and M , N are dimensions of the images I

and F.

0

1

2

3

4

5

6

7

e_bf1

e_bf2

e_bf

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 869

8

4.6 Peak Signal to Noise Ratio (PSNR)

The peak signal to noise ratio (PSNR) is the ratio of the maximum value of an

image and magnitude of background noises.

PSNR=10log10 (2𝑛−1)2

𝑀𝑆𝐸 (12)

4.7 Root Mean Square Error (RMSE)

This measurement is used to evaluate the fused image. Let be the fused

image, is the input image, then

Root Mean Square Error = √∑ ∑[𝐼(𝑖,𝑗)−𝐹(𝑖,𝑗)]2

𝑀𝑁 (13)

The results of MSE(Mean Square Error) gives very minimum value zero

for combination of t1_4 & t1_2, t1_2& t2_2 and O1_350 & O2_350

images. After fusion, again the size of images vary. So again resize the image using

preprocesss algorithm. The value of MSE and PSNR value for soap image is

0.09 and 58.51. The value of MSE and PSNR value for tomoto image is 0.01

and 69.48. The value of MSE and PSNR value for OralB image is 0.03 and 74.28.It

is notable that MSE value is increased in MSE value and decreased in PSNR

values, which are useful to identify the original and fused image in defence or

secured sectors.The restore of images can be done using fuzzy logic[10].

SSE is the sum of the squared difference between each observation and its

groups mean. It can be used as a measure of variation within a cluster. If all

cases within a cluster are identical then SSE would be equal to zero.

SSE = ∑𝑛 ( xi-Î)2 (14)

𝑖=1

where , n – number of observations xi –

is the value of the ith observation Î -is

mean of all observations

In cluster analysis , SSE

is minimized with

SSEtotal = SSE1+ SSE2+ SSE3.....+ SSEn (15)

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 870

Table 4.2 Result of RMSE, R-square, SSE values of true color images

Image Name

Curve fit

type

R-square

No.of

Terms

RMSE

(x10-7)

b1_65 &

b2_65

Fourier 1 4 0.03

1 8 .1522

Gaussian 0.99 1 5.62

0.99 8 2.32

t1_2 & t2_2

Fourier 1 18 .57

1 1 0.0254

Gaussian 0.992 3 5.26

0.994 8 1.45

o1_340 &

o2_340

Fourier 1 4 .2083

0.99 8 .8938

Gaussian 0.97 3 1..22

0.99 8 1.0.9

o1_350 &

o2_350

Fourier 1 1 .05

1 8 0.2262


1 8 2.84

b1_65 &

sb2_65

Fourier 1 1 .06

1 8 0.72


1 8 1.44

t1_2 & sT2_3

Fourier 0.94 1 .14

0.93 8 .15


1 8 5.94

R-square is a goodness of fit measure for statistical regression models. R-square of

1 indicates that the images perfectly fit the data. The values of R-square outside the

range from zero to one, helps to decide the fitness of pixel data of an image. It is

noted that when the number of iterations increased the RMSE value also

increases. Particularly in (Max,Mean) fusion method for true color image,

fourier fit increases by o.o2 by each iteration.

Conclusion

The resultant multi fuse process of DWT and iDWT shows the fusion

characteristics of proposed method. Wavelets and wavelet transforms are relatively

new imaging tools that are being rapidly applied to multifocused colour images.

The MSE, PSNR statistical reports analysed the source and fused images. The

RMSE, SSE values from different curve fitting methods evaluated. In future the

proposed method can be improved using SWT/FWT transform methods and further

can be extended to compressed images with Remote sensing, Video processing and

Computer Vision techniques.

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 871

REFERENCES

1. Tapasmini Sahoo, Richa Acharya & kunal Kaumar Das,An Innovative approach of

multifocus Image Fusion using SWT and Content Adaptive Blurring, IJET, Vol

7,2018

2. A.Anoop Suraj, Mathew Fransis, Discrete Wavelet Transform based Image Fusion

and Denoising in FPGA ,JESIT,2014,72-81

3. Kumudham,Tarun,Avinash Rajendran, Resolution Enrichment using wavelet based

interpolation method

4. Amolins, K., Zhang, Y., Dare, P., 2007. Wavelet based image fusion techniques—

an introduction, review and comparison. ISPRS J. Photogram.Remote Sens. 62

(September (4)), 249–263.

5. Baradarani, A., Jonathan Wu, Q.M., Ahmadi, M., Mendapara, P., 2012. Tunable

halfband-pair wavelet filter banks and application to multifocusimage fusion.

Pattern Recogn. 45 (February (2)), 657–671.

6. VPS Naidu and J.R. Raol “Pixel-Level Image Fusion using Wavelets and Principal Component Analysis

A Comparative Analysis”, Defense Science Journal,Vol 58, No.3, pp.338-352, May 2008.

7. http://www.nasa.gov/centers/ames/research/2007/imagefusion.html

8. Irshad, H.; Kamran, M.; Siddiqui, A.B.; Hussain, A., Image fusion using

computational intelligence:A survey, Environmental and Computer Science, ICECS

’09. Second International Conference ,2009, pp.128- 132. 9. http://www.mathworks.fr/matlabcentral/fileexchange/28923-image-fusion

10. Jionghua Teng; Suhuan Wang; Jingzhou Zhang; Xue Wang, Neuro fuzzy logic

based fusion algorithm of medical images, Image and Signal Processing (CISP),

2010, 3rd International Congress, vol.4, pp.1552- 1556.

11. R. C. Gonzalea and R. E. Woods, Wavelets and Multiresolution Processing, Digital

Image Processing, 2004, 2nd Ed., Prentice Hal.

12. Ivan W. Selesnick, Richard G. Baraniuk, and Nick G. Kingsbury, The Dual-

Tree Complex Wavelet Transform, IEEE Signal Processing Magazine, 2005, vol.

22, no. 6, pp. 123-151.

13. Jingyu Yang, Yao Wang, Wenli Xu, and Qionghai Dai, Image Coding Using Dual-

Tree Discrete Wavelet Transform, IEEE Trans. Image Processing, 2008, vol. 17, no.

9, pp. 1555-1569.

14. Gonzalo Pajares, Jes us Manuel de la Cruz, A wavelet-based image fusion tutorial,

Pattern Recognition, 2004, pp.1855 1872.

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 872

http://www.nasa.gov/centers/ames/research/2007/imagefusion.html


PREDICTION OF LIVER DISEASES USING DECISION TREES AND MACHINE

LEARNING ALGORITHMS

Article · January 2020

CITATIONS

0READS

132

1 author:


Data Classification View project

Pulmonary nodule identification View project

Piramu Kailasam



SEE PROFILE

All content following this page was uploaded by Piramu Kailasam on 18 January 2021.


https://www.researchgate.net/publication/348575344_PREDICTION_OF_LIVER_DISEASES_USING_DECISION_TREES_AND_MACHINE_LEARNING_ALGORITHMS?enrichId=rgreq-183ba67642807c4cc29540ddf705292e-XXX&enrichSource=Y292ZXJQYWdlOzM0ODU3NTM0NDtBUzo5ODEzMjk4NzMyMTk1OTFAMTYxMDk3ODY4Mjg5NA%3D%3D&el=1_x_2&_esc=publicationCoverPdf

https://www.researchgate.net/publication/348575344_PREDICTION_OF_LIVER_DISEASES_USING_DECISION_TREES_AND_MACHINE_LEARNING_ALGORITHMS?enrichId=rgreq-183ba67642807c4cc29540ddf705292e-XXX&enrichSource=Y292ZXJQYWdlOzM0ODU3NTM0NDtBUzo5ODEzMjk4NzMyMTk1OTFAMTYxMDk3ODY4Mjg5NA%3D%3D&el=1_x_3&_esc=publicationCoverPdf

https://www.researchgate.net/project/Data-Classification-5?enrichId=rgreq-183ba67642807c4cc29540ddf705292e-XXX&enrichSource=Y292ZXJQYWdlOzM0ODU3NTM0NDtBUzo5ODEzMjk4NzMyMTk1OTFAMTYxMDk3ODY4Mjg5NA%3D%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/project/Pulmonary-nodule-identification?enrichId=rgreq-183ba67642807c4cc29540ddf705292e-XXX&enrichSource=Y292ZXJQYWdlOzM0ODU3NTM0NDtBUzo5ODEzMjk4NzMyMTk1OTFAMTYxMDk3ODY4Mjg5NA%3D%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/?enrichId=rgreq-183ba67642807c4cc29540ddf705292e-XXX&enrichSource=Y292ZXJQYWdlOzM0ODU3NTM0NDtBUzo5ODEzMjk4NzMyMTk1OTFAMTYxMDk3ODY4Mjg5NA%3D%3D&el=1_x_1&_esc=publicationCoverPdf

https://www.researchgate.net/profile/Piramu-Kailasam?enrichId=rgreq-183ba67642807c4cc29540ddf705292e-XXX&enrichSource=Y292ZXJQYWdlOzM0ODU3NTM0NDtBUzo5ODEzMjk4NzMyMTk1OTFAMTYxMDk3ODY4Mjg5NA%3D%3D&el=1_x_4&_esc=publicationCoverPdf


https://www.researchgate.net/institution/Sadakathullah-Appa-College?enrichId=rgreq-183ba67642807c4cc29540ddf705292e-XXX&enrichSource=Y292ZXJQYWdlOzM0ODU3NTM0NDtBUzo5ODEzMjk4NzMyMTk1OTFAMTYxMDk3ODY4Mjg5NA%3D%3D&el=1_x_6&_esc=publicationCoverPdf



Journal of the Maharaja Sayajirao University of Baroda

ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 157

PREDICTION OF LIVER DISEASES USING DECISION TREES AND MACHINE

LEARNING ALGORITHMS

S.Piramu Kailasam Assistant Lecturer, Sadakathullah Appa College,Tirunelveli

Abstract

Human body consists of various important organs for processing the entire metabolism of the human

being. Some of them are the kidneys, brain, heart and liver. The proper functioning of these organs

will lead the person happy life. The identification of problems to these organs is also important at the

right point of time. In the current article a GUI model had been developed such that to identify the

level of damage happened to liver based on the data given by users. The current model will also

provide the details about various problems that causes to the liver through GUI model. The saying in

the medical field is always predicting the disease in earlier stages is better than at the crucial period

such that to reduce the damage of that problem to that particular organ. The symptoms also become

more different and also difficult to analyze at later stages. In the current tool, an attempt had been

made to utilize the machine learning algorithms like K-means, ANN and SVM to analyze the liver

patients from a group of normal patients. The comparison of the machine learning algorithms for

their performance to identify the liver problems based on performance factors also observed. The

developed GUI can be a good source for the doctors to analyze liver diseases.

Keywords: Liver,Diseases,Machine learning, ANN, SVM, K-means

1. Introduction

Human body had a versatile number of parts which are playing a key role for the healthier life of any

human being. Some of those parts are brain, heart, liver, kidneys, etc,. The successful functioning of

these parts will lead the life any human being successful and a happy life [Dimitris Lipara et. Al.,

Kulwinder Kaur et. Al., Pooja Sharma et. Al.,]. The diseases or the problems occurring with these

parts may lead to great damage to the functioning of the human body. The major problem concerned

with these components was all these parts working metabolism depends on the functioning of other

parts of the body. The identification of problems with these parts and diagnosis at the right intervals

at right point of time is more important such that the further damage of the parts and the problem

spreading to other parts also can be reduced to some level. When the diseases to these organs are in

severe conditions, the symptoms are apparently changes from the regular types of symptoms and it

becomes difficult to identify when the diseases are in critical stages. With proper medication and

other measures taken by the patients, these problems can be reduced to some level of damage to the

other parts of the body. The currently developed application is a graphical user interfacing unit with

which the doctors or medical practitioners can utilize to identify the symptoms and give as input and

can identify the problems of liver diseases at various stages [Hao Jia, Manmohan Shukla et. Al.,

V.V.Ramalingam et. Al.,].

The current model is also used to identify the number of people with liver problems also identified

from the available list of patients. A group of patients and their blood samples and values are made

with databases and the data was given as input to the current model and list of patients with liver

problems are identified from the available list of total patients in the selected database.

About Liver

Liver is one of the important organs in the human body [G.Ignisha Rajitha et. Al., Esraa Mamdohi

Mai et. Al., Islam MM et. Al., Mei Ying et. Al.,]. The size of the liver is large and its place in the

human body is on the right side of the belly of the human being. The colour of the liver is red and

brown in colour. It always looks like the feel of a rubber type of material to touch. The tasks

performed by the liver in the human body are around 500 tasks [Abhijit singh Putu et. Al., MoonSun

Shin et. Al., Jung-euk Ahn et. Al.,]. The best feature of this organ is that it can regenerate at any


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 158

point of time. Due to this unique property, it can be repaired or the problems occurred to this organ

can be easily reduced and can be resolved easily. Some of the problems that can occur to these

organs are liver cancer, liver failure, gallstones, Ascites are some of the common and more serious

issues that were happen to most of the people in terms of liver problems.

CT Liver Segment[Google]

Fig.1

Machine Learning

Machine learning is one of the applications of artificial intelligence and it plays a key role in

machines to learn things by observing the previous data and can take decisions based on the previous

data [Mei Ying et. Al.]. The data that can be given as input can be considered as the training data

from which the machine will learn the facts or the conditions on the data that was supplied and based

on such data, the machines tries to take decisions on various scenarios. If the data regarding a

particular problem did not exist in the database, then a hypothetical situation will be arid to any

machine. Hence, it is the duty of the developers to lookout the data related to all the problems or the

issues that the machine is going to face should be there in the training data. Otherwise, the problems

of hypothetical errors might arise [Abhijit singh Putu et. Al.,]. Several algorithms and techniques had

been developed under this application and are being used highly in the research world today. Some

of the important algorithms available today for the better implementation of the current model

algorithms are,

Support Vector Machine (SVM) [16]

Artificial Neural Network (ANN) [16,17]

Random Forest

a. Support Vector Machine (SVM)

In machine learning, the supervised algorithms are the important type of algorithms and some

important algorithms fall under this group of algorithms. Support vector machine (SVM) algorithm

is the important algorithm falls under this category. This algorithm can be used for regression type of

problems and for classification type of problems and their analysis [Islam MM et. Al.,]. In order to

differ or separate two classes of data, several types of hyper planes are available such that to choose

among the available group of hyper planes. In most of the cases, the objective is to identify one plane

among the other planes that has the margin values are very high. The hyperplanes example for an

SVM algorithm can be seen at the fig.2.


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 159

Fig.2: Hyper planes example of an SVM Algorithm [Tom Bromley et. Al.,]

The value is high means it can be taken as that the distance between various points in those classes is

having the good margin. As this distance is more, the future classification is also done easily for

further future sources. These hyperplanes can also be considered as the boundary decisions to be

considered and also for identifying and classifying various pints of data in various databases. When

the program is being run, the data points will fall on both sides of the planes, both sides of the planes

data points can be considered as the data points belongs to two different classes of data.

b. Artificial Neural Networks

Neural networks are the one of the important applications of machine learning. The most important

thing in this model is the current model works on the mechanism of neurons in the human brain such

that to resemble the working of a neural systems in the human brain. The major applications or the

help that the users are getting from these networks are the cluster and classifying of data. Some of

the major applications of these neural network models are face detection, identifying the objects in

images, identifying the expression on faces, identifying the gestures in images and videos, detection

of voices, speech translation etc. The artificial neural network model can be seen clearly for better

understanding at fig.3.

Fig.3: Artificial Neural network model [Tom Bromley]

Some of the major applications of these neural network models are face detection, identifying the

objects in images, identifying the expression on faces, identifying the gestures in images and videos,

detection of voices, speech translation etc.

d. Random Forest

Classification is one of the major important considerations for any machine learning application.

Performing this classification task precisely will give us more accurate and better performances than

the expected ones. The major considerations are like predicting a customer for a product purchase or

predicting the mood of the customers to check whether the product will be purchased by the person

or not. All these sort of applications are key factors for the implementation of these random forest

algorithms. Basically this algorithm belongs to the family of decision trees which are used very

frequently for the identification of results from these sorts of applications. For better understanding


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 160

of the random forest example, a model was presented at fig.4.

Fig.4: Random Forest Example [4]

In the working of the current algorithm, a various number of decision trees will present, these trees

are of two types built. At first, every single tree is built based on the data available or data getting

from the original database. And the next step is a subset of features are selected randomly from the

each near nodes and the new data set or the new result set of nodes which were classified are given as

output.

Roc Curve

ROC curve stands for the receiver operating the characteristic curve is basically used to represent the

data trends that were showing towards the true false and false true etc. Basically this plot is designed

by using the True positive rate and false positive rate at different threshold values.

Fig.5: An example of ROC curve model [5]

The mode of classification and the success rate of data classification can also be observed with this

curve as an output. Most of the researchers are using the same the mode of curve fro representing

various data classification techniques and models now days. In order to have the better

understanding of the ROC curve model, an example curve had been presented at fig.5.

2. Literature Review

Several authors had done various works on the area related to the current article. Some of the articles

and authors who had done some related works are discussed in the current section to identify the

similar set of works,

V.V. Ramalingam, A.Pandian, R. Ragavendran discussed about various algorithms that can be used

for identifying the problems of liver diseases in human beings. They had discussed in detail about

various algorithms of machine learning and decision trees, and also explained how these algorithms

can be used for further analysis of finding diseases in liver and its related components.

G. Ignisha Rajathi and G. Wiselin Jiji discussed in detail about the chronic diseases that might occur

with various symptoms and with several causes. All the symptoms and the disease issues were

discussed in detail with the help of various algorithms like the Simulated Annealing and Ensemble

Classifier. The results are very good and impressive and the identification of these serious diseases

can be found easily by utilizing this method of techniques further for achieving the better results.

Esraa MamdouhMai, Mabrouk Mai Mabrouk discussed about diabetes and chronic liver diseases and

the reasons for their presence in human bodies are also discussed in detail. SVM algorithm had given

some good resulted when their model was implemented with their own dataset.

Islam MM, Wu CC, Poly TN, Yang HC, Li YJ discussed about the fatty liver diseases and their

symptoms and the problems and solutions for the same diseases. They had used various machine

learning algorithms to identify the symptoms from the pool of items to be noted at various databases

and the algorithms for the same. The results and the utilization of the method are described in detail

and the results are interesting for further analysis and extensions of the works.

From the above works, almost all the works are done with the various machine learning algorithms.

Very few works are done with the combination of both machine learning and decision tree


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 161

algorithms. Hence, in the current work an attempt has been made to analyze our new dataset with the

combination of both these types of algorithms and tried to analyze the performance of these

algorithms. The results and representations are discussed in detail in the form of ROC curve and

other values.

3. Proposed System

Several authors had considered several algorithms and methods to identify the presence of liver

related issues and diseases in human beings with the presence of various symptoms. If the detection

of these diseases were done at right intervals of time, the disease and its related issues can be

reduced to a great level of relief to these patients. In the current work, an attempt had been made to

analyze the performance of the two set of algorithms like the machine learning algorithms and the

decision trees algorithms. From the pool of the machine learning algorithms, we had considered the

SVM algorithm and the ANN algorithm and from the pool of decision trees algorithms Random

Forest algorithm had been chosen. The performance was analyzed by collecting the various set of

samples from various hospitals in Visakhapatnam and the results were displayed in the results

section.

4. Implementation and Results of the Current Methods

The implementation of the current model considered was done in the following process and the

detail process had explained as follows. For the better submission of data to analyze and to process,

an integrated graphical user interface had been developed such that to submit the data and to process

the data. The graphical interface had been developed with the PHP model and the algorithms which

were using in the current model are connected with the current GUI such that user can select the type

of algorithm he needs to implement or he can check the data with a particular algorithm. A drop

down menu had been created for the three set of algorithms and the user can select his choice of

algorithms and also the users can select all the three algorithms individually and can check the

results with three algorithms one after the other algorithm.

The disease symptoms and its values may vary sometimes to some levels in males and females.

Hence, it is always better to first classify the data for both male patients and female patients and then

apply the algorithm from the above selected three algorithms and try to get the results and conclude

with those set of results obtained from the above interfacing unit with each algorithm

implementation. The detailed architecture of the entire process had been presented in fig.6.

Fig.6: Architecture model of the current method and flow of execution

Whenever the data was being submitted by the user for checking with the currently developed

model, the machine will collect the knowledge or expertise from the database that we had already

submitted with previous results and symptoms. The machine learning algorithms needs to work on


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 162

some existing data of patients such that to train from that particular data and take the decisions.

Hence, a set of patients data was collected from various hospitals and laboratories in and around the

Visakhapatnam City and a dataset of patients with both male and female with various ages had

prepared. Once the data is collected and dataset is prepared, the cleaning of the data and correlation

of the data is to be done and then the errors and noise in the database was removed for fast and easy

working of the machine with these algorithms. Once the entire process of this data analysis had

completed, the standard format of the data was prepared and the data needs to be analyzed for further

algorithms to get more accurate results based on the various symptoms and their values present.

Whenever the new choice of patient’s data is to be verified, the data is being verified and then the

current record of patient’s data was added to the existing dataset. Once the data is cleaned and ready

for checking the data had been distributed on the categorical basis. Then the data correlation and

cleaning of data had been completed and the algorithms are implemented and the data results are

represented in the form of ROC curve and the results presentation.

When the database is present with various patients’ data with various attributes, the dataset is

to be distributed with various numerical features. Some of those features are like age, albumin,

alamine_Aminotransfarese, Albumin and Globulin Ratio, Alkaline Phosphate, direct bilirubin, total

bilirubin, etc. The user interface of the current model was presented at fig.7 for better understanding.

Fig.7: The user interface of the current developed model

In order to understand the data and its components perfectly, exploratory data analysis and the

distribution of numerical features had been done such that for the better understanding of the trends

of the data. The exploratory data analysis model was presented in fig.8 and distribution of numerical

values is also observed in fig.9 in detail.

Fig.8: Exploratory Data Analysis


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 163

Fig.9: Distribution of Numerical Features

The resultant values after distribution of parameters and their components are shown and presented

in fig.10.


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 164

Fig.10: Resultant values after the proper distribution of values or the symptoms

Gender

Count 583

Unique 2

Top Male

Frequency 441

Fig.11: Distribution of categorical data

It seems there is outlier in Aspartate_Aminotransferase as the max value is very high than mean

value. Dataset i.e the output value has '1' for liver disease and '2' for no liver disease so let's make it 0

for no disease to make it convenient.

From the above table, the distribution of category and the features are plotted in the form of Bar

graph for better understanding of the values and the symptoms are presented in fig.11 and fig.12.

Fig.12: Bar plots for categorical Features

Number of patients that are male: 441

Number of patients that are female: 142

There are more male patients than female patients, Label Male as 0 and Female as 1. Once the data

labelling format is decided, the represented format of the data in the below dataset table I,

Table I. Dataset format with the representation of the labelling

The correlations between the various parameters and the symptoms and their values are represented

in the form of a table format with their represented values which are correlated with other. The

parameters considered with various symptoms and their values are represented at table II as,

Table II. Correlated values representation of symptom values


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 165

Once the data in the dataset was stabilized and values are represented, now it is the time to clean the

data and tries to remove the outliers from the dataset and the process is implemented and the

resultant data can be shown in the form of the graph at fig.13 as,

Fig.13: Data representation after cleaning the data

Once the data is finalized for testing and implementing from the dataset, now the standard procedure

of data standardization needs to be implemented and the algorithms had also been used on the same

dataset for further verification of the performance of the considered algorithms and their

performance. The training dataset that was been done before going for the implementation can be

seen at table III

Table III. Standardizing the data for implementation with algorithms

In the next level of the process, the standard deviation method is also applied to standardize the test

data and the test data in the dataset such that to get better results. The training process for the above

table 3 had been done and the resultant table results are as follows at table IV.

Table IV. The dataset table after implementing the Standard Deviation to the training dataset


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 166

At first, the Support Vector Machine is being implemented with the selected dataset and the results

are obtained in the form of both the ROC curve and actual values. The other two algorithms also had

followed the same type of process and the results obtained are shown and the final observation about

the performance of all the three algorithms are discussed and given the observations.

Case 1.

The results obtained for the SVM algorithm by calculating the ROC curve from the y_test and

predictions are observed at fig.14 as,

Fig.14: ROC curve for SVM algorithm

The accuracy of the above algorithm is observed as,

Case 2.

The results obtained for the Random Forest algorithm by calculating the ROC curve is presented at

fig.15 as,

Fig. 15: ROC curve for the second algorithm

The output result from the current algorithm as follows,


ISSN : 0025-0422

Volume-54, No.2 (XIIX) 2020 167

5. Conclusions

In the current article, an attempt has been made to analyze and diagnose the liver diseases in patients

using various machine learning and decision trees algorithms. The machine learning and decision

trees algorithms are working perfectly on the selected and prepared dataset of various patients who

are suffering with various sliver diseases with various levels of the diseases. From the considered

three algorithms, the ANN model is having the better performance when compared to the other two

methods of algorithms. The GUI developed for performing the current applications and the tasks is

very useful and helpful for the normal people also to work on it. Any educated person also can work

on it with minimum knowledge of computer operations.

References [1] https://www.webmd.com/digestive- disorders/picture-of-the-liver#1 [Last Accessed on 10-03-2019]

[2] https://towardsdatascience.com/support-vector- machine-introduction-to-machine-learning- algorithms-

934a444fca47. [Last Accessed on 10- 03-2019]

[3] A blog written by Tom Bromley in Xanadu AI entitled, Making a Neural Network, Quantum” on February,

2018. [Last Accessed on 10-03-2019]

[4] Stavros I Dimitriadis, Dimitris Lipara (2018). How random is the random forest? Random forest algorithm on

the service of structural imaging biomarkers for Alzheimer's disease: from Alzheimer's disease neuroimaging

initiative (ADNI) database. Neural Regeneration Research, 13, 6, 962-970.

[5] A blog written by Jason Brownlee in Python Machine Learning in 2018 entitled “How and When to Use ROC

Curves and Precision-Recall Curves for Classification in Python”. [Last Accessed on 10-03-2019]

[6] Kulwinder Kaur and Vikas Zandu, (2016). A Secure Data Classification Model in Cloud Computing Using

Machine Learning Approach. International Journal of Grid and Distributed Computing, 9, 8, 13-22, SERSC

Australia.

[7] Pooja Sharma and A K Mishra (2017). Analysis and Classification of Plant microRNAs using Machine

Learning Approach. Asia-Pacific Journal of Neural Networks and Its Applications, 1, 2, 1-6, GVPress.

[8] Hao Jia (2015). Large-Scale Data Classification Method Based on Machine Learning Model. International

Journal of Database Theory and Application, 8, 2, 71-80, SERSC Australia.

[9] Manmohan Shukla, and B.K. Tripathi (2017). On the Efficient Machine Learning of the Fundamental Complex-

Valued Neurons. International Journal of Neural Systems Engineering, 1, 1, 21-26, GVPress.

[10] V.V. Ramalingam, A.Pandian, R. Ragavendran (2018). Machine Learning Techniques on Liver Disease - A

Survey. International Journal of Engineering & Technology, 7, 4.19, 485-495.

[11] G. Ignisha Rajathi, and G. Wiselin Jiji (2019). Chronic Liver Disease Classification Using Hybrid Whale

Optimization with Simulated Annealing and Ensemble Classifier. Symmetry, 11, 33, 1-21.

[12] Esraa MamdouhMai, MabroukMai Mabrouk, (2014). A Study of support vector machine algorithm for liver

disease diagnosis. American Journal of Intelligent Systems, 4, 1, 9-14.

[13] Islam MM, Wu CC, Poly TN, Yang HC, Li YJ (2018). Applications of Machine Learning in Fatty Live Disease

Prediction. Student Health Technological Information, 247, 166-170.

[14] Mei-ying Ren and Sin-jae Kang (2017). Korean 5W1H Extraction Using Rule-based and Machine Learning

Method. International Journal ofArtificial Intelligence and Applications for Smart Devices, 5, 2, 1-10, GVPress.

[15] Abhijitsingh Putu Gaonkar, Ritwik Kulkarni, Ronnie D. Caytiles and N.Ch.S.N Iyengar (2017). Classification

of Lower Back Pain Disorder Using Multiple Machine Learning Techniques and Identifying Degree of Importance

of Each Parameter. International Journal of Advanced Science and Technology, 105, 11-24, SERSC Australia.

[16] MoonSun Shin, HaYoung Lee, JinSoo Kim and JungHwan Kim (2018). Design of Intelligent Healthcare

Support System Applying Machine Learning. International Journal of Advanced Science and Technology, 111, 73-

84, SERSC Australia.

[17] Jung-euk Ahn, Eun-Surk Yi, Ji-Youn Kim and Byung Mun Lee (2018). A Multiracial Data-Based Machine

Learning Model for Exercise Motion Recognition. International Journal of Control and Automation, 11, 2, 89-102,

SERSC Australia.


http://www.webmd.com/digestive-


Prediction of Tuberculosis Diagnosis Using

Weighted KNN Classifier

S.Piramu Kailasam,

Assistant Professor, Department of Computer Applications,

Sadakathullah Appa College(Autonomous),

Tirunelveli,Tamil Nadu

Abstract

In this paper, a Machine learning Model is used to develop predictive model

for Tuberculosis diagnosis using classifiers. The Model is developed with Principle

Component Analysis for feature extraction and classified with Ensemble KNN

Classifiers. The data is transferred into the knowledge that the symptoms are the

significant ones in diagnosis Tuberculosis. The presented results showed that

Ensemble KNN classification accuracy for TB diagnosis is 91.4 % and training

time is 1.4661s, also analyzed with True positive rate and False positive rate.

Keywords: Tuberculosis, Principle Component Analysis, Machine learning

classifiers, Classification

1.Introduction

The Tuberculosis Chemotherapy Centre (as the NIRT was then known) was

set up in 1956 as a result of collaboration between ICMR (Indian Council of

Medical Research), BMRC (British Medical Research Council), USPHS (United

States Public Health Service), and the government of Tamil Nadu. It became the

Tuberculosis Research Centre (TRC) in the 1970s, and was renamed the National

Institute for Research in Tuberculosis (NIRT) recently. NIRT’s vision is to

undertake high quality research in Tuberculosis that is relevant to both national and

global programmatic needs.

The first research activity undertaken in 1956 (‘The Madras Study’) was

meant to compare the efficacy of drug treatment at home, for isolation in sanatoria

during the treatment duration for TB patients. The study found that cure rates were

the same in both settings, and there was no greater contact risk by leaving patients

at home. The results of this trial prompted OPD treatment of TB patients.

Mukt Shabd Journal

Volume X, Issue IV, APRIL/2021

ISSN NO : 2347-3150

Page No : 502

1.1 Tuberculosis

Tuberculosis[1,2] is an infectious disease caused by bacteria whose scientific

name is Mycobacterium tuberculosis. It was first isolated in 1882 by a German

physician named Robert Koch who received the Nobel Prize for this discovery. TB

most commonly affects the lungs but also can involve almost any organ of the

body. Tuberculosis (TB) is one of the most prevalent infections of human beings

and contributes considerably to illness and death around the world. It is estimated

by the World Health Organization (WHO) that one third of the global population is

infected with TB and that seven to eight million new cases of TB occur each year.

1.2 Cause of tuberculosis

Tuberculosis is not a hereditary disease. It is an infectious disease. Any

person can get afflicted with TB. Whenever, a patient having active tuberculosis

coughs or sneezes in an open manner, bacteria causing TB come out in the aerosol.

This aerosol can infect any person who happens to inhale it.

1.3 Symptoms of TB

Characteristic symptoms of TB are persistent cough of more than three

weeks duration, cough with expectoration of sputum, fever, weight loss or loss of

appetite etc. If any of these symptoms persist beyond three weeks, the person

concerned should visit the nearest DOTS TB Center or Health Center and get his

sputum examined.

1.4 Treatment of TB

If a full course of anti-tubercular drugs is taken on a regular basis, this

disease is fully curable. A TB patient has to take medicines for a minimum period

of six months continuously. The drugs may continue up to one year in some cases.

It is important that the drugs are discontinued only on the advice of the doctor.

Patients who do not take a complete treatment or take drugs on an irregular basis,

their disease turns incurable or even life-threatening.

1.5 Prevention

TB is a preventable disease, even in those who have been exposed to an

infected person. Skin testing (PPD) for TB is used in high risk populations or in

people who may have been exposed to TB, such as health care workers. A positive

skin test indicates TB exposure and an inactive infection. Discuss preventive

therapy[3] with your doctor. People who have been exposed to TB should be skin

tested immediately and have a follow-up test at a later date, if the first test is

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 503

negative. Prompt treatment is extremely important in controlling the spread of TB

from those who have active TB disease[3] to those who have never been infected

with TB[4,5].

2.Material and Methodology

Fig. 1 System Architecture

TB input dataset is taken as .csv file. Feature Extraction process is taken over

by PCA approach. Classified by SVM,Decisin tree, and KNN ensemble classfiers.

For the purpose of the study and analysis the data was taken from the records

of National Institute For Research In Tuberculosis

Number of Records : 430

Number of features : cough, fever, Hameoptisis, chestpain, weightloss,

dyspnoea, smear

The incidence of Tuberculosis is considered as the output feature while all the

other six features are taken as input features. The value attached on the title of each

column represents the maximum point that a patient can score on such symptoms.

As a patient is examined for any of the symptoms a score that represents the

medical practitioner in respect of the system is recorded for the patient. Based on

all the recorded values, the patient is diagnosed. Data set is divided into two sets

randomly as Training 70% , Testing Data 30% . In this work, Matlab 2016, intel

core i3 labtop, 6GB RAM is used.

TB Dataset

TB Disease

Diagnosis

Predictor Result

TB Disease

Diagnosis

Final Result

Classification using SVM, Decision

Tree, Logistic Regression and

Ensemble

KNN Classifiers

(ROC METHOD)

Feature

Extraction

using PCA

algorithm

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 504

2.2 Artifical neural networks

Artifical neural network consists of many nodes processing units analogous

to neuron in the brain. Each node has a node function associated with it which

along with a set of local parameter determines the output of the nodes, given an

input modifying the local parameters alter the node function. Artifical neural

network is an information processing system. In this information processing

system, the elements called neurons, process the information. The signals are

transmitted by means of connection links. The links possess an associated weight,

which is multiplied along with the incoming signal for any typical neural net. The

output signal is obtained by applying activations to the net input.

The neural net can generally be single layer or a multilayer net. Thus

artificial neural network represents the major extension to computation. They

perform the operations similar to that of the human brain. Hence it is reasonable to

except a rapid increase in our understanding of artifical neural network leading to

improved neural network paradigms and a host of application opportunies.

A simple artifical neural net with two input neurons ( ) and one output

neuron(y).The interconnected weights are given by and .In a single layer net

there is a single layer of weighted interconnections.

2.3 Machine learning

Supervised machine learning method is used to classify tb data. Feature

extraction and dimension reduction combined using Principle Component

Analysis technique. K nearest neighbor classifier finds K neighbours which are

similar like points and for distance it can uses Euclidean or cosine similarity. KNN

is suitable for low dimension space. In high dimension space like video frame

classification KNN is not suitable. In KNN classification model[6], Euclidian

distance metric and k value can alter depends upon data.

Ensemble KNN classifier used in this work are as follows. Fine KNN

classifier that classifies the data with neighbor value 1. Medium classifier that

classifies with number of neighbours as 10. Coarse KNN classifier[6] which makes

the classification between classes with number of neighbours value as 100. Cosine

classifier[6] that uses cosine distance metric. Cubic KNN classifier uses the cubic

distance metric. Weighted KNN classifier[6] which uses distance weighting. In

KNN classifier Euclidian distance is measured other than cubic classifier. In cubic

KNN classifier , Minkowski distance used.

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 505

3.Results and Discussion

ROC Curve Graph

Logistic Regression 89.4% SVM 88.5%

KNN Medium 89.1% Coarse KNN 72%

Fig 2. Accuracy of Classification models

ROC Curve

Cosine KNN 88.2% Cubic KNN 89.2%

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 506

Fine KNN 84.1% Weighted KNN 91.4%

Simple Decision Tree 87.9%

Fig. 3 Accuracy of Classification models

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 507

Cosine KNN 88.2% Cubic KNN 89.2%

Fig. 4 Accuracy of Classification models

ROC curve method helps to identify the ratio between true positive rate and false

positive rate. Statistical measure, Accuracy value 91.4% clearly shows that

Weighted KNN classifies the TB data better than other classifiers.

Fig. 5 Comparison graph of Training Time

Time graph shows, weighted KNN is the best KNN classifier which

classifies the data in minimum seconds 1.4661 than others.

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 508

4.Conclusion

Using recent improved machine learning algorithm to establish a

classification model for the diagnosis of TB with weighted KNN classifier shows

the better diagnostic performance. In this aspect, accuracy of weighted KNN

classifier is superior to ANN, logistic regression, SVM and decision tree

classifiers. The training time also better than other classifiers with PCA feature

extraction. In future, this work can extend with Artificial Intelligence technique.

References

1.Ren, Z., Hu, Y. & Xu, L. Identifying tuberculous pleural effusion using artificial intelligence

machine learning algorithms. Respir Res 20, 220 (2019).

2.Ryan H, Yoo J, Darsini P. Corticosteroids for tuberculous pleurisy. Cochrane Database Syst

Rev. 2017;3:CD001876.

3.Porcel JM. Biomarkers in the diagnosis of pleural diseases: a 2018 update. Ther Adv Respir

Dis. 2018;12:1753466618808660d.

4.Choi H, Chon HR, Kim K, et al. Clinical and laboratory differences between lymphocyte- and

neutrophil-predominant pleural tuberculosis. PLoS One. 2016;11(10):e0165428.

5.Li D, Shen Y, Fu X, et al. Combined detections of interleukin-33 and adenosine deaminase for

diagnosis of tuberculous pleural effusion. Int J Clin Exp Pathol. 2015;8(1):888–93.

6.Sameh Alfere, Ashraf Yunis maghari, Prediction of student’s performance using modified

KNN classifers. Ist international conference on Engineering and future technology, 2018.

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 509

S.Lingeswari et al, International Journal of Computer Science and Mobile Computing, Vol.10 Issue.8, August- 2021, pg. 51-59






IJCSMC, Vol. 10, Issue. 8, August 2021, pg.51 – 59

A STUDY OF TOMATO FRUIT DISEASE

DETECTION USING RGB COLOR

THRESHOLDING AND K-MEANS CLUSTERING

S.Lingeswari1; Dr. P.M.Gomathi

2; S.Piramu Kailasam

3

1Assistant Professor, P.K.R. Arts College for Women, Gobichettipalayam 2Head & Associate Professor, P.K.R. Arts College for Women, Gobichettipalayam

3Assistant Professor, Sadakathullah Appa College, Tirunelveli 1 [email protected];

2 [email protected]; 3 [email protected]

DOI: 10.47760/ijcsmc.2021.v10i08.009

Abstract— The agriculture field plays vital role in development of smart India. To increase economic level the production

of fruits, crops and vegetables can use CAD technique using image processing tools. Identifying diseases in fruits is an

image processing’s big challenging task. This can done by continuous visual photos or videos monitoring system. The

automated image processing research helps to control the pesticides on fruits and vegetables. In this paper we focus to

detect the diseases of tomato at earlier stage. The proposed system shows how different algorithms such as color

thresholding segmentation techniques and K-means clustering are used. In proposed system shows the K-means Clustering

is better than RGB color based colorthresholder method for detecting tomato diseases in beginning stage.

Keywords— Tomato disease, Threshold, Colorthresholding segmentation, K-means Clustering segmentation

I. INTRODUCTION

Image Segmentation[1] is an important part for processing the segment of an image into many disjoint small area based on

the certain principle, and it is one of the most basic area of research in the image processing. Image segmentation[1] is used in

many applications such as film and photo analysis, geo information systems, medical image analysis, satellite image analysis

and etc. Image segmentation of tomato images is the most important stage in detecting diseases. This stage carries to next process namely classification which classifies the tomato images in detail. There are many more segmentation methods and

algorithms available in object detection.

II. RELATED WORKS

Qimei Wang, MingheSun, JianhuaQu and JieXue authors Identify tomato diseases[2,14] and Detect the infected areas using CNN and Object Detection Techniques. Experimental result Indicate that ResNet -101 has the highest detection rate but

take longest time for training and detection.







Huiqun Hong, Jinfa Lin, Fenghua Huang highlight to detect and diagnosis the tomato diseases by using deep learning. This method used to reduce size, time and computational cost. This paper supports for the further development of tomato

disease diagnosis system based on smart phones.

Jun Liu and Xuewei Wang focus on the detection of tomato diseases based on Yolo V3 Convolutional Neural Network.

Through this research, it provides reference for intelligent recognition of fruit diseases[3] and pest detections.

Manisha A. Bhange, Prof H.A. Hingoliwala detect pomegranate fruit disease based on three features such as texture, color and morphology. Here use two dataset namely training and testing.

III. MATERIALS AND METHODOLOGY A. Image Acquisition

A sample of 10 diseased tomato fruits[3] are captured by camera and pre-processed by mat lab. All input images are converted

into the size of 60x60. They normalized in equal size.

Table1 - Input images: i1, i2, i3, i4, i5, i6, i7, i8, i9, i10

B. Region of Interest(ROI)

ROI is a process of filtering particular portion from an input image. By using input images from table1 are cropped the defected

portion as shown in below

Table 2.ROI images: f,g,h,i,j,k,l,m



Fig 1.Work Flow Diagram

C. Pre-processing

Filtering[11] act as an important technique for modifying or enhancing the quality of an image. Filtering gives smoothing, sharpening and clear edges of an image. Image processing supports different types of filtering such as Gaussian filter, Bilateral filter,

Linear filter, Sigma filter[16 ], Median filter[11], etc. In this study, apply median filter for removing noise from an image. It is a non-

linear based filtering. In median filter, the center pixel of an MxN neighbourhood is replaced by the median (middle) value of the

corresponding window. It can be calculated by sorting all the pixel values from the matrix into numerical order and replace the pixel

value by using the median (middle) pixel value. In this paper apply 3x3 sampling window for filtering an images.

IV. TECHNIQUES USED IN SEGMENTATION PROCESS

Segmentation

Techniques

Advantages Disadvantages

Threshold[9,16]

It is a simple method and involves comparing each

pixel values of an image. It is applicable for only gray scale images.

It is not suitable

for blurred, outlier and noisy

images

ColorThresholder It is an advanced method of Threshold[9]. It used

colorimage[10] for comparing each pixel values for

segmentation.

It is not suitable

for edge based

computation.

K-means

Clustering[12]

(k=3)

It is more easy way to segment an image. Also it used

to eliminate noisy spots in an image

More expensive

for computation.

Image Acquisition

Preprocessing

(Median Filtering)

ColorThresholder

method

(RGB)

K means Clustering

method

(k = 3)

Detection of Tomato

Disease

Image Segmentation



V. RESULTS AND DISCUSSION A. Image Segmentation

ColorThresholding Method

To reduce noise, Median filter method[11] is applied to images. The Colorthresholder enables to create a segmentation

mask of a tomato[6] color image based on the exploration of different color spaces like RGB[15], HSV and La*b. As the fig.

shows segmented area of tomato. Threshold[5] only works with grayscale images. The image type is 8 bit. It is converted to

RGB stack. Color image is split into channels. Then adjusted Color Threshold values which are from 0 to 255.

Segmentation[1] achieved when Red, green, blue channels are adjusted. Green channel values are dominated than red and

blue. Secondly red takes good role in display segmented area. Blue channel has very least role in colorthresholding

segmentation.

Segmentation using colorthresholder algorithm

Input: Color images

Output: Detect the infected areas in the tomato

Step1: Read color images

Step2: Apply red color range between 0 and 255, then find threshold value. If threshold value reached then go to step5

Step3: Apply green color range between 0 and 255, then find threshold value. If threshold value reached then go to step5

Step4: Apply blue color range between 0 and 255, then find threshold value. If threshold value reached then go to step5

Step5: Stop the process

Step6: Finally compared the threshold values of each color and find which color is best for finding the infected area of an image.

Table 3.Colorthresholder Images and its color spaces

Input image Color space Output image



Table4: Qualitative analysis of segmented area (ColorThresholding method)

Images Mean Standard deviation

f 58.7 48.49

g 53.9 50.52

h 56.73 42.30

i 124.95 84.97

j 80.57 70.41

k 31.64 21.05

l 147.61 107.59

m 106.47 68.83

K-means Clustering

Data Mining is a process of extracting information from raw data. Data mining is an inter-process knowledge which includes

statistics, image processing, data base, machine learning and artificial intelligence. The different data mining techniques are

classification, clustering and regression, sequential pattern recognition and outlier detection. Clustering belongs to unsupervised

learning. A cluster is a subset of similar objects. Some of the clustering algorithms are Kmeans, hierarchical, fuzzy c means[17] and

DBSCAN algorithm. Kmeans clustering method is used for conducting image segmentation and disease detection. In this study Kmeans clustering algorithm is used for grouping related pixels which helps to detect the disease in tomato fruit image.



Segmentation Using K-means Clustering Algorithm

Step1: Read Image

Step2: Convert image from RGB color space to La*b* color space

Step3: Classify the colors in “a*b*” space using kmeans clustering

Step4: Label every pixel in the image using the results from kmeans

Step5: Create images that segment the tomato image by color

Table 5.Segmented images using Kmeans Clustering technique

Input image Cluster1 Cluster2 Cluster3



The above images are the results of k-means[1,9,12] cluster method. As a result of k-means cluster the resultant images are categorized as Cluster 1, Cluster 2 and Cluster 3. Here K = 3. All Cluster images are not unique. Cluster 1 is different from cluster 2

and cluster 3 and vice versa. The advantage of K-means[1,9,12,13] method is one can choose a best segmentation from any of the

clustered image. Hence, tomato diseased[2,14] portion can be easily detected by Computer Aided Design(CAD) system.

Qualitative Analysis

Table 6. Qualitative Analysis of Color based segmented image using K-means clustering technique

Images

(jpg)

Mean Median Mode Variance Standard

deviation

f 147.54 149.80 130.58 254.61 15.95

g 136.41 135.97 135.55 1.3 1.14

h 158.89 158.07 140.19 365.3 19.11

i 140.88 141.17 136.09 21.66 4.654

j 145.83 146.13 131.15 211.40 14.53

k 135.47 136.61 131.11 15.37 3.92

l 150 148.07 129.6 457.3 21.38

m 154.25 151.82 145.57 102.30 10.11

Mean and Variance

The simple statistical method is used for calculating the mean and variance of the pixel levels.

Mean, µ=∑zi * p(zi) where zi € S --------- (1)

and

Variance =∑(zi- µ)2 p(zi) where zi € S ----------- (2)



Here zi are the gray level values of the pixels in S and p(zi) are the corresponding normalized histogram values.

Mean value of K-means algorithm[2,12,13] varies from 135.47 to 158.07 where the variance values starts from 1.3 to

365.3. Mean value 135.47 to 158.07 contributes individual pixel intensity for entire tomato segmented image. The variance values help to find each pixel varies from center pixel. This helps to classify into three different clusters. The results of

Kmeans clustering is for better than FCM clustering method in visual as well as numerical measures[5].

Signal-to-noise ratio (SNR)

To find image quality, the SNR evaluation method is applied. Among image quality measurements, SNR always best for analysis. SNR is given as the ratio of mean value of image and the standard deviation of the noise.

Generally, higher SNR value could give more quality of image than lower values. SNR value lesser than 1 would not be

good enough in images.

In this study, Kmeans clustering and color threshold methods are given more than 1 and are tabulated as below.

Table 7. Comparison of SNR Value

Image(jpg) Colorthresholder

Segmentation(dB)

K-means Clustering

technique(dB)

f 9.7 10.47

g 11.39 13.22

h 11.55 12.32

i 10.12 10.32

j 10.88 11.50

k 10.28 10.58

l 12.55 12.72

m 11.34 11.99

In table.7, SNR values of K-means clustering[2,12,13] shows better than color threshold segmentation. The K-means

method gains more SNR value as follows 0.77dB, 1.83 dB, 0.77 dB, 0.2 dB, 0.62 dB, 0.3 dB, 0.17 dB and 0.65 dB than other

methods. The results of K-means clustering helps to detect the types[6,7,8] of tomato fruit disease such as Late Blight, Fungal,

Early Blind, Sunscald, Rotted tomato, Tomato Blossom end rot, Tomato puffy fruit and damage tomato.

VI. CONCLUSION

Comparison of three factors such as mean, variance and SNR, this proposed system shows K-mean clustering[1,12,13] technique for segmentation is better than colorthresholder segmentation method for detecting tomato diseases. In future

research, a new algorithm using data mining techniques is framed for effective detection and accurate classification of citrus

fruit diseases.

References [1]. HamirulAiniHambali, Salwa Khalid, MassudiMahmuddin, Nor Hazlyna Harun, “Segmentation of multi food images

using integrated active contour and K-means”, journal of engineering and applied sciences, special issue 3, 3146-3151,

2018.

[2]. Z. H. Huang, “Research on construction method of tomato disease image database”, South China Agricultural University,

2016.



[3]. Jagdeesh D. Pujari, Rajesh Yakkundimath, Abdulmunaf S. Byadgi, ”Statistical Methods for Quantitatively Detecting

Fungal Disease from Fruit’s Image”, International Journal of Intelligent System and Application in

Engineering,vol.1(4),60-67,2013.

[4]. ManishaBhange, H.A.Hingoliwala, “Review of Image Processing for Pomegranate Disease Detection”, International

Journal of Computer Science and Information Technologies, vol.6(1), 92-94,2015. J. Comp. Tech. Appl., Vol 2(5), 1709-

1716, 2011. [5]. M.MohamedSathik, S.PiramuKailasam, “Pulmonary CT image analysis for nodule detection using inspired FCM

clustering”, International Journal of Engineering Research & Technology (IJERT), vol.8(7),1013-1020, 2019.

[6]. ManyaAfonso, Hubert fonteijn, Felipe schadeck fiorentin, Dick Lensink, “Tomato fruit detection and counting in

greenhouse using deep learning”, Frontiers Plant Science, Vol. 11,1- 11, 2020.

[7]. Jiayue Zhao, JianhuaQu, “A detection method for tomato fruit common physiological diseases based on Yolov2”,

International Conference on Information Technology in Medicine and Education(ITME), 559-563, 2019.

[8]. Qimei Wang, Feng Qi, Minghe Sun , JianhuaQu, JieXue, “Identification of Tomato Disease Types and Detection of

Infected Areas Based on Deep Convolutional Neural Networks and Object Detection Techniques”, Computational

Intelligence and Neuroscience Volume 2019, 1-15.

[9]. SwapanSamaddar, Dr. A RamaSwamy Reddy, “Comparative study of image segmentation techniques on chronic kidney

diseases”, International Journal of Pure and Applied Mathematics, Volume 118,235-239, 2018.

[10]. TejasDeshpande, SharmilaSengupta, K.S. Raghuvanshi, “Grading Identification of Disease in Pomegranate Leaf and Fruit”, International Journal of Computer Science and Information Technologies, vol.5(3), 4638- 4645, 2014.

[11]. R. Gonzalez, R. Woods, Digital Image Processing, 3rd ed., Prentice- Hall, 2007.

[12]. Pham, V.H., Lee, B.R. “An image segmentation approach for fruit defect detection using k-means clustering and graph-

based algorithm”, Vietnam J ComputSci 2, 25–33, 2015.

[13]. Md. TarekHabib , AnupMajumder , A.Z.M. Jakaria , MoriumAkter, Mohammad ShorifUddin , Farruk Ahmed, “Machine

vision based papaya disease recognition “, Journal of King Saud University – Computer and Information Sciences,300-

309, 2020.

[14]. H. Hong, J. Lin and F. Huang, "Tomato Disease Detection and Classification by Deep Learning," 2020 International

Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), 2020, pp. 25-29, doi:

10.1109/ICBAIE49996.2020.00012.

[15]. HoreaMuresan, MihaiOltean, “Fruit recognition from images using deep learning”, act auniv.Sapientiae, informatica, 10, 1,26–42,2018.

[16]. SPKailasam, MMSathik,” A Reliable Computer Aided Lung Cancer Classification System Using Curvelet Features and

Ensemble Classifier”, International Journal of Pure and Applied Mathematics 117 (22), 275-278, 2017.

[17]. Ali,I.,Ahmed, A, “Segmentation of different fruits using image processing based on fuzzy C-means method”, 7th

international conference on reliability, ICRITO, pp-441-447, IEEE, 2018.

Journal of Analysis and Computation (JAC) (An International Peer Reviewed Journal), www.ijaconline.com, ISSN 0973-2861

Volume XIII, Issue I, January 2019

S. Mohideen Pillai and Dr. S. Kother Mohideen 1

FOOD IMAGE PROCESSING TECHNIQUES: A SURVEY

S. Mohideen Pillai1, Dr. S. Kother Mohideen2

1Reg.No : 18221252161009, Research Scholar, Department of IT, Sri Ram Nallamani Yadava College

of Arts & Science, Tenkasi, Tamilnadu – 627804, India, Affiliation of Manonmaniam Sundaranar

University, Abishekapatti, Tirunelveli – 627012, Tamilnadu, India. 2Associate Professor & Head, Department of IT, Sri Ram Nallamani Yadava College of Arts & Science,

Tenkasi, Tamilnadu – 627804, India.

ABSTRACT

As people across the globe are becoming more interested in watching their weight, eating more healthy and

avoiding obesity, a system that can measure calories and nutrition in every day meals can be very useful.

To identify food items accurately in such systems, image processing is used. Image processing techniques

like image segmentation, feature extraction, object recognition, classification is used for food recognition,

nutrients identification and calorie calculation. Bag-of-Features model, Artificial Neural Networks is used

to identify the food items. Image classifiers are trained to identify and categorize individual food items in

single image. This survey addressed various food image processing techniques, evaluates them and

presents the issues related to those techniques.

Keywords: Food recognition, Segmentation, Classification

[1] INTRODUCTION

One of the major goal of food image processing is to retrieve calorie and nutrient information

from the given food image. Food recognition is worthy of more research effort owning to its practical

signification and scientific challenges. Food recognition exposes new challenges to the current pattern

recognition literature and stimulates the stemming of novel techniques for general object recognition. In

addition, automatic food recognition is beneficial to health care related applications, such as obesity

management.

Obesity has become a serious public health problem to general population in many developed

countries [1], [2].The WHO defines obesity based on the Body Mass Index (BMI) of the individual. A

person is considered obese when the BMI is greater than or equal to 30(kg/m2) [2].Thus it is noticeable

that obesity and overweight are linked to a number of chronic diseases such as type II diabetes, breast

and colon cancer and heart diseases. The main cause of obesity is the imbalance between the amount of

food intake and energy consumed by the individuals [3].Obesity treatment requires the patient to



consume healthy food and decrease the amount of daily food intake, but in most of obesity cases it is

not easy for the patients to measure or control their daily intake due to the lack of nutrition education or

self-control. Therefore using an assistive monitoring food system is very needed and effective for

obesity elimination. Our objective is to develop an efficient food processing system to calculate the

calorie and nutrition of food. In this paper, general architecture, survey and analysis of various food

processing techniques are proposed.

The rest of the paper is organized as follows. Section 2 describes the various existing food

processing techniques. Section 3 covers the analysis of classifier used in the food processing

techniques. Finally, Section 4 concludes this paper with proposed work.

General Architecture of Food Processing System

Fig. 1 General Architecture

The general architecture of a Food Processing system is shown in Fig. 1. At the early stage, the quality

of the input food image is improved by image preprocessing techniques. In preprocessing: removing

noise, normalizing image, image format conversion, image resizing and removing unnecessary features

are carried out in the given image to improve the quality. In segmentation step, the image will be

analyzed to extract various segments of the food portion. For the detected food portions, a feature

extraction process has to be performed. In this step, various food features including size, shape, color,

and texture will be extracted. The extracted features will be sent to classifier, where the food portion

will be identified. Finally, by estimating the area and volume of the food portion with nutritional tables,

the calorie value of the food will be measured.

[2] RELATED WORK

Measuring Calorie and Nutrition from Food Image (MCNFI)

Input Food Image

Calorie and

Nutrition Table

Food Portion cropped from the

image using Image

Segmentation techniques

Feature

Extraction

Food items

recognized by

Classifier

Calorie and

Nutrition

calculation

Volume

calculation

Image Quality improved

through Pre-processing

technique




In this method [4], authors have not only discussed about the calorie and nutrition measure from food

image but also discussed about the uncertainty in image based food processing system. The image

based food processing method should come across two phases. They are image segmentation and image

classification. In this paper, Gabor filter [11] is used to segment the images. In each segment size, color,

and texture of the image is extracted. The extracted features are supplied to classification algorithm. In

this method support vector machine [12], [13] is used as classification algorithm. After identified the

food portion, then the height and area of the food is calculated to find the volume of the food. Finally

the calorie and nutrition are calculated using the calorie chart. The main advantage of this system is

calibration technique. The thumb of the user is used to eliminate the calibration error instead of special

trays and cards. The disadvantage is more work is needed for calculating calories of mixed and liquid

food.

DietCam: Multi-View Food Recognition Using a Multi-Kernel SVM (MVMK)

In this method [5], an automatic food classification method DietCam has been proposed. Diet-cam

mainly addressed the appearance variation of the same food. Diet-cam consists of two major

components, ingredient detection and food classification. Detecting the ingredient of the food image,

the improved part based model is used. Part based model [14], [15], [16] uses the shape of the objects as

a key property. The system is enriched the model with shape and texture to detect food ingredients.

Next, Multi-view Multi-kernel support vector machine is used to classify the various combinations of

food segments. In this method, they collected multi view images at restaurants and merged with PFID

[17] database. PFID is an image dataset which contains 4545 images of fast food. In this method, final

dataset contains 55 different food types of 15262 food images. The dataset focused only American

foods.

A Food Recognition System for Diabetic Patients Based on an Optimized Bag-of-Features Model

(BoF)

It is proposed to estimate the meals carbohydrate content for diabetic patients. In this method [6], 5000

home made images were created, that reflects the nutrition nutritional habits in central Europe. The

dataset contains 11 different types of food items. In general, the food recognition system contains two

main components, Segmentation and classification. In this system, the food image is segmented using

the Optimized Bag-of-Features Model [18]. Random Forest method [19] is used to classify the

segments. The main limitation of this method is using k-means algorithm to construct the visual

dictionary. The visual dictionary is a very essential part in Bag-of-Features model and also this system

achieved only 78% classification accuracy. The main advantage of this system is tested with different

classification algorithms like ANN, SVM and Random Forest. Finally Random Forest is used as

classification algorithm.

Accelerometer Based Detection of Food Intake in Free living Individuals (ABDF)

In this method [7], Food intake is measured using a 3-axis accelerometer attached to the frame of

eyeglasses. Many of the earlier methods need to attach with the body at different regions of the head on

the larynx in ear or on the temporal muscle. This is the first method attached the sensor to the temple of

the regular eyeglasses. The working principle of the proposed method relies on the detection of



temporal muscle activity during chewing and other facial activities. Two stage procedures are used to

extract the best feature set suitable for classification. In first stage, Minimum Redundancy and

Maximum Relevance method is used to extract the 30 top ranked features. In the next stage, the

extracted feature id forwarded to KNN classification algorithm. The KNN recognizes the food intake.

The main advantage of this method is avoiding direct sensor attachment to the body and issues

associated with incorrect sensor placement and peer body contact. This method is tested in controlled

laboratory and free living environment. The F1-score of the system combined from laboratory and free

living experiment was 87.9 +/- 13.8%.

Mobile Cloud Based System Recognizing Nutrition and Freshness of Food Image (MCBNF)

A mobile cloud based food calorie measurement system [8] is developed to find the calorie, nutrition

and freshness of the food image. The image processing techniques, segmentation and classification

plays a major role in food processing system. In this system, segmentation is done by Graph Cut

techniques [20]. The goal of this technique is to segment the main objects out of an image.

Classification is done by Naive Bayes classification algorithm. The system also finds the freshness of

the food using the textual extraction method. The system focuses only solid food. The main

disadvantage of this system is the low accuracy of the classification algorithm.

Food Calorie Measurement Using Deep Learning Neural Network (FCMNN)

Here authors have proposed Deep Learning Neural Network method [9] handles the training and testing

requests at the top layers, without affecting the central layers. Firstly the segmentation is done using

Graph Cut segmentation followed by deep learning method. Here user’s thumb is used for size

calibration. In this method, the first step is to generate a per-trained model file and then system is

trained with positive set of images. In the second step, we re-train the system with the set of negative

images (images that do not contain the relevant object).

In this system, once the model file is generated from the training, we load it into the application and test

it against the images captured and submitted by the user. The system then performs the images

recognition process and generates a list of probabilities against the label name. The label with the

highest probability is prompted to the user in the dialog box, to confirm the object name. Once the

object name is confirmed, the system performs the calorie computation part by calculating the size of

the food item with respect to the thumb in the frame. It finally prints the output to the user with the

required calorie.

[3] ANALYSIS

In this section, the classification accuracy and calorie calculation is analyzed for all the aforementioned

calorie measurement techniques. The calorie efficiency is calculated by comparing the system

calculated calorie value with the real calorie of the food image. The classification accuracy is measured

based on the recognition rate of the classification algorithm. The recognition rate of the classification

algorithm is calculated using precision and recall value.




Fig. 2 Comparison of Classification Algorithms Fig 3 Comparison of Calorie calculation

techniques

Fig. 2 shows the comparison of the classification algorithms of the aforementioned techniques. Fig. 3

shows the calorie calculation efficiency. According to Fig. 1 the recognition rate of the Diet-cam

technique is very high. In Diet-cam, Multi Kernel SVM Algorithm is used for classification. Hence, we

plan to use Multi Kernel SVM Algorithm in our proposed system. Fig. 2 shows the volume calculation

inefficiency of the existing techniques. Hence, the improved volume calculation technique is required to

build the efficient calorie measurement system.

[5] CONCLUSION

In this paper, a method for measuring the calorie and nutrition of the food image is analyzed. In general,

the food processing helps the people to control their daily food intake. The system focused on

identifying food items in an image using image processing and classification techniques. The proposed

work plans to identify calorie and nutrition of the food image. According to our analysis, Multi-kernel

Multi-view SVM (MVMK) classification algorithm identifies the food items effectively. Hence,

MVMK algorithm is decided to use in our proposed system. The proposed system also plans to work

towards improving the accuracy of identifying mixed foods.

70

75

80

85

90

95

100

MC

NFI

MV

MK

Bo

F

MC

BN

F

AB

DF 70

75

80

85

90

95

100

MC

NFI

MV

MK

Bo

F

MC

BN

F

AB

DF



REFERENCES

[1] E. Finkelstein, I. Fiebelkorn, and G. Wang, “National medical spending attributable to overweight and obesity:

How much, and who’s paying?” Health Affairs Web Exclusive, vol. 5, no. 14, 2003.

[2] World Health Organization, Geneva, Switzerland. (2011, Oct.). Obesity Study [Online].

Available:http://www.who.int/mediacentre/factsheets/fs311/en/index.html.

[3] G. A. Bray and C. Bouchard, Handbook of Obesity, 2nd ed. Baton Rouge, LA, USA: Pennington Biomedical

Research Center, 2004.

[4] Parisa Pouladzadeh, Shervin Shirmommadi, and Rana Al-Maghrabi, “ Measuring Calorie and Nutrition From

Food Image “, IEEE Transactions on Instrumentation and Measurement., vol. 63, no. 8, pp. 1947–1956, 2014.

[5] Hongsheng He, Fanyu Kong, and Jindong Tan, “DietCam: Muiti-View Food Recognition Using a Multi-

Kernal SVM”, IEEE Journal of Biomedical and Health Informatics., vol. 20, no. 3, pp. 848–855, 2015.

[6] Marios M. Anthimopoulos, Member, IEEE, Lauro Gianola, Luca Scarnato, Peter Diem, and Stavroula G.

Mougiakakou, (2014) “A Food Recognition System for Diabetic Patients Based on an Optimized Bag-of-Features

Model” IEEE Transactions., vol. 18, no. 4, pp. 1261–1271, 2014.

[7] Muhammad Farooq, and Edward Sazonov, “Accelerometer-Based Detection of Food Intake in free-living

Individuals”, IEEE Sensors Journal., vol. 18, no. 9, pp. 3752–3758, 2018.

[8] Diptee Kumbhar, and Sarita Patil, “Mobile Cloud based System Recognizing Nutrition and Freshness of Food

Image”, Energy, Communication, Data Analytics and Soft Computing, IEEE Conference on, pp. 709-714, 2017.

[9] Parisa Pouladzadeh, Pallavi Kuhad, Sri Vijay Bharat Peddi, Abdulsalam Yassine, Shervin Shirmohammadi, “

Food Calorie Measurement Using Deep Learning Neural Network”, International Instrumentation and

Measurement Technology, IEEE Conference on, pp. 1-6, 2016.

[10] Vaibhavee Gamit, Ms.Hinal Somani, “A Survey on Food Recognition and Nutrients Identification for

Classification of Healthy Food”, IJARIIE, vol. 2, no. 6, 2016.

[11] A. K. Jain and F. Farrokhnia, “Unsupervised texture segmentation using Gabor filters,” Pattern Recognit.,

vol. 24, no. 12, pp. 1167–1186, 1991.

[12] C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining Knowl., vol. 2,

no. 2, pp. 121–167, 1998.

[13] K. Muller, S. Mika, G. Ratsch, K. Tsuda, and B. Scholkopf, “An introduction to kernel-based learning,”

IEEE Trans. Neural Netw., vol. 12, no. 2, pp. 181–201, Mar. 2001.

[14] M. Weber, M. Welling, and P. Perona, “Unsupervised learning of modelsfor recognition,” Computer Vision,

European Conference on, pp. 18 – 32, Jan 2000.

[15] P. Viola and M. Jones, “Robust real-time face detection,” Computer Vision, IEEE International Conference

on, vol. 2, pp. 747 – 747, Jul 2001.

[16] H. Schneiderman and T. Kanade, “A statistical method for 3d object detection applied to faces and cars,”

Computer Vision and Pattern Recognition, IEEE Conference on, pp. 746 – 751, Jun 2000.

[17] M. Chen, K. Dhingra, W. Wu, and L. Yang, “PFID: Pittsburgh fast-foodimage dataset,” Image Processing,

IEEE International Conference on, pp. 289 – 292, Jan 2009.




[18] L. Fei-Fei and P. Perona, “A bayesian hierarchical model for learning natural scene categories,” in Proc.

IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog., 2005, vol. 2, pp. 524–531.

[19] L. Breiman, “Random forests,” Mach. Learning, vol. 45, no. 1, pp. 5–32, 2001.

[20] Pouladzadeh Parisa, Shervin Shirmohammadi, and Abdulsalam Yassine. "Using Graph Cut Segmentation for

Food Calorie Measurement." 2014 IEEE International Symposium on Medical Measurements and Applications

(MeMeA), 2014, DOI:10.1109/memea.2014.6860137, pp 1-6.

[21] World Health Organization, Geneva, Switzerland. (2012). World Health Statistics [Online]. Available:

http://www.who.int/gho/publicationsworld_health_statistics/2012/e n/index.html.

[22] G. A. Bray and C. Bouchard, Handbook of Obesity, 2nd ed.BatonRouge, LA, USA: Pennington Biomedical

Research Center, 2004.

[23] American Diabetes Association, “Standards of medical care in diabetes-2010,” Diabetes Care, vol. 33, no. 1,

pp. S11–S61, 2010.

[24] C. E. Smart, K. Ross, J. A. Edge, C. E. Collins, K. Colyvas, and B. R. King, “Children and adolescents on

intensive insulin therapy maintain postprandial glycaemic control without precise carbohydrate counting,”

Diabetic Med., vol. 26, no. 3, pp. 279–285, 2009.

[25] C. E. Smart, B. R. King, P. McElduff, and C. E. Collins, “In children using intensive insulin therapy, a 20-g

variation in carbohydrate amount significantly impacts on postprandial glycaemia,” Diabetic Med., vol. 29, no. 7,

pp. e21–e24, Jul. 2012.

[26] M. Graff, T. Gross, S. Juth, and J. Charlson, “How well are individuals on intensive insulin therapy counting

carbohydrates?” Diabetes Res. Clinical Practice, vol. 50, suppl. 1, pp. 238–239, 2000.

Special Issue Published in International Journal of Trend in Research and Development (IJTRD),

ISSN: 2394-9333, www.ijtrd.com

International Conference on Intelligent Computing Techniques (ICICT -17) organized by Department of Computer Applications,

Sarah Tucker College (Autonomous), Tirunelveli, TamilNadu on 1st September 2017 15 | P a g e

Contourlet Transform Based Denoising On Normal

Images Using ABC Optimization With Hard

Thresholding Contourlet Transform With ABC Optimization

1Syed Ali Fathima KMN,

2W. Fathima Farsana and

3R. Fathima Syreen,

1Research Dept. of CS, Sadakathullah Appa College, Tirunelveli, India

2Asst.Professor & Head, Dept.of BCA, Sadakathullah Appa College, Tirunelveli, India.

3Asst.Professor, Dept.of BCA, Sadakathullah Appa College, Tirunelveli, India.

Abstract— Digital Images are generally corrupted by noise,

Noise is nothing but addition of unwanted information for the

Original Image. Removal of the noise is necessary to reduce

the minimal damage of the image, improve image details.

Contourlet transform is employed with the directional filter

bank to capture the discontinuities of line and it gives various

directional decomposition.ABC algorithm is bio-inspired

algorithm which is derived from intelligent food search nature

of the honey bee. ABC as an optimization tool provides a

population-based search procedure in which individuals called

foods positions are modified by the artificial bees with time

and the bee’s aim is to discover the places of food sources with

high nectar amount and finally the one with the highest nectar.

The hard thresholding will kill all the coefficients whose

magnitudes are less than the threshold to zero while keeping

the remaining coefficients unchanged. Contourlet with image

denoising with ABC optimization based hard thresholding it

improves the image quality.

Keywords- component; Contourlet Transform;Laplacian Filter

Bank; ABC Optimization;Hard Thresholding

I. INTRODUCTION

Image denoising has remained a fundamental problem in the

field of image processing. Image denoising involves the

manipulation of the image data to produce a visually high

quality image. The restored image should contain less noise

than the observations while still keeping sharp transitions (i.e.

edges) Image denoising is to eliminate the noises as well as to

conserve the details of an image. Image denoising is a

fundamental step in the image processing. Image Denoising is

to confiscate the unnecessary noises at the same time it

preserves the main characteristic of the information and its

enhance the image clarity also. The main focus of an image

denosing is to accomplish noise reduction and maintain the

quality. Digital images plays an notable task in daily life

application such as Natural images such as Lena and

cameraman. Corruption may come in many forms such as

motion blur, noise and camera mis-focus

Noise can corrupted by different intrinsic and extrinsic

conditions. In practical situation its not possible to avoid a

noises. Noises may be additive and Multiplicative. Additive

noises are always interrupted with natural images. Removing

the noise from the image to increase the overall quality of the

processed image.

Contourlet is the greatest method for preserving the edges.

Contourlets form a multiresolution directional tight frame

designed to efficiently approximate images made of smooth

regions separated by smooth boundaries. The contourlet

transform has a fast implementation based on a Laplacian

pyramid decomposition followed by directional filter banks

applied on each bandpass sub band. The contourlet transform

is shown to be more effective in recovering smooth contours,

both visually as well as in PSNR. The contourlet transform is

applied for the noisy image to produce decomposed image

coefficients. Basically Contourlet transform is a double filter

bank structure. It consists of a Laplacian pyramidal filter

followed by a directional filter bank. First the Laplacian

pyramid (LP) is used to capture the point discontinuities. Then

directional filter bank (DFB) used to link point discontinuities

into linear structures.

The hard-thresholding function chooses all contourlet

coefficients that are greater than the given threshold λ and sets

the others to zero. The threshold λ is chosen according to the

signal energy.

ABC as an optimization tool provides a population-based

search procedure in which individuals called foods positions

are modified by the artificial bees with time and the bee’s aim

is to discover the places of food sources with high nectar

amount and finally the one with the highest nectar. Artificial Bee-colony (ABC) to improve the accuracy of denoised image.

II. DENOISING PROCEDURE

The procedure to denoise an image is given as follows:

De-noised image = W−1 [T{W (Original Image + Noise)}

Step 1: Apply forward Contourlet transform to a noisy image

to get decomposed image.

Step 2: Apply hard thresholding to decomposed image to

remove noise.

Step 3: Apply inverse Contourlet transform to thresholded

image to get a denoised image.

Contourlet gives high degree of directionality. It can easily

symbolize the curves and lines without discontinuity. The

intension of image denoising is to eliminate the noises as well

as to conserve the details of an image. Contourlet transform is

to preserve the edges and contours. After Hard thresholding

function are used, The contourlet transform was proposed as a

directional multiresolution image representation that can

efficiently capture and represent singularities along smooth

object boundaries in normal images, and take inverse

transform to reconstruct the original image, Artificial Bee-

colony (ABC) to improve the accuracy of denoised image.

https://en.wikipedia.org/wiki/Motion_blur

https://en.wikipedia.org/wiki/Image_noise

https://en.wikipedia.org/wiki/Frame_of_a_vector_space

https://en.wikipedia.org/wiki/Laplacian_pyramid



https://en.wikipedia.org/wiki/Filterbank





A. Related Work

Image Denoising is a well-known good model for noise

removal and edge preservation[3]

. The main aim of denoising is

to remove the unwanted noises or signals without losing any

information[7]

. Image Denoising is a central pre-processing

step in image processing to eliminate the noise in order to

strengthen and recover small details that may be hidden in the

data[9]

.

A noise can be categorized depending on its

source,Contourlet transforms is by introducing basis functions

which are local directional, and with multiresolution

expansion[13]

.This representation has two basic building

blocks, the Laplacian pyramid (LP) and the Directional Filter

Bank (DFB). Laplacian pyramid to capture the point

discontinuities, followed by a directional filter bank to connect

point discontinuities into linear structures[9]

.

[3]

A hard thresholding function doesn’t change rest of

coefficients while thresholding reduces its value by absolute

threshold value[10]

. Small coefficients are dominated by noise,

while coefficients have a large coefficients contain more signal

information as compared to noise so thresholding[15]

functions

set the coefficients less than threshold to zero.

Artificial Bee Colony is one of the most recently

introduced optimization algorithms. [12]

This(ABC) algorithm,

which is based on the foraging behavior of honey bees.ABC, a

population based algorithm[8]

, the position of a food source

represents a possible solution to the optimization problem and

the nectar amount of a food source corresponds to the quality

(fitness) of the associated solution. The number of the

employed bees is equal to the number of solutions in the

population[15]

.

B. Motivation and Justification


removal and edge preservation. Contourlet is one of the

transformation function. Contourlet is builded by Laplacian

Pyramid and Filtering bank. Laplacian Pyramid is used to

perceive the line discontinuities and filtering bank is used to

tie-up the discontinuities. Edges are clearly discern in

contourlet transform. It will speed up the enactment process. It

recovers the edges and the shapes. Contourlet gives high

degree of directionality. It can easily symbolize the curves and

lines without discontinuity. Edges are clearly discern in



degree of directionality.

A new approach to denoise Otsu thresholding optimized

using ABC algorithm in Contourlet transform domain.

Contourlet has high directional sensitivity and are optimally

sparse in representing image containing edges.

B. Organization of the Paper

The rest of the paper is organized as follows. Methodology

includes the outline of the proposed work of Contourlet and

Transform, Denoising Procedure are presented in Section II.

Experimental results are shown in Section IV. Performance

evaluations are discussed in Section IV and Finally Conclusion

is shown in Section V.

III. METHODOLOGY

A. Outline of the Proposed Work

The input image is added with speckle and salt & pepper noise and applies Contourlet Transform to decompose the

image. Then apply Hard Thresholding with ABC Optimization in transforms and noises are removed to obtain the noise free images.

Figure 1.Block Diagram for Contourlet Based ABC with Hard Threshold

B. Contourlet Transform

Contourlet with ABC mostly depends on random search

which is sufficient for exploration but insufficient for

exploitation. Contourlet transform is employed with the

directional filter bank to capture the discontinuities of line and

it gives various directional decomposition. It identifies an

anistropic components. The value of a food source depends on

many factors such as its proximity to the nest, its richness or

concentration of its energy, and the ease of extracting this

energy. ABC as an optimization tool provides a population-

based search procedure in which individuals called foods

positions are modified by the artificial bees with time and the

bee’s aim is to discover the places of food sources with high

nectar amount and finally the one with the highest nectar.

Figure 2.The contourlet transform framework

C. Hard Thresholding

Thresholding is a procedure used for signal and image

denoising. The hard-thresholding function desires all Shearlet

coefficients that are greater than the provide threshold λ and

sets the others to zero. The threshold λ is chosen according to

the signal energy.These methods are to make a noises free in

an image.

𝐷 𝑥, 𝑦 = 𝑆 𝑥, 𝑦 𝑖𝑓 𝑆 𝑥, 𝑦 > 𝑇

𝑒𝑙𝑠𝑒 0 - (1)

Where T is Hard threshold. Let S(x, y) represent the

original shearlet coefficient in the point (x, y), denoised

coefficient D(x,y) at the position S(x,y) by adjust the pixel

values.

D. ABC Optimization

In the ABC model, the colony consists of three groups of

bees: employed bees, onlookers and scouts. It is assumed that

there is only one artificial employed bee for each food source.

In other words, the number of employed bees in the colony is

equal to the number of food sources around the hive.

Employed bees go to their food source and come back to hive

and dance on this area. The employed bee whose food source

has been abandoned becomes a scout and starts to search for

finding a new food source. Onlookers watch the dances of

employed bees and choose food sources depending on dances.





E. Noise Categories

Image noise is an undesirable by-product of image capture

that adds spurious and extraneous information. Noise removal

algorithm is the process of removing or reducing the noise

from the image. The noise removal reduce or remove the

visibility of noise by smoothing the entire image leaving areas

near contrast boundaries. i) Gaussian Noise

Gaussian means that each pixel in the noisy image is

the sum of the true pixel value and a random Gaussian

distributed noise value, which has a bell shaped probability

distribution function given by,

where g represents the gray level, m is the mean or

average of the function, and σ is the standard deviation of the

noise.

ii) Poisson Noise

Poisson or shot photon noise is the noise that can

reason, when number of photons intellect by the sensor is

not ample to give evident numerical information. This

noise has root mean square value comparative to square

root intensity of the image.

𝑝 𝑥 = 𝑒−λ λx for λ>0 and x=0,1,2….

IV. EXPERIMENTAL RESULTS

Experimental results were conducted to denoise

Lena,Cameraman images shown in Fig. 2. Noises are

Gaussian,Poisson noises are considered. Contourlet

Transformation are decomposed level.The hard thresholding

has been used.

Figure 3. Original Image for Lena and Cameraman

Figure 4.Lena image Contourlet with ABC Using hard threshold

Figure 5.Cameraman image Contourlet with ABC Using hard threshold

A. Performance Metrics

i) PSNR

PSNR is used to appraise the restoration results,

which determines how secure the restored image is to the

original image.

𝑃𝑆𝑁𝑅 = 20 ∗ 𝑙𝑜𝑔 log10

𝑚𝑎𝑥𝑖

𝑚𝑠𝑒

ii) MSE

The slighter the MSE the nearer the estimator is to the

tangible data. A miniature mean squared error means that

the arbitrariness reflects the data more precisely than a

superior mean squared error.

𝑀𝑆𝐸 =1

𝑚𝑛 𝐼 𝑖 − 𝑗 − 𝐾 𝑖 − 𝑗

𝑛−1

𝑗=0

𝑚−1

𝑖=0

B. Performance Evaluation

The performance of Contourlet with ABC was evaluated by

using MSE, PSNR. The experiments Contourlet with ABC

hard thresholding have been tested and their denoised image

results are shown in Table I and Table II. Considered all the

metrics, it is observed that performs well for two images such

as Lena, Cameraman.

Table -1: Normal Image Contourlet With Hard Threshold In

Gaussian Noise


Poisson Noise





CONCLUSION

This Paper presents for normal image denoising based on

Contourlet with hard threshold is used to remove noise from

the image. The normal images are corrupted by Gaussian and

Poisson noise. Contourlet With ABC Optimization is one of

the method for decomposition is best suited for performance

demonstrated that the use of Contourlet With ABC

Optimization could offer better results than hard thresholding

to improve the best quality of the image.

References

[1]. Jannath Firthouse.P ,Shajun Nisha.S ,Dr.M.Mohammed Sathik ―Noise

Reduction in MRI Images using Contourlet Transform and Threshold

Shrinkages Techniques‖,International Journal of Computer Science and Information Technology (IJCSIT)2016.

[2]. Sakthivel K.,‖ Contourlet Based Image Denoising Using New- Threshold

Function‖, International Journal of Innovative Research in Computer and Communication Engineering ,March 2014.

[3]. Abbas H. Hassin AlAsadi,‖Contourlet Transform Based Method for

Medical Image Denoising‖, International Journal of Image Processing (IJIP), Volume (9): Issue (1): 2015.

[4]. Syed Ali Fathima KMN[1] ,Shajun Nisha S[2], ―Shearlet Transform Based

Denoising On Normal Images Using Hard Threshold‖ International Journal of Innovative Research in Computer and Communication

Engineering,(IJIRCCE) Jan.2017.

[5]. Asaju la’Aro Bolaji, Ahamad Tajudin Khader, Mohammed Azmi al-Betar and Mohammed a. Awadallah ―Artificial Bee Colony Algorithm,

its variants and applications: a survey‖, Journal of Theoretical and

Applied Information Technology,20th January 2013. Vol. 47 No.2. [6]. Junping Wang, Dapeng Zhang*,‖Image Denoising Based on Artificial

Bee Colony and BP Neural Network‖, Information Engineer Department,

Henan Vocational and Technical Institute. ISSN: 1693-6930, April 12, 2015.

[7]. Miao Maa b, Jianhui Lianga, Min Guoa, Yi Fana, Yilong Yinb∗ m

―Segmentation based on Artificial Bee Colony algorithm‖

Journal homepage: www.elsevier.com/locate/asoc.

[8]. Sandeep Kumar, Vivek Kumar Sharma, Rajani Kumari, ‖Improved

Onlooker Bee Phase in Artificial Bee Colony Algorithm‖, International

Journal of Computer Applications (0975 – 8887) Volume 90 – No 6,

March 2014. [9]. Anju T S, Mr. Nelwin Raj N R,‖Shearlet Transform Based Satellite

Image Denoising Using Optimized Otsu Threshold‖, International

Journal of Latest Trends in Engineering and Technology (IJLTET) ISSN: 2278-621X.

[10]. Minh N. Do, Member, IEEE, and Martin Vetterli, Fellow, IEEE,‖The

Contourlet Transform: An Efficient Directional Multiresolution Image Representation‖.

[11]. Yoonsuk Choi, Ershad Sharifahmadian, Shahram Latifi,―Remote sensing

image fusion using contourlet transform with sharp frequency localization‖, international Journal of Information Technology, Modeling

and Computing (IJITMC) Vol. 2, No. 1, February 2014.

[12]. Dervis Karaboga and Bahriye Basturk, ―Artificial Bee Colony (ABC) Optimization Algorithm for Solving Constrained Optimization

Problems‖.

[13]. Jyotsna Patil1, Sunita Jadhav2,―A Comparative Study of Image Denoising Techniques‖, International Journal of Innovative Research in Science,

Engineering and Technology Vol. 2, Issue 3, March 2013.

[14]. Gang Liu1, Jing Liu2,Quan Wang3 ,Wenjuan He4,‖The Translation Invariant Wavelet-based Contourlet Transform for Image Denoising‖,

JOURNAL OF MULTIMEDIA, VOL. 7, NO. 3, JUNE 2012.

[15]. Mingwei Chuia, Youqian Fengb,Wei Wang ac, Zhengchao Lic, Xiaodong Xua―Image Denoising Method with Adaptive Bayes Threshold in

Nonsubsampled Contourlet Domain‖,Science Direct Elsevier 2012. [16]. C.J. Li, H.X. Yang, Y.Y. Cai, B. Song,‖ Image Denoising Algorithm

Based on Non- Subsampled Contourlet Transform and

BilateralFiltering―,International Conference on Computer Information Systems and Industrial Applications (CISIA 2015).

[17]. Malini. S1, Divya. V2 and Moni. R. S,‖ Improved Methods of Image

Denoising Using Non-Sub Sampled Contourlet Transform ―Advances in Computational Sciences and Technology,ISSN 0973-6107 Volume 10,

Number 4 (2017) pp. 549-560.

[18]. Gangyi JiangMei Yu,Wenjuan Yi, Fucui Li,Yong-Deak Kim,‖A New Denoising Method with Contourlet Transform‖, (LNCIS, volume 345).

[19]. ―Speckle Reducing Contourlet Transform for Medical Ultrasound

Images‖,P.S. Hiremath, Prema T. Akkasaligar, and Sharan Badiger. [20]. ―Soft, Hard and Block Thresholding Techniques for Denoising of

Mammogram Images‖Manas Saha, Mrinal Kanti Naskar & B. N.

Chatterji Electronics and C, Journal ,IETE Journal of Research Volume 61, 2015 - Issue 2.

http://www.elsevier.com/locate/asoc

http://www.tandfonline.com/author/Saha%2C+Manas

http://www.tandfonline.com/author/Naskar%2C+Mrinal+Kanti

http://www.tandfonline.com/author/Chatterji%2C+B+N



http://www.tandfonline.com/toc/tijr20/current

http://www.tandfonline.com/toc/tijr20/61/2

ISSN 2319 – 1953 International Journal of Scientific Research in Computer Science Applications and Management Studies

IJSRCSAMS

Volume 7, Issue 4 (July 2018) www.ijsrcsams.com

Review On Fetal Head Detection Techniques in 2D

Ultrasound Images 1W. Fathima Farsana

2Dr.N. Kowsalya

1Research Scholar, Assistant Professor & Head, Department of Computer Application, Sadakathullah Appa College, Tirunelveli.

2Assistant Professor, Department of Computer Science, Sri Vijay Vidyalaya College of Arts and Science, Nalampalli, Dharmapuri

Abstract-Accurate diagnostic of fetal Head Detection is one of the

most important factors in assessing fetal growth during

pregnancy. There are so many techniques used for automatic

measurement of the fetal Head. This paper mainly reviews some

techniques proposed in past years for automatic detection of fetal

Head in 2D ultrasound images.

Keywords—fetal head circumference, fetal measurement, bi-

parietal diameter (BPD), abdominal circumference (AC), Region

of Interest (ROI), Ultrasonography (USG).

I. INTRODUCTION

ULTRASOUND (US) is widely used for pregnancy

diagnosis because it is radiation-free, real-time, and

inexpensive. The general purpose of the ultrasound

examination is to determine the location of the fetus and the

placenta, the number of fetuses, the gestational age (GA) and

estimated day of delivery (EDD) and to detect anomalies as a

basis for further fetal medical management. Common fetal

ultrasound measurements include: bi-parietal diameter (BPD),

head circumference (HC), abdominal circumference (AC), and

femur length (FL). : 1) the quality of the measurements are

user-dependent, 2) the exam can take more than 30 minutes,

and 3) specialists can suffer from Repetitive Stress Injury

(RSI) due to these lengthy exams. Head circumference (HC) is

one of the most important biometrics measured in the second

and third trimester. It is also a most important biometrics in

assessing fetal growth during prenatal ultrasound

examinations. However, the manual measurement of this

biometric by doctors often requires substantial experience.

The skull is often only partially visible in US images. These

missing boundaries make it difficult to accurately extract and

fit the whole skull. Because of the complex fetal posture

within the uterus, fetal head US images may contain other

anatomical structures with intensities that are similar to those

of the skull. They may have failed because of the missing

boundaries of the skull. Most of the existing intensity-based

methods are not really effective. Several techniques have been

proposed by several people for automatic detection of fetal

Head Circumference (HC). In this paper we will discuss about

some techniques that help detecting the fetal head.

II.FETAL IMAGE PREPROCESSING

The main problem in fetal ultrasound images are they have

more speckle noises and artificial borders, have very poor

qualities that cause the diagnosis more difficult, For fetal head

detection on ultrasound images, the skeletons of fetal head

skulls should be extracted as the bright object in pre-

processing. In this paper [7] they use two filters Anisotropic

diffusion filter (ADF) is used to reduce the speckle noise and

Discrete wavelet transform filter (DWT) is used for

decomposition and reconstruction.

Fig 1 Ultrasound Fetal Head Images

They reclaim discontinuous edge and non stationary signal

problems. They compare the result of the two filters ADF does

not improve the image quality and the DWT method gives

high performance.

EVALUATION OF IMAGE ULTRASOUND SEQUENCE QUALITY IN TERM OF

FETAL HEAD DEFECTS

Metrics PSNR MSE

DWT 82.4±0.1 0.0024±10ˉ4

ADF 80.4±0.1 0.0030±10ˉ4

In this paper [4] the pre-processing is based on the gray

feature of ultrasound images, they could extract the basic

skeletons of fetal head skulls. They use a bilateral filter to

reduce the speckle noise and a white top-hat transform is used

to increase the contrast. After that they apply a k-means

clustering algorithm. To reduce the noise impact in the k-

means method they use a global thresholding to convert the

intensity image into binary image then the bright object is

extracted from the background. And they apply a binary

morphological opening operation to remove some small bright

operations and morphological dilation to smooth the

boundaries of the large bright objects .In this paper [11] they

convert the USG image to the form of digital image. Then the

region of interest is calculated using various shapes of

cropping like rectangular, circle and ellipse and find out the


IJSRCSAMS


Mean Square error (MSE) and peak-signal to noise ratio

(PSNR). After the comparison they concluded that ROI

detection can reach best results through ellipse shape

detection.

COMPARISON MSE & PSNR FOR ROI DETECTION ON HEAD

Number ROI Shape Head

MSE PSNR

1 Rectangular 1.5216e+003 16.3077

2 Circle 1.0407e+003 17.9573

3 Ellipse 1.1182+003 17.6454

In this paper [17] two pre-processing operations were done.

First the background was filled by extrapolating the image

inside of the scanned area by using iterative low-pass

filtering in discrete cosine transform[DCT] domain. The

second pre-processing operations regularize the contrast and

intensity of extrapolated image. In this paper [3] they use

phase-based edge detection methods to detect the head contour

in the ROI, it performs better in ultrasound images with great

intensity in-homogeneity caused by speckle, attenuation or

signal dropout. In this paper [2] they use phase-symmetry

method to detect a symmetry object from low-contrast and

noisy US images based on frequency domain analysis. This

measure is employed to calculate the HC ROI images to

extract the HC edge. In this paper [6] they use a contourlet

approach for speckle noise reduction in fetal US images.

Compared to wavelets it gives enhanced results and the

highest PSNR value. In this paper [12] they use a Wiener filter

to reduce Speckle Noise. It is a low-pass filter which can be

applied in the frequency domain. This filter smoothes the

input ultrasound image after both of noise and image signals.

In this paper [14] they use a high pass filter to make the image

appear sharper and apply thresholds on multiple intensity

levels. In this paper [20] they use a combination of bilateral

filter and stationary wavelet transform. This method uses a

neighborhood relationship based on parent-child relationship.

III.FETAL HEAD DETECTION

A. In this paper [1]present a fully automatic approach based

on texture maps and optimal ellipse detection to Segment and

obtain measurements of the fetal head. First all images are

standardized from 0 to 255 grey levels, subsequently three

texture maps are obtained. The image obtained from the

infinite norm is used to segment the brightness objects

corresponding to the fetal skull, then the morphological

operators were used to obtain lines with the most

representative pixels of the fetal head, finally the ellipse is

found and used to initialize an active contour. The initial

Contour is given by the parameterized ellipse and the goal is

to adjust the snake to the outer region of the skull, by

preserving its elliptical shape. Finally BPD and HC are

obtained.The results quality was evaluated by two obstetrician

experts. A set of 10 studies was used to validate the

segmentation process, obtaining a precision of 94.61%, a Dice

similarity index of 97.19% and a maximum distance of 2.64

mm. To evaluate the head measures, a set of 23 images was

used. The differences between experts and the proposed

method are 2.72 mm for BPD and 2.73 mm for HC, with a

correlation of 99.8% in both cases. These results suggest that

the proposed method is suitable to be used for clinical

applications in maternal medicine.

B. In this paper [2] they developed a learning based

framework that considers prior image knowledge of

gestational age and scanning depth. The gestational age

ranging from 18 to33 weeks. The fetal ground truth was

determined by 3 experienced radiologists. The trained RF

Classifier is used to detect the HC ROI in US images To train

the classifer they use 1048 positive samples and 1996 negative

samples. All the samples were normalized into 90x80 pixels.

The Haar-Like features were extracted from the training

sample.The RF Classifier was trained using binary decission

trees. The trained classifiers are employed to detect the HC

ROI using sliding window method. The phase symmetry

method is used to detect the HC edge in the ROI imsge. This

method detect Symmetric object fro low-conrast and noisy US

images based on a frequency domain analysis. Finally the

ellifit method is used to fit the HC ellipse in US image. The

obtained HC ellipse is the skull center line(SCL) whereas the

clinically defined HC is the outer skull ellipse. The average

thickness of the skull was calculated and integrated it with the

SCL to obtain the final HC ellipse. For this they applied a top-

hat transformation to enhance the contrast of the HC ROI

image. Then the morphological open operation is used to

remove the small connected skull components and retain the

large ones. Then the average skull thickness was calculated.

Finally the outer skull ellipse HC was measured.

C. In this paper[4] they proposed a fast and robust

framework to automatically detect and measure HC from US

images. They use a AdaBoost learning algorithm to select a set

of Haar-like features from the training sample & train the

classifier. At the detection stage the Sliding-window object

detection algorithm is used to identify and localize the fetal

head region in US images. This approach increases both

detection rate and speed. The boundry of the head in the

localized region is further detected using a local phase based

method, which is insensitive to speckle noise and intensity

changes in US images. Then they propose an improved

iterative randomized Hough Transform(IRHT) to detect the


IJSRCSAMS


ellipse from the skeleton image. This method increase the

detection rate. The rate was 92.1% & 87.67 for the traditional

detector.

D. In this paper[5] they propose a system for fast automatic

segmentation of fetal anotomical structures. It uses a classifier

to perform the segmentation. A well trained classifier is used

to reduce the probability of misclassification. This CPBT

algorithm is used to reduce the training and detection time.

The training involves two stages the first stage is coarse stage

robust to false negative and a fine stage roboust to false

positives. The detection algorithm also runs in two stages the

coarse detection samples the search space using sampling

interval, the fine detection searches the hypothesis selected

from the coarse search at smaller intervals. This system

automatically measure BPD & HC from US images of fetal

head,AC from images of featl abdomen & FL in images of

fetal femur.

E. In this paper[6] mainly focus on analytic vision approach

for fetal head detection using a discriminative set of log Gabor

. First the US fetal image is passed through the contourlet

preprocessing algorithm to reduce the speckle noise. It

converts the noisy image into a transform domain and in

comparing the transform coefficient to threshold value. Then

log Gabor features are extracted from US image to determine

the biometric head measurements. The measurements are

compared to expect results using normal & abnormal data sets.

This method automaticlly measure HC, BPD & OFD.

F. In this paper [12] least square fitting(LSF) algorithm that

depends on the concept of reducing the sum of square

algebraic distances between the data points and ellipse based

on predetermined constraint.they compared the proposed

method with hough transform method , it proves that it is

faster and more accurate than hough transform method.

G. In this paper [14]apply high pass filtering on original

image,then apply threshold at various intensity levels are

applied to an ultrasound image. There are two types of

methods used to locate head skull, first approach uses

Chamfer matching to find the elliptical structure. Second

approach uses Hough Transfoem to find the best ellipse that

represents the head. From the segmented head object, head

circumference is calculated using(1) & (2). The final result

shows that Chamfer Matching shows better performance than

Hough Transform.

ACCURATE SEGMENTATION RATES

Head segmentation

method

Accuracy(%)

Chamfer Matching 95.55

Hough Transform 91.11

IV.DISCUSSION

In paper [3,4] they use an improved iterative randomized

Hough transform method. It gives better performance than

iterative Hough transform method. In this paper [2] they

compared the proposed method with 4 methods. The

comparison results shows that the proposed method perform

better than the morphology + ellifit method due to the use of

phase symmetry to detect the centre line of the skull and

employed a fast ellipse fitting method to accurately fit the HC

ellipse. In this paper[14] they compared the Hough transform

based approaches. The chamfer matching gives accurate

results In this paper[7] they use Hough Transform technique,

fetal head structure detection is achieved and give 97%

segment accuracy. In this paper [16] they use a Circle Hough

Transform to detect the bright circular fetal skull.

V.CONCLUSION AND FUTURE SCOPE

This paper discuss about various types of filters used for

fetal image pre-processing and various techniques used for

fetal head detection. Each technique has its own advantages

and limitations. Because of the increasing applications of US

images in the medical field, we need to enhance the features

which we require for further processing.

REFERENCES

[1] Automatic fetal head measurements from ultrasound images

using optimal ellipse detection and Texture Maps J.L perez-

Gonazalez1,J.C.Bello Munoz2,M.C.Rolon Porras2,Fernando

Arambula-Cosfo3, and V.Medina-Banuelos

[2] Automatic Fetal Head Circumference Measurement in

Ultrasound using Random Forst and Fast Ellipse Fitting

[3] Learning based automatic head detection and measurement from

fetal ultrasound images via prior knowledge and imaging

parameters. Dong Ni1, Yong Yang1, Shengli Li2, Jing Qin3,

Shuyuan Ouyang2, Tianfu Wang1, Pheng Ann Heng3

[4] Automatic Fetal Head Detection on Ultrasound Images by An

Improved Iterative Randomized Hough Transform Rong Xu, Bo

Zhang1, Yoshinobu Sato2, Masakatsu G. Fujie1

[5] Detection of Fetal Anatomies from Ultrasound Images using a

Constrained Probabilistic Boosting Tree Gustavo Carneiro,

Bogdan Georgescu, Sara Good, Dorin Comaniciu Senior

Member, IEEE.

[6] Analytic Approach for Fetal Head Biometric Measurements

Based on Log Gabor Features. Hanene Sahli1 • Ahmed

Zaafouri1 • Amine Ben Slama1,2 • Radhouane Rachdi3 •

Mounir Sayadi1

[7] Automated Detection of Current Fetal Head in Ultrasound

Sequences Hanene Sahli 1, Amine Ben slama1, Ahmed

Zaafouri1, Mounir Sayadi 1 and Rathwen Rachdi 2

[8] Automatic image quality assessment and measurement of fetal

head in two-dimensional ultrasound image Lei Zhang Nicholas

J. Dudley Tryphon Lambrou Nigel Allinson Xujiong Ye

[9] Detection of fetal anotomies from ultrasound images using a

constrained probabilistic boosting tree Gustavo Carneiro,

Bogdan Georgescu, Sara Good, Dorin Comaniciu Senior

Member, IEEE.

[10] Human-level Performance On Automatic Head Biometrics In

Fetal Ultrasound Using Fully Convolutional Neural Networks

Matthew Sinclair, Christian F. Baumgartner, Jacqueline

Matthew, Wenjia Bai, Juan Cerrolaza Martinez, Yuanwei Li,

Sandra Smith, Caroline L. Knight, Bernhard Kainz, Senior

Member, IEEE, Jo Hajnal, Andrew P. King, Daniel Rueckert,

Fellow, IEEE

[11] Region of Interest Detection for Pregnancy Image Processing M.

Khairudin,Joko Laras B T Electrical Engineering Education

Dept. Faculty of Engineering, Dessy Irmawati Electronics


IJSRCSAMS


Engineering Education Dept. Faculty of Engineering,

Universitas Negeri Yogyakarta,

[12] Determining the Gestation Age through the automated

measurement of the bi-parietal distance in fetal ultrasound

images. Mariam Saii,Zaid Kraitem

[13] Automated Techniques for the Interpretation of Fetal

Abnormalities: A Review Vidhi Rawat,1 Alok Jain,2 and

Vibhakar Shrimali3

[14] Segmentation of head from ultrasound fetal image using

chamfer matching and HoughTransform based Approaches

Mathews M, Deepa j Tonu James Shari Thomas

[15] Speckle Noise Reduction and Segmentation Techniques in

Ultrasound for Detection of Fetal Abnormality - A Survey *K

Priya Dharshini1, K Srinivasan2, R Ramya3

[16] Automatic Detection and Measurement of Fetal Biparietal

Diameter and Femur Length —Feasibility on a Portable

Ultrasound Device Naiad Hossain Khan1*, Eva Tegnander2,3,

Johan Morten Dreier2, Sturla Eik-Nes2,3, Hans Torp1, Gabriel

Kiss1

[17] Head Contour Extraction From fetal ultrasound images by

differnce of Gaussians Revolved along Elliptical

Paths,Alessandro Foi,Matteo Maggioni,Antonietta Pepe,Jussi

Tohka

[18] A review on ultrasound Image Segmentation Techniques. Pooja

Rani,Rakesh Verma

[19] Evaluation of various speckle Reduction Filters on Medical

Ultrasound Images, Shibin Wa,Qingsong Zhu and Yaoqin Xie

[20] Ultrasound Image Denoising using a Combination of Bilateral

Filtering and Stationary Wavelet Transform Moussa Olfa1,

Khlifa Nawres1 1Université de Tunis El Manar École nationale

d’ingénieurs de Tunis (ENIT) Laboratoire de Biophysique et de

Technologies Medicales, 1006 Tunis,Tunisie

[email protected], [email protected]

https://www.hindawi.com/24830594/




1W.Fathima Farsana

1Research Scholar, Research Department of Computer Science and Applications, Vivekanandha College of Arts and

Sciences for Women,Elayampalayam

2Dr.N.Kowsalya

2Assistant Professor,, Department of Computer Science, Sri Vijay Vidyalaya College of Arts and Science , Nalampalli,Dharmapuri

Abstract— Ultrasound is an imaging technique which is noninvasive, nonradioactive and also inexpensive and is used for

diagnosis and treatment. Presence of noise in ultrasound fetal images is a major issue as it can lead to improper diagnosis. It is

more important to preserve the features of the fetal image by removing the noise. The main objective of image enhancement is to

obtain a highly detailed image. The traditional techniques are not very good for especially speckle noise reduction. This paper

explains about various kinds of filters that are used for speckle noise reduction. Here we have used some spatial filtering

techniques like Gaussian filter, Median filter, bilateral filter, Wiener Filter, and Speckle Reducing Anisotropic Diffusion filter to

denoise fetal ultrasound image. The performance of these filtering techniques are compared on the basis of Peak Signal to Noise

Ratio (PSNR),Root Mean Square Error (RMSE) and Structure Similarity Index (SSIM). The results obtained are presented in the

form of statistical tables and diagrams. Based on the statistical measures and visual quality of the US images the Speckle

Reducing Anisotropic Diffusion filter (SRAD) performed well over the other filter techniques.

.

Keywords: ultrasound images, speckle noise, PSNR, RMSE,SSIM.

I. INTRODUCION

Medical ultrasound images are mostly corrupted by speckle noise in its acquisition and transmission. The poor quality of the ultrasound image is a major drawback in the medical field..The main challenge in the image denoising techniques is to remove such noises while preserving the important features in the image. There are so many techniques have been made to reduce the speckle noise using various types of filters[1]. Many filters have been used to improve the performance, which is assessed based on the quality metrics such as Peak Signal to Noise Ratio(PSNR),Mean Square Error(MSE),Mean Absolute Error(MAE).This paper focuses on speckle noise reduction techniques in fetal ultrasound images using different filtering techniques.

Today, US imaging is one of the most common tool used for diagnosis over the other imaging modalities like

Positron emission tomography (PET), Magnetic Resonance imaging (MRI) and Computed tomography (CT) due to

its low cost and availability. Speckle noise[4] is the characteristic effect

seen in US images that affects the visual quality, as these are low resolution images which are constructed by using

reflection of ultrasound waves. However, speckle noise as shown in Figure 1, is significant in ultrasound images and

it might cause negative impact on post-processing steps such as image segmentation and image compression.

Figure 1. Comparison between Fetal Ultrasound Image with a Fetal Ultrasound Image Degraded by Speckle

Noise

SPECKLE NOISE REDUCTION AND PERFORMANCE EVALUATION

OF VARIOUS FILTERS IN 2D ULTRASOUND FETAL IMAGES

Mukt Shabd Journal

Volume IX, Issue XII, DECEMBER/2020

ISSN NO : 2347-3150

Page No : 784

II. RELATED RESEARCH

Various techniques are available in literature for speckle noise reduction in Ultrasound images. The speckle noise

produced the bright and dark dots in an image which is head to mislead in diagnostics process in the case of medical

images. In paper[2] they tested the various ultrasound images that contained speckle noise. Different spatial filters

applied on the noisy images and the denoised images are observed. Among all the filtering techniques, the Max filter

gives the better results in both subjective and objective manner over the other filters. In paper[3] they focus on

hybrid filter such as, Modified Median(N4 max) and Modified Median(ND max) for speckle reduction in abstraction

domain. While reducing speckle noise useful data should be retained.In paper[4] a simple preprocessing step was

added with Gabor filter to establish a new technique. The general approach of image denoising is to apply a function

for processing each pixel of the image. A basic approach is based on the use of masks[10], which are also referred to

as filters, kernels, templates, or windows. Thus, numerous researches, concepts, and studies of speckle noise

reduction have been investigated worldwide. Where various methods based on these studies have been developed

and implemented for eliminating speckle noises while preserving and maintaining the critical features in ultrasound

images.The main goal is how to specify the measure, which can efficiently differentiate features of the image from

the speckle noise since both of them, are come with high frequencies in many cases. This paper focuses on the

speckle reduction techniques in ultrasound fetus images using various filtering techniques.

III. SPECKLE NOISE MODEL

Speckle Noise is granular or multiplicative Noise. This type of Noise is present in active and synthetic aperture radar

(SAR), and especially in medical ultrasound images. It is caused in the ultrasound images because of the

interference resulted by the returning wave. Speckle noise is a serious and a major problem in the ultrasound images

thus leading to problem in image interpretation. The availability of an accurate and reliable model of speckle noise

formation is a prerequisite for the development of a valuable despeckling algorithm. A possible generalized model

of the speckle imaging is

𝑆(𝑥, 𝑦) = 𝑓(𝑥, 𝑦)𝜂𝑚(𝑥, 𝑦) + 𝜂𝑎(𝑥, 𝑦) (1)

Where S(x,y) is the noisy image, 𝜂𝑚(𝑥, 𝑦) and 𝜂𝑎(𝑥, 𝑦) are multiplicative and additive noises respectively, and f(x,y)

is the noise free image to be recovered. For any speckle, the contrast ratio (ξ) is defined as

ξ =Standard Deviation of I

Mean value of I (2)

where I is the intensity field.

IV. SPECKLE REDUCTION FILTERS

Speckle is a particular type of noise which degrades the fine details and edges definition and limits the contrast

resolution by making it difficult to detect small and low contrast lesions in body. The challenge is to design methods

which can selectively reducing noise without altering edges and losing significant features. Speckle filtering consists

of a window moving over each pixel in the image and applies a mathematical calculation and substitutes for the

value of the central pixel under the window The window moves along the image one pixel at a time until it covers

the entire image. By applying various filters smoothing effect is achieved and speckle noise has been reduced to

certain extent. The analysis study includes various filtering methods mainly in US images after being applied

according to the advantages preserves better image and visual quality of input image. The outcomes proved that

suggested techniques emerge significantly better in values of different quantitative measures such as Peak Signal to

Noise Ratio (PSNR), Mean Square Error(MSE),Mean Absolute Error(MAE)

A. Gaussian filter Gaussian filter [8] is a local linear filter that smoothes the noisy image. It suppresses the noise by preserving the

features of the image, but blurs the edges. Gaussian filter is given by

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 785

g(x) = 1

√2πσ2e

x2

2σ2 (3)

where 𝜎 is the standard deviation of Gaussian distribution. Pixels in the centre of the kernel have higher weights

than those on the periphery. Larger values of 𝜎 produce a blur. The kernel size must be increased with increasing 𝜎

to maintain the nature of the Gaussian filter. A gaussian kernel coefficient depends on the value of 𝜎. In two

dimensions, it is the product of two such Gaussians, one in each dimension and is given as.

g(x, y) = 1

2πσ2 ex2+ y2

2σ2 (4)

B. Median Filter Median filter, one of the nonlinear filter types is created by replacing the median of the gray values of pixels into

its’ original gray level of a pixel in a specific neighbourhood[19]. Median filter can help in reducing speckle noise as

well as salt and pepper noise. The noise-reducing effect of the median filter depends on the neighbourhoods’ spatial

extent and the number of pixels involved in the median calculation.

(5)

C. Bilateral filter A bilateral filter is a non-linear, edge-preserving, and noise-reducing smoothing filter for images. It replaces the

intensity of each pixel with a weighted average of intensity values from nearby pixels[14]. Bilateral filter smooth

image but preserve edges. It operates both in domain and range of the image. Bilateral filter gives better

performance when we are taking log of it because it is use for additive noise . The algorithmic idea of bilateral filter

is it smooth as usual in the domain of the image The bilateral filter is defined as follows:-

(6)

D. Wiener Filter The Wiener filter is one kind of linear spatial domain filter that is used to target image adaptively and based on the

local image variance, the filter adjusts itself [7]. . It can restore images even if they are corrupted or blurred. It

reduces noise from image by comparing with the desired noiseless image. Wiener filter is based on computation of

local image variance. Hence larger the local variance of the image then smoothing is done in lesser amount and if

the local variance is small it performs more smoothing.. The Wiener filtering method requires more computation

time for execution.. The Wiener filter is expressed by

(7)

The Wiener filter decreases the MSE and increases the PSNR. It is disabling to enhance the brightness of the image.

It is considered as a disadvantage of Wiener filter

E. Speckle Reducing Anisotropic Diffusion (SRAD) Filter In diffusion filtering, the direction and strength of the diffusion are controlled by an edge detection function. It

removes speckles and at the same time enhances the edges. It removes speckles by modifying the image by solving a

Partial Differential Equation. SRAD is proposed for speckle noise images without logarithmic compression. In Lee

and Frost filter we utilize the coefficient of variation in adaptive filtering, SRAD exploits the instantaneous

coefficient of variation, which serves as an edge detector in speckled images. The function exhibits high values at

edges and produces low values in homogeneous regions. Thus, it ensures the mean preserving behavior in the

homogeneous regions and edge preserving[11].

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 786

https://en.wikipedia.org/wiki/Non-linear

https://en.wikipedia.org/wiki/Edge-preserving_smoothing

https://en.wikipedia.org/wiki/Noise_reduction

https://en.wikipedia.org/wiki/Smoothing

https://en.wikipedia.org/wiki/Digital_image_processing

V. PERFORMANCE ANALYSIS PARAMETER

To analyze the speckle reduction methods various performance parameters are used and compared such as Peak

Signal-to-Noise Ratio (PSNR), Mean Square Error (MSE), and Mean Absolute Error (MAE).

A. Peak Signal to Noise Ratio (PSNR) PSNR is defined from RMSE. It is the ratio between the original and a denoised image. The higher PSNR the

better is the quality of the compressed or reconstructed image. For 256 gray levels, PSNR is defined as[13]

PSNR = 10log102552

MSE

(8)

B. Root Mean Square Error (MSE)

It is the root average of the square of the difference between original and the denoised image intensity values

divided by the size of the image, RMSE is defined as

RMSE = √1

MN∑ ∑ [f(x, y) − f ′(x, y)]2N−1

y=0M−1x=0 (9)

Structural Similarity Index (SSIM) This index is another metric for measuring the similarity between two images. This metric has much better

consistency with the qualitative appearance of the image. SSIM is defined as

𝑆𝑆𝐼𝑀 = 1

𝑀∑

(2𝜇1𝜇2+ 𝐶1)(2𝜎1,2+ 𝐶2)

(𝜇12+𝜇2

2+𝐶1)(𝜎12+𝜎2

2+ 𝐶2 ) (10)

Where, 𝜇1 and 𝜇2 are the means and 𝜎1 and 𝜎2 are the standard deviations of the images being compared. 𝜎1,2 is the

covariance between them. SSIM has value between 0 and 1, when it is equal to 1 images are structurally equal.

V.RESULTS AND DISCUSSIONS Ultrasound fetal images contain high levels of speckle noise which results in creating problems in the diagnosis. In

this Paper, we have De-noised an Ultrasound fetal Image by using different Filters and their performance is analyzed

on the basis of parameters PSNR, RMSE and SSIM. Higher value of PSNR gives high filtering value so by

calculating the PSNR values of different types of filters we can analyze which type of the filter is the best suitable

for the noise removal from a medical ultrasound image. An original noise-free image was first selected.

Subsequently, an image contaminated with speckle noise (noise factor 0.02) was selected, and finally output images

were obtained by image processing operations involving various types of filtering techniques..Table.1 presents the

performance of different speckle reduction filters in terms of PSNR and SSIM. To see the overall performance, results are plotted

in Table1

Filters PSNR RMSE SSIM Gaussian 32.2097 3.7479 0.8549

Median 32.4412 3.4392 0.8360

Bilateral filter 32.7522 5.208 0.85

Wiener Filter 32.3317 4.0035 0.8448

SRAD 36.807 3.2989 0.8889

Table 1. Comparison of performance of different filters

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 787

The above table shows that the performance comparison of the different types of speckle reduction filters. It shows

that the Speckle Reducing Anisotropic Diffusion (SRAD) Filter has marginally higher PSNR value and good MSE

compared to other de-speckling filters. SRAD removes substantial amount of noise and also preserves edges .

Figure 1Filter’s Performance in terms of PSNR, RMSE, SSIM

(a) Original Image (b) Noisy Fetal Image

(c) Gaussian Image (d) Median Image

0

5

10

15

20

25

30

35

40

Gaussian Median Bilateralfilter

WienerFilter

SRAD

PSNR

RMSE

SSIM

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 788

(e) Bilateral Filter (f) Wiener Filter

(g) Speckle Reducing Anisotropic

Diffusion Filter

CONCLUSIONS

This paper provides detail information of Speckle noise removed by the various filters such as Gaussian filter,

Gabor filter, Median filter, bilateral filter, Wiener Filter, and filter to denoise fetal ultrasound image. It shows that

visual quality of the denoising images. Denoised images with SRAD filter gave better quality than the others. The

SRAD filter shows, higher Peak Signal to noise ratio, higher structural similarity index metric closer to one, and

lower root mean square error(RMSE).

.

REFERENCES

[1] Milindkumar V.Sarode,Prashant R.Deshmukh “Reduction of Speckle Noise and Image Enhancement of Images

Using Filtering Technique” International Journal of Advancements in Technology http://ijict.org/ ISSN

0976-4860, Vol 2, No 1 (January 2011)

[2] P. Kalavathi, M. Abinaya and S. Boopathiraja “Removal of Speckle Noise in Ultrasound Images using Spatial

Filters”. Computational Methods, Communication Techniques and Informatics ISBN: 978-81-933316-1-3

[3] Chetan Prakash Khalane, Prof V. D.Shinde“An Efficient Hybrid Spatial Filter for Removal of Speckle Noise in

Ultrasound Image” IJIREEICE Vol. 6, Issue 11, November 2018ISSN (Online) 2321-2004 ISSN (Print) 2321-

5526

[4] Mehedi Hasan Talukder, Mitsuhara Ogiya & Masato Takanokura “New Noise Reduction Technique for

Medical Ultrasound Imaging using Gabor Filtering “International Journal of Image Processing (IJIP), Volume

(12) : Issue (1) : 2018

[5] Ahlam A. Hussein, Ebtesam F. Kanger, Amel H. Abbas“Speckle Noise Reduction from Ultrasound

Fetus Images”. Iraqi Journal of Science, 2016, Special Issue, Part A, pp:225-231

[6] Lee J.S. Refined filtering of image noise using local statistics Computer Graphic and Image

Processing,1981;15: 380-389

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 789

[7] N. Attlas and S. Gupta, “Reduction of Speckle Noise in Ultrasound Images using Various Filtering techniques

and Discrete Wavelet Transform : Comparative Analysis Wavelet Transform : Comparative Analysis,” no. 6,

pp. 112–117, 2014.

[8] Terebes, R., Borda, M., Germain, C., and Lavialle, O., “A Novel Shock Filter for Image Restoration and

Enhancement”, (2012), IEEE Int. Conf. Signal Processing, pp. 255-259

[9] O. Lopera, R. Heremans, A. Pizurica, and Y. Dupont, “Filtering Speckle Noise in SAS Images to Improve

Detection and Identification of Seafloor Targets,” International Waterside Security Conference, 2010.

[10] Zaid Kraitem, Mariam Saii, Issa Ibraheem“Speckle Noise Reduction in Fetal Ultrasound Images “International

Journal of Biomedical Engineering and Clinical Science 2015

[11] Hyunho Choi and Jechang Jeong”Speckle Noise Reduction Technique for SAR Images Using Statistical

Characteristics of Speckle Noise and Discrete Wavelet Transform”Special Issue Image Optimization in

Remote Sensing May 2019

[12] R. Czerwinski, D. L. Jones, and W. D. O’Brien, “Detection of Lines and Boundaries Speckle Images

Application to Medical Ultrasound,” IEEE Transactions on Medical Imaging, 18(2), 1999

[13] G.Vasavi, S.Jyothi “A Comparative Study of Speckle Reduction Filters for Ultrasound Images of Poly Cystic

Ovary”. Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-7, Issue-6S5, April

2019

[14] Er.Deepak Sharma, Isha Gulati “SPECKLE NOISE REDUCTION USING BILATERAL FILTER”

International Journal of Advanced Technology in Engineering and Science Volume No.02, Issue No. 10,

October 2014

[15] K. Bala Prakash, R. Venu Babu, B. Venu Gopal, “Image Independent Filter for Removal of Speckle Noise,”

International Journal of Computer Science Issues, 8(3), 2011.

[16] C. Li, “Two Adaptive Filters for Speckle Reduction in SAR imagery by Using the Variance Ratio,”

International General of Remote Sensing, Vol. 9, No. 4, pp. 641-653, 1988.

[17] R. Eveline Pregitha, *Dr. V. Jegathesan and **C. Ebbie Selvakumar “Speckle Noise Reduction in Ultrasound

Fetal Images Using Edge Preserving Adaptive Shock Filters” International Journal of Scientific and Research

Publications, Volume 2, Issue 3, March 2012

[18] Ricardo G. Dantas and Eduardo T. Costa “Ultrasound speckle reduction using modified gabor filters” ieee

transactions on ultrasonics, ferroelectrics, and frequency control, vol. 54, no. 3, march 2007

[19] Jyoti Jaybhay, Rajveer Shastr “SPECKLE NOISE REDUCTION FILTERS ANALYSIS “International Journal

of advanced computing and electronics technology ISSN(PRINT):2394-3408,(ONLINE):2394-

3416,VOLUME-2,ISSUE-4,2015

[20] Sivakumar, R. and Nedumaran, D. Comparative study of Speckle Noise Reduction of Ultrasound B-scan

Images in Matrix Laboratory Environment. International Journal of Computer Applications.10(9), pp. 46–50, (

Nov. 2010).

Mukt Shabd Journal


ISSN NO : 2347-3150

Page No : 790

https://www.mdpi.com/journal/remotesensing/special_issues/Image_Optimization

https://www.mdpi.com/journal/remotesensing/special_issues/Image_Optimization


AutomaticFetal Head Circumference Measurement in 2D Ultrasound Fetal

Images using Histogram of Oriented Gradient

Article · December 2020

DOI: 10.34218/IJARET.11.12.2020.052

CITATION

1READS

67

1 author:

Fathima Farsana


1 PUBLICATION 1 CITATION

SEE PROFILE

All content following this page was uploaded by Fathima Farsana on 14 December 2020.


https://www.researchgate.net/publication/347073541_AutomaticFetal_Head_Circumference_Measurement_in_2D_Ultrasound_Fetal_Images_using_Histogram_of_Oriented_Gradient?enrichId=rgreq-5bd2653fb7574bf9d9703bb2f5c50f6a-XXX&enrichSource=Y292ZXJQYWdlOzM0NzA3MzU0MTtBUzo5Njg3Mjg1MjMwNjMzMDBAMTYwNzk3NDI4Njg3OA%3D%3D&el=1_x_2&_esc=publicationCoverPdf

https://www.researchgate.net/publication/347073541_AutomaticFetal_Head_Circumference_Measurement_in_2D_Ultrasound_Fetal_Images_using_Histogram_of_Oriented_Gradient?enrichId=rgreq-5bd2653fb7574bf9d9703bb2f5c50f6a-XXX&enrichSource=Y292ZXJQYWdlOzM0NzA3MzU0MTtBUzo5Njg3Mjg1MjMwNjMzMDBAMTYwNzk3NDI4Njg3OA%3D%3D&el=1_x_3&_esc=publicationCoverPdf

https://www.researchgate.net/?enrichId=rgreq-5bd2653fb7574bf9d9703bb2f5c50f6a-XXX&enrichSource=Y292ZXJQYWdlOzM0NzA3MzU0MTtBUzo5Njg3Mjg1MjMwNjMzMDBAMTYwNzk3NDI4Njg3OA%3D%3D&el=1_x_1&_esc=publicationCoverPdf

https://www.researchgate.net/profile/Fathima-Farsana?enrichId=rgreq-5bd2653fb7574bf9d9703bb2f5c50f6a-XXX&enrichSource=Y292ZXJQYWdlOzM0NzA3MzU0MTtBUzo5Njg3Mjg1MjMwNjMzMDBAMTYwNzk3NDI4Njg3OA%3D%3D&el=1_x_4&_esc=publicationCoverPdf


https://www.researchgate.net/institution/Sadakathullah-Appa-College?enrichId=rgreq-5bd2653fb7574bf9d9703bb2f5c50f6a-XXX&enrichSource=Y292ZXJQYWdlOzM0NzA3MzU0MTtBUzo5Njg3Mjg1MjMwNjMzMDBAMTYwNzk3NDI4Njg3OA%3D%3D&el=1_x_6&_esc=publicationCoverPdf



http://www.iaeme.com/IJARET/index.asp 484 [email protected]

International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 11, Issue 12, December 2020, pp 484-491, Article ID: IJARET_11_12_052

Available online at http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=12

ISSN Print: 0976-6480 and ISSN Online: 0976-6499

DOI: 10.34218/IJARET.11.12.2020.052

© IAEME Publication Scopus Indexed

AUTOMATICFETAL HEAD CIRCUMFERENCE

MEASUREMENT IN 2D ULTRASOUND FETAL

IMAGES USING HISTOGRAM OF ORIENTED

GRADIENT W.Fathima Farsana

Research Scholar, Research Department of Computer Science and Application Vivekandha

college of Arts and Sciences for Women, Elayampalayam,Tiruchengode,

Periyar University Salem, India.

Dr. N.Kowsalya

Assistant Professor, Department of Computer Science, Sri Vijay Vidyalaya College of Arts

and Science, Nalampalli, Dharmapuri, India.

ABSTRACT

Ultrasound images are used to provide information about fetal development in the

womb. The image generated by the two-dimensional ultrasound has not been able to

provide complete information. Therefore, in order to get the form of fetus on

ultrasound image can be clearly identified with the necessary process of image

analysis that can detect the boundaries of objects ROI, so that it can differentiate

between one object with another object on the ultrasound image. In this paper, we

explore a new local feature extraction technique pyramid histogram of oriented

gradients (PHOG) to make simple, fast and high performance. PHOG can describe

the local shape of the image and its relationship between the spaces. The using of

PHOG algorithm to extract image features in image recognition and retrieval and

other aspects have achieved good results.

Keywords: Region of Interest (ROI).Ultrasound (US), histogram of Oriented

Gradients (HOG).

Cite this Article: W.Fathima Farsana and N.Kowsalya, AutomaticFetal Head

Circumference Measurement in 2D Ultrasound Fetal Images using Histogram of

Oriented Gradient, International Journal of Advanced Research in Engineering and

Technology.11(12),2020,pp.484-491.

http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=12

1. INTRODUCTION

Ultrasound (US) is commonly used for pregnancy diagnosis due to its noninvasive nature,

low-cost and real-time acquisition. The main goal of the ultrasound examination is to

determine the location of the fetus and the placenta, the number of fetuses, the gestational age

AutomaticFetal Head Circumference Measurement in 2D Ultrasound Fetal Images using

Histogram of Oriented Gradient


(GA) and estimated day of delivery (EDD) and to detect anomalies as a basis for further fetal

medical management. Head circumference (HC) is one of the basic biometric

parameters used to assess fetal size. HC together with biparietal diameter (BPD), abdominal

circumference (AC), and femur length (FL) are computed to produce an estimate of fetal

weight. The HC is defined as the circumference of the outer skull ellipse, see Fig. The skull is

often only partially visible in US images. These missing boundaries make it difficult to

accurately extract and fit the whole skull. Because of the complex fetal posture within the

uterus, fetal head US images may contain other anatomical structures with intensities that are

similar to those of the skull. They may have failed because of the missing boundaries of the

skull. An automated system could assist inexperienced human observers in obtaining an

accurate measurement. In this work, we focus on measuring the HC using robust local feature

extraction techniques PHOG to represent HC ROI because this measurement can be used to

determine the GA and monitor growth of the fetus.

Figure 1.Ultrasound Fetal Head Images

2. RELATED WORKS

In this literature review we survey papers that aim at finding the HC ROI.In this paper [4]

they convert the data of US image to the form of digital image. Then the region of interest

(ROI) is calculated using various shapes of cropping methods like rectangular, circle and

ellipse and find out the Mean Square error (MSE) and peak-signal to noise ratio (PSNR).

After the analysis they concluded that ROI detection can reach best results through ellipse

shape. In this paper [5] an elliptical region-of-interest (ROI) can be automatically obtained

from the localized head region by defining the outer and inner ellipse then apply phase-based

edge detection methods to detect a HC edge in the ROI image. In this paper [2] they use

phase-symmetry method to detect a HC edge in the ROI image. Phase symmetry can detect a

symmetric object from low-contrast and noisy US images based on frequency domain

analysis. In this paper [6] proposed a fully automatic approach based on texture maps and

optimal ellipse detection to Segment and obtain measurements of the fetal head. In this paper

[7] a system for automatically detecting fetal anatomies that directly exploits a large database

of expert-annotated fetal anatomical structures in US images. However, 20% of the automatic

measurements in [7] showed relatively large errors because of a great difference in

appearance between the testing and training images.

3. METHODOLOGY

In this work first we find the HC ROI using a random forest classifier and prior knowledge.

The HC edge extraction using Histogram of Oriented Gradients, section 3 describes HC

ellipse fitting using a geometric ellipse fitting method and finally HC measurement.

W.Fathima Farsana and Dr N.Kowsalya


Figure 2. The Proposed Framework

Figure 2 of the proposed framework. HOG= histogram of oriented gradients ROI = region

of interest; PS = phase symmetry; ElliFit = geometry ellipse fitting method; HC = head

circumference.

3.1 HC ROI Detection

We extracted the PHOG features from the training samples to represent the appearance of the

anatomical structures. The method is based on analyzing local histograms of image gradient

orientations. For this, the image is divided into small regions called HOG cells, of the size of

[s×s] pix. It derives from the applied HOG scale s = [128, 64, 32, 16, 8, 4]. Over the pixels of

each cell a local histogram of gradient directions or edge orientations is calculated, whereas

the number of analyzed directions is set by the user of the system or selected experimentally.

The HOG feature extraction process consists of five main steps as follows: Step1: Use

Gamma to standardize the color space. This step is the first step in extracting the HOG

feature, which is preconditioned and used to reduce the effect of light changes on the image.

Step2: The common method is to use the one-dimensional template to calculate the gradient

histogram of the image, and use the [-1, 0, 1] gradient operator to convolution the original

image to obtain the gradient component in the x direction. Use [-1, 0, 1] gradient operator to

do the convolution of the original image

To get the gradient component of the y direction. The formula for calculating the gradient

is as follows:

While Gx(x,y) and Gy(x,y) are the horizontal and vertical gradient of the image level,

representing the gray value of the pixel in the image. Gradient direction is α(x,y) :

Step3: The image is divided into a number of cells, each cell contains a number of pixels,

such as a single cell in the total number of pixels 6 * 6, assuming that the use of 9 bin

histogram to count the cell 6 * 6 pixels gradient direction, then the cell inside the pixel

gradient direction of 360 degrees is divided into nine directions area, Then statistical gradient

histogram in the corresponding area, while using the band The gradient magnitude of the pixel

is weighted. Step4: Establish Block and gradient vector normalization. The adjacent cell

connected into a large area block, in order to prevent missed, block area can overlap. The

shape of the block area may be rectangular or circular. Dalai and so on the block and cell

settings were tested found that human detection for 6 * 6 pixel cell, 3 * 3 cell block. And the




histogram of bin is set to 9, then a block of the number of features 3 * 3 * 9. Step5:

Vectorization. The above-extracted block region feature vectors are all combined to form the

eigenvector of the whole image, that is, the gradient histogram of the whole image, also

known as HOG (Histogram of Oriented Gradient, HOG).

3.2 ROI Classification

3.2.1 RF Classifier

A classifier was trained using the RF technique. The RF classifier is a set of multiple random

decision trees. Here we trained a RF classifier to detect a HC ROI in Ultra Sound fetal

images. Each node in the trees includes a set of training examples and predictor. Splitting

starts from the root node and then continues at every node. The procedure is performed based

on the feature representation and allocating the partitions to the right and left nodes. The tree

grows until a specified tree depth is reached. During the bagging process and at each attribute

split, a random subset of features is used. In RF by generating large number of trees, the most

popular class is voted. Training requires a set of fetal head images. Here we trained a RF

classifier to detect the skull boundary in Ultra Sound fetal images. A set of features based on

the prior knowledge of the fetal skull is constructed for training the RF. With the tree growing

from root node to leaf node, the training samples were divided into different nodes and finally

split into different classes. The training process stopped when a maximal tree depth was

reached or the input training samples for nodes were less than a minimal number.

Figure 3. RF Classifier

3.2.2 SVM Classifier

We use support vector machines (SVM) for classification. SVM is a classification technique

for a two-group categorization problem proposed by Cortes and Vapnik . The SVM model

separates the positive classes (+1) and the negative classes (−1) by an optimal hyper plane.

The separation between the two classes is maximized by finding the linear optimal hyper

plane.The SVM model in an object has M training data points {(p1, q1), (p2, q2),..…, (pM,

qM)}, where pM∈ real integer and qM∈{+1, −1}. In the SVM algorithm, the hyper plane is

indicated by (w, b) where w is the weight vector and b is the bias; x is the object classifier.

In the SVM model, the data is not linearly separable, then nonlinear data points are

changed to the higher-dimensional space; the data points then become linearly separable.

4. PHASE SYMMETRY FOR HC EDGE EXTRACTION

Once the classifier was trained we employed phase symmetry for HC Edge extraction. Phase

based edge detection methods performs better in ultrasound images with great intensity in

homogeneity caused by speckle, attenuation or signal dropout. In this paper we employed



Phase Symmetry to detect a symmetric object from low-contrast and noisy US images based

on frequency domain analysis The 2D phase symmetry measure in image I is defined as:

Where ⌠.⌡ = max(·,0); erm(x,y) = I(x,y)∗real(F−1

(LG))and orm(x,y) = I(x,y) ∗

imag(F−1

(LG)) are responses of the even filter and the odd filter, where real(F−1

(LG)) and

imag(F−1

(LG)) denote the even and odd 2D LogGabor filters LG at a scale m and an

orientation r, and F−1

denotes the inverse Fourier transform operation. ε was a small constant

included to prevent division by zero; and τr was the orientation-dependent noise threshold,

which was calculated according to [9]. We employed the phase symmetry measure to

calculate the HC ROI images to extract the HC edge (Fig. 7 (b)). We further used non-

maxima suppression to thin the extracted HC edge to obtain the binary skull boundary.

5. HC ESTIMATION

The obtained HC ellipse is the skull center line (SCL), whereas the clinically defined HC is

the outer skull ellipse. Therefore, we needed to calculate the average thickness of the skull

and integrate it with the SCL to obtain the final HC ellipse Therefore, we needed to calculate

the average thickness of the skull and integrate it with the SCL to obtain the final HC ellipse.

We first applied a top-hat transformation to enhance the contrast of the HC ROI image. Next,

we employed a morphological open operation to remove the small connected skull

components while retaining the large ones(area Si > 80 pixels) for the use of the skull

thickness estimation.Then, the average skull thickness _t

Was calculated using

where M is the

number of components and Li represents the length of each component. Finally, the outer

skull ellipse (HC) was measured as:

6. EXPERIMENTAL RESULTS AND ANALYSIS

Figure 1. visualizes the HC results obtained by four different methods As shown in Figure 4.,

our method achieved the most similar HC results compared with other methods. Table I lists

the region-based and distance-based metrics for each method. It can be observed that our

method performed best in terms of precision, Dice, accuracy, specificity and sensitivity

metrics. Specifically, our method outperformed the HAAR+SVM , HAAR-RF and HOG-

SVM methods in terms of all the evaluation metrics. Overall, our method had better

performance in terms of region based and distance-based metrics [2] method. The numerical

results for each method are listed in Table I. Our method outperformed the other three

methods in terms of mean value_ d, standard deviation sd and root mean squared error

(RMSE),which indicates that our obtained HC.Compared to the methods [2] and [6], our

method integrated the prior knowledge into the RF classifier to efficiently assist the




localization of the HC ROI. The localized ROI is able to contain the fetal head while

discarding irrelevant anatomical structures, thus facilitating the HC measurement by PS and

ElliFit. Our method performed better than the HAAR+SVM,HAAR-RF and HOG-SVM

method due to the use of PHOG to extract the shape information feature of the image and also

describes the spatial information of the image.. This shows that the PS method was able to

preserve a clear head outline more effectively and to remove more noise than the

morphological method. Additionally, the RF+HOG method provided more accurate results

than the HAAR-RF. Table 1 Performance metrics for various techniques

Method

accuracy sensitivity specificity

precision

recall recall

HOG-RF 98.70126 84.10847 99.84171 84.09353 84.6799

HOG-SVM 98.57343 83.46723 99.9805 83.22246 83.46723

HAAR-RF 98.63537 83.46644 97.38029 83.07836 83.46644

HAAR-SVM 98.33203 82.99694 94.97191 82.6998 82.99694

Comparison Graph

Figure 4. HC results in four different methods

6.1 Dataset Description

We first acquired 300 fetal head US images for training the RF classifier and, subsequently,

200 images for testing and evaluation. The image size was 768 x 576 pixels. The fetal

gestational age ranged from 18 to 33 weeks. All images were acquired from various scan

center in tirunelveli. The fetal head ground truth was determined by the radiologist.

Sample Images



7. CONCLUSION AND FUTURE SCOPE

This paper describes an automatic solution for measuring Head Circumference from

ultrasound fetal image. We presented a learning based method for fetal head detection and

measurement by the prior knowledge of gestational age and scanning depth that assists the

localization of the ROI by RF classifiers. We employed phase symmetry to detect the center

line of skull and employed a fast ellipse fitting method to accurately and efficiently fit the HC

ellipse for measurement. The experimental results on 145 fetal head images indicated that our

proposed method outperforms traditional methods in terms of the accuracy and efficiency of

the HC measurement and improves the speed. The developed method is robust and has

potential for use in clinical practice. Our future goal is to develop a framework to the

detection of other anatomical structures, especially those in fetal ultrasound images.

REFERENCES

[1] Bosch A, Zisserman A, Munoz X. Representing Shape with a Spatial Pyramid Kernel Proc of

the 6th ACM International Conference on Image and Video Retrieval. Amsterdam, Nether-

lands, 2007: 401 - 408

[2] Automatic Fetal Head Circumference Measurement in Ultrasound using Random Forest and

Fast Ellipse Fitting

[3] Bosch A, Zisserman A, Muoz X. Scene Classification Using a Hybrid Generative

Discriminative Approach. IEEE Trans on Pattern Analysis and Machine Intelligence, 2008,

30 (4): 712 – 727

[4] Region of Interest Detection for Pregnancy Image Processing M. Khairudin,Joko Laras B T

Electrical Engineering Education Dept. Faculty of Engineering, Dessy Irmawati Electronics

Engineering Education Dept. Faculty of Engineering, Universitas Negeri Yogyakarta

[5] Learning based automatic head detection and measurement from fetal ultrasound images via

prior knowledge and imaging parameters. Dong Ni1, Yong Yang1, Shengli Li2, Jing Qin3,

Shuyuan Ouyang2, Tianfu Wang1, Pheng Ann Heng3

[6] Automatic fetal head measurements from ultrasound images using optimal ellipse detection

and Texture Maps J.L perez-Gonazalez1,J.C.Bello Munoz2,M.C.Rolon Porras2,Fernando

Arambula-Cosfo3, and V.Medina-Banuelos

[7] G. Carneiro, B. Georgescu, S. Good, and D. Comaniciu, ―Automatic fetal measurements in

ultrasound using constrained probabilistic boosting tree,‖ in International Conference on

Medical Image Computing and Computer-Assisted Intervention. Springer, 2007, pp. 571–579.




[8] Automatic Fetal Head Detection on Ultrasound Images by An Improved Iterative Randomized

Hough Transform Rong Xu, Bo Zhang1, Yoshinobu Sato2, Masakatsu G. Fujie1

[9] Detection of Fetal Anatomies from Ultrasound Images using a Constrained Probabilistic

Boosting Tree Gustavo Carneiro, Bogdan Georgescu, Sara Good, Dorin Comaniciu Senior

Member, IEEE.

[10] Analytic Approach for Fetal Head Biometric Measurements Based on Log Gabor Features.

Hanene Sahli1 • Ahmed Zaafouri1 • Amine Ben Slama1,2 • Radhouane Rachdi3 • Mounir

Sayadi1

[11] Automated Detection of Current Fetal Head in Ultrasound Sequences Hanene Sahli 1, Amine

Ben slama1, Ahmed Zaafouri1, Mounir Sayadi 1 and Rathwen Rachdi 2

[12] Automatic image quality assessment and measurement of fetal head in two-dimensional

ultrasound image Lei Zhang Nicholas J. Dudley Tryphon Lambrou Nigel Allinson Xujiong Ye

[13] Detection of fetal anotomies from ultrasound images using a constrained probabilistic

boosting tree Gustavo Carneiro, Bogdan Georgescu, Sara Good, Dorin Comaniciu Senior

Member, IEEE.

[14] Human-level Performance On Automatic Head Biometrics In Fetal Ultrasound Using Fully

Convolutional Neural Networks Matthew Sinclair, Christian F. Baumgartner, Jacqueline

Matthew, Wenjia Bai, Juan Cerrolaza Martinez, Yuanwei Li, Sandra Smith, Caroline L.

Knight, Bernhard Kainz, Senior Member, IEEE, Jo Hajnal, Andrew P. King, Daniel Rueckert,

Fellow, IEEE

[15] Determining the Gestation Age through the automated measurement of the bi-parietal distance

in fetal ultrasound images. Mariam Saii,Zaid Kraitem

[16] Automated Techniques for the Interpretation of Fetal Abnormalities: A Review Vidhi

Rawat,1 Alok Jain,2 and Vibhakar Shrimali3

[17] Segmentation of head from ultrasound fetal image using chamfer matching and

HoughTransform based Approaches Mathews M, Deepa j Tonu James Shari Thomas

[18] Speckle Noise Reduction and Segmentation Techniques in Ultrasound for Detection of Fetal

Abnormality - A Survey *K Priya Dharshini1, K Srinivasan2, R Ramya3

[19] Automatic Detection and Measurement of Fetal Biparietal Diameter and Femur Length —

Feasibility on a Portable Ultrasound Device Naiad Hossain Khan1*, Eva Tegnander2,3, Johan

Morten Dreier2, Sturla Eik-Nes2,3, Hans Torp1, Gabriel Kiss1

[20] Head Contour Extraction From fetal ultrasound images by differnce of Gaussians Revolved

along Elliptical Paths,Alessandro Foi,Matteo Maggioni,Antonietta Pepe,Jussi Tohka

[21] A review on ultrasound Image Segmentation Techniques. Pooja Rani,Rakesh Verma

[22] Evaluation of various speckle Reduction Filters on Medical Ultrasound Images, Shibin

Wa,Qingsong Zhu and Yaoqin Xie

[23] Ultrasound Image Denoising using a Combination of Bilateral Filtering and Stationary

Wavelet Transform Moussa Olfa1, Khlifa Nawres1 1Université de Tunis El Manar École

nationale d’ingénieurs de Tunis (ENIT) Laboratoire de Biophysique et de Technologies

Medicales, 1006 Tunis,[email protected],[email protected].











Contourlet Transform Based Denoising On Normal

Images Using ABC Optimization With Hard

Thresholding Contourlet Transform With ABC Optimization

1Syed Ali Fathima KMN,

2W. Fathima Farsana and

3R. Fathima Syreen,

1Research Dept. of CS, Sadakathullah Appa College, Tirunelveli, India

2Asst.Professor & Head, Dept.of BCA, Sadakathullah Appa College, Tirunelveli, India.

3Asst.Professor, Dept.of BCA, Sadakathullah Appa College, Tirunelveli, India.

Abstract— Digital Images are generally corrupted by noise,

Noise is nothing but addition of unwanted information for the

Original Image. Removal of the noise is necessary to reduce

the minimal damage of the image, improve image details.

Contourlet transform is employed with the directional filter

bank to capture the discontinuities of line and it gives various

directional decomposition.ABC algorithm is bio-inspired

algorithm which is derived from intelligent food search nature

of the honey bee. ABC as an optimization tool provides a

population-based search procedure in which individuals called

foods positions are modified by the artificial bees with time

and the bee’s aim is to discover the places of food sources with

high nectar amount and finally the one with the highest nectar.

The hard thresholding will kill all the coefficients whose

magnitudes are less than the threshold to zero while keeping

the remaining coefficients unchanged. Contourlet with image

denoising with ABC optimization based hard thresholding it

improves the image quality.

Keywords- component; Contourlet Transform;Laplacian Filter

Bank; ABC Optimization;Hard Thresholding

I. INTRODUCTION

Image denoising has remained a fundamental problem in the

field of image processing. Image denoising involves the

manipulation of the image data to produce a visually high

quality image. The restored image should contain less noise

than the observations while still keeping sharp transitions (i.e.

edges) Image denoising is to eliminate the noises as well as to

conserve the details of an image. Image denoising is a

fundamental step in the image processing. Image Denoising is

to confiscate the unnecessary noises at the same time it

preserves the main characteristic of the information and its

enhance the image clarity also. The main focus of an image

denosing is to accomplish noise reduction and maintain the

quality. Digital images plays an notable task in daily life

application such as Natural images such as Lena and

cameraman. Corruption may come in many forms such as

motion blur, noise and camera mis-focus

Noise can corrupted by different intrinsic and extrinsic

conditions. In practical situation its not possible to avoid a

noises. Noises may be additive and Multiplicative. Additive

noises are always interrupted with natural images. Removing

the noise from the image to increase the overall quality of the

processed image.

Contourlet is the greatest method for preserving the edges.

Contourlets form a multiresolution directional tight frame

designed to efficiently approximate images made of smooth

regions separated by smooth boundaries. The contourlet

transform has a fast implementation based on a Laplacian

pyramid decomposition followed by directional filter banks

applied on each bandpass sub band. The contourlet transform

is shown to be more effective in recovering smooth contours,

both visually as well as in PSNR. The contourlet transform is

applied for the noisy image to produce decomposed image

coefficients. Basically Contourlet transform is a double filter

bank structure. It consists of a Laplacian pyramidal filter

followed by a directional filter bank. First the Laplacian

pyramid (LP) is used to capture the point discontinuities. Then

directional filter bank (DFB) used to link point discontinuities

into linear structures.

The hard-thresholding function chooses all contourlet

coefficients that are greater than the given threshold λ and sets

the others to zero. The threshold λ is chosen according to the

signal energy.

ABC as an optimization tool provides a population-based

search procedure in which individuals called foods positions

are modified by the artificial bees with time and the bee’s aim

is to discover the places of food sources with high nectar

amount and finally the one with the highest nectar. Artificial Bee-colony (ABC) to improve the accuracy of denoised image.

II. DENOISING PROCEDURE

The procedure to denoise an image is given as follows:

De-noised image = W−1 [T{W (Original Image + Noise)}

Step 1: Apply forward Contourlet transform to a noisy image

to get decomposed image.

Step 2: Apply hard thresholding to decomposed image to

remove noise.

Step 3: Apply inverse Contourlet transform to thresholded

image to get a denoised image.

Contourlet gives high degree of directionality. It can easily

symbolize the curves and lines without discontinuity. The

intension of image denoising is to eliminate the noises as well

as to conserve the details of an image. Contourlet transform is

to preserve the edges and contours. After Hard thresholding

function are used, The contourlet transform was proposed as a

directional multiresolution image representation that can

efficiently capture and represent singularities along smooth

object boundaries in normal images, and take inverse

transform to reconstruct the original image, Artificial Bee-

colony (ABC) to improve the accuracy of denoised image.

https://en.wikipedia.org/wiki/Motion_blur

https://en.wikipedia.org/wiki/Image_noise

https://en.wikipedia.org/wiki/Frame_of_a_vector_space




https://en.wikipedia.org/wiki/Filterbank





A. Related Work


removal and edge preservation[3]

. The main aim of denoising is

to remove the unwanted noises or signals without losing any

information[7]

. Image Denoising is a central pre-processing

step in image processing to eliminate the noise in order to

strengthen and recover small details that may be hidden in the

data[9]

.

A noise can be categorized depending on its

source,Contourlet transforms is by introducing basis functions

which are local directional, and with multiresolution

expansion[13]

.This representation has two basic building

blocks, the Laplacian pyramid (LP) and the Directional Filter

Bank (DFB). Laplacian pyramid to capture the point

discontinuities, followed by a directional filter bank to connect

point discontinuities into linear structures[9]

.

[3]

A hard thresholding function doesn’t change rest of

coefficients while thresholding reduces its value by absolute

threshold value[10]

. Small coefficients are dominated by noise,

while coefficients have a large coefficients contain more signal

information as compared to noise so thresholding[15]

functions

set the coefficients less than threshold to zero.

Artificial Bee Colony is one of the most recently

introduced optimization algorithms. [12]

This(ABC) algorithm,

which is based on the foraging behavior of honey bees.ABC, a

population based algorithm[8]

, the position of a food source

represents a possible solution to the optimization problem and

the nectar amount of a food source corresponds to the quality

(fitness) of the associated solution. The number of the

employed bees is equal to the number of solutions in the

population[15]

.

B. Motivation and Justification


removal and edge preservation. Contourlet is one of the

transformation function. Contourlet is builded by Laplacian

Pyramid and Filtering bank. Laplacian Pyramid is used to

perceive the line discontinuities and filtering bank is used to

tie-up the discontinuities. Edges are clearly discern in



degree of directionality. It can easily symbolize the curves and

lines without discontinuity. Edges are clearly discern in



degree of directionality.

A new approach to denoise Otsu thresholding optimized

using ABC algorithm in Contourlet transform domain.

Contourlet has high directional sensitivity and are optimally

sparse in representing image containing edges.

B. Organization of the Paper

The rest of the paper is organized as follows. Methodology

includes the outline of the proposed work of Contourlet and

Transform, Denoising Procedure are presented in Section II.

Experimental results are shown in Section IV. Performance

evaluations are discussed in Section IV and Finally Conclusion

is shown in Section V.

III. METHODOLOGY

A. Outline of the Proposed Work

The input image is added with speckle and salt & pepper noise and applies Contourlet Transform to decompose the

image. Then apply Hard Thresholding with ABC Optimization in transforms and noises are removed to obtain the noise free images.

Figure 1.Block Diagram for Contourlet Based ABC with Hard Threshold

B. Contourlet Transform

Contourlet with ABC mostly depends on random search

which is sufficient for exploration but insufficient for

exploitation. Contourlet transform is employed with the

directional filter bank to capture the discontinuities of line and

it gives various directional decomposition. It identifies an

anistropic components. The value of a food source depends on

many factors such as its proximity to the nest, its richness or

concentration of its energy, and the ease of extracting this

energy. ABC as an optimization tool provides a population-

based search procedure in which individuals called foods

positions are modified by the artificial bees with time and the

bee’s aim is to discover the places of food sources with high

nectar amount and finally the one with the highest nectar.

Figure 2.The contourlet transform framework

C. Hard Thresholding

Thresholding is a procedure used for signal and image

denoising. The hard-thresholding function desires all Shearlet

coefficients that are greater than the provide threshold λ and

sets the others to zero. The threshold λ is chosen according to

the signal energy.These methods are to make a noises free in

an image.

𝐷 𝑥, 𝑦 = 𝑆 𝑥, 𝑦 𝑖𝑓 𝑆 𝑥, 𝑦 > 𝑇

𝑒𝑙𝑠𝑒 0 - (1)

Where T is Hard threshold. Let S(x, y) represent the

original shearlet coefficient in the point (x, y), denoised

coefficient D(x,y) at the position S(x,y) by adjust the pixel

values.

D. ABC Optimization

In the ABC model, the colony consists of three groups of

bees: employed bees, onlookers and scouts. It is assumed that

there is only one artificial employed bee for each food source.

In other words, the number of employed bees in the colony is

equal to the number of food sources around the hive.

Employed bees go to their food source and come back to hive

and dance on this area. The employed bee whose food source

has been abandoned becomes a scout and starts to search for

finding a new food source. Onlookers watch the dances of

employed bees and choose food sources depending on dances.





E. Noise Categories

Image noise is an undesirable by-product of image capture

that adds spurious and extraneous information. Noise removal

algorithm is the process of removing or reducing the noise

from the image. The noise removal reduce or remove the

visibility of noise by smoothing the entire image leaving areas

near contrast boundaries. i) Gaussian Noise

Gaussian means that each pixel in the noisy image is

the sum of the true pixel value and a random Gaussian

distributed noise value, which has a bell shaped probability

distribution function given by,

where g represents the gray level, m is the mean or

average of the function, and σ is the standard deviation of the

noise.

ii) Poisson Noise

Poisson or shot photon noise is the noise that can

reason, when number of photons intellect by the sensor is

not ample to give evident numerical information. This

noise has root mean square value comparative to square

root intensity of the image.

𝑝 𝑥 = 𝑒−λ λx for λ>0 and x=0,1,2….

IV. EXPERIMENTAL RESULTS

Experimental results were conducted to denoise

Lena,Cameraman images shown in Fig. 2. Noises are

Gaussian,Poisson noises are considered. Contourlet

Transformation are decomposed level.The hard thresholding

has been used.

Figure 3. Original Image for Lena and Cameraman

Figure 4.Lena image Contourlet with ABC Using hard threshold

Figure 5.Cameraman image Contourlet with ABC Using hard threshold

A. Performance Metrics

i) PSNR

PSNR is used to appraise the restoration results,

which determines how secure the restored image is to the

original image.

𝑃𝑆𝑁𝑅 = 20 ∗ 𝑙𝑜𝑔 log10

𝑚𝑎𝑥𝑖

𝑚𝑠𝑒

ii) MSE

The slighter the MSE the nearer the estimator is to the

tangible data. A miniature mean squared error means that

the arbitrariness reflects the data more precisely than a

superior mean squared error.

𝑀𝑆𝐸 =1

𝑚𝑛 𝐼 𝑖 − 𝑗 − 𝐾 𝑖 − 𝑗

𝑛−1

𝑗=0

𝑚−1

𝑖=0

B. Performance Evaluation

The performance of Contourlet with ABC was evaluated by

using MSE, PSNR. The experiments Contourlet with ABC

hard thresholding have been tested and their denoised image

results are shown in Table I and Table II. Considered all the

metrics, it is observed that performs well for two images such

as Lena, Cameraman.


Gaussian Noise


Poisson Noise





CONCLUSION

This Paper presents for normal image denoising based on

Contourlet with hard threshold is used to remove noise from

the image. The normal images are corrupted by Gaussian and

Poisson noise. Contourlet With ABC Optimization is one of

the method for decomposition is best suited for performance

demonstrated that the use of Contourlet With ABC

Optimization could offer better results than hard thresholding

to improve the best quality of the image.

References

[1]. Jannath Firthouse.P ,Shajun Nisha.S ,Dr.M.Mohammed Sathik ―Noise

Reduction in MRI Images using Contourlet Transform and Threshold

Shrinkages Techniques‖,International Journal of Computer Science and Information Technology (IJCSIT)2016.

[2]. Sakthivel K.,‖ Contourlet Based Image Denoising Using New- Threshold

Function‖, International Journal of Innovative Research in Computer and Communication Engineering ,March 2014.

[3]. Abbas H. Hassin AlAsadi,‖Contourlet Transform Based Method for

Medical Image Denoising‖, International Journal of Image Processing (IJIP), Volume (9): Issue (1): 2015.

[4]. Syed Ali Fathima KMN[1] ,Shajun Nisha S[2], ―Shearlet Transform Based

Denoising On Normal Images Using Hard Threshold‖ International Journal of Innovative Research in Computer and Communication

Engineering,(IJIRCCE) Jan.2017.

[5]. Asaju la’Aro Bolaji, Ahamad Tajudin Khader, Mohammed Azmi al-Betar and Mohammed a. Awadallah ―Artificial Bee Colony Algorithm,

its variants and applications: a survey‖, Journal of Theoretical and

Applied Information Technology,20th January 2013. Vol. 47 No.2. [6]. Junping Wang, Dapeng Zhang*,‖Image Denoising Based on Artificial

Bee Colony and BP Neural Network‖, Information Engineer Department,

Henan Vocational and Technical Institute. ISSN: 1693-6930, April 12, 2015.

[7]. Miao Maa b, Jianhui Lianga, Min Guoa, Yi Fana, Yilong Yinb∗ m

―Segmentation based on Artificial Bee Colony algorithm‖

Journal homepage: www.elsevier.com/locate/asoc.

[8]. Sandeep Kumar, Vivek Kumar Sharma, Rajani Kumari, ‖Improved

Onlooker Bee Phase in Artificial Bee Colony Algorithm‖, International

Journal of Computer Applications (0975 – 8887) Volume 90 – No 6,

March 2014. [9]. Anju T S, Mr. Nelwin Raj N R,‖Shearlet Transform Based Satellite

Image Denoising Using Optimized Otsu Threshold‖, International

Journal of Latest Trends in Engineering and Technology (IJLTET) ISSN: 2278-621X.

[10]. Minh N. Do, Member, IEEE, and Martin Vetterli, Fellow, IEEE,‖The

Contourlet Transform: An Efficient Directional Multiresolution Image Representation‖.

[11]. Yoonsuk Choi, Ershad Sharifahmadian, Shahram Latifi,―Remote sensing

image fusion using contourlet transform with sharp frequency localization‖, international Journal of Information Technology, Modeling

and Computing (IJITMC) Vol. 2, No. 1, February 2014.

[12]. Dervis Karaboga and Bahriye Basturk, ―Artificial Bee Colony (ABC) Optimization Algorithm for Solving Constrained Optimization

Problems‖.

[13]. Jyotsna Patil1, Sunita Jadhav2,―A Comparative Study of Image Denoising Techniques‖, International Journal of Innovative Research in Science,

Engineering and Technology Vol. 2, Issue 3, March 2013.

[14]. Gang Liu1, Jing Liu2,Quan Wang3 ,Wenjuan He4,‖The Translation Invariant Wavelet-based Contourlet Transform for Image Denoising‖,

JOURNAL OF MULTIMEDIA, VOL. 7, NO. 3, JUNE 2012.

[15]. Mingwei Chuia, Youqian Fengb,Wei Wang ac, Zhengchao Lic, Xiaodong Xua―Image Denoising Method with Adaptive Bayes Threshold in

Nonsubsampled Contourlet Domain‖,Science Direct Elsevier 2012. [16]. C.J. Li, H.X. Yang, Y.Y. Cai, B. Song,‖ Image Denoising Algorithm

Based on Non- Subsampled Contourlet Transform and

BilateralFiltering―,International Conference on Computer Information Systems and Industrial Applications (CISIA 2015).

[17]. Malini. S1, Divya. V2 and Moni. R. S,‖ Improved Methods of Image

Denoising Using Non-Sub Sampled Contourlet Transform ―Advances in Computational Sciences and Technology,ISSN 0973-6107 Volume 10,

Number 4 (2017) pp. 549-560.

[18]. Gangyi JiangMei Yu,Wenjuan Yi, Fucui Li,Yong-Deak Kim,‖A New Denoising Method with Contourlet Transform‖, (LNCIS, volume 345).

[19]. ―Speckle Reducing Contourlet Transform for Medical Ultrasound

Images‖,P.S. Hiremath, Prema T. Akkasaligar, and Sharan Badiger. [20]. ―Soft, Hard and Block Thresholding Techniques for Denoising of

Mammogram Images‖Manas Saha, Mrinal Kanti Naskar & B. N.

Chatterji Electronics and C, Journal ,IETE Journal of Research Volume 61, 2015 - Issue 2.

http://www.elsevier.com/locate/asoc

http://www.tandfonline.com/author/Saha%2C+Manas

http://www.tandfonline.com/author/Naskar%2C+Mrinal+Kanti




http://www.tandfonline.com/toc/tijr20/current

http://www.tandfonline.com/toc/tijr20/61/2

© 2019, IJCSE All Rights Reserved 95

International Journal of Computer Sciences and Engineering Open Access

Survey Paper Vol.-7, Special Issue-8, Apr 2019 E-ISSN: 2347-2693

A Survey on Underwater Fish Species Detection and Classification

R. Fathima Syreen1*

, K. Merriliance2

1Dept of BCA, Sadakathullah Appa College, Tirunelveli, India

2 Dept of MCA, Sarah Tucker College,Tirunelveli, India

DOI: https://doi.org/10.26438/ijcse/v7si8.9598 | Available online at: www.ijcseonline.org

Abstract— Fish species recognition is a challenging task for research. Great challenges for fish recognition appear in the

special properties of underwater videos and images. Due to the great demand for underwater object recognition, many machine

learning and image processing algorithms have been proposed. Deep Learning has achieved a significant results and a huge

improvement in visual detection and recognition. This paper mainly reviews some techniques proposed in past years for

automatic fish species detection and classification.

Keywords— Fish Recognition;Fish Classification;Feature ExtractionImage Processing;Neural Network;Deep Learning.

I. INTRODUCTION

Live Fish recognition in the open sea is a challenging task.

They have been investigated for commercial and

environmental applications like fish farming and

meteorological monitoring. Traditionally, aquatic experts

have employed many tools to examine the appearance and

quantities of different types of fish using methods such as

casting nets to catch and recognize fish in the ocean, diving

to observe underwater, using photography[1], combining net

casting with acoustic(sonar)[2].Quantity of collected data is

not enough using these methods. They are not well equipped

to capture normal fish behaviors. Nowadays, much more

convenient tools like hand-held video filming devices,

Embedded video cameras are also used to record underwater

animals. Fish presence and their habit at different times can

also be observed. This equipment has produced large

amounts of data and informatics technology like computer

vision and pattern recognition are required to analyze and

query large about videos. Statistical details about specific

oceanic fish species distribution ,besides an aggregate count

of aquatic animals can assist biologists resolving issues

ranging from food availability to predator-prey relationships.

Since fish can move freely and illumination levels change

frequently in such environments, the recognition task is

challenging. The challenges faced during classification of

underwater fish species include noise, distortion, overlap,

segmentation error and occlusion. As a result, this task

remains an eminent research problem. Prior research is

mainly restricted to constrained environments and the

datasets were probably small. The accuracy also is very

unsatisfying under constraint and unconstraint

conditions.Many scientists in the field of ecology collect

large amounts of video data to monitor biodiversity in their

species applications.But manual analysis of this data is time

consuming. However this large scale analysis is important to

obtain the knowledge to save ecosystem that have a large

impact on the human population. So tools for automatic

video analysis need to be developed. In this paper we first

discuss about some techniques based on shape, color, texture

and hierarchical classification approaches used for fish

detection and classification. The last section review about

deep learning based techniques.

II. LOW-LEVEL FEATURE BASED APPROACH

Many fish species have similar size, color and shape which

makes the identification process very difficult. This section

reviews species recognition based on geometric features such

as size and shape and appearance features such as color and

surface texture.

Fish Contour .This figure is from paper[5]

This paper [3] presents the design of an automated fish

species recognition and monitoring system. Several shape-

based recognition methods were implemented on the

prototype system for testing. Curvature function analysis is

used to find critical landmark points on the fish contour.

Fourier descriptors of a bend- angle function for shape

description meet all invariant requirements. For recognition

process, power spectrum and phase angle information is

calculated as shape descriptors. Since the performance of

these shape methods did not give satisfactory result, a new

International Journal of Computer Sciences and Engineering Vol. 7(8), Apr 2019, E-ISSN: 2347-2693


method called Turn Angle Distribution Analysis(TADA) was

developed and tested on larger database. This method allows

the contour for the current image to be matched against

species-specific contours.300 images of 6 species were tested

for recognition with impressive results. TADA algorithm is

easy to set up and does not require extensive training.

Computational overhead is low and small number of

parameters are used.

FIRS[4] proposes five subsystems for image recognition.

They are image acquisition, image preprocessing, feature

extraction, image recognition and result presentation. To

recognize the fish image it employs Artificial Neural

Network(ANN) and Euclidean distance method(EDM).It

extracts eight main fish features. Dataset consists of 900 fish

images of 30 fish species.600 fish images are used for training

and 300 fish images for testing. FIRS recognize all 30 species

of training fish images with a precision of 99.00% for ANN

and 81.67% for EDM. The ANN gave better precision than

EDM but EDM uses less processing time than ANN. To have

better precision rate FIRS system need to access more fish

features and must have a larger database of fish images.

In this paper[5],an automatic fish classification system that

operates in the natural underwater environment is proposed.

Texture features are combined with shape descriptors

preserved under affine transformation. Using statistical

moments of grey-level histogram, spatial Gabor filtering and

properties of the co-occurrence matrix, texture features are

extracted. Shape features are extracted using

CurvatureScaleSpace transform and the histogram of Fourier

descriptors of boundaries. Affine transformation is applied to

the acquired images to represent 3D fish shape. This describes

the different views while fish move in the water. Two types of

affine invariant features are used. Principal Component

Analysis is used to reduce many values in vector produced by

texture and the boundary features of affine transformation

.For large amount of data, K-fold cross validation method was

used.320 images of 10 different fish species are used. The use

of color features for a better fish description is considered to

be the future work.

This paper[6] recognizes isolated pattern of fish, as the

system acquire an image consisting of pattern of fish. The

image is processed into several phases like image

preprocessing and feature extraction before recognizing the

pattern of fish. The image is processed and then recognized

the given fish into its cluster. The clustered fish are

categorized into poison or non-poison fish and categorizes the

non-poison fish into its family. Image segmentation is based

on color texture measurements. The Gray Level Co-

occurrence Matrix(GLCM) is used for the extraction of

feature values from color texture. In the color texture ,we

extracted the GLCM’s feature from the whole pattern of

interest(fish) which differs from previous studies which

considered only small parts of the pattern of interest. Shape

characters of fish play a important role in fish recognition

based on color texture measurements. Different feature values

obtained from the shape characters helps to distinguish

between each family.

In this paper[7],an algorithm for shape and color descriptors

are generated from fish images. Fish bending and

deformations will not affect the descriptors. The descriptors

crudely sort the fish. A shape grid is derived from the shape

of the fish. From the shape grid, a set of descriptors is

obtained. These shape descriptors are normalized to make

them invariant to size by dividing them with the square root

of the area of fish. They can be used as a set of variables to

sort the fish by species. Average R,G,B values obtained from

the tail and nose elements of the shape grid is used to sort the

fish by species. Discriminant analysis is used to process the

shape descriptors and the color data and it employs linear

combinations of variables which is used to distinguish

between the different species of fish.

III. HIERARCHICAL IMAGE CLASSIFICATION

Hierarchical classification inherits from the divide and

conquer tactic. A customized classifier is trained with special

features at each level. Hierarchical classification has several

advantages. It divides all classes into certain subsets and

leaves similar classes for a later stage. This strategy balances

the load of any single node. This method applies customized

set of features to classify specific classes. It achieves better

performance on similar classes. This method also exploits the

correlations between classes and find the similar group.

This paper[8] proposed a hierarchical classification approach

for live fish recognition. Heuristic method is used to

construct an automatically generated Balanced Guaranteed

Optimized Tree(BGOT).66 types of features are extracted

from the color, shape and texture properties from different

parts of the fish. Forward Sequential Feature

Selection(FSFS) is used to reduce the feature dimensions.

The features selected from FSFS are used by Support Vector

Machine(SVM).BGOT is used to control the error

accumulation in hierarchical classification. Data is acquired

from a live fish dataset with 3178 fish images of 10 different

species. The automatically generated hierarchical tree

achieves about 4% better accuracy compared to state-of-the-

art technique.

In this paper[9] a novel rejection system in a hierarchical

classification method for fish species recognition is proposed.

A Gaussian Mixture Model(GMM) is applied at the leaves of

the hierarchical tree as a reject option. This model evaluates

the posterior probability of testing samples.2626 dimensions

of features such as color, shape and texture properties from

different parts of the fish are computed and normalized. FSFS



is used which utilizes SVM as a classifier. FSFS is used to

select a subset of effective features that distinguishes samples

of a given class from others. The reject function is combined

with a Balance-Guaranteed Optimized Tree(BGOT)

hierarchical method.

Frame work of GMM for Fish Recognition. This figure is

from original paper[9].

This paper[10] proposes a novel Balance-Enforced Optimized

Tree with Reject(BEOTR) option classifier for live fish

recognition. This paper contributes on a hierarchical

classification method suited for greatly unbalanced classes,

classification-rejection method to clear up decisions and reject

unknown classes. It also contributes on the application of the

classification method to free swimming fish. The Grabcut

algorithm[11] is employed to segment fish from the

background. A streamline hypothesis is proposed to align the

fish images to the same direction. BEOTR process the fish

samples from an imbalanced dataset. A GMM model is

applied after the BEOTR method to evaluate the posterior

probability of testing samples and provide the reject option.

The reject option filters out less confident results such as

noise, classification errors or unknown classes. BEOTR

achieves 6% better accuracy compared to SVM and other

hierarchical classifiers.

IV. DEEP LEARNING BASED APPROACH

Since 2012 Neural Network came back as a strong possibility

for classification tasks. By integrating convolutional layers,

Deep neural Network(DNN) are able to create features

vectors and classify them. Neural Network is a mathematical

model which tries to mimic human brains[12].Like SVM,

neural networks may classify feature vectors after a training

phase. A neural network is composed of interconnected

nodes called neurons and each neuron of each layer receives

a signal from the neurons of the previous layer. This signal is

altered according to an activation function and transferred to

the neurons of the next layer. Each layer of a neural network

except the first one which receives the feature vector and the

last one are called hidden layers. To make a network able to

build its own feature, a simple network is moved to

Convolutional Neural Network(CNN).Each convolutional

layer transforms the signal sent from the previous layer using

convolutional kernels, an activation function breaking the

linearity and a pooling phase that reduces image and

strengthens the learning by selecting significant pixels. The

last convolutional layer concatenates all the information in

one feature vector and sends it to another layer or to a

classifier.

This paper[13] proposes two supervised machine learning

methods to automatically detect and recognize coral reef

fishes in underwater HD videos. The first method relies on the

extraction of HOG features and use of a SVM classifier. The

second method is based on Deep Learning. The Histogram of

oriented Gradients characterizes an object in an image based

on its contours. HOG features gives a better results in a

complex classification, where a fish can be hidden in coral

reefs or occluded by another fish. SVM is a supervised

methods to classify feature vector.The deep learning method

approach efficiently recognizes fishes on different

resolutions.Deep learning architecture used in this method

follows the GooLeNet with 27 layers,9 inception layers and

asoft-max classifier.

This paper[14] proposes a system for aquarium family fish

species identification. It identify and classify eight family fish

species with 191 species. The system is built using deep

Convolutional Neural Network(CNN).It is made up of two

convolutional layers for feature extraction and two fully

connected layers for classification. The proposed system is a

simple version for AlexNet[18]. QUT Robotics fish dataset

consisting of 3,960 images. The proposed system achieves

85.59% testing accuracy when compared to AlexNet which

achieves 85.41% over untrained benchmark dataset.

This paper[15] proposes the use of object proposal

classification for fish detection. Convolutional Neural

Network(CNN) are applied to object proposals for detection

and species classification. The detection approach constitutes

generation of bounding box, extraction of CNN features for

each proposal and classification of each bounding box

proposal as fish or background. The background subtraction

algorithm[16] uses a probabilistic background model that

represents each pixel as a mixture of Gaussians. The result of

this algorithm is a binary mask that identifies pixels of

background. An Erosion filter is applied to this mask to obtain

background mask to separate nearby fishes. Then the Blob

detection method[17] is used on both masks to obtain

bounding box proposals. A binary SVM is used for

classifying each bounding box proposal as fish or

background. The generated proposals is used to extract CNN

features from AlexNet. The activations of the 7th

hidden

layer(relu7) in the convolutional network is chosen as



features.

Abstract view of the CNN architecture.This figure is

taken from paper[14]

A novel technique based on convolutional neural networks is

proposed in this paper[19] for fish classification.The noise in

the dataset are first removed.Image pre-processing is done

using gaussian Blurring,Morphological operations,Otsu’s

thresholding and Pyramid Mean Shifting.A grey level

histogram is created from the gray scale image for noise

removal.the outline of the fish object is obtained using

Pyramid Mean shifting with the kernel function k.The

enhanced image is feed to CNN for the classification of Fish

Species.This method gives an accuracy of 96.29%.

V. CONCLUSION

This survey paper discuss various types of approach for fish

recognition. Various techniques for image pre-processing is

discussed. Fish recognition using low level features,

hierarchical classification and deep learning is

discussed.Features has to be enhanced for larger dataset and

also to obtain 100% accuracy.

REFERENCES

[1] W. Rouse. Population dynamics of barnacles in the intertidal zone.

Marine Biology Research Experiment,2007

[2] P. Brehmer, T. Do Chi, and D. Mouillot.Amphidromous fish school migration revealed by combining fixed sonar monitoring (horizontalbeaming) with fishing data. Journal of Experimental Marine Biology and Ecology, 334:139–150, 2006

[3] D.-J.Lee,R.B.Schoenberger,D.Shiozawa,X.Xu,P.Zhan,Contour matching for a fish recognition and migration-monitoring system,in:OpticsEast,International Society for Optics and Photonics,2004,pp.37–48.

[4] .C. Pornpanomchai, B. Lurstwut, P. Leerasakultham, W. Kitiyanan, ―Shape and Texture based fish image recognition system‖, Kasetsart J. (Nat. Sci.) 47 : 624 - 634 (2013)

[5] C.Spampinato,D.Giordano,R.DiSalvo,Y.-H.J.Chen-urger,R.B.Fisher,G. Nadarajan,‖Automatic fish classification for underwater species behavior understanding‖, in:Proceedings of the FirstACM InternationalWorkshop on Analysis and Retrieval of Tracked Events and MotioninImageryStreams,ACM, Firenze,Italy,2010,pp.45–50.

[6] K.A. Mutasem, B.O. Khairuddin, N. Shahrulazman and A. Ibrahim . ―Fish Recognition Based on Robust Features Extraction from color Texture Measurements Using Back Propagation Classifier ‖

[7] N J C Strachan ,‖Recognition of fish species by colour and shape‖,ImageVis.Comput. 11(1)(1993)2-10

[8] 8.P.X.Huang,B.J.Boom,R.B.Fisher,‖Underwater live fish recognition using a balance-guaranteed optimized tree‖,in:ComputerVision–ACCV2012,Springer, Daejeon, Korea,2013,pp.422–433.

[9] .P.X.Huang,B.J.Boom,R.B.Fisher,‖Gmm improves the rejectoption in hierarchical classification for fishrecognition‖,in:2014IEEEWinter ConferenceonApplicationsofcomputerVision(WACV),IEEE,SteamboatSpringsCO.,USA,2014,pp.371–376.

[10] Phoenix X. Huang,Xuan.Huang,Bastiaan J. Boom,‖Hierarchical classification with reject option for live fish recognition‖

[11] C. Rother, V. Kolmogorov, and A. Blake,‖Grabcut - interactive foreground extraction using iterated graph cuts," ACM Transactions on Graph-ics (SIGGRAPH), August 2004.

[12] Atkinson, Peter M. et TATNALL, A. R. L. Introduction neural networks in remote sensing. International Journal of remote sensing, 1997, vol. 18, no 4, p.699-709.

[13] Sébastien Villon, Marc Chaumont, Gérard Subsol, Sébastien Villéger, Thomas Claverie and David Mouillot,‖Coral reef fish detection and recognition in underwater videos by supervised machine learning : Comparison between Deep Learning and HOG+SVM methods ―

[14] .Nour Eldeen M. Khalifa , Mohamed Hamed N. Taha and Aboul Ella Hassanien,‖Aquarium Family Fish Species Identification System Using Deep Neural Networks ―

[15] .Jonas Jager,Erik Rodner,Joachim Denzler,Viviane Wo and Klaus Fricke-Neuderth ,‖SeaCLEF 2016: Object proposal classification for fish detection in underwater videos‖

[16] Chris Stau_er and W. Eric L. Grimson. Adaptive background mixture models for real-time tracking. In 1999 Conference on Computer Vision and Pattern Recognition (CVPR '99), 23-25 June 1999, Ft. Collins, CO, USA, pages 2246{2252,1999.

[17] Satoshi Suzuki and Keiichi Abe,‖ Topological structural analysis of digitized binary images by border following. Computer Vision‖, Graphics, and Image Processing,30(1):32{46, 1985.

[18] Alex Krizhevsky, Ilya Sutskever, and Geo_rey E. Hinton,‖ Imagenet classification with deep convolutional neural networks. In F. Pereira, C.J.C. Burges, L. Bottou and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097{1105. Curran Associates, Inc., 2012.

[19] .Dhruv Rathi, Sushant Jain and Dr. S. Indu,‖ Underwater Fish Species Classification using Convolutional Neural Network and Deep Learning ―

AUTHORS PROFILE R.FATHIMA SYREEN Research Scholar,Sarah Tucker

college,M.S University,Tirunelveli.Also working as

Assistant Professor in the Department of Computer

Applications,Sadakathullah Appa College.

© 2019 JETIR February 2019, Volume 6, Issue 2 www.jetir.org (ISSN-2349-5162)

JETIRAB06065 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 372

A Review on Deep Learning Based Object

Detection and Part Localization

1R.Fathima Syreen,2Dr.K.Merriliance

1.Research Scholar,Assistant Professor,Dept of Computer Application,Sadakathullah Appa College,Tirunelveli

2.Assistant professor,Dept of MCA,Sarah Tucker College,Tirunelveli.

Abstract—The rapid growth of image data leads to the need of

research and development of image retrieval. Compared to

normal image, fine grained images are difficult to classify.

Recognizing objects in fine grained domains can be extremely

challenging task due to subtle differences and variations in poses,

scales and rotations between same species. Deep Learning has

recently achieved superior performance on many tasks such as

image classification, object detection and neural language

processing. In this survey we mainly focus on the object

detection, semantic part localization and feature extraction which can facilitate fine grained categorization.

Keywords—fine grained images;Deep Learning; object

detection;part localization.

I. INTRODUCTION

Fine Grained object classification aims to distinguish objects

from different sub ordinate level categories within in a general

category. Fine grained classification is a very challenging task

due to large inner-class variance and subtle inter-class

distinctions. Variances in the pose, scale or rotation make the

problem more difficult. In the recent years fine grained image

classification has received considerable attention due to the

advancement of deep learning based approaches. Layers of the

features in deep learning techniques are not human designed

instead learned from data using a general purpose learning

procedure. There are large number of variants of deep learning architecture. In this survey we mainly focus on convolutional

neural network (CNN) based approaches. Convolutional

neural network(CNN) are surpassing other approaches in

terms of accuracy and efficiency in a large margin. CNN is a

type of feed-forward artificial neural network. It consists of

one or more convolutional layers which are the building block

of a CNN. The convolutional layers are then followed by one

or more fully connected layers as in a standard multilayer

perceptron(MLP).Most deep learning networks can be trained

end to end efficiently using backpropogation.it is a common

method of training artificial neural network used in

conjunction with an optimization method such as gradient descent. In this survey ,we first introduce several

convolutional neural networks which are mostly used for fine

grained image categorization. Then part localization and

object detection based approaches. The last section will review

about feature extraction based approaches.

II. GENERAL DEEP NETWORK ARCHITECTURES

CNN is able to yield more discriminative representation of the image which is essential for fine grained image classification. AlexNet[1] is a deep convolutional neural network which won the ILSVRC-2012 competition with a top-5 test accuracy of 84.6% compared to 73.8% accuracy achieved by closest competitor. It consists of five convolutional layers, max pooling ones, Rectified Linear units(ReLus) as non-linearities, three fully connected layers. The Visual Geometry Group(VGG)[2] model has been introduced by visual Geometry Group from the university of

Oxford.The VGG-16 has 13 convolutional Layers with 3 fully connected layers.VGG-19 has 3 more convolutional layers than VGG-16 model. Filters with a very small receptive field is used. All hidden layers are provided with the rectification non-linearity.

GooLeNet[3] is a network which won the ILSVRC-2014 challenge with a top-5 accuracy of 93.3%.This CNN is composed of 22 layers and a newly introduced building block called inception modules. The inception module allows for increasing the depth and width of the network while keeping computational budget constant. The 1x1 convolutions are used to compute reductions before the expensive 3x3 and 5x5 convolutions.

III. PART DETECTION BASED APPROACHES

A. SPDA-CNN For Part Detection

The semantic part detection and abstraction CNN

architecture[4] uses two sub networks. One for detection and

one for recognition. The detection sub-network uses a novel

top-down method to generate small semantic part candidates

for detection. The classification sub-network uses a novel part

layers that extract features from parts detected by the detection

sub-network and combine them for recognition. In detection

sub-network k nearest neighbour(k-nn) method is used to

generate proposals for semantic parts. Using k-nn, the

detection network applies Fast R-CNN[] to regress and obtain much more accurate part bounding boxes. The final part

detection are sent to abstraction and classification sub-

network.

Based on the results from detection, the part Rol pooling layer

does semantic pooling. This layer conducts feature selection

and reordering which are useful for classification. By sharing

the computation of convolutional filters, SPDA provides an

end to end network that performs detection, localization of

multiple semantic parts and whole object recognition within

one framework.

B. Deep LAC

To recognize fine grained classes, the Deep LAC[5] incorporates part localization, alignment and classification in one deep neural network.it proposes Valve Linkage Function(VLF) for back-propagation chaining. The main network consists of three sub-networks for localization, alignment and classification. VLF connects all the sub networks and also function as information valve to compromise alignment and classification errors. For a given input image, the part localization sub network outputs the commonly used co-ordinates, top-left and bottom-right bounding box corners. Ground truth bounding boxes are generated with part annotations. This sub-network consists of 5 convolutional layers and 3 fully connected layers. The alignment sub-network receives part localization from localization sub-network.it performs template alignment. This network performs translation, scaling and rotation for pose

http://www.jetir.org/



aligned part region generation which is important for accurate classification. Apart from pose aligning, the VLF in the alignment sub-network plays a essential part in connecting the localization and classification sub networks. If the alignment is good enough in forward propagation, VLF guarantees accurate classification.

C. Non-parametric transfer method

Non-parametric part transfer method[6] introduces a non-parametric approach for part detection.it is based on transferring part annotation from related training images to an unseen test image. The possibility for transferring part annotations to unseen images allows for copying with a high degree of pose and view variations. This method is valuable where intra-class variances are extremely high and precisely localized features need to be extracted. For training, a suitable feature representation is computed, which mainly focuses on the global shape of the object. The overall layout is represented with histograms of oriented gradients(HOG).During testing, again a global HOG feature is computed from the image region defined by the bounding box. Training images with an object shape similar to current test image is searched and then the part annotations are transferred from them directly. Final results are attained by pooling individual classification scores for every transferred part configuration. For classification SVM classifier is used.

D. Part based R-CNN

Part based R-CNN[7] learns the part detectors by leveraging deep convolutional features computed on bottom-up region proposal. R-CNN[8] is used to detect objects and localize their parts under geometric prior. Starting from bottom-up region proposals like selective search ,both object and part detectors are trained based on deep convolutional features. During testing all detectors are used to score, all the proposals and non-parametric geometric constraints are applied to rescore the proposals and choose the best object and part detection. Both the full objects bounding box annotations and a fixed set of semantic part annotations are used to train multiple detectors. A geometric constraints over the layout of the parts relative to the object localization is considered to filter the incorrect detections. At the final step, features for the whole object or part regions are extracted. Then SVM classifier is trained for the final categorization.

Pose Normalized Convolutional Nets

The pose normalized nets[9] first computes an estimate of objects pose and using this, it computes local image features. These local features are used for classification. The features are computed by applying deep convolutional nets to image patches which are located and normalized by the pose. The lower level feature layers are integrated with pose normalized extraction routines and higher level feature layers are integrated with unaligned image features.

Pose Normalized Convolutional Nets. This figure is from original paper[9]

In training, pose normalized nets uses Deformable Part Model(DPM)[10] to predict 2D locations and visibility of 13 semantic part key points. The classifier is trained with concatenated features, extracted from each prototype region and the entire image. In testing groups of key points are used

to compute multiple warped image regions which are aligned using prototypical models. Each region is fed through a deep convolutional network. Features extracted are concatenated and fed to the classifier.

IV. OBJECT DETECTION BASED APPROACH

In this paper[11] two types of attention model is used using deep neural network for fine grained classification task. The raw candidate patches or image regions that have high objectness are generated in a bottom-up process. Selective search, an unsupervised process is used to extract patches from the input image. The bottom-up process will provide multi-view and multi-scale of the original image. Two top-down attention model is proposed in this paper. The object level top-down attention model filters the bottom-up raw patches. They remove the noisy patches that are not relevant to the object. This is done by converting a CNN pre-trained on ILSVRC2012 1K dataset into a FilterNet. A threshold score is used to decide whether a given patch should be selected.

The DomainNet, the second CNN is trained using the patches selected by the FilterNet. The DomainNet extracts features relevant to the categories belonging to the specific domain. It is a good fine grained classifier. Spectral Clustering is used to find the groups and use filters in a group to serve as part detector. Finally both the levels of attention model is merged to bring the significant gains. The advantage of this method is that the attention models are derived from CNN trained with classification task under weakest supervision setting where only class-labels are provided.

This Object-Part attention Model [12] has been proposed for weakly supervised fine grained image classification. The main novelties of object-part attention model are object level attention and part-level attention. The object level attention model localizes objects of images for learning object features and part level model is to select discriminative parts of object. Object level attention model based on saliency extraction, localizes the object of images automatically only with image level subcategory labels without any object or part annotation, both during testing and training. To filter out the noisy image patches, patch filtering is used. The saliency map is extracted via global average pooling in CNN, for localizing the objects of images.

V. FEATURE EXTRACTION

A. Fused with Semantic Alignment

Fused One-Vs-All Features(FOAF)[13] proposes a new framework for fine grained visual categorization. Semantic prior is combined with geometric information for efficient part localization. Using template based model, it detects less deformable parts and localizes other highly deformable parts with simple geometric alignment. During training and testing, no bounding boxes are provided. Less deformable part as well as object is first detected using R_CNN[8].A geometric refinement is done which restricts the detected head within the region of detected object. In order to segment foreground from background without object bounding boxes, object confidence map is computed, denoting the possible location of foreground object. The final alignment is based on the detected head and segmented foreground mask, which guarantees the accurate part alignment.

For Fused One-Vs-All Features learning, features are extracted based on the aligned parts and train One-vs-All SVM classifier to obtain mid-level features. Similar sub categories are fused iteratively and treated as micro-classes. Another SVM classifier is trained based on these mid-level features for final classification. FOAF achieves superior performance than traditional features for classification.




B. Subset Feature learning

Subset Feature learning[14] consists of two main parts. It proposes a subset learning system, which first clusters visually similar classes and then learns deep convolutional neural network for each subset. A progressive transfer learning system also been proposed to learn domain-generic convolutional feature extractor. Applying Linear Discriminant Analysis(LDA) to fc6 features to reduce their dimensionality, visually similar species are clustered into k subsets. A separate CNN is attained for each of the k pre-clustered subsets.

To select most relevant subset is to train a separate CNN based subset selector(SCNN).Using the output from the pre-clustering as the class labels, SCNN is trained by changing the softmax layer fc8 to make it have K outputs. As with the previously trained CNNs, the weights of SCNN are trained via back propagation.

C. Selective Convolutional Descriptor

In an unsupervised methodology of feature extraction

Selective Convolutional Descriptor Aggregation(SCDA)[15]

meet the challenges of automatically localizes the main

object in fine grained images. A pretrained CNN model first

extracts convolution activations for an input image. A

Multiscale Orderless Pooling (MOP) CNN first extract CNN

activations and perform VLAD pooling to determine the

useful part of the activations. These useful descriptors are

then aggregated using sum-pooled convolutional features on

the last convolutional layer which outperforms the traditional methods like PCA(Principle Component Analysis). In

SCDA, the preliminary step is getting pool5 which refers to

the activations of the max-pooled last convolutional layer. To

obtain a useful deep descriptors and removing background

noise, aggregation map(A) is used. Then following this, mask

map(M) of the same size as A is obtained. In order to resize

it, bicubic interpolation is used whose size is same as input

image.

D. Bilinear CNN

Bilinear Model[16] consists of two independent

features combined with an outer product. A feature function

takes as input, an image I and a location L and outputs a

feature of size CxD. Generally locations can include position

and scale. The feature outputs are combined with an outer

product using the matrix outer product. To obtain an image

descriptor, the pooling function aggregates the bilinear features across all locations in the image.

Bilinear CNN model. This figure is from original paper[16]

A natural candidate for the feature function f is a

Convolutional neural network which consists of a hierarchy

of convolutional and pooling layers. Two different CNNs pretrained on ImageNet datasets are used. By pretraining,

bilinear deep network will benefit from additional training

data in the cases of domain specific data scarcity. The

variants of outer product representations are very effective at

various texture, fine grained and scene recognition tasks.

E. Refining Deep Convolutional Features

Incorporating multi-scale image recognition using Fisher

Vector(FV) CNN is proposed in Refining deep convolutional

features[17].It is based on an unsupervised manner in

extracting features. A non parametric feature weighting

method is proposed on the basis of refining convolutional

descriptors to boost the performance. To generate multiscale

information pooling of feature tensors of the last convolutional layers with ReLu activation in a CNN is

achieved using different pooling window sizes. No object

bounding box and part annotations are used during training

and testing times. Only image labels are used. Networks like

VGG-16 and AlexNet are used

Table-1 Architecture and accuracy between different methods

Method

Architecture Accuracy(%)

Part-based R-CNN

Pose Normalized Nets

Deep LAC

SPDA-CNN

Two-level attention Model

Bilinear CNN

Subset Feature Learning

Fused One Vs-All Feature

AlexNet

AlexNet

AlexNet

VGGNET

AlexNet

VGGnet

AlexNet

VGGNet

76.4

75.7

84.10

85.14

69.7

77.2

77.5

84.63

VI. CONCLUSION

This paper surveys some recent programs in deep learning based fine grained object detection and part localization. Several common convolutional neural networks are introduced first, including AlexNet, VGGNet and GooLeNet. They can be directly adapted to fine grained image classification. Since there are large intra class variance and small inter class variance in fine grained image classification, many common approaches resort to deep learning technology for object detection. While some approaches integrate the part localization into the deep learning framework, since subtle differences of visually similar fine grained objects exists in local parts. Some Fine grained categorization approaches also combine feature extraction to gain more classification capability and accuracy for fine grained images.

REFERENCES

[1]. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E.Howard, W.

Hubbard, L. D. Jackel.,”Backpropagation applied to handwritten zip

code recognition.” Neural Computation,

[2]. K. Simonyan, A. Zisserman.”Very deep convolutional networks for

large-scale image recognition.”, arXiv:1409.1556,2014 [3]. C. Szegedy, W. Liu, Y. Q. Jia, P. Sermanet, S. Reed, D.Anguelov,

D. Erhan, V. Vanhoucke, A. Rabinovich,”Going deeper with

convolutions.” In Proceedings of IEEE Conference on Computer Vision and

Pattern Recognition, IEEE, Boston, USA, pp. 1–9, 2014.

[4]. H. Zhang et al., “SPDA-CNN: Unifying semantic part detection and

abstraction for fine-grained recognition,” in Proc. IEEE Conf.

Comput Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 1143–1152.

[5]. D. Lin, X. Shen, C. Lu, and J. Jia, “Deep LAC: Deep localization,

alignment and classification for fine-grained recognition,” in Proc.

IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2015, pp. 1666– 1674.

[6]. C. Göering, E. Rodner, A. Freytag, and J. Denzler, “Nonparametric part

transfer for fine-grained recognition,” in Proc. IEEE Conf.

Comput.Vis Pattern Recognit. (CVPR), Jun. 2014, pp. 2489–2496.

[7]. N. Zhang, J. Donahue, R. Girshick, and T. Darrell, “Part-based R-CNNs

for fine-grained category detection,” in Computer Vision—

ECCV. Cham, Switzerland: Springer, 2014, pp. 834–849.




[8]. W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural Network

regularization,” ArXiv:abs/1409.2329, 2014.

[9]. S. Branson, G. Van Horn, S. Belongie, and P. Perona, “Bird species

categorization using pose normalized deep convolutional nets,” in

Proc.BMVC, Jun. 2014.

[10]. S. Branson, O. Beijbom and S. Belongie,” Efficient large-scalestructured

learning.” in Proceedings of IEEE Conference on Computer Vision and

Pattern Recognition, IEEE, Portland, USA, pp. 1806–1813, 2013.

.

[11]. T. Xiao, Y. Xu, K. Yang, J. Zhang, Y. Peng, and Z. Zhang, “The

application of two-level attention models in deep Convolutional

neural network for fine-grained image classification,” in Proc. IEEE

Conf. Comput. Vis. Pattern Recognit., Jun. 2014, pp. 842–850.

[12]. Yuxin Peng , Xiangteng He, and Junjie Zhao, “Object-Part Attention

Model for Fine-Grained Image Classification”,IEEE Trans. Image

Process., vol. 27, no. 3,Mar. 2018.

[13]. X. Zhang, H. Xiong, W. Zhou, and Q. Tian, “Fused one-vs-all Features

with semantic alignments for fine-grained visual categorization,” IEEE

Trans. Image Process., vol. 25, no. 2, pp. 878–892, Feb. 2016.

[14]. Z. Ge, C. McCool, C. Sanderson, and P. Corke, “Subset feature learning

for fine-grained category classification,” in Proc. IEEE Conf. Comput

Vis. Pattern Recognit. Workshops, Jun. 2015, pp. 46–52.

[15]. X.S. Wei, J.H Luo and J.X. Wu, “Selective convolutional descriptor

aggregation for fine-grained image retrieval”, Xiv:1604.04994,2016

[16]. T.-Y. Lin, A. RoyChowdhury, and S. Maji, “Bilinear CNN models for

fine-grained visual recognition,” in Proc. IEEE Int. Conf.

Comput. Vis.,Dec. 2015, pp. 1449–1457.

[17]. Weixia Zhang,Jia Yan,Wenxuan Shi,Tianpeng Feng,Dexiang Deng”

Refining deep convolutional features for improving fine-grained

image recognition”


http://iaeme.com/Home/journal/IJARET 91 [email protected]

International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 11, Issue 11, November 2020, pp. 91-100, Article ID: IJARET_11_11_009 Available online at http://iaeme.com/Home/issue/IJARET?Volume=11&Issue=11 ISSN Print: 0976-6480 and ISSN Online: 0976-6499 DOI 10.34218/IJARET.11.11.2020.009 :

© IAEME Publication Indexed Scopus

DEEP CONVOLUTIONAL NETWORKS FOR UNDERWATER FISH LOCALIZATION AND

SPECIES CLASSIFICATION R. Fathima Syreen*

Research Scholar, Sarah Tucker College, Manonmanium sundaranar university Tirunelveli, India ,

Dr. K. Merriliance Dept. of MCA, Sarah Tucker College, Tirunelveli, India

*Corresponding Author

ABSTRACT Live fish recognition is a difficult multi-class order task in the open sea. We propose

a technique to perceive fish in an unlimited common habitat.In the proposed technique, VGG-16 with deep fish architecture is used to enhance the feature extraction what's more, to improve the exactness of the result.The proposed approach comprises of two fundamental stages; namely Fish Localization phase and Fish classification phase.The technique first detect the fish from the image by extracting feature map using VGG16

network. DeepFish architectureis used to categorize the Fish.Then, the proposed approach uses support vector machine and random forest classifier to differentiate

between fish species. Experimental results obtained show that VGG16 with deepfish architecture using support vector machine attains a better accuracy of 99.47%.

Keywords: VGG-16, Random Forest, Support Vector Machine, Deep Fish Architecture Cite this Article: R. Fathima Syreen and Dr. K. Merriliance, Deep Convolutional

Networks for Underwater Fish Localization and Species Classification, International Journal of Advanced Research in Engineering and Technology, 11(11), 2020, pp. 91-100. http://iaeme.com/Home/issue/IJARET?Volume=11&Issue=11

1. INTRODUCTION Recognition investigates are for the most part centered around objects on the ground. In any case, Recognition of submerged article is in incredible interest. Fish Recognitions a difficult exploration issue. We attempt to find an answer for submerged item Recognition. We propose a structure for submerged live fish Recognition in unhindered normal air. As fish can move

uninhibitedly and brightening levels change much of the time in such atmosphere, theRecognition task is very challenging. So the undertaking of fish Recognition stays an

Deep Convolutional Networks for Underwater Fish Localization and Species Classification


extraordinary exploration issue. Article identification which is one of the significant applications in the field of PC vision has been the focal point of examination, and convolution

neural system has made incredible improvement in object discovery. Article discovery is advancing from the single item Recognition to the multi-object recognition. The initial one

recognizes a solitary article from a image, it very well may be said that it is an issue of characterization, and the later isn't just can distinguish all the articles in a image, however finds the specific area of the articles. Intricacy emerges in light of the fact that discovery requires the precise limitation of articles and arrangement of items

Over the most recent couple of years, object order and location capacities have fundamentally improved because of advances in profound learning and convolutional networks. This improvement isn't just because of the consequence of all the more remarkable equipment, bigger datasets and greater models, yet for the most part a result of new thoughts, calculations

and improved system structures. Present day ways to deal with object Recognition utilize profound learning techniques. Convolutional neural systems (CNNs) make ground-breaking and for the most part precise suppositions about the idea of images. CNNs have many less associations and boundaries contrasted with standard feed forward neural systems with likewise measured layers thus they are simpler to prepare. The most basic technique of improving the

exhibition of profound neural systems is by expanding their size. This incorporates both expanding the profundity just as its width. In this paper, we propose two CNN systems to limit

the item and to classify the article. The system comprises of two primary modules: 1) confinement net and 2) characterization net. For restriction net, we calibrated 16-layer VGG organize and for order net, we utilize a DeepFish design. train a profound recognition Weorganize VGG16 and concentrates highlight map utilizing pooling layer'pool2'. At that point highlights gained from DeepFish design are given to SVM and Random woods classifier for arrangement task.

2. RELATED WORKS The problem of fine grained Recognition in ordering objects which look fundamentally the same as and contrast just for unobtrusive subtleties have been as of late tended to by the PC

vision, AI and interactive media mainstream researchers. In [6] uses highlight extraction utilizing SIFT and SURF calculation followed via programmed characterization utilizing SVM classifier. It brings about exactness of 94.44% utilizing SIFT, just 69.57% utilizing SURF calculation in the interim SVM beats than KNN and ANN. Creators in [16] proposed a strategy

to perceive fish in an unhindered common habitat recorded by submerged cameras. This technique separates 66 kinds of highlights, which are a mix of shading, shape and surface properties from various pieces of the fish and diminish the component measurements with

Forward Sequential Feature Selection determination (FSFS) technique ology. The chose highlights of the FSFS are utilized by a SVM classifier. A Balance-Guaranteed Optimized Tree (BGOT) has been applied so as to control the blunder aggregation in progressive grouping and, in this way, accomplish better execution. A BGOT of 10 fish species is naturally built utilizing the interclass similitudes and a heuristic strategy. The proposed BGOT based various leveled arrangement strategy accomplished around 4% better exactness contrasted with cutting edge techniques on a live fish image dataset.

Earlier fish Recognition explores are mostly limited to obliged situations. Leeetal[27] utilized form coordinating to perceive fish in the fish tanks.Strachanetal [21 23] utilized –

shading and shape descriptors to perceive fish shipped along a transport line underneath an advanced camera.Larsenetal[28] determined shape and surface highlights from an appearance model, classified with LDAand tried on a dataset containing 108 images of three fish species under compelled condition(caught from the ocean), accomplishing an accuracy of 76%.There

R. Fathima Syreen and Dr. K. Merriliance


are additionally explores completed on submerged live fish.Spampinatoetal [ ] consolidated surface highlights and shape includes and tried on a database containing 360 images frequently

fish species, achieving a normal exactness of 92%.Huangetal [17] introduced a Balance-Enforced Optimized Tree with Reject Option(BEOTR)algorithm for live fish Recognition with dismissal alternative. They did an examination on a dataset containing 24150 physically named fish images of 15 fish species gathered from south Taiwan Sea. Then in [29], they further utilized GMM to improve the oddball choice in progressive order. With BGOT and GMM they got a normal accuracy of 65% on a bigger dataset containing 24,150 images of 15 species.

3. PROPOSED FRAMEWORK Item areas of images are gotten through Localisation Net and afterward we utilize DeepFish engineering to arrange the article as shown in figure 1. As of late, convolutional neural systems

(CNNs) have been incorporated into number of visual Recognition frameworks in a wide assortment of spaces. Rather than more conventional item Recognition strategies which register hand-built highlights on images as an underlying pre-processing step, the quality of the CNN lies in their capacity to take in discriminative highlights from crude information inputs (e.g., image pixels). CNNs were advocated by LeCun and associates who originally utilized such

models to conventional article Recognition and to digit Recognition task. In object acknowledgment, include extraction for the most part made out of a channel bank layer, a non-direct change, and a component pooling layer. A run of the mill convolutional neural systems (CNN) is such a framework that comprises of a few layers. A convolutional channel bank layer means to separate nearby examples. The motivation behind nonlinear preparing layer is to frame a non-direct complex model. The goal of highlight pooling layer is to diminish include maps goal. In Recent occasions, conventional article identification techniques have started to use profound CNNs and beat any contending approaches dependent on customary highlights.

Figure 1 Architecture of Proposed Framework

3.1. Fine Grained Fish Localization The LocalisationNet arrange takes as information a whole image. The system first procedures the entire image with a few convolutional (conv) and max pooling layers to deliver a conv include map. At that point a fixed-length include vector from the element map is separated in

the pooling layer. A productive preparing technique is proposed which exploits highlight sharing during preparing. Max-pooling layers has likewise been successfully utilized for

localization.Objectness map is characterized as a image where every pixel esteem demonstrates the likelihood that the pixel has a place with an intrigued object. CNN is a mainstream zone of examination inside the field of man-made brainpower and image order. It very well may be



utilized to distinguish and characterize protests by essentially providing a PC program with enough information and set up this system to prepare itself so as to tackle the given

characterization issue. We separate profound convolutional highlights utilizing a VGG16 pre- prepared CNN model. To accomplish our particular area and create more discriminative

highlights, we tweak the pre-prepared CNN model. All the more explicitly, we utilize a 5-way fc3 layer (the last completely associated layer), rather than a 1000-way fc3 order layer as

appeared in Fig2. Past work in the field depends on applying channels to the images to isolate the fish from the foundation or hone the images by evacuating foundation commotion. The images from the dataset are separated from submerged cameras and contain commotion, so arranging these images is testing. As the Convolutional Neural Network (CNN) has shown solid discriminative power, we utilize the element maps in CNN to evaluate the nearness of fish species. Applying normal pooling on conv or pool we find that component maps acquires a image adequately speaking to the fish area. A large portion of highlight maps center around the

frontal area fish locale, despite the fact that there exist some component map with high enactment on the foundation. Subsequently, applying normal pooling on those element guides would stifle the actuation on foundation, in the interim assemble and feature the initiation on closer view area. VGG16 is a convolution neural net (CNN) engineering which is viewed as one of the brilliant vision model design till date. Most one of a kind thing about VGG16 is that as opposed to having countless hyper-boundary they concentrated on having convolution layers of 3x3 channel with a step 1 and consistently utilized same cushioning and maxpool layer of 2x2 channel of step 2. It follows this game plan of convolution and max pool layers reliably all through the entire design. At long last it has 2 FC(Fully Connected layers) trailed by a softmax for yield. The 16 in VGG16 alludes to it has 16 layers that have loads. This system is a really enormous system and it has around 138 million (approx) boundaries.

Figure 2 Architecture of VGG-16 for localizing fish species

When this CNN is tweaked, we extricate its pool highlight guides to make objectness map. The pool include speaks to the reactions of various convolutional channels on the article. As there are various pool layers in VGG-16 net, we break down the objectness map produced by various layers, e.g., pool2, pool3, pool4, pool5. Contrasted and the objectness map created by pool2 layer, the objectness maps produced by pool3 and pool4 have higher reaction on the foundation locales. Accordingly, it is more reasonable to utilize pool2 to produce the objectness map. To create the objectness map for a given image I, we initially resize it to measure 224×224, at that point feed it into the CNN to separate its pool2 include. The size of pool2 is 512×7×7. It uses the worldwide normal pooling in CNN for limiting fish of images, instead of just choosing the image patches pertinent to fish that have huge regions of foundation commotion.



3.2. Fish ClassificationSubsequent to confining the fish from the given image,a profound fish architecture[1] is utilized for arrangement task as appeared in Fig 3.Deep design comprises of PCANet[2]. A convolution profound neural system (ConvNet) is one of the key segment of profound learning strategy utilized in image characterization. ConvNet are intended to perceive neighborhood designs. ConvNet comprises of a few phases that are teachable, stacked on head of one another which follow a directed classifier. The stages comprise of layers of a convolutional channel bank, a

nonlinear handling and an element pooling. The PCANet[2] contains three fundamental information handling components: fell PCA, paired hashing and square savvy histogram.

Furthermore, these components incorporate the three phases of PCANet: the initial two phases are of PCA convolution and the last yield stage is the blend of double hashing and square insightful histogram. Convolution layer of the ConvNet can be considered as the PCA falling and the Hashing and histogramming are utilized rather than the nonlinear preparing layer and highlight pooling layer. The phases that comprise of fell PCA can be expanded or diminished as and when requested by the issue. Principal Component Analysis (PCA) is very well known procedure of dimensionality reduction. Given an information with n measurements PCA is use to lessen the information to k measurement with least conceivable remaking mistake. In the wake of finishing PCANet process, every preparing test compares to an element vector. These highlights are then utilized for order task. For order task we utilize the Support Vector Machine (SVM) and Random Forest classifier.

Figure 3 DeepFish Architecture

3.2.1. SVM VS Random Forest An essential piece of AI is order and to anticipate what class our species has a place with. Information science gives a bounty of grouping calculations, for example, strategic relapse, bolster vector machine, guileless Bayes classifier, and choice trees. The Support Vector Machines (SVMs) is a Machine Learning (ML) calculation that is utilized for order. A SVM

takes care of the characterization issue by finding the best hyperplane that isolates all information purposes of one class from those of the other class.SVM calculation tries to expand

the edge around a hyperplane that isolates a positive class from a negative class. Mathematically, the SVM demonstrating calculation finds an ideal hyperplane with the

maximal edge to isolate two classes, which requires to take care of the streamlining issue, as appeared in conditions.



where, I is the weight allotted to the preparation test xi. In the event that _i > 0, xi is known as a help vector. C is a guideline boundary used to compromise the preparation exactness and the model multifaceted nature so a prevalent speculation ability can be accomplished. K is a bit work, which is utilized to quantify the likeness between two examples.

Random Forest (RF) is a gathering learning technique that has been utilized for order and relapse. RF was created by Brieman. The strategy fills in as a lot of autonomous classifiers (normally choice trees), where every classifier picks a specific example in an informational collection, and afterward dominant part casting a ballot is considered to decide the last class name. RF has a few preferences over other learning techniques, however one of its imperative focal points is its power to clamor and overfitting. The arbitrary determination of highlights is

done at every hub split for developing the tree. Irregular Forests fabricates numerous characterization trees. To arrange another article from an information vector, put the info vector down every one of the trees in the backwoods. Each tree gives an order, and we state the tree "votes" for that class. The woodland picks the arrangement having the most votes (over all the trees in the timberland).The error rate of random forest depends on two things:

The correlation between any two trees in the forest. Increasing the correlation increases the forest error rate.

The strength of each individual tree in the forest. A tree with a low error rate is a strong classifier. Increasing the strength of the individual trees decreases the forest error rate.

4. EXPERIMENTAL RESULTS

Figure 4 Fish Species used in this experiment

4.1. Datasets The viability of the design are assessed utilizing the dataset made by fish images from

Fish4Knowledge (F4K) Project [30].This submerged live fish dataset is gained from a live video dataset caught from the vast ocean. The proposed approach was assessed utilizing 509 images of 5 fish species. The dataset was part into equal parts, one for training (484) and other for testing (25). The case of Fish Species are appeared in Fig 2.There are different sizes of fish images extending from about 20x20 to about 200x200 pixels. They are resized to same size. Images in the dataset change in orientation, scaling and position. They additionally shift in shading and textures. All the images are resized to 47x47 for Deep Fishnet.



4.2. Visualization of Pooling Layer

Figure 5 Feature map generated by different pool layers

To improve comprehension of the impact of the profound highlights, we imaged the component guides of pooling2 and pooling3 layer during the test strategy, as appeared in Fig.5. Contrasted with pool3, pool2 is prepared with create a more significant level portrayal firmly related with the arrangement task, which needs to concentrate on the frontal area district and channel the foundation locale. The element guides of the pooling2 layer show a few discrete hinders, that shows the highlights are a lot of conceptual than the previous which is significant for the arrangement issue.

5. RESULTS DISCUSSIONS AND

5.1. Performance Metrics When our proposed system has been constructed, the most significant thing how great our structure functions. In this way, assessing is the most significant undertaking where portrays how great our forecasts are. Here we examine a portion of the presentation measurements to assess our system.

Table 1 Performance metrics for various techniques

Techniques Accuracy Sensitivity Specificity Precision Recall Fscore Deep Fish+SVM 0.9733 0.9278 0.9846 0.8429 0.9278 0.8608

DeepFish+VGG+SVM 0.9947 0.95 0.9949 0.9946 0.95 0.9687 Deep Fish+RF 0.9627 0.6833 0.9651 0.9643 0.6833 0.7429

Deep Fish + VGG+RF 0.968 0.7167 0.9702 0.9685 0.7167 0.7881 1. is the most intuitive performance measure and it is simply a ratio of correctly Accuracypredicted class of fishes to the total number of fishes. Hence, we obtained the accuracy

of 97.33 using DF+SVM and with DF+RF obtained accuracy of 96.27%.Using DF+Vgg+SVM,we obtain the higher accuracy of 99.47% and in DF+Vgg+RF it is

96.8%. 2. which talks about how precise/accurate our work is out of those predicted Precision positive ie., how many of them are actual positive. In our experiment ,the precision for

DF+SVM is 84.29%,DF+Vgg+SVM is 99.46%,DF+RF is 96.43%,DF+Vgg+RF is 96.85%.Of these ,the proposed one DF+Vgg+SVM outperforms better than all other techniques.

3. is the ratio of correctly predicted species to the all observations in actual Recallclasses.The obtained results for the experiments performed were DF+SVM is 92.78%, DF+Vgg+SVM is 95%,DF+RFis 68.33%,DF+Vgg+RF is 71.67%.From the results we conclude that proposed one recall value is good compared to all other ones.



4. might be a better measure to use if one need to seek a balance between Precision Fscoreand Recall.Fscore is usually more useful than accuracy especially in case of uneven class distribution. In our case Fscore is 86.08% for DF+SVM, 96.87 for DF+Vgg+SVM, 74.29% for DF+RF, 78.81% for DF+Vgg+RF. Here we achieved high value of fscore for our proposed work. The results are listed in table 1.

5.2. Comparison Graph

Figure 6 Comparison graph between deep fish svm &deep fish vgg svm

From the diagram in Fig 6 we think about two techniques, for example, profound fish SVM and profound fish vgg SVM. In this we plot for the presentation measurements of exactness, affectability, explicitness, accuracy, review and fscore. While examining it is seen that the fscore of our proposed work arrives at a high incentive than the customary strategy. The F score arrives at the best worth, which means flawless accuracy and review, at an estimation of 1. The most exceedingly terrible F score, which implies least exactness and least review, would be an estimation of 0. From the estimation of fscore, the proposed work performs well than the current ones.

Figure 7Comparison graph between deep fish RF &Deep fish vgg RF

From Chart in fig 7, the correlation is between profound fish RF and profound fish vgg RF. the exhibition measurements like exactness, particularity and precision are generally giving same qualities .The Fscore utilizing vgg accomplishes slight high an incentive than the ordinary technique. From the examination it is noticed that our proposed strategy give preferred and great strategy over the current techniques.

0.75

0.8

0.85

0.9

0.95

1

1.05

DEEP FISH

SVM

DEEP FISH

VGG SVM

0

0.2

0.4

0.6

0.8

1

1.2

Acc

ura

cy

Se

nsi

vit

y

Sp

eci

city

Pre

cisi

on

Re

call

Fsc

ore

DEEP FISH

RF

DEEP FISH

VGG RF



6. CONCLUSION In this paper, we propose a powerful structure for submerged live fish recognition. To

accomplish the particular area and produce more discriminative features, our strategy joins the pre-prepared CNN model VGG16 with profound fish design. At that point for arrangement we use bolster vector machine and irregular woods classifier. Trial results acquired demonstrated

that the SVM classifier with VGG16 and Deep fish engineering accomplished the best execution with precision of 99.47%.

REFERENCES [1] Hongwei Qin, XiuLi, JianLiang, Yigang Peng, Changshui Zhang, “DeepFish: Accurate under

water live fish recognition with a deep architecture”, [2] Y. Gan, J. Liu, J. Dong, and G. Zhong.. “A PCA based convolutional network.” [Online]. -

Available:http://arxiv.org/abs/1505.03703,2015. [3] Spampinato, C., Chen- Burger, Y. H., Nadarajan, G., and Fisher,R. B.” Detecting, tracking and

counting fish in low quality unconstrained underwater videos”, The 3th International Conference on Computer Vision Theory and Applications, 2008,2:514-519.

[4] Spampinato, C., Giordano, D., Di Salvo, R., Chen-Burger, Y. H.,Fisher, R. B., and Nadarajan, G., “Automatic fish classification for underwater species behavior understanding”, Proceedings of the First ACM International Workshop on Analysis and Retrieval of Tracked Events and Motion in ImageryStreams. Firenze, Italy,2010, 45-50.

[5] - T. H. Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Y. Ma, “PCANet: A simple deep learning

baseline for image classification?” IEEE Trans.Image Process., vol. 24, no. 12, pp. 5017-5032, Dec. 2015

[6] M.M.M.Fouad, H.M.Zawbaa, N. El- Bendary, and A.E.Hassanien, “Automatic nile tilapia fish

classification approach using machine learning techniques ,” in Hybrid Intelligent systems (HIS) 2013,13 th international conference pp 173-178.

[7] K. Simonyan and A. Zisserman. “Very deep convolutional networks for large-scale image recognition.” [Online]. Available:http://arxiv.org/abs/1409.1556,2014.

[8] K. He, X. Zhang, S. Ren, and J. Sun.“Deep residual learning for image recognition.” [Online].

Available: ttp://arxiv.org/abs/1512.03385,2015. [9] T. Wiatowski and H. Bolcskei, “Deep convolutional neural networks based on semi-discrete

frames,” in Proc. IEEE ISIT, June 2015, pp.1212-1216. [10] Sun, X., Shi, J., Liu, L., Dong, J., Plant, C., Wang, X., and Zhou,H., “Transferring deep

knowledge for object recognition in low-quality underwater videos”. Neurocomputing, 275: 897-908,2018.

[11] Mao, X. J., Shen, C., and Yang, Y. B., “ Image denoising using very deep fully convolutional encoder-decoder networks with symmetric skip connections”,Eprint Arxiv, No. 1603.090 56

,2016. [12] Pan, S. J., and Yang, Q.,” A survey on transfer learning. IEEE Transactions on Knowledge and

Data Engineering”, :1345-1359,2010. 22 [13] Blanc FPK, Lingrand D,”Fish species recognition from video using SVM classifier”, in

LifeClef’14- Proceedings, http://www.imageclef.org/2014/lifeclef/fish,2014. [14] Boom BJ, He J, Palazzo S, Huang PX, Beyan C, Chou H-M, Lin F-P, Spampinato C, Fisher

RB ,”A research tool for long-term and continuous analysis of fish assemblage in coral-reefs using underwater camera footage. Ecological Informatics” 23(0):83–97,2014.

[15] Boureau Y ,” Learning hierarchical feature extractors for image recognition”, Ph.D. dissertation,New York University,2012.

[16] Huang P, Boom B, Fisher R ,"Underwater live fish recognition using a balance-guaranteed optimized tree", in Computer Vision ACCV 2012, Springer Daejeon, Korea,2013,pp.422 433. –



[17] Huang P, Boom B, Fish er R, “ Hierarchical classification with reject option for live fish

recognition”.Mach Vis Appl 26(1):89–102,2015. [18] Torralba A, Fergus R, Freeman WT 80 million tiny images: “a large data set for non parametric

object and scene recognition”. IEEE Transactions of Pattern Analysis and Machine Intelligence 30(11):1958 1970,2018. –

[19] F.Perronnin,J.Sánchez,T.Mensink,”Improving the fisher kernel for large-scale image classification”, in:ComputerVision–ECCV ,Springer,Heraklion,Crete, Greece,2010,pp.143–

156. [20] K.Kavukcuoglu,P.Sermanet,Y.- .Boureau,K.Gregor,M.Mathieu,Y.L.Cun,” Learning

convolutional feature hierarchies for visual recognition”, in:Advances in Neural Information Processing Systems, Vancouver, Canada,2010, pp.1090 1098. –

[21] N. Strachan, P.Nesvadba, A.R.Allen,”Fish species recognition by shape analysis of images, Pattern Recognit 1990,539 544. –

[22] N. Strachan, “Recognition of fish species by colour and shape”, Image Vis. Comput.

11,1993,2 10. –

[23] D. White,C.Svellingen,N.Strachan,”Automated measurement of species and length of fish by

computer vision,Fish.Res.80,2006,203 210. –

[24] Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural network,” in , 2012 ,pp. 1097-1105. Proc. NIPS

[25] Wang Xinhua, Ouyang Jihong, LI Dayu, and Zhang Guang,”Underwater Object Recognition Based on Deep Encoding-Decoding Network”, Germany ,2019.

[26] -grained object recognition in underwater C. Spampinato1 · S. Palazzo1 · P. H. Joalland,”Finevisual data”, Springer ,Science Business Media New York 2015.

[27] D.J.Lee, R.B.Schoenberger, D.Shiozawa,X.Xu,P.Zhan,”Contour matching for a fish recognition and migration-monitoringsystem”,in: Optics East,International Society for

OpticsandPhotonics,2004,pp.37 48. –

[28] R.Larsen, H.Olafsdottir, B.K.Ersbøll,” Shape and texture based classification of fish species “, ImageAnal.,2009,745-749.

[29] P.X.Huang, B.J.Boom, R.B.Fisher, “Gmm improves the reject option in hierarchical classificationfor fish recognition”,in:2014 IEEE,Steamboat Springs CO.,USA,2014, pp.371–

376. [30] B.J. Boom, P.X. Huang, J. He, R.B. Fisher, Supporting ground-truth annotation of image

datasets using clustering, in: 2012 21stInternational Conference on Pattern Recognition (ICPR), IEEE, Tsukuba, Japan


Bi-Level Fish Detection Using novel Iterative Grouped Convolutional Neural

Network

Article in Design Engineering (Toronto) · March 2022

CITATIONS

0READS

16



object detection and classification View project

Fathima Syreen



SEE PROFILE

All content following this page was uploaded by Fathima Syreen on 05 March 2022.


https://www.researchgate.net/publication/359031606_Bi-Level_Fish_Detection_Using_novel_Iterative_Grouped_Convolutional_Neural_Network?enrichId=rgreq-06bc3c289332b2b68264228a1b649402-XXX&enrichSource=Y292ZXJQYWdlOzM1OTAzMTYwNjtBUzoxMTMwMjY4ODY1OTAwNTQ0QDE2NDY0ODg1MDUxMDg%3D&el=1_x_2&_esc=publicationCoverPdf

https://www.researchgate.net/publication/359031606_Bi-Level_Fish_Detection_Using_novel_Iterative_Grouped_Convolutional_Neural_Network?enrichId=rgreq-06bc3c289332b2b68264228a1b649402-XXX&enrichSource=Y292ZXJQYWdlOzM1OTAzMTYwNjtBUzoxMTMwMjY4ODY1OTAwNTQ0QDE2NDY0ODg1MDUxMDg%3D&el=1_x_3&_esc=publicationCoverPdf

https://www.researchgate.net/project/object-detection-and-classification?enrichId=rgreq-06bc3c289332b2b68264228a1b649402-XXX&enrichSource=Y292ZXJQYWdlOzM1OTAzMTYwNjtBUzoxMTMwMjY4ODY1OTAwNTQ0QDE2NDY0ODg1MDUxMDg%3D&el=1_x_9&_esc=publicationCoverPdf

https://www.researchgate.net/?enrichId=rgreq-06bc3c289332b2b68264228a1b649402-XXX&enrichSource=Y292ZXJQYWdlOzM1OTAzMTYwNjtBUzoxMTMwMjY4ODY1OTAwNTQ0QDE2NDY0ODg1MDUxMDg%3D&el=1_x_1&_esc=publicationCoverPdf

https://www.researchgate.net/profile/Fathima-Syreen?enrichId=rgreq-06bc3c289332b2b68264228a1b649402-XXX&enrichSource=Y292ZXJQYWdlOzM1OTAzMTYwNjtBUzoxMTMwMjY4ODY1OTAwNTQ0QDE2NDY0ODg1MDUxMDg%3D&el=1_x_4&_esc=publicationCoverPdf


https://www.researchgate.net/institution/Sadakathullah-Appa-College?enrichId=rgreq-06bc3c289332b2b68264228a1b649402-XXX&enrichSource=Y292ZXJQYWdlOzM1OTAzMTYwNjtBUzoxMTMwMjY4ODY1OTAwNTQ0QDE2NDY0ODg1MDUxMDg%3D&el=1_x_6&_esc=publicationCoverPdf



Design Engineering ISSN: 0011-9342 | Year 2021

Issue: 8 | Pages: 16652 - 16665

[16652]

Bi-Level Fish Detection Using novel

Iterative Grouped Convolutional Neural

Network

1R. Fathima Syreen

Research Scholar, Reg.no-18121192282003, Sarah Tucker College,

Manonmaniam Sundaranar University, Tirunelveli, India

2Dr. K. Merriliance

Associate Professor, Dept. of MCA, Sarah Tucker College, Tirunelveli, India

Abstract

In the marine environment, submerged cameras can be used for a variety of purposes.

In any case, we need to design equipment that can automatically detect and categorize species

captured on film in order to efficiently cope with the huge amount of data generated.

Significant differences in scenes owing to inadequate lighting, direction of fish due to

different movements, seabed structures, development of oceanic plants behind the scenes, and

picture diversity in the shape and texture of fish of various species should all be considered

while developing such a method. It is well known that grouping convolutional filters in

convolution layers reduces the size of the parameters of a convolutional neural network

(CNNs).Convolutions can be replaced with group convolutions when delivering an advanced

neural network on limited hardware. A significant memory saving is achieved with minimal

loss of accuracy Nevertheless, current executions of grouped convolutions in present-day

deep taking in structures are far from performing optimally in terms of speed. To detect fish

species, we present a Bi-level system. Foreground fish candidates are identified using a

hybrid fusion of optical flow and VGG16 at level one. Consequently, fewer non-fish entities

are mistakenly classified as fish, increasing the detection accuracy of the system as a whole

Iterative Grouped Convolution Network (IGCN) is used to separate all candidate regions into

fish and non-fish entities (IGCNet). The LifeCLEF 2015 fish dataset is used as a benchmark

for our Bi-level approach, which was retrieved from a massive Fish4Knowledge submerged

video store. 94.05percent detection accuracy was found in our investigation.

Keywords: Convolutional Neural Network, bi-level system, optical flow, vgg16, object

detection, IGCNet

1. Introduction

The emphasis of research has shifted away from the land and toward the sky and the

sea. Robots are used for a variety of tasks that living organisms find difficult to perform, such

as underwater exploration and space exploration. It is useful to have a framework ready to

detect, localise, and classify fish and fish species from underwater images in certain marine

applications, such as fish checking or commercial fisheries. Despite this, underwater image

recognition is challenging due to environmental factors. Submerged images are less

illuminating because water significantly weakens the light. Likewise, it makes a lot of noise.


Issue: 8 | Pages: 16652 - 16665

[16653]

Furthermore, because light is absorbed or dissipated by floating matter in the water,

underwater images are hazy. The underwater recordings are of poor quality, and the

background is complicated. Manual Processing video sequences by species experts is a time-

consuming and expensive process that can result in fatigue errors. The main issue and a very

difficult task for underwater imagery analysis is visual content recognition. Intra-class

variability that contributes to visual substance includes views, scales, illumination, and non-

rigid deformation. As with computerised images, the visual content becomes more uncertain

as water depth increases, particularly for species identification and classification.

One of the most important and difficult aspects of research vision is recognizing and

limiting objects in images. Object discovery, for example, is used in security monitoring,

autonomous driving, drone scene analysis, and robotic vision to find instances of semantic

objects of a specific class. Object detection is defined as determining which class the object

belongs to and recognising the area of the object by outputting its position (Object

Localization). Single class object detection is the detection of a single instance of a class from

an image, whereas multi class object detection is the detection of the classes of all objects in

the image.

Object detection faces a number of difficulties, such as partial/complete occlusion,

varying lighting conditions, poses, scale, and so on. As a result, the pipeline of traditional

object location models consists of three stages: illumination region selection, feature

extraction, and classification. First, the detector should be able to distinguish between

foreground and background objects and label them appropriately. Second, the detector should

address the problem of object localization by assigning precise bounding boxes to various

objects. Both are relatively difficult because the detector encounters a high number of "close"

false positives, which correspond to "close but incorrect" bounding boxes. The detector

should be able to detect true positives while suppressing false positives in the vicinity. Fish

Detection is a technique for distinguishing between fish and non-fish objects in underwater

videos. Ocean grass beds, coral reefs, sea-going plants, sessile spineless creatures such as

sponges, ascidians, and gorgonians, as well as general background, are examples of non-fish

objects. Recognizing fish in murky ocean water is the most difficult interaction. Machine

learning algorithms have previously been used to solve the problem in underwater videos.

Because of the variety of lighting conditions, poor picture quality and noise, and

changes in direction and size of the object of revenue in the picture, it is difficult to separate

fish in submerged video/pictures. Because of advances in neural networks, particularly deep

learning, these methods of visual identification have been enormously successful. Deep

learning, also known as deep machine learning, has made significant advancements in digital

image processing for object detection and classification. Deep learning models have been

widely used in computer vision, including general object detection and domain-specific

object detection. CNNs, which serve as the foundation of object detection methods, are

extremely useful for learning the features used in object recognition.

Multilayer deep neural networks allow us to extract distinct, invariant, and powerful

fish-dependent features from images that have distortions and variability. We propose a bi-

level framework for identifying fish in vastly diverse and unconstrained submerged

environments, followed by a final detection phase using deep CNN IGCNet in this paper. The

motion-based feature extraction is used by the combination framework. Our research aims to

identify rapidly moving fish in dynamic backgrounds by combining motion indicators in

highly complex conditions with deep CNNs such as IGCNet. This method improves detection


Issue: 8 | Pages: 16652 - 16665

[16654]

accuracy, particularly for data with high ecological inconstancy, such as unconstrained

submerged fish recordings. IGCNet receives ROIs for foreground fish candidates derived

from the fusion of Optical flow and VGG16, which aids in the suppression of false detection

by precisely identifying the fish in an incredibly contrastive background.

Figure 1: Samples of underwater images with different and complex

background.

2. Related works

To save time and money, marine researchers and managers are looking for non-

damaging and automated methods for fish testing [2]. Several approaches, including non-

dangerous sampling and automatic fish location and species classification in submerged

recordings, have been used [3][4].

Due to their lack of precision, these methods present a number of challenges. These

impediments include helpless light conditions and cloudiness in the water, as well as

impediments and jitter in the imagery, similarity in shape and texture among various fish

species, moving ocean plants, and background disarray. Previously, programmable fish

detection was accomplished through the use of a variety of image and video handling

calculations. Using image processing and pattern recognition-based methods for

unconstrained submerged fish classification [5] [6] is one such method. The temperature of a

fish can also be used to determine its location [7]. To display background pixels in video

frames, Gaussian Mixture Models (GMM) were used as a flexible background deduction

technique. When a specific region of the frame does not match the background model

distribution, GMM detects movement in video frames (most likely caused by fish). On a


Issue: 8 | Pages: 16652 - 16665

[16655]

dataset of a few submerged recordings near the Southern Taiwan district, they achieved

promising results on the fish discovery task with a normal achievement rate of 83.99 percent.

Another comparative method for displaying the covariance of the background and frontal

area (fish events) in the video outlines is to use shading and scale texture highlights. They

achieved a discovery precision of 78.01 percent in four submerged recordings with high

differentiation and illumination variations, solid background object developments, dynamic

textures, and dense background instances.

Optical flow is another movement-based method that has been used in recent years for

submerged fish identification, as well as in [9], where they proposed robot fish motion-

tracking control using optical flow as an object detecting algorithm in the controlled

aquarium environment. The use of Deep Neural Networks (DNNs) in fish identification and

species classification tasks has recently increased. Convolution Neural Network (CNN) on

Fish4Knowledge LifeCLEF 2014 and LifeCLEF 2015 fish images is also compared to other

AI strategies. An algorithm for fish recognition and grouping based on YOLO (You Only

Look Once), a CNN variant [11], has been developed. Their work included the creation of a

YOLO network based on 829 images and the identification of 93 percent of fish species

based on 100 images. [12] In another study, YOLO was used to distinguish fish cases from

each other in three different datasets, with a mean normal accuracy of 53.92 percent, while

YOLO with a parallel correlation filter was used to identify and track submerged fish

instances. Using two datasets of submerged fish, they demonstrated re-enactment results and

proposed the effectiveness of their fish tracker.

Once again, an improved deep learning framework with pre- and post-processing for

improved precision has been proposed for automatic fish identification and species

recognition on the LifeCLEF 2015 dataset [14] [15,16]. In their analysis, SSD [13] was used

to separate the foreground fish from the background fish. They obtained an F-score of 83.8

when they used CNN's ResNet-10 [15] classifier. LifeCLEF 2015 data was once again

incorporated into the GoogleNet CNN-based fish location and classification framework. They

received an F-score of 84.8 percent on the task of species classification [16].

YOLO, a deep neural network, is used to detect fast-moving fish in dynamic

environments using motion detectors in highly complex environments. They are then used to

supplement YOLO when it is unable to recognise fish due to cloudiness in the water or

because they are moving against a highly contrasted background due to their shape being

obscured by a haze.

Another advantage is that by detecting fish that are either stationary or moving

slowly, YOLO helps to suppress the fault alerts of GMMs and optical stream movement

locators [17]. For object location, a Convolutional Neural Network (CNN) called Fast R-

CNN was used, which effectively ignores background noise [18] the region recommendations

made by AlexNet as data sources, and achieves a mean normal exactness of 81.4 percent.

A CNN, on the other hand, such as AlexNet, which has three fully associated layers

and five convolutional layers, is pre-trained using a different method [19]. ImageNet's 1000-


Issue: 8 | Pages: 16652 - 16665

[16656]

picture dataset is used for pre-training, with weights that are then applied to the

Fish4Knowledge dataset after they have been adapted. Each classification has 50 images, and

the Fish4Knowledge database has 10 classifications. Picture de-noising is used in the pre-

processing of Fish4Knowledge images, and the accuracy achieved on 1420 test images is

85.08 percent. Fish4Knowledge has so far achieved 98.64 percent detailed precision, which

was achieved by first using channels to remove the background from the first images, and

then using a CNN with a Support Vector Machine (SVM) for classification [20]. DeepFish is

the name of this approach, which consists of three standard convolution layers and three fully

connected layers. Previously, the images were usually pre-processed to remove as much noise

as possible and, in particular, to outline the shape of the fish [19, 20].

In general, previous methods included a preprocessing step to remove as much noise

from the designated image as possible and highlight the fish's outline [19, 20]. Although this

strategy can improve system performance, its methodology should be carefully tuned, as it

could result in the loss of valuable data and a negative impact on system performance. As a

result, different species may have distinct living areas that can be seen in the background.

During pre-processing, removing the species' background may result in the loss of useful

information.

To get the most out of the data and keep the results unaffected by background noise,

we need to use a robust methodology that can withstand noise and accommodate variety in

classification [21]. These techniques are well-known for preparing their network from start to

finish. We propose a deep learning-based alternative strategy for detecting and classifying

mild fish.

For example, they have used underwater camera images and recordings to discover

fish and have examined CNN using the most recent SE design to classify the data in question.

The YOLO method was used to locate the fish, and once identified, the algorithm classified

the fish to its specific species. Because the Fish4Knowledge dataset only contains exotic fish

species, we gathered another dataset of temperate fish species for this investigation. To

classify fish, we used a pre-training stage on the Fish4Knowledge dataset to prepare the

organisation to learn nonexclusive fish highlights. Following that, the learned weights were

used as the starting point for additional training on the recently gathered dataset of temperate

fish species, a process known as post-training. Transfer learning [23] is a two-process

training metric.

3. Materials and Methods

3.1. Dataset

Our proposed method has been validated using LifeCLEF 2015 (hereinafter called

LCF-15) as a benchmark. There are 93 clarified recordings in the LCF-15 dataset[24], which

includes 15 distinct species. A few video frames from LCF-15 are shown in Figure1.

Fish4Knowledge, a larger repository of underwater recordings, provided this dataset. For a

five-year period, Fish4Knowledge has gathered more than 700,000 unrestricted submerged


Issue: 8 | Pages: 16652 - 16665

[16657]

recordings captured with fixed cameras. More than 3000 distinct fish species live in this area,

making it one of the most biodiverse places on the planet. This dataset is referred to as LCF-

15 by marine experts, and it is used to benchmark video-based fish detection and species

arrangement methods. The nine classes were chosen for this exploration. Uses species such as

AmphiprionClarkii, ChaetodonTrifascialis and ChromisChrysura as well as

HemigymnusMelapterus, MyripristisKuntee, Plectrogly-PhidodonDickii, and Zebrasoma

Scopas. There are a total of 14,765 video outlines with different fish instances that have been

commented upon. Our proposed work uses 8968 of these images. We used 70% of them for

training, i.e. 6278 images for training, and 30% for testing, i.e. 2690 images for testing.

3.2. Proposed Method

This section explains the proposed Bi-level system for underwater object detection.

Figure2 depicts the flowchart of bi-level system.Level oneidentifies the region of

interest(ROIs) for foreground fish candidates from the combined fusionof optical flow and

VGG16.Attained region of interests are fedinto the Iterative Grouped Convolution Network

(IGCNet) to categorize fish and non fish entity in second level.The Section depicts optical

flow, VGG16 and IGCNet

Figure2: Flow chart of proposed bi-level method

3.2.1 Optical Flow

Optical flow is a powerful idea that has been used to improve video classification

accuracy while requiring less computational power to perform. It has been around since 1980

Input Images

Optical Flow VGG16

Combined Fusion

Iterative Grouped

convolution

Network

Object detection (fish/

non fish)


Issue: 8 | Pages: 16652 - 16665

[16658]

as a hand-crafted approach. Optical flow detects motion patterns generated by moving objects

in consecutive video frames and captures even minor movements, reducing missed

detections. In addition to false motion detection, it is vulnerable to false motion detection

when it comes to aquatic plants that move in the background. The optical flow features in the

underwater videos are created entirely by motion. When objects move in three dimensions,

optical flow generates a two-dimensional motion vector [24]. Optical flow can be estimated

in a variety of ways.

Optical flow is a per-pixel prediction that assumes brightness constancy, which means

it attempts to estimate how the brightness of the pixels moves across the screen over time. It

is assumed that I(x, y, t) = I(x + x, y + y, t + t). In layman's terms, this means that the pixel

characteristics at time t (i.e. rgb values) are the same as those at t, but at a different location

(denoted by x and y), where the change in location is predicted by the flow field. The

intensity vector of a pixel is evaluated by inspecting the intensity vector's successive edges.

Aside from fishes, such an investigation is prone to false movement identification when

applied to a unique background with moving oceanic plants and unexpected luminosity

variation due to unsettling influence at the water surface.

3.2.2 VGG16

The architecture diagram of VGG16 as follows

Figure3: Architecture of VGG 16

This layer's contribution is a fixed-size RGB image of the size of 224 x 224 pixels in

size. Each layer in the image is convolutional (conv. ), with a small responsive field of 3x3

(which is the smallest size for capturing the idea of left/right, up/down, and focus) for each

channel. A 1x1 convolution channel is also used in one of the arrangements; this is a straight

change of the information channels (trailed by non-linearity). After the convolution, the

spatial goal is protected by a 1-pixel padding of the convolution layer input. Five max-

pooling layers, which follow a portion of the conv. layers, complete the spatial pooling (not

all the conv. layers are trailed by max-pooling). A 2x2 pixel window is used for max-pooling

and stride 2 is used. Since VGG-16 net has a number of different pool layers, we separate the

objectness map created by each layer (pool2, pool3, etc.) In contrast to the objectness maps

created by Pool3 and Pool4, the feature map generated by Pool2 layer has a higher response

on background regions. Similarly, it makes more sense to deliver the objectness map using


Issue: 8 | Pages: 16652 - 16665

[16659]

pool2. Pool2 has a size of 512x7x7. Instead of simply selecting the image patches relevant to

fish that have vast regions of complex background; it makes use of the overall normal

pooling in CNN to limit the number of fish. This is followed by three Fully-Connected (FC)

layers, each with 4096 channels, while the third only performs 5-way (one for each class)

instead of 1000-way. Lastly, there's a layer called soft max. As a general rule, all networks

have a similar design for their fully connected layers. The rectification (ReLU) non-linearity

has been applied to all hidden layers.

3.2.3 Image fusion

To generate a hypothetical RGB image, the blobs representing the foreground fish

candidates extracted by VGG16 and optical flow while working with input data video frames

are combined. While the third blue channel is black, the green and red channels are filled

with blobs from VGG16 and optical flow, respectively. As a result of this method, the regions

of interest (RoIs) of the foreground fish candidates appear in the resulting image. Our novel

network IGCNet is used to process the data in order to differentiate between fish species and

non-fish entities.

Table 1 displays our new Iterative Grouped Convolution Network (IGCNet) for Fish

Detection, as shown in Figure 4. Our new network will have a 46-layer architecture. It has

eight convolutional layers, two fully connected layers, softmax, and finally a classification

layer. Batch normalization is performed immediately after convolution. The ReLU is run

every time a batch is normalized. After each pooling, the data is split and aggregated twice

more. The pooling layer is used to reduce the dimensionality of data and refine convolution

output features. Convolution is used for each separation group to reduce the number of

computations and parameters to be learned..

Figure 4: Architecture of Iterative Grouped Convolution Network (IGCNet)


Issue: 8 | Pages: 16652 - 16665

[16660]

Table 1: Description of Architecture

Conv1 5×5,64, Stride1 Group conv1 [(3,3,1),c=64)]×3

MaxPooling 3×3 Stride 2 Group conv3

Conv2 5×5,64, Stride1 Group conv2 [(5,5,1),c=64)]×3

Conv3 3×3,64, Stride1 Group conv4

Conv4 3×3,64, Stride 2 Average Pooling 3×3 Stride1

Conv5 5×5,128,Stride1 Conv7 3×3,64, Stride 3

Conv6 3×3,128, Stride1 Conv8 3×3,64, Stride 3

Batch Normalization B Followed by Relu Layer R

3.2.4 Iterative Grouped Convolution Network(IGCNet)

Grouped convolution is the process of using different sets of convolution filter groups

on the same image. This method is different from regular convolution in that the input

channels are divided up and a regular convolution is performed on each partition separately.

To reduce the computation budget and parameter cost, convolutions are performed

independently within each group. Due to its efficient compact structure and suitability for

mobile and embedded applications, group convolution has been gaining in popularity as a

way to compress and accelerate deep neural network architecture. Make a deep network with

a certain number of layers and then repeat it so that there are multiple paths for convolutions

on a single image.

When the IGCNet is trained, it is fed input images with dimensions of 224x224x3. A

64-kernel 5x5x3 convolution layer with a stride of 1 pixel is applied to the input image. Batch

Normalization is used extensively throughout the model, including on activation inputs. This

is followed by 3x3 pooling with stride 2 before the blocks are divided. Group convolution1

and group convolution2 from various blocks are now given the result of the max pooling

111x111x64. A 64-group convolution is performed with groups of 3, 3x3x1, and 5x5x1

channels. Both blocks have a standard convolutional layer after group convolution. There is a

batch normalization followed by a ReLu layer after each convolutional (either standard or

group). To obtain the feature vector, two input blocks are added element by element. A

convolutional layer8 with 3x3x64 of stride3 and a ReLu layer are both fed output from the

summation step. After a ReLU, average pooling is performed. This is followed by a splitting

operation as in the previous block. For final image representation, the split and merge results

are combined with convolution layer 8 to create a final image frame. A fully connected layer

and a softmax then receive the result. There are two types of classification layers: one for fish

and one for non-fish enitites


Issue: 8 | Pages: 16652 - 16665

[16661]

Origina

l image

Optical

flow

image

Poolin

g

image

Fusion

image

Output

from

IGCNe

t

Figure 5: Output images from each steps involved in the bi-level system for fish

detection


Issue: 8 | Pages: 16652 - 16665

[16662]

4. Results and Discussions

4.1. Performance Analysis

The accuracy, sensitivity, specificity, precision, recall, and F1score of the proposed

Bi-level system are used to assess its performance. The measure in table 2 shows that the

proposed algorithm outperforms the existing algorithms.

The F-score is calculated as,

F=

Where

Precision=

And

Recall=

These scores are based on the overlap of the areas of bounding boxes associated with

ground truths and detected fish. The proposed bi-level system achieved an overall detection

accuracy of 94.05 percent. Table 2 shows the proposed algorithm's performance analysis.

Table 2: Performance Analysis

Bi-Level system

Accuracy 94.05%

Sensitivity 92.19%

Specificity 99.49%

Precision 99.84%

Recall 92.19%

F1Score 95.86%

4.2. Comparison with competitive approaches

We perform fish detection using bi-level system on LCF-15 dataset and compare our

method with some recent algorithm which deals with the convolutional neural network for

object detection. The comparison result is shown in the following figure.6 and table.3.

Accuracy and performance of our bi-level system is compared with Ahsanjalal[17], Inception

V3, Resnet-50, Inception-ResNet-V2, CNN-SENet, CNN-SENet without squeeze and

Excitationnetwork. These algorithms are very recent improvement in the deep learning

environment. Our bi-level system achieves an accuracy of 94.05% improving accuracy by

11% compared to CNN-SENetwithoutSqueeze and Excitation[26],slightly better than Ahsan

Jalal (et al.,) [17].The result of the proposed bi-level method shows 94.05% of accuracy

which is higher than the other algorithms.

Table 3: Accuracy comparison of various methods

Method Accuracy

Ahsan Jalal (et al.,) [17] 91.64%

Inception-V3 88.45%

ResNet-50 90.20%

Inception-ResNet-V2 82.39%


Issue: 8 | Pages: 16652 - 16665

[16663]

CNN-SENet[26] 87.74%

CNN-SENetwithoutSqueeze and

Excitation[26]

83.55%

Bi-level system 94.05%

Figure 6: Accuracy chart with comparison of all methods

5. Conclusion

We developed a bi-level architecture that combines optical flow, vgg16, and Iterative

grouped convolution networks to detect and localise fish instances in unconstrained

underwater videos with varying degrees of scene complexity (IGCNet). The main

contribution of this work is that it detects fish in complex backgrounds using a hybrid

approach that combines VGG16 and optical flow outputs. This fusion result is fed into the

novel IGCNet, which is used to fine-tune fish detection in the presence of non-fish entities.

This, as confirmed by the comparative study, helped to achieve cutting-edge results for the

fish detection task. In comparison to traditional computer vision and machine learning

techniques, the proposed hybrid system necessitates more computational resources but

achieves greater accuracy. Because of the advent of fast microprocessors and GPUs, deep

neural networks such as CNN can now be used for tasks that require near-real-time

processing. We believe that this research will assist fisheries scientists in adopting automatic

approaches for detecting, classifying, and tracking fish fauna in non-destructive sampling.

Furthermore, we plan to use a unified deep architecture capable of processing video

sequences in real-time in the future, thanks to rigorous optimization of our algorithm and

improved mathematical modelling. This type of setup will allow for fish detection as well as

species classification, allowing for effective fish fauna sampling.

References

[1] Jennings, S., Kaiser, M.J., 1998. The effects of fishing on marine ecosystems. In:

Advances in Marine Biology. Vol. 34. Elsevier, pp. 201–352.

[2] McLaren, B.W., Langlois, T.J., Harvey, E.S., Shortland-Jones, H., Stevens, R., 2015.

A small no take marine sanctuary provides consistent protection for small-bodied by-


Issue: 8 | Pages: 16652 - 16665

[16664]

catch species, but not for large-bodied, high-risk species. J. Exp. Mar. Biol. Ecol. 471,

153–163.

[3] Harvey, E., Shortis, M., 1995. A system for stereo-video measurement of sub-tidal

organisms. Mar. Technol. Soc. J. 29 (4), 10–22.

[4] Shortis, M., Abdo, E.H.D., 2016. A review of underwater stereo-image measurement

for marine biology and ecology applications. In: Oceanography and Marine Biology.

CRC Press, pp. 269–304.

[5] Rova, A., Mori, G., Dill, L.M., 2007. One fish, two fish, butterfish, trumpeter:

recognizing fish in underwater video. In: MVA, pp. 404–407.

[6] Spampinato, C., Giordano, D., Di Salvo, R., Chen-Burger, Y.-H.J., Fisher, R.B.,

Nadarajan, G., 2010. Automatic fish classification for underwater species behavior

understanding. In: Proceedings of the First ACM International Workshop on Analysis

and Retrieval of Tracked Events and Motion in Imagery Streams, ACM, pp. 45–50.

[7] Hsiao, Y.-H., Chen, C.-C., Lin, S.-I., Lin, F.-P., 2014. Real-world underwater fish

recognition and identification, using sparse representation. Ecol. Inform. 23, 13–21.

[8] Palazzo, S., Murabito, F., 2014. Fish species identification in real-life underwater

images. In: Proceedings of the 3rd ACM International Workshop on Multimedia

Analysis for Ecological Data. ACM, pp. 13–18.

[9] Shin, K.J., 2016. Robot fish tracking control using an optical flow object-detecting

algorithm. IEIE Trans. Smart Process. Comput. 5 (6), 375–382.

[10] Salman, A., Jalal, A., Shafait, F., Mian, A., Shortis, M., Seager, J., Harvey, E., 2016.

Fish species classification in unconstrained underwater environments based on deep

learning. Limnol. Oceanogr. Methods 14 (9), 570–585.

[11] Sung, M., Yu, S.-C., Girdhar, Y., 2017. Vision based real-time fish detection using

convolutional neural network. In: OCEANS 2017-Aberdeen. IEEE, pp. 1–6.

[12] Xu, W., Matzner, S., 2018. Underwater fish detection using deep learning for water

power applications. In: arXiv preprint arXiv:1811.01494.

[13] Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.,

2016. Ssd: Single shot multibox detector. In: European Conference on Computer

Vision. Springer, pp. 21–37.

[14] Zhuang, P., Xing, L., Liu, Y., Guo, S., Qiao, Y., 2017. Marine Animal Detection and

Recognition with Advanced Deep Learning Models, Working Notes of CLEF 2017.

[15] Hariharan, B., Girshick, R.B., 2017. Low-shot visual recognition by shrinking and

hallucinating features. In: ICCV, pp. 3037–3046.

[16] Choi, S., 2015. Fish identification in underwater video with deep convolutional

neural network: Snumedinfo at lifeclef fish task 2015. In: CLEF (Working Notes).

[17] AhsanJalala , Ahmad Salmana , AjmalMianb , Mark Shortisc , Faisal Shafaita 2020

.Fish detection and species classification in underwater environments using deep

learning with temporal information Ecological Informatics 57 (2020) 101088


Issue: 8 | Pages: 16652 - 16665

[16665]

[18] X. Li, M. Shang, H. Qin, L. Chen, “Fast accurate fish detection and recognition of

underwater images with fast r-cnn “ in OCEANS’15 MTS/IEEE Washington (IEEE,

2015), pp. 1–5

[19] L. Jin, H. Liang, “Deep learning for underwater image recognition in small sample

size situations.” in OCEANS 2017-Aberdeen (IEEE, 2017), pp. 1–4.

[20] H. Qin, X. Li, J. Liang, Y. Peng, C. Zhang,” DeepFish: Accurate underwater live fish

recognition with a deep architecture”, Neurocomputing 187, 49 (2016)

[21] M. Pedersen, J.B. Haurum, R. Gade, T.B. Moeslund, “Detection of Marine Animals

in a New Underwater Dataset with Varying Visibility” in CVPR Workshops (2019)

[22] J. Yosinski, J. Clune, Y. Bengio, H. Lipson, CoRR “How transferable are features in

deep neural networks?” abs/1411.1792 (2014)

[23] Warren, D.H., Strelow, E.R., 2013. Electronic Spatial Sensing for the Blind:

Contributions from Perception, Rehabilitation, and Computer Vision. Vol. 99

Springer Science & Business Media.

[24] LifeClef2015 dataset link: https://www.imageclef.org/lifeclef/2015/fish

[25] Knausgård, K.M., Wiklund, A., Sørdalen, T.K. et al. Temperate fish detection and

classification: a deep learning based

approach. ApplIntell (2021).https://doi.org/10.1007/s10489-020-02154-9

n


https://scholar.google.com/citations?view_op=view_citation&hl=en&user=mrunnpoAAAAJ&citation_for_view=mrunnpoAAAAJ:d1gkVwhDpl0C



https://www.imageclef.org/lifeclef/2015/fish

https://doi.org/10.1007/s10489-020-02154-9


www.jespublication.com Page No:1029

Vol 11, Issue 6,June/ 2020

ISSN NO: 0377-9254

IMPLEMENTATION OF DEEP LEARNING TECHNIQUES TO

ADDRESS CORAL REEF DISEASES M. H. Ibrahim1 and M. Mohamed Sathik2

1Research Scholar, Sadakathullah Appa College, Tirunelveli, Tamilnadu, India

2Principal and Research Supervisor, Sadakathullah Appa College, Tirunelveli, Tamilnadu, India

Abstract— Coral reefs is an important,

unavoidable attribute of the marine ecosystem.

It plays a major role in keeping the ecosystem

balanced. Coral reefs produces nutrition and

proteins to the fishes like tuna, dolphin and

other pelagic species. But, due to industrial

pollution, human-related activities, climate

change, bleaching, the total amount of the coral

reefs has drastically reduced. According to the

report of Global Coral Reef Monitoring

Network (GCRMN) around 19% of coral reefs

in the world has lost in last 30years and it is

expected to lose another 17% by 2030. It has

also predicted that, in 2050, all the coral reefs in

the world will be in danger. To address this

alarming issue, this research work proposes a

deep convolution neural network machine

learning technique to identify the infected or

diseased corals from videos. Here, two datasets

are used. One is used to train the neural

network to predict the type of the coral reef

downloaded from Mendeley open source archive

and second dataset is used to predict whether

the particular coral is affected from white

plague disease or not. The second video dataset

is downloaded from official BBC Earth

YouTube web channels in HD 720 pixel. These

data are given as input to the Deep Convolution

Neural Network (DCNN) to perform image

classification. The proposed system

automatically predicts the diseases in the coral

reefs and helps in improving the coral

ecosystem.

Keywords— Deep learning; Coral reefs;

Disease prediction; Coral ecosystem; neural

networks.

1. INTRODUCTION

Coral reefs are living, colorful, multi-faceted

structure, which are the part of marine ecosystem.

Coral reefs are formed by calcium carbonates and

are mostly present near equator of the world. It

lives in warm water and maintains a temperature

between 20 to 28 degree Celsius It creates a

colony-like structure to shelter species like fishes,

crabs, sea stars, shrimps and algaes [1]. It is

estimated that around one lakh different species are

living under the coral reefs. It also produces

nutrition and proteins to the species that live on the

ocean. Moreover, it also helps human being by

providing coral foods, shorelines protection and

medicines. On the basis of recent report from

National Oceanic and Atmospheric Administration,

the annual economy earning of coral reef mining is

about 30 billion US dollars.

Coral reefs has most variety of all marine

ecosystem. There are more than 2500 species of

coral reefs in the world. Around 40 percent coral

are hard in nature. It will not have any motion or

movements in the ocean. They will look like a hard

rock. On the other hand, remaining 60 percent of

corals are soft in nature [2]. It will have a skeletons

that are flexible to make motions on the water.

Though, it gives several advantages to the

ecosystem and human being, coral reefs are now in

great danger due to industry pollution, overfishing,

destructive fishing and climate change. In some

places on earth, the coral reefs are entirely

destroyed and in many places coral reefs are in

endangered situation.

Diseases spreading on the coral reefs are also an

important reason for the degradation. Diseases like

vibrio, white syndrome, white band, and rapid

wasting disease. In here, the white syndrome and

white band are the wide spread disease in the entire

world. These diseases forms a white colored foam

in the surfaces of the coral reefs. It also exhibits a

pronounced division between the remaining coral

tissue and the exposed coral skeleton. White layer

diseases such as white band and syndrome in the

coral reefs can be categorized into Type I and Type

II diseases. In here, Type I, diseases doesn’t show

any bleaching on the upper layer of the coral reefs.

But it changes the color slightly. Type II coral reef

diseases bleaches the surface completely[3].

In this research work, we intended to create a

framework to identify the corals types and its

http://jespublication.com/



ISSN NO: 0377-9254

disease automatically by performing deep learning

techniques on the coral reef video dataset. Here, we

have used Deep Convolution Neural Network

(DCNN) to identify the data coral type and its

diseases.

Deep Convolution Neural Network (DCNN) is a

type of neural network, which perform both

forward and backward propagation to gather

information, identify the relationship and detects

patterns between the data. CNN has a fully

connected multi-layer neuron structure, in which

each neurons in a layer is connected with all the

neurons in the adjacent layers. It uses linear

convolution operation at least in any one of the

layer. Convolution Neural Network (CNN) takes

input as image and assign importance to various

aspects in the image and classifies it [4].

In this research work, a coral reef image dataset

(downloaded from Mendeley open archive) is

given as input to train and a video dataset from

BBC Earth is used to test the proposed DCNN

framework.

This research work is organized as, Section 2,

describes the related works that are carried out on

predicting the type and the disease spread in the

coral reefs. Section 3, explains the proposed

implementation and framework of proposed Deep

Convolution Neural Network used. Section 4,

evaluates the performance of the proposed

approach and Section 5 concludes and discusses

about the future work.

2. RELATED WORKS

In early days, information related to coral reefs are

collected from underwater visual censuses (UVC)

by scuba divers. The accuracy of the collected

information is highly depends on the depth,

duration and the experience of the scuba divers.

Identifying the type, growth, disease in the coral

reefs were difficult. But, after the rapid

advancement in robotic technology and under

water cameras, information related to the coral

reefs and marine ecosystem are captured using

video supported under water rovers and automatic

motion capturing devices [5].

However, collecting and observing information

about the coral reefs need an oceanic research

specialists. Moreover, observing information from

a long video is a time consuming and expensive

operation. So there is an immediate need to create a

framework to perform automatic coral reef

classification and disease prediction to improve the

coral ecosystem. Images captured through

underwater rovers is shown in Fig. 1 and the

example image of jelly fish rover is shown in Fig.2.

In order to achieve such an automated image

classification method, deep learning techniques are

used.

Fig. 1 Image captured through underwater

video rovers Fig. 2 Image of jelly fish rover

One of the important subset of deep learning

technique is Neural Networks (NN). It has been

used in several applications text classification,

paraphrase detection, facial recognition, speech

recognition and much more. In recent time, it has

been used in oceans to keep track of ocean wave

power, endangered fishes and temperature monitor.




ISSN NO: 0377-9254

Likewise, deep learning techniques are also used in

the monitoring coral reefs. Countries like Australia,

Philippines are using a deep sea ocean rovers to

capture the images of the coral reefs at frequent

interval. In 2010, the government of Australia

initiated an automated tools for capturing,

analyzing marine images and videos called

CATAMI [6]. It used classification methods to

classify the species in the marine ecosystem.

However, due to the bad quality of images and

improper preprocessing technique, the system

doesn’t automate the classifying procedure.

Over the last decade, the performance of automatic

identification of objects on images and videos has

rapidly increased. But identifying the coral reefs on

under water image is a challenging issues as it has

a very low brightness and irrelevant information

(i.e.) fishes covering the corals.

Shortis [7], proposed a fully automated technique

to identify and measure the count of the fish in the

particular place of the ocean. It proposed an

algorithm for the detection of objects (fishes),

identifying the trueness of objects, measuring the

size of the fish, classifying the category of the fish

and finds out the biomass value in underwater

sequences. To classify the fishes into different

classes, Artificial Neural Network (ANN), Nearest

Neighbor algorithm and Support Vector Machine

(SVM) algorithms are used. Joly [8], proposed a

multimedia identification to gather accurate

knowledge about the identity, geographic location

and the living species in the ocean using Artificial

Neural Networks (ANN).

In addition to that, it also evaluates the challenges

related to multimedia information retrieval and

fine-grained classification problems. Villon [9],

presented two supervised machine learning method

to automatically detect and recognize coral reef and

fishes under water. It uses HOG machine learning

and SVM classifying algorithm to classify the

object. Later, the results of both HOG and SVM are

compared to predict the accurate result. It achieves

88% of F-Score within the same network

architecture. Choi et al. [10], proposed a

technique to perform video based fish identification

from the deep ocean. It uses a foreground detection

method with selective search to extract information

about the fish object. It also uses deep convolution

neural networks to classify the fishes based on its

size.

Kratzert et al. [11], presented an automatic fish

species classification method to classify the

underwater species for a FishCam video dataset. It

uses Convolutional Neural Networks to classify the

species. Raw data which are collected from video

camera are directly fed to the machine learning

system. It also observe that by adding the

additional metadata information the accuracy of the

system is increased. It achieves the maximum

accuracy as 93% (with additional metadata

information like fish length, location, count,

migration details)

Ravanbakhsh [12], proposed an automated

approach for fish detection using shape based level

sets framework. Principal Component Analysis is

used to gather the knowledge about the fishes. The

HAAR classifier is used identify the precise

location of the head and the snout of the fishes. It

used two major deep learning techniques, such as

patch based convolutional neural networks (CNNs)

and fully convolutional neural networks (FCNNs)

to classify the semantic segmentation of the

underwater images of the coral reef ecosystem. In

here, patch based CNN are used to enable single

entity classification and FCNN is used to generate

a semantically segmented output from the input

images. It also compares the patch based CNN and

FCNN to improve the accuracy of classification

and semantic segmentation of the input images.

Villion [13], proposed a method to identify and

count the number of fishes from videos using

Convolution Neural Networks. It also compares the

human ability interns of speed and accuracy with

CNN trained with different photographic databases.

The rate of accuracy of CNN based prediction is

about 94.6% and human (manual) identification is

89.3%. Deep Learning methods can thus perform

efficient fish identification on underwater images

and offer promises to build-up new video-based

protocols cheaply and effectively in short time.

3. PROPOSED FRAMEWORK

The proposed approach intends to create a design

for detecting the types of the coral reefs and the

effect of white band diseases in coral ecosystem

using Deep Convolution Neural Network (DCNN)

from video dataset. The workflow of the proposed

deep learning technique is shown in Fig. 3.

The proposed deep learning technique uses two

different data set. To train the proposed DCNN




ISSN NO: 0377-9254

model we have used, an open-source image dataset

which is collected from Mendeley Open Archives

from Universitat de Girona - Campus de Montilivi.

The dataset consist of a huge amount of images and

information about the different coral reef species.

We have also used a second dataset to test the

performance of the proposed method. The second

dataset is a 10 hours video download from official

BBC Earth YouTube web channel in HD 1080p.

To train the Deep Convolution Neural Network

(DCNN), the chosen image dataset has to undergo

a preprocessing method to purify the error data

from the dataset. Raw image dataset might have

low quality as the images are captured under water.

The preprocessing technique will improve the

quality and accuracy of the image dataset, which

will impacted on the overall accuracy of the

proposed deep learning technique.

3.1 Data Pre-processing of test dataset

Pre-processing is an important feature of deep

learning technique which process the raw data and

converts in a knowledgeable format which are later

given as the input to the Deep Convolution Neural

Network(DCNN). Under water image will have

noises like,

Fig. 3. Workflow to identify the type and condition of white band diseases affected in coral reefs using

DCNN

Gaussian noise, Salt and pepper noise, shot noise,

quantization noise, film grain noise, low light

noise, anisotropic and periodic noise.

To remove these noise and improve the quality of

image two important method are used to preprocess

the underwater images. They are image restoration

and image enhancement.

In image restoration, an arithmetic mean filter is

used to de-noising the underwater images and then

an iterative deconvolution method is used to filter

out the images. Information like attenuation,

coefficients, scattering and depth estimation of the

object (fishes) are fed as the input along with the

image.

Later in image enhancement preprocessing method,

disturbance or noises in the images are corrected

sequentially. Initially, it removes the over applied

effects using the homomorphic filters on the image

to improve the quality of the image. It also used to

remove the defects of non-uniformity of

illumination and improves the contrasts. Fig. 4

represents the preprocessed image and raw image.

Later the preprocessed data are fed to a specified

Uniform Ratio Aspect (URF) module to crop the

images into a square shaper. Because most of the

neural networks performs on the square shaped

image. For cropping the images, the middle part of

the image is chosen. The preprocessing algorithm

takes middle part of the image and draws multiple

square shaped crops and find the best fit.

Fig. 4 (a) Raw extracted image




ISSN NO: 0377-9254

Fig. 4 (b) Preprocessed image

3.2 Feature selection

Feature selection is also an important and

unavoidable task in Deep learning. It aims to find a

small number of features that describes the image

dataset

Fig. 5 Relief feature selection algorithm of image dataset

better than the original set of features. This can be

achieved by removing irrelevant and redundant

features according to importance criterion in

feature selection. To remove the irrelevant data

from the raw dataset, relief feature selection

algorithm is used. The images on the dataset is

consider as instances and the object belonging to a

known classes are called as feature of the dataset.

Let us consider, ‘n’ instances and ‘p’ features in a

dataset. Each features in the dataset is scaled

between the value 0 and 1. Now, take the feature

vector X from a random instance and another

feature vector X from its nearest instance using

Euclidean distance. This operation is performed in

an iterative manner. The closest instance is

considered near-hit and the longest distanced

instance is considered as near-miss. At each

iteration, the weight of the vector is updated.

Weight of the feature vector can be calculated

from,

Weight (i) = Weight (i-1) – (Xi - near hiti) 2 + (Xi -

near missi) 2

After find out the weight for each feature vector,

the median value is set as threshold for feature

selection. Fig. 5, shows the procedure of feature

extraction.

3.3. Deep Convolution Neural Networks

Deep Convolution Neural Network is a subset of

deep neural networks, which is used to analyze,

classify the image and video dataset. It is also

called as Shift Invariant Neural Network (SIANN).

It has been used in face recognition, image

classification, recommendation system, medical

image analysis.

Fig. 6 Structure of the Neural Networks

Every layer in neural networks will have several

neurons in each layer. CNN system takes input as

images, process it and classifies the input dataset

into classes based on the features of the image. In

here, the result produced by the feature extraction

algorithm is processed and separates the images

based on the type of the corals reefs. Each input in

the dataset will undergo through a series of

convolutions layers, where each layer will have a

kernel filters, pooling and softmax function to

classify the image with probabilistic value of 0 to

1. Convolution uses the relationship between the

pixels and small-square as input to learn the




ISSN NO: 0377-9254

features on the image. Filters in each layer perform

task like edge detection, blur handling, sharpening

the image.

Meanwhile, the classification based on the

convolution filters alone can’t give a perfect result.

To improve the prediction result in the convolution

layer, padding is also used. For example, if the size

of the filter used is 3*3 matrix and the input image

size is 8*8, then the filter can’t efficiently process

the input data. In such scenario, padding can be

used.

The system artificially adds, n*n zero matrix values

to the image to equals the inputs data size. The

added pixel will have the zero value and doesn’t

affect the convolution neural network in any

means. The process of kernal filter and padding

are continuously performed in an iteration manner

and allows the input data to propagate forward and

backward and makes the system to learn about the

input image.

Once after the DCNN is learned about the coral

reefs and its type. The second dataset (i.e.) test

dataset (video files) is given as input the DCNN.

Later, the video dataset is extracted in scale of 2

images per second (120 image per minute) of a

total 10hours videos. These frame extraction is

performed using video capturing function in

OpenCV. These test dataset is fed to the CNN

again to check the accuracy of the proposed DCNN

based automatic prediction technique.

Once after finding out the type of the coral reefs, an

separate support vector machine (SVM) base

machine learning technique is used to classify

whether the coral reef has affected from white band

disease or not.

4. EXPERIMENTS

The proposed Deep Convolution Neural Network

experiment to identify the type of the coral reef and

to derive the information about the white band

disease on the surface of the coral reefs are carried

out on an open source software called, Tensor Flow

software. The dataset used for the implementation

is taken from Mendeley Imaging Archive site. The

downloaded data had several information related to

the type of the coral reef. The total number of the

coral reef images present in the first dataset is

around 1,512, in which 80% of images (i.e.) 1,259

images is chosen as training images and remaining

20% of images are used as the testing dataset. The

first dataset is used to train the neural network to

classify the type of the coral reefs. The second

dataset used to train the neural networks is a video

dataset, which is downloaded from the BBC Earth

YouTube Channel consist of a total run time of

3.12 hours of oceanography and underwater videos

shot beneath the sea. Total of 22,462 images are

derived from the video (i.e.) two images per

second. Due to the limitation of processing

capacity in the computer system, a total of 3000

images were chosen from 22,462 image derived

from the video. Later in these 3,000 videos, 80%

are used as training data set and 20% are used as

test data. The second dataset is used to identify

whether the coral reefs is infected by the white

band diseases are not.

The implementations were carried out on an i7

Intel processor system with 8GB with the

computing power and 1TB storage space. The

overall performance of the proposed DCNN is

measured using three major evaluation aspects of

the ML algorithm such as accuracy of the results,

precision and recall.

To measure the accuracy of the proposed DCNN

method, four major factors are considered, such as

true positive (TP), true negative (TN), false

negative (FN) and false positive (FP). The true

positive is the case, in which the type of the coral

reef is identified correctly. The true negative is the

case, where the coral relief belong to a particular

type and DCNN fails to find it. The false positive is

the case where the coral reef is not belong to a

particular type and DCNN results the same. False

negative is the case, where the coral reed doesn’t

belong to a type, but DCNN wrongly predicts that

the coral reef belong to particular type. The

formula to measure the accuracy of the proposed

system is,

The measured accuracy of the proposed model

DCNN model is 92.1%. In total of 253 test set

images, the DCNN system correctly predicted the

type of 233 images. However, accuracy of SVM

algorithm to predict the same is 80.6%, (i.e.) 203

images. The accuracy of the proposed DCNN

method is measured and compared with other

existing methods. Likewise, the sensitivity is also

measured. Sensitivity is a measure of the

proportion of actual positive cases that got




ISSN NO: 0377-9254

predicated as positive. It is measured by the

following formula,

Moreover, the specificity (i.e.) proportion of actual

negatives, which got predicted as negative.

Specificity of the proposed DCNN based coral reef

classification is calculated from the following

formula,

The performance metrics (accuracy, sensitivity and

specificity) of proposed DCNN method to predict

the type of the coral reefs is diagrammatically

compared with existing methods and shown in Fig.

7. Likewise, the accuracy, sensitivity and

specificity of the DCNN to predict whether the

coral reef is infected by the white band diseases or

not is diagrammatically shown in Fig. 8.

From the above results, it is clear that the proposed

method outperforms the existing methods with

respect to accuracy, sensitivity and specificity.

40

50

60

70

80

90

100

Accuracy Precision Specificity

Per

cen

tag

e

Performance of the proposed

DCNN Method to predict the type

of the coral reefs

Villon et al.

[9]Joly et al.

[8]Proposed

Method

Fig. 7 Comparison of the proposed DCNN

method with existing approaches to predict the

type of the coral reefs

40

45

50

55

60

65

70

75

80

85

90

Per

cen

tag

e

Performance of the proposed DCNN

Method to predict whiteband disease in

coral reefs

Choi et al. [10]

Joly et al. [8]

Proposed

Method

Fig. 8 Comparison of the proposed DCNN with

existing approaches to predict the white band

diseases in coral reefs

5. Conclusion

This research work creates an efficient deep

convolution neural network to identify the type of

the coral reef and also it identifies the coral which

are infected by the white band diseases. Initially,

the images were preprocessed using image

restoration and image enhancement techniques.

Later, the image feature are detected from the

images and extracted to train the neural network.

Once after the training phase, the test data are

given as an input to the DCNN to perform image

classification. The proposed system efficiently

predicts the type of the coral reefs and later it also

predicts the whether the coral is infected by the

white band or not. The proposed method helps in

improving the coral ecosystem, with the accuracy

of 92.1%, sensitivity of 92.8% and specificity of

86.2%. The proposed methods outperforms the

existing methods and makes in practical to

implement on real time.

Reference

1. S. C. Doney, M. Ruckelshaus, J. E. Duffy, J. P.

Barry, F. Chan, C. A. English, H. M. Galindo, J. M.

Grebmeier, A. B. Hollowed, N. Knowlton,

“Climate change impacts on marine ecosystems,”

Marine Science, Vol. 4, 2012.

2. O. Hoegh-Guldberg, P. J. Mumby, A. J. Hooten,

R. S. Steneck, P. Greenfield, E. Gomez, C. D.

Harvell, P. F. Sale, A. J. Edwards, K. Caldeira, et

al., “Coral reefs under rapid climate change and

ocean acidification,” science, Vol. 318, No. 5857,

pp. 1737–1742, 2007.




ISSN NO: 0377-9254

3. T. C. Bridge, R. Ferrari, M. Bryson, R. Hovey,

W. F. Figueira, S. B. Williams, O. Pizarro, A. R.

Harborne, and M. Byrne, “Variable responses of

benthic communities to anomalously warm sea

temperatures on a high-latitude coral reef,” Plos

One Journal, Vol. 9, No. 11, pp. 1-10, 2014. DOI:

10.1371/journal.pone.0113079.

4. Aloysius, N., & Geetha, M. “A review on deep

convolutional neural networks” International

Conference on Communication and Signal

Processing (ICCSP), 2017

DOI:10.1109/iccsp.2017.8286426.

5. Boström-Einarsson L, Babcock RC, Bayraktarov

E, Ceccarelli D, Cook N, Ferse SCA, et al. (2020)

“Coral restoration – A systematic review of current

methods, successes, failures and future directions”

PLoS ONE 15(1), 2020. DOI:

10.1371/journal.pone.0226631

6. Althaus F, Hill N, Ferrari R, Edwards L,

Przeslawski R, Schönberg CHL, A Standardised

Vocabulary for Identifying Benthic Biota and

Substrata from Underwater Imagery: The CATAMI

Classification Scheme. PLoS ONE 10(10), 2015.

DOI: 10.1371/journal.pone.0141039.

7. M. Shortis, B. Ghanem, P.F. Culverhouse, D.

Edgington, D. Cline, M. Ravanbakhsh, J. Seager,

E.S. Harvey, Fish identification from videos

captured in uncontrolled underwater environments,

ICES J. Marine Sci.: J. Conseil (2016) 106

8. Joly, A., Goëau, H., Glotin, H., Spampinato, C.,

Bonnet, P., Vellinga, W.-P, Müller, H. (2015).

LifeCLEF 2015: Multimedia Life Species

Identification Challenges. Experimental IR Meets

Multilinguality, Multimodality, and Interaction,

462–483. DOI:10.1007/978-3-319-24027-5_46.

9. Villon, S., Chaumont, M., Subsol, G., Villéger,

S., Claverie, T., & Mouillot, D. (2016). Coral reef

fish detection and recognition in underwater videos

by supervised machine learning: Comparison

between deep learning and HOG+ VM methods,

Advanced Concepts for Intelligent Vision Systems.

ACIVS 2016. Lecture Notes in Computer Science,

Vol. 10016.

10. Sungbin Choi, “Fish Identification in

Underwater Video with Deep Convolutional Neural

Network: SNUMedinfo at LifeCLEF Fish task

2015. CLEF (Working Notes) 2015

11. Kratzert, F., & Mader, H. (2017, April).

Advances of FishNet towards a fully automatic

monitoring system for fish migration. In EGU

General Assembly Conference Abstracts (Vol. 19,

p. 7932).

12. Mehdi Ravanbakhsh, Faisal Shafait, Euan S.

Harvey, ―Automated Fish Detection In

Underwater Images Using Shape-Based Level

Sets‖, 2015.

13. Villon, S., Mouillot, D., Chaumont, M.,

Darling, E. S., Subsol, G., Claverie, T., and

Ville´ger, S. 2018. A deep learning method for

accurate and fast identification of coral reef fishes

in underwater images. Ecological Informatics, 48:

238–244


Automated diagnosis of Lungs Tumor Using Segmentation ...

Documents

Transcript of Automated diagnosis of Lungs Tumor Using Segmentation ...