Classification of Deforestation Factors Using Data Mining Techniques

CLASSIFICATION OF DEFORESTATION FACTORS USING DATA MINING

TECHNIQUES

S. JYOTHI1, K. SARITHA

2 & K. R. MANJULA

3

1Professor, Department of Computer Science, Sri Padmavati Women’s University, Tirupati, Andhra Pradesh, India

2Research Scholar, Department of Computer Science, Sri Padmavati Women’s University, Tirupati, Andhra Pradesh, India

3Assistant Professor, Department of CSE, SASTRA University, Thanjavur, Tamil Nadu, India

ABSTRACT

Data mining techniques have been widely used for extracting knowledge from large amounts of data. Monitoring

deforestation is utmost important for the developing countries. Classification of deforestation is one of the primary

objectives in the analysis of remotely sensed data. The present study focuses on monitoring accurate results of

deforestation and forest degradation using classification techniques. In this paper, an experiment has been set up on

different classification algorithms to compare the results. To evaluate the results, we used the WEKA open source tool,

which is a collection of machine learning algorithms consisting of different processing tasks such as classification,

association and clustering. The main aim of our study in this paper is comparative study of the classification algorithms to

find the best algorithm of our data set.

KEYWORDS: Deforestation, Data Mining, Classification, WEKA

INTRODUCTION

Knowledge discovery in databases is the nontrivial process of identifying valid, novel, useful and ultimately

understandable patterns in data [7]. The process of automatic classification based on data patterns obtained from data set is

referred as Data mining [5]. Classification is one of the data mining task, the objective of the classification is to build a

model in training data set to predict the class of future objects whose class label is not known [2][13]. There are lots of

classification algorithms, for example, classification based on decision-tree, Bayesian classification based on statistics,

classification based on neural network [4]. Geospatial data mining is a process of geographic knowledge discovery from

the spatial datasets mined with data mining algorithms to identify interesting and previously unknown but potentially

useful patterns [14].

Conversion of forest land into non-forested land either directly or indirectly is referred as deforestation. But

generally most of the deforestation is happening due to human intervention or activities like developing urbanization,

constructing roads along forest area, improving or shifting the agriculture land, mining the available resources at forests,

logging the wood for fuel or other purposes etc. These are the major causes of deforestation.

Data mining techniques are applied for classifying the factors of deforestation. In this paper, the classification techniques

are applied for classifying our data and the performance of each classification is achieved and compared to analyze the best

classification technique.

PROBLEM DOMAIN

The study area covers the 5000 square kilometers which includes Chittoor, Kadapa and Nellore districts. The

International Journal of Computer Science Engineering

and Information Technology Research (IJCSEITR)

ISSN 2249-6831

Vol. 3, Issue 4, Oct 2013, 159-172

© TJPRC Pvt. Ltd.

160 S. Jyothi, K. Saritha & K. R. Manjula

boundary lies between lower left East 78 " Longitude and E 13 " Latitude and the upper right corner N 79 39"

Longitude and N 14 33" Latitude with an area of 15,379 square kilometers of Kadapa district, which include 51 Mandals

and three revenue divisions. The geographical area of Chittoor district lies between 12 37"to 14 18" N Latitude and 78

33"to 79 55" E Longitude. The district area is 13,076 square kilometers divided into three revenue divisions and 46

Mandals administratively. The data is derived from the Manjula et.al [10][11] consisting of maps and tables regarding the

association technique. The data set consists of 5 attributes and 99 instances. The classification problem involved the factors

of deforestation like Agriculture, Built-up, Mining and Roads that cover the bulk area of forest used for analyzing the best

algorithm for our data set. Figure 1 and Figure 2 represent the maps of study area.

Figure 1: Map of Study Area

CLASSIFICATION TECHNIQUES

Classification of data is very typical task in data mining. There are large number of classifiers that are used to

classify the data such as Bayes net, Function, Rule based and Decision Tree etc. The goal of classification is to predict the

correct value of a designated discrete class variable, given a vector of predictors or attributes [6]. In this paper we

implement our data in WEKA machine learning tool to analyze the performance analysis of different classification

techniques.

Figure 2: Topographical Map of Study Area

Comparison of Classification Algorithms

Bayesian Methods

Bayesian methods are also one of the classification techniques in data mining. In this paper two main Bayesian

methods are used namely Naive Bayes and Bayesian networks that are implemented in WEKA software for classification.

Classification of Deforestation Factors Using Data Mining Techniques 161

A Bayes classifier could be defined as an independent feature model deals with a simple probabilistic classifier

based on applying Bayes theorem with strong independence assumptions. Bayes rule is applied to calculate the likelihood.

There are several models that make different assumption fitting for Naive Bayes [12] [8].

Evaluation on Training Set

Time taken to build model: 0 seconds

Correctly Classified Instances 99 100 %

Incorrectly Classified Instances 0 0 %

Kappa statistic 1

Mean absolute error 0.0194

Root mean squared error 0.0521

Relative absolute error 10.1457 %

Root relative squared error 16.922 %

Total Number of Instances 99

=== Confusion Matrix ===

a b c d e f g h <-- classified as

17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 10 0 | g = ABR

0 0 0 0 0 0 0 7 | h = BR

Stratified Cross-Validation


Correctly Classified Instances 98 98.9899 %

Incorrectly Classified Instances 1 1.0101 %

Kappa statistic 0.9867









17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 9 1 | g = ABR

0 0 0 0 0 0 0 7 | h = BR

Naive Bayesian

Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes theorem with strong

independence assumptions. Naive Bayes classifier is that it only requires a small amount of training data to estimate the

parameters necessary for classification.













17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R


0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 9 1 | g = ABR

0 0 0 0 0 0 0 7 | h = BR













17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 3 0 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 9 1 | g = ABR

0 0 0 0 0 0 0 7 | h = BR

Decision Tree

A decision tree is a flow chart like hierarchical tree structure consists of a root, a set of internal nodes and terminal

nodes called leaves. The root node and the internal nodes are linked as decision stages, the terminal node represent final

classification. The classification process provides a set of rules that determine the path starting from the root node and

ending at one terminal node. Each terminal node represents one class label for the object being classified.

Decision trees are powerful classification algorithms. Popular decision tree algorithms include Quinlan’s ID ,

C . , C , and Breiman et al.’s CART [3]. As the name implies, this technique recursively separates observations in


branches to construct a tree for the purpose of improving the prediction accuracy. Most decision tree classifiers perform

classification in two phases: tree-growing (or building) and tree-pruning. The tree building is done in top-down manner.

During this phase the tree is recursively partitioned till all the data items belong to the same class label. In the tree pruning

phase the full grown tree is cut back to prevent over fitting and improve the accuracy of the tree in bottom up fashion. It is

used to improve the prediction and classification accuracy of the algorithm by minimizing the over-fitting. Compared to

other data mining techniques, it is widely applied in various areas since it is robust to data scales or distributions.

J48 is an open source Java implementation of the C4.5 algorithm in the WEKA data mining tool. C4.5 is an

algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is a software extension and thus improvement

of the basic ID3 algorithm designed by Quinlan. The decision trees generated by C4.5 can be used for classification, and

for this reason, C4.5 is often referred to as a statistical classifier [15]. For inducing classification rules in the form of

Decision Trees from a set of given examples C4.5 algorithm was introduced by Quinlan. C4.5 is an evolution and

refinement of ID3 that accounts for unavailable values, continuous attribute value ranges, pruning of decision trees, rule

derivation, and so on.


Time taken to build model: 0.02 seconds











17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 9 1 | g = ABR

0 0 0 0 0 0 0 7 | h = BR














17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 9 1 | g = ABR

0 0 0 0 0 0 2 5 | h = BR

K Nearest Neighbour

A Nearest Neighbor Classifier assumes all instance correspond to points in the n-dimensional space. During

learning, all instances are remembered. When a new point is classified, the k nearest points to the new point is found and is

used with a weight for determining the class value of the new point. For the sake of increasing accuracy, greater weights

are given to closer points [9].





Kappa statistic 1









17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 10 0 | g = ABR

0 0 0 0 0 0 0 7 | h = BR













17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR


0 0 0 0 0 11 0 0 | f = A

1 0 0 0 0 0 9 0 | g = ABR

0 0 0 0 0 0 0 7 | h = BR

Artificial Neural Networks

Artificial Neural Networks (ANN) is one of the classification methods in data mining. To employ

Figure 3: Multilayer Neural Network

Neural Network based classifiers, Multi-Layer Perceptron (MLP) is used (Figure 3). MLP is a feed forward

technique that makes a model to map input data to output data. Hidden layer in MLP can include various layers between

input and output. The structure of MLP is shown below [1].





Kappa statistic 1








17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A


0 0 0 0 0 0 10 0 | g = ABR

0 0 0 0 0 0 0 7 | h = BR











== Confusion Matrix ===


17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 1 0 0 0 0 9 0 | g = ABR

0 0 0 0 0 0 0 7 | h = BR

Simple CART

CART algorithm stands for Classification And Regression Trees algorithm, it is a data exploration and prediction

algorithm. Classification and Regression Trees is a classification method which in order to construct decision trees uses

historical data. To classify new data decision trees so obtained are used. Number of classes must be known a prior in order

to use CART. CART uses so called learning sample which is a set of historical data with pre-assigned classes for all

observations for building decision trees [16].






Kappa statistic 1

Mean absolute error 0

Root mean squared error 0

Relative absolute error 0 %

Root relative squared error 0 %




17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 0 3 0 0 0 0 0 | c = R

0 0 0 3 0 0 0 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 10 0 | g = ABR

0 0 0 0 0 0 0 7 | h = BR













17 0 0 0 0 0 0 0 | a = ABM

0 42 0 0 0 0 0 0 | b = AR

0 3 0 0 0 0 0 0 | c = R


0 0 0 0 0 0 3 0 | d = ABMR

0 0 0 0 6 0 0 0 | e = AMR

0 0 0 0 0 11 0 0 | f = A

0 0 0 0 0 0 10 0 | g = ABR

0 0 0 0 0 0 0 7 | h = BR

EXPERIMENTAL RESULTS

In this section, we present the results of different classification algorithms and perform analysis on their

performance to verify the effectiveness of each algorithm. The domain of this work is to analyze the best algorithm for our

data set. Performance evaluation of algorithms is also done between the training and validation methods to analyze the best

algorithm. Table 1 shows the summary of the computational time, correct instances, kappa and measures like MAE, RMSE

are evaluated on the Training set in the classification algorithms. Regarding to the computational time, Bayes Net, IBK,

Naives Bayes perform fast computation with less time, but it also obtain worst results when we evaluate the kappa,

measures and confusion matrix. It should also be noted that Bayes Net is showing the best performance when cross

checked with kappa, measures and confusion matrix.

Despite this fact, that is not considering the time computation, if we perform analysis on kappa, and other

measures the algorithms like Bayes Net, MLP, Simple CART are playing major role in providing the best accuracy. Kappa

is a chance-corrected measure of agreement between the classified classes. If Kappa =1, then there is a perfect agreement,

if Kappa=0, then there is no agreement, if the value is >0 then it means that classifier is doing better classification. Mean

absolute error is sum of absolute errors divided by number of predictions. Root means square error is a square of sum of

squares error divided number of predictions, it is a measure the differences between values by a predicted by a model and

the values actually observed. Smaller the values of RMSE shows that the model with better accuracy. So, if MAE and

RMSE are minimum then the better prediction and accuracy.

Using Training Set

Table 1: Accuracy Results of all Methods in Training Set

Algorithm Time Correctly Incorrectly Kappa MAE RMAE RAE % RRAE %

Bayes 0 99 0 1 0.0194 0.0521 10.1457 16.922

Naïve Bayesian 0 98 1 0.9867 0.0339 0.0861 17.6942 27.9577

J48 0.02 98 1 0.9867 0.0042 0.0459 2.197 14.9017

IBK 0 99 0 1 0.0019 0.0045 1.0033 1.4611

MLP 1 99 0 1 0.0089 0.0152 4.627 4.9526

SimpleCART 0.05 99 0 1 0 0 0 0

Figure 4: Comparison of Classifiers Based on Accuracy


Figure 4 demonstrates the comparison of accuracy between classifiers in many aspects. It is revealed and justifies

in the graph that Bayes Net, IBK, MLP, and Simple CART shows the best performance results.

Table 2 also shows the summary results of various classification techniques using the cross-validation of k-folds

model. In the present case, the performance analysis of various algorithms indicates that time computation is high in MLP

and Simple CART algorithms, whereas it is less in other learning algorithms like Bayes, Naïve Bayes, J48 and IBK but the

measures like kappa statistics, MAE, RMSE and confusion matrix is indicating less accuracy. But IBK (K-NN) is showing

the high accurate results in all aspects. When compared to all algorithms IBK k-nearest neighbor showing the best accuracy

but its accuracy is not excellent is training set, though its performance on cross-validation is approached to the perfect

level. Interestingly, if the performance analysis is done independent of time then the MLP algorithm is showing the good

accuracy in both the cases.

Using Cross-Validation

Table 2: Accuracy Results of all Methods in Cross-Validation

Algorithm Time Correctly Incorrectly Kappa MAE RMAE RAE % RRAE %

Bayes 0 98 1 0.9867 0.0247 0.0743 12.8792 24.5768

NaiveBayesian 0 95 4 0.9458 0.0401 0.1055 20.8635 34.2137

J48 0 96 3 0.96 0.0081 0.0817 4.2058 26.4947

IBK 0 98 1 0.9867 0.0051 0.0438 2.035 14.2084

MLP 0.86 98 1 0.9866 0.0126 0.0502 6.5723 16.2596

SimpleCART 0.02 93 6 0.9185 0.0163 0.1106 8.4774 35.8587

Figure 5: Comparison of Classifiers Based on Performance

Figure 5 describes the performance of each classifier. To evaluate the performance of different methods, we made

detailed analysis on above characteristics and we can see that Simple CART achieves the best classification performance.

CONCLUSIONS

A variety of classification methods has been applied and tested on deforestation data. Our main aim is to analyze

the best algorithm for our data set. For this purpose, we compare the performance results of different classification

algorithms in WEKA a Machine Learning Language tool. Selecting the best algorithm is an important task to pertain the

accurate results, Which are not found in the observed algorithms, some of the algorithms are yielding the best results like

Bayes Net, MLP, Simple CART but the time of computation, MAE, RMSE are differ in each cases. The results and

findings of the presented study may be used for extending the new algorithm which reflects the best properties of the

different classification algorithms. So to obtain the optimal results for our data set, we propose the hybrid algorithm as our

future work containing the best properties of the above algorithms.

ACKNOWLEDGEMENTS

The authors are grateful to the UGC for providing the funds for our work.


REFERENCES

1. D. A. Avellaneda, et al., "Natural Texture Classification: A Neural Network Models Benchmark," 2009, pp. 325-

329.

2. R. Bayardo, “Brute-force mining of high-confidence classification rules”, Proceedings of the 3rd International

Conference on Knowledge Discovery and Data Mining (KDD-97), AAAI Press, Newport Beach, CA, United

States, August 1997, pp. 123-126.

3. Bernhard Pfahringer, Geoffrey Holmes and Richard Kirkby, “Optimizing the Induction of Alternating Decision

Trees”, Proceedings of the Fifth Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining.

2001, pp. 477-487.

4. Chung-Chian Hsu, Sheng-Hsuan Wang. “An integrated framework for visualized and exploratory pattern

discovery in mixed data’, IEEE Transactions on Knowledge and Data Engineering, 2006, 18(2): 161-173.

5. Desouza, K.C. (2001) Artificial intelligence for healthcare management In Proceedings of the First International

Conference on Management of Healthcare and Medical Technology Enschede, Netherlands: Institute for

Healthcare Technology Management.

6. Daniel Grossman and Pedro Domingos (200 ), “Learning Bayesian Network Classifiers by Maximizing

Conditional Likelihood”, In Press of Proceedings of the 21st International Conference on Machine Learning,

Banff, Canada.

7. Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. From Data Mining to Knowledge Discovery: An Overview. In

Fayyad, U., Piatetsky-Shapiro, G., Amith, Smyth, P., and Uthurusamy, R. (eds.), Advances in Knowledge

Discovery and Data Mining, MIT Press, 1-36, Cambridge, 1996.

8. Y. Herdiyeni, et al., "A Bayesian network approach for image similarity," International Conference on

Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME), 2009,

pp. 1-6.

9. T.M. Mitchell, Machine Learning, McGraw-Hill Companies, USA, 1997.

10. K.R. Manjula, Dr. S. Jyothi, S. Anand Kumar Varma, ”Analysing the factors of deforestation using GIS”.

11. K.R. Manjula, Dr. S. Jyothi, S. Anand Kumar Varma, Dr. S.Vijaya Kumar, “ Construction of Spatial Dataset from

Remote Sensing using GIS for Deforestation Study”, International Journal of Computer Applications (0975 –

8887) Volume 31– No.10, October 2011.

12. G. Qiang, "An Effective Algorithm for Improving the Performance of Naive Bayes for Text

Classification,",Second International Conference on Computer Research and Development, 2010, pp. 699-701.

13. J. Quinlan, C . , “Programs for machine learning”, San Mateo, CA: Morgan Kaufmann, 1993.

14. Shekhar, S., et al., Trends in Spatial Data Mining, in Data Mining: Next Generation Challenges and Future

Directions, AAAI Pres,2004.

15. http://www.c4.5-Wikipedia, the free encyclopedia.

16. http:// www.CART-Wikipedia, the free encyclopedia.

Classification of Deforestation Factors Using Data Mining Techniques

Documents

Transcript of Classification of Deforestation Factors Using Data Mining Techniques