Applied Acoustics Paper
-
Upload
independent -
Category
Documents
-
view
0 -
download
0
Transcript of Applied Acoustics Paper
Elsevier Editorial System(tm) for Applied Acoustics Manuscript Draft Manuscript Number: Title: On contour-based classification of dolphin whistles by type Article Type: Research Paper Section/Category: Contributors in the America Keywords: Support vector machines; Fourier descriptor; Non-linear kernels; Pattern recognition; Dolphin whistles Corresponding Author: Mr. Mahdi Esfahanian, Ph.D Candidate Corresponding Author's Institution: Florida Atlantic University First Author: Mahdi Esfahanian, Ph.D Candidate Order of Authors: Mahdi Esfahanian, Ph.D Candidate; Hanqi Zhuang, Ph.D; Nurgun Erdol, Ph.D Abstract: Classification of cetacean vocalizations may help marine biologists study their behavioral context in different environments yet automatic classification of vocalizations for their information content has not been adequately addressed in the literature. Since classifier performance has a strong dependence on the extent to which features cluster, we, in this paper, explore the effect of two feature sets on two classifiers and assess their performance and computational complexity. We choose two feature sets that are exemplary of very different methods: The first set consists of Tempo-Frequency Parameters (TFPs) that are hand-picked to describe the spectral whistle contours. The second feature set embodies spectral information measured with the Fourier Descriptors (FD) commonly used in image processing for contour representation. The computed feature vectors are fed into the K-nearest neighbor (KNN) and Support Vector Machine (SVM) classification algorithms. The KNN in its basic form is a simple classifier that works well if feature clusters have clear margins and SVM uses a data dependent margin chosen for optimal performance. We argue that KNN serves to accentuate the effect of the feature sets and the SVM acts as the scientific process control. Experimental results show best results with the combination of the TFP feature extractor and the SVM classifier, suggesting a future research direction of developing non-linear kernels for SVM.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
1
On contour-based classification of dolphin whistles by type
Mahdi Esfahanian*, [email protected]
Hanqi Zhuang, [email protected]
Nurgun Erdol, [email protected]
Department of Electrical and Computer engineering and Computer Science, Florida Atlantic
University, 777 Glades road, Boca Raton, FL 33431, USA
* Corresponding author. Tel: +1 561 929 6394
E-mail address: [email protected]
ManuscriptClick here to view linked References
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
2
1. Introduction
The Southeast National Marine Renewable Energy Center within the College of Engineering and
Computer Science at Florida Atlantic University is investigating the potential of harnessing
ocean currents such as the Gulf Stream to generate base-load electricity, thereby making a
unique contribution to a broadly diversified portfolio of renewable energy for the nation’s future.
Commercial turbines are being considered for installation in the Straits of Florida which is home
to a large population of the Atlantic bottlenose dolphin (Tursiops truncatus), and several species
of endangered whales are known to migrate through the region. Farms of underwater turbines
with rotor diameters of 20 meters may disrupt migration patterns and will likely act as fish
aggregation devices, attracting predators such as the dolphin. The ability to classify vocalization
patterns of these marine mammals for signs of distress and disruption and distortion of their
communication whistles will prove to be invaluable in predicting changes in their natural life
cycle and health and in developing turbine control measures to protect them.
The over reach of anthropogenic activity in the oceans is not unique to the Florida coast. The
near-shore location of the wave resource along the coast of northern California and Oregon, for
example, is known to be a migration route for the California gray whale (Eschrichtius robustus),
and placement of extensive "farms" of wave harvesting equipment, including networks of cables,
could present obstacles to the historical migration patterns of this protected species. Studying the
vocal patterns in the vicinity of planned installations would provide vital input to mitigate
possible disruptions to their behavior patterns.
Association of vocalizations with natural communication and behavioral messages has been
the subject of research in bioacoustics for nearly three decades [1-3]. Recent studies matching
behavioral activity to vocalizations have been reported in [4] by Panova et al. who investigated
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
3
the relationship between different whistle types, pulsed tones, click series and noise vocalizations
and six behavioral contexts by using the Kruskal-Wallis criterion and the Mann-Whitney U-test.
Test results of Fisher’s exact test and the Pearson test to correlate dolphin whistles and 15
behavioral descriptions were reported in [5]. Classification of dolphin whistle types is considered
as a major step for biologists studying dolphin recognition and behavior [6]. Acoustic parameters
of whistles were analyzed as potential indicators of stress in bottlenose dolphins under various
capture-release events and undisturbed conditions [7].
Dolphin whistles are narrow-band frequency-modulated sounds centered on a time-varying
fundamental frequency commonly with several harmonic components [8]. They are believed to
serve the purpose of conveying information and communication. They have sufficient
distinguishing features that they may be categorized by function [9]. Dreher and Evans [10]
argued that dolphins shared a function-specific repertoire of whistles, and each whistle was used
in a particular behavioral context. The possibility of individual “dolphin identification” is also
considered based on the existence of a distinctive signature whistle transmitting the identity
information independent of the caller’s voice or locations [11, 12].
Based on the dataset used in the research, we have identified four types of whistle signatures
in the time-frequency domain to comprise an exhaustive set to classify bottlenose dolphin
whistles. Time-frequency spectrograms are convenient displays of long and short term data
characteristics and capture their time-evolutionary spectral features. Treating them as images
avails us of the use of archival image processing techniques. This paper proposes using Fourier
descriptors (FD) on the spectrogram as a means of extracting contour features for their automatic
classification. Fourier descriptors are translation-, scale-, rotation-, and start-point invariant
signature vectors of contours, edges, and similar shapes encountered in images. We compare the
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
4
performance of the FD feature vectors with the more direct but tedious method of deriving
Tempo-Frequency Parameters (TFPs) from traced contours. The classifiers we have chosen are
the K-Nearest Neighbor (KNN) and the Support Vector Machine (SVM) classifiers. We have
chosen SVM because its class boundaries are data adaptive and are chosen to maximize the
margin width between them. Increasing the margin width increases the measure of confidence in
classifiers and reduces misclassification. KNN is a simple classifier that functions on majority
voting. It is chosen as a “control” element and its output is used as a reference for judging
SVM’s performance. To the authors’ best knowledge, this is the first paper that reports the use
of the FD as whistle contour feature vectors and the first paper to use the SVM to classify whistle
types of dolphins and whales. That we have applied them to the classification of bottlenose
dolphins’ vocalizations by type does not preclude their application to classification by species or
to other vocalizations displaying narrowband contours. To identify dolphin whistle types
effectively, we tested combinations of the two feature extraction and two classification
algorithms under consideration with the data measured by passive acoustic sensors and report the
results and their analysis.
The steps involved in the automatic classification of dolphin whistles are detection, pre-
processing, contour tracing, feature extraction and classification (refer to Fig. 1). In the
preprocessing stage we use two-dimensional band-pass filters to isolate the fundamental whistle
and denoise the band-limited spectrogram. Under controlled environments of data acquisition and
when real-time or even pseudo real-time processing are not the object, the easiest way to denoise
is to do some kind of contour tracing in the spectrogram. The isolation significantly improves the
overall signal-to-noise ratio.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
5
The remainder of the paper is organized as follows. Section 2 describes the pre-processing and
tracing stages. The outputs are spectrograms of isolated dolphin whistles fundamental
components after they have been time windowed and band-pass filtered. Section 3 describes
algorithms for extracting relevant features. The classification methods relevant to this research
are presented in Section 4. Discussion of results from comparative experimental studies is given
in Section 5.
2. Preliminaries: Representation, Denoising and Extracting Whistle Contours
An effective method of displaying dolphin vocalizations is the time-frequency domain
spectrogram. Spectrogram images of dolphin whistles show (refer to Fig. 2) that each whistle is
adequately modeled by a Fourier series with a time-varying fundamental frequency. The
frequency of the fundamental whistle varies in the range of 4 and 20 kHz over 250-500 ms. with
2-4 harmonics mimicking the frequency sweep at integer multiples of the fundamental. The
harmonic relationship renders the higher harmonics redundant for characterization of the whistle
so that we only need to work with the fundamental whistle. Data corpus used in this study shows
that four distinct whistles shapes define an exhaustive set of classes although the proposed method
is scalable to include more classes presented by a larger corpus.
Preprocessing steps depicted in Fig. 3 are used in order to extract contours of the fundamental
whistle and include a cascade of high- and low-pass filters that limit the useful frequency range
to 4 - 16 kHz. This is a generous bandwidth to contain all the fundamental whistles we have
observed in our data set and to exclude low frequency environmental noise. Clicks and some
parts of the second harmonic that remain in the pass-band are smoothed by a running-average
filter. Reduced background noise and the absence of nearly all the non-fundamental harmonics
are clearly visible. In this work all spectrograms were obtained using a Hamming window of
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
6
1024 length (~13 ms windows) with 50% overlap. It is also assumed that the start and end points
of dolphin whistles have been previously identified.
In most cases, fundamentals remain as the highest intensity contours, as shown in Fig. 4; thus
contour tracing can be obtained from its frequency location as a function of the time. Spectral
peak picking along each temporal, and some modifications are needed if the second harmonic’s
peak competes, yield contours such as the one shown in dashed lines in Fig. 5.
3. Feature Extraction
Feature extraction for whistle identification may be parametric or non-parametric. In the first
category, we derive seven Tempo-Frequency Parameters that describe the contour shape in the
spectrogram are defined as follows: (1) Minimum frequency, (2) Maximum frequency, (3) Start
frequency, (4) End frequency, (5) Frequency range, (6) Time duration, (7) Number of inflection
points. The last parameter is defined as the number of times where the slope of whistle contour
changes sign over the time.
The non-parametric methods are distilled from a transformation of the data that concentrates
the defining information into vectors that cluster effectively. In this category, we use the Fourier
Descriptor which is an image processing technique of extracting signatures of image shapes.
Many FD methods have been utilized in areas such as shape analysis [13, 14], character
recognition [15, 16], shape coding [17], and shape retrieval [18, 19]. FD coefficients
characterize the whistle shapes to be used in the classification stage. For pedagogical reasons and
completeness, we present a brief description of the scheme next.
The key strategy is to consider the thi point on the whistle contour as a complex number
i i iz u j v (1)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
7
where the contour coordinates iu and iv are time and frequency, respectively, in a spectrogram.
The Fourier descriptors are derived from the discrete Fourier transform samples kq of iz . FDs
can be made translation-, scale-, rotation-, and start-point invariant hence are good signature
vectors for whistle contours. To make complex coordinates signature invariant to the position,
all the N descriptors except the first one (DC component) are needed to index the whistle shape.
Since the DC component depends only on the position of the shape, it is not useful in describing
shape and is thus discarded [20]. As most of the dolphin whistles are within a given frequency
range, scale normalization is achieved by dividing the magnitude values of all the other
descriptors by the magnitude value of the second descriptor. The invariant feature vector used to
index the shape is then given by:
2 3 1
1 1 1
, ,...,Nq q q
fq q q
(2)
where f is the feature vector of whistle needed for classification. Directly related to the Discrete
Fourier transform, the lower frequency FD samples represent the general features of the shape,
and the higher frequency descriptors characterize fine details of the shape. Although the number
of coefficients generated from the transform is usually large, a subset of the coefficients is
usually sufficient to capture the overall features of a shape.
4. Classification Methods
KNN and SVM are the two classifiers used in this research. The KNN algorithm classifies an
object to the class of the majority of its K nearest neighbors. The method fails for objects that are
on or near class boundaries unless the margin between the class boundaries is of sufficient width,
and misclassified objects bias the subsequent classifications.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
8
The SVM algorithm uses data adaptive class boundaries that are chosen to maximize the
margin width between them. SVM [21] is a decision making algorithm that determines a class
boundary by maximizing the margin of the class boundary. Its decision function surfaces are
obtained directly from the training data [22]. An illustration of choosing the maximum margin
for objects in two dimensions is given in Fig. 7. Of the infinite number of straight lines that
separate the data, the one chosen maximizes a distance measure to the closes data on the
boundary of each class. The dashed lines define the boundaries on each side, and the region
between them is called the margin. The data points on the boundaries are the support vectors.
They are the closest points to the decision boundary (separating hyperplane) and play the most
important role in determining the margin by which the two classes are distinguished.
Let the training inputs , 1,...,ix i M have the associated labels 1iy for Class 1 and
1iy for Class 2. If these data are linearly separable, one can define the hyperplane
( ) TD x w x b (3)
where w is the weighting vector perpendicular to the hyperplane and b is the scalar bias. The
distance between the dashed boundary planes is inversely proportional to w and the
optimization problem is equivalent to minimizing the function
21( , )
2Q w b w (4)
subject to the constraints that the training data satisfy
( ) 1, 1,..., .T
i iy w x b for i M (5)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
9
The choice of the Euclidean norm w in Eq. (4) makes it possible to solve the problem with a
quadratic programming algorithm. The optimal hyperplane is a function of the support vectors
satisfying Eq. (5) and therefore adapts to new training data. The minimization problem of Eq. (4)
may be solved by standard optimization procedures.
It is worthwhile to point out that for soft-margin problems where training samples are linearly
inseparable, a nonnegative slack variable is added to the constraints and the objective function.
If 0 1 , the training data does not have the maximum margin. Also the aforementioned
linear discriminant function may not work well for some data sets. So a nonlinear kernel may be
introduced to map the data into a higher dimensional space by following Mercer’s theorem.
Polynomial kernel, Gaussian (RBF) kernel and sigmoid kernel are some examples frequently
used by researchers [23].
The SVM classifier is usually known as a binary classifier and may not be used directly for
problems with more than two classes [24]. A conventional remedy to the problem is the so
called one-against-all method [25] which separates one class from the other classes at each time
leading to run M binary SVMs for classification of M classes. In our study, after whistles of the
thi class in both training and testing data sets are extracted, feature vectors of training samples
are fed into the thi SVM to find the optimal hyperplane. In this technique, support vector weights
and bias are calculated for each class. Subsequently the testing data from different classes go
through the SVM trained for the thi class and are classified according to their membership to the
thi class.
5. Results and Discussions
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
10
To evaluate the performance of the proposed algorithms, a collection of 100 dolphin whistles
belonging to one of the four different types was extracted from underwater passive acoustic
recordings of bottlenose dolphins. To avoid any bias in selection of the training and testing
samples, all were randomly chosen. Half of this collection was declared as the training set and
the remaining half used as testing data. Only those whistles with traceable contours were
analyzed so that tracing would not be a factor in testing the overall whistle-type classification
scheme.
The classification results using the FDs and the TFPs (refer to Section 3) are displayed in
Table 1. Overall classifier accuracy for KNN is 92% with FD and 96% with TFP, whereas
accuracy scores with linear SVM show a large variation at 80% with FD and 100% with TFP.
Even though we worked with a small size corpus, the difference is large enough to be significant.
One may conjecture from the above results that using physical parameters extracted from the
whistle contours as features has the potential of providing more accurate and reliable
classifications. They are more closely related to a human expert in determining which whistle
belongs to which class.
As to the selection of classifiers for dolphin whistle-type classification, there are often cases
where data in the feature space may not be linearly separable causing misclassifications. KNN is
a linear classifier which is not very effective when the data cannot be separated linearly. On the
other hand, SVM as one of the promising classification methods has the ability to utilize
nonlinear kernel functions instead of linear ones to overcome this issue. However, depending on
a specific problem, some kernels may perform better than others. In this study, in order to seek a
better kernel for whistle classification, three nonlinear kernels, Radial Basis Function (RBF)
kernel, polynomial kernel and sigmoid kernel, were considered. The confusion matrices in
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
11
Tables 2-4 illustrate that when the FDs were used as features, SVM with the RBF kernel, the
polynomial kernel, and the sigmoid kernel achieved an accuracy of 86%, 82% and 60%,
respectively. Another important point is that the RBF SVM is the only classifier being capable of
completely correctly classifying the third dolphin whistles type. On the other hand, the sigmoid
SVM has the highest accuracy for second-class whistles despite of its low overall performance.
As to why the combination of TFPs and SVMs performs the best, we conjecture that the
reason lies in the fact that the TFP features are hand-picked for the task guaranteeing near perfect
success similar to a human classifier. The task of classifying dolphin vocalizations by types, like
human speech recognition, is best done by humans, and the goal of machine classification is to
match the human performance. The TFP set has seven elements that were chosen by
spectrogram observation and analysis as well as consideration of all the features that completely
describe the contour. One could fathom a non-linear method of extracting TFP features from the
data set. Performance of the SVM classifier using non-linear kernels on the nonlinear features
would be indicative of the evolution of its success.
6. Conclusions
A systematic scheme has been proposed toward the goal of automatic classification of dolphin
whistle types. The concept of categorizing feature extraction algorithms into two groups: one
based on tempo-frequency characteristics of dolphin whistles and another on their global spectral
information, has been presented. As an example for each feature extraction algorithms, the TFP
and FD feature extractors have been introduced. For classification, the SVM algorithm has been
outlined. To investigate the effectiveness of the proposed approach, an experimental study based
on the data collected by passive acoustic sensors has been conducted. The experimental results
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
12
show that the combination of the TFP feature extractor and the kernel SVM classifier
outperforms any other options tested. We conjecture that it is mainly due to the fact that the
seven tempo-frequency parameters are nonlinear in nature, which integrates naturally to the
SVM with an RBF kernel. The application of these algorithms on a more expansive set of
vocalizations from a wider variety of marine mammals is left as future work. In our future
research, we will also explore ways of classifying dolphin whistle types without tracing contours
from spectrograms of dolphin whistles.
Acknowledgments
The authors would like to thank the financial support by Southeast National Marine Renewable
Energy Center in this research. We thank Drs. George Frisk and Edmund Gerstein of Florida
Atlantic University for their valuable comments about the research, and Drs. Laela Sayigh and
Mary Ann Daher of Woods Hole Oceanographic Institution (WHOI) for their assistance in our
acquisition of the acoustic data used in this research.
References
[1] J. H. Poole, K. Payne, Jr.W.R. Langbauer, C.J. Moss, The social context of some very low
frequency calls of African elephants, Behav. Ecol. Sociobiol. 22 (1988) 385–392.
[2] B.L. Sjare, T.G. Smith, The vocal repertoire of white whales, Delphinapterus leucas,
summering the Cunningham Inlet, Northwest Territories, Can. J. Zool. 64 (1986a) 407–415.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
13
[3] B.L. Sjare, T.G. Smith, The relationship between behavioral activity and underwater
vocalizations of the white whale, Delphinapterus leucas, Can. J. Zool. 64 (1986b) 2824–
2831.
[4] E.M Panova, R.A. Belikov, A.V. Agafonov, V.M. Bel'Kovich, The relationship between the
behavioral activity and the underwater vocalization of the beluga whale (Delphinapterus
leucas), Oceanology 52 (2012) 79-87.
[5] R. Ferrer-i-Cancho, B. McCowan, A law of word meaning in dolphin whistle types, Entropy
11(1995) 688–701.
[6] G. Rui, M. Chitre, S.H. Ong, E. Taylor, Automatic template matching for classification of
dolphin vocalizations, MTS/IEEE Kobe-Ocean (2008) 1-6.
[7] H. Carter Esch, L.S. Sayigh, J.E. Blum, R.S. Wells, Whistles as Potential Indicators of Stress
in Bottlenose Dolphins (Tursiopstruncatus), Journal of Mammalogy 90 (2009) 638-650.
[8] A.N. Propper, Sound emission and detection by Delphinids, in: L. M. Herman, Cetacean
Behavior: Mechanisms and Functions, Wiley, New York, 1980, pp. 1-52.
[9] P.L. Tyack, Development and social functions of signiture whistles in bottlenose dolphins
Tursiops truncatus, Bioacoustics 8 (1997) 21-46.
[10] J. J. Dreher, W.E. Evans, Cetacean communication, in: W.N. Tavolga, Marine Bioacoustics,
Pergamon Press, Oxford, 1964, pp. 373–393.
[11] M.C. Caldwell, D.K. Caldwell, P.L. Tyack, Review of the signature-whistle hypothesis for
the Atlantic Bottlenose dolphin, in: S. Leatherwood and R.R. Reeves, The Bottlenose
Dolphin, Academic Press, San Diego, 1990, pp. 199–234.
[12] V.M. Janik, P.J.B. Slater, Context-specific use suggests that bottlenose dolphin signature
whistles are cohesion calls, Anim. Behav. 56 (1998) 829–838.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
14
[13] C.T. Zahn, R.Z. Roskies, Fourier descriptors for plane closed curves, IEEE Trans.
Computers 21 (1972) 269-281.
[14] P.J. Van Otterloo, A Contour-Oriented Approach to Shape Analysis, Prentice-Hall,
Englewood Cliffs, New Jersy, 1991.
[15] E. Persoon, K-S. Fu, Shape discrimination using Fourier descriptors,” IEEE Trans. Systems,
Man and Cybernetics 7 (1977) 170-179.
[16] T.W. Rauber, Two-dimensional shape description, Technical Report: GRUNINOVA-RT-
10-94, University Nova de Lisboa, Portugal, 1994.
[17] R. Chellappa, R. Bagdazian, Fourier coding of image bouderies, IEEE Trans. Pattern
Analysis and Machine Intelligence 6 (1984) 102-105.
[18] G. Lu, A. Sajjanhar, Region-based shape representation and similarity measure suitable for
content-base image retrireval, Multimedia Systems 7 (1999) 165-174.
[19] C-L. Huang, D-H. Huang, A content-based image retrieval system,” Image and Vision
Computing 16 (1998) 149-163.
[20] H. Kauppinen, T. Seppänen, M. Pietikäinen, An experimental comparison of autoregressive
and Fourier-based descriptors in 2D shape classification,” IEEE Trans. on Pattern
Recognition Analysis and Machine Iintelligence 17 (1995) 201-208.
[21] V.N. Vapnik, Statistical Learning Theory, Wiley, New York, 1998.
[22] C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Knowledge
Discovery Data Mining 2 (1998) 121–167.
[23] I. Steinwart, A. Christmann, Support Vector Machines, Springer-Verlag, New York, 2008.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
15
[24] C.W. Hsu, C.J. Lin, A comparision of methods for multiclass support vector machines,
IEEE Trans. Neural Networks 13 (2002) 415-425.
[25] L. Bottou, C. Cortes, J. Denker, H. Drucker, I. Guyon, L. Jackel, Y. LeCun, U. Muller, E.
Sackinger, P. Simard, V. Vapnik, Comparison of classifier methods: A case study in
handwriting digit recognition, in Proceedings of the International Conference on Pattern
Recognition, (1994) 77-87.
Fig. 1. Block diagram of the proposed approach to classify different types of dolphin whistles
Fig. 2. (Color reproduction on the Web and black-and-white in print) Spectrogram of four
distinct types of dolphin whistles. Different patterns and their harmonics are observable as well
as the broadband vertical lines denoting clicks.
Figure
Fig. 3. The steps need to be taken in the pre-processing stage to isolate cleaner whistles
Fig. 4. (Color reproduction on the Web and black-and-white in print) The output of pre-
processing block. The spectrogram contains a nice whistle pattern where most of the noises
including clicks were filtered out.
Fig. 5. (Color reproduction on the Web and black-and-white in print) The whistle contour
extracted from the resulting denoised spectrogram indicated by dashed line.
Fig. 6. (Color reproduction on the Web and black-and-white in print) Linear separation of data
using the maximum margin criteria. The solid line denotes the optimal hyperplane and those
points lying on the dashed lines on both sides of the optimal hyperplane are called support
vectors which play the most important role in the performance of SVM classifier.
1
Table 1
Results accuracy of two classifiers using TFPs and FDs as the feature vectors
TFPs
FDs
KNN
(%)
Linear SVM
(%)
KNN
(%)
Linear SVM
(%)
First class 100 100 100 100
Second class 100 100 94 65
Third class 67 100 50 50
Fourth class 100 100 100 93
Table
2
Table 2
Confusion matrix of SVM classifier with RBF kernel defined as 2exp g g for 0.2
using FDs as feature vectors
First class
(%)
Second class
(%)
Third class
(%)
Fourth class
(%)
First class 100 0 0 0
Second class 0 65 0 35
Third class 0 0 100 0
Fourth class 0 7 0 93
3
Table 3
Confusion matrix of SVM classifier with 3rd
degree polynomial kernel defined as 0( . )dg g c
for 0 0, 3c d using FDs as feature vectors
First class
(%)
Second class
(%)
Third class
(%)
Fourth class
(%)
First class 100 0 0 0
Second class 0 93 0 7
Third class 0 100 0 0
Fourth class 0 14 0 86