Applied Acoustics Paper

24
Elsevier Editorial System(tm) for Applied Acoustics Manuscript Draft Manuscript Number: Title: On contour-based classification of dolphin whistles by type Article Type: Research Paper Section/Category: Contributors in the America Keywords: Support vector machines; Fourier descriptor; Non-linear kernels; Pattern recognition; Dolphin whistles Corresponding Author: Mr. Mahdi Esfahanian, Ph.D Candidate Corresponding Author's Institution: Florida Atlantic University First Author: Mahdi Esfahanian, Ph.D Candidate Order of Authors: Mahdi Esfahanian, Ph.D Candidate; Hanqi Zhuang, Ph.D; Nurgun Erdol, Ph.D Abstract: Classification of cetacean vocalizations may help marine biologists study their behavioral context in different environments yet automatic classification of vocalizations for their information content has not been adequately addressed in the literature. Since classifier performance has a strong dependence on the extent to which features cluster, we, in this paper, explore the effect of two feature sets on two classifiers and assess their performance and computational complexity. We choose two feature sets that are exemplary of very different methods: The first set consists of Tempo-Frequency Parameters (TFPs) that are hand-picked to describe the spectral whistle contours. The second feature set embodies spectral information measured with the Fourier Descriptors (FD) commonly used in image processing for contour representation. The computed feature vectors are fed into the K-nearest neighbor (KNN) and Support Vector Machine (SVM) classification algorithms. The KNN in its basic form is a simple classifier that works well if feature clusters have clear margins and SVM uses a data dependent margin chosen for optimal performance. We argue that KNN serves to accentuate the effect of the feature sets and the SVM acts as the scientific process control. Experimental results show best results with the combination of the TFP feature extractor and the SVM classifier, suggesting a future research direction of developing non-linear kernels for SVM.

Transcript of Applied Acoustics Paper

Elsevier Editorial System(tm) for Applied Acoustics Manuscript Draft Manuscript Number: Title: On contour-based classification of dolphin whistles by type Article Type: Research Paper Section/Category: Contributors in the America Keywords: Support vector machines; Fourier descriptor; Non-linear kernels; Pattern recognition; Dolphin whistles Corresponding Author: Mr. Mahdi Esfahanian, Ph.D Candidate Corresponding Author's Institution: Florida Atlantic University First Author: Mahdi Esfahanian, Ph.D Candidate Order of Authors: Mahdi Esfahanian, Ph.D Candidate; Hanqi Zhuang, Ph.D; Nurgun Erdol, Ph.D Abstract: Classification of cetacean vocalizations may help marine biologists study their behavioral context in different environments yet automatic classification of vocalizations for their information content has not been adequately addressed in the literature. Since classifier performance has a strong dependence on the extent to which features cluster, we, in this paper, explore the effect of two feature sets on two classifiers and assess their performance and computational complexity. We choose two feature sets that are exemplary of very different methods: The first set consists of Tempo-Frequency Parameters (TFPs) that are hand-picked to describe the spectral whistle contours. The second feature set embodies spectral information measured with the Fourier Descriptors (FD) commonly used in image processing for contour representation. The computed feature vectors are fed into the K-nearest neighbor (KNN) and Support Vector Machine (SVM) classification algorithms. The KNN in its basic form is a simple classifier that works well if feature clusters have clear margins and SVM uses a data dependent margin chosen for optimal performance. We argue that KNN serves to accentuate the effect of the feature sets and the SVM acts as the scientific process control. Experimental results show best results with the combination of the TFP feature extractor and the SVM classifier, suggesting a future research direction of developing non-linear kernels for SVM.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

1

On contour-based classification of dolphin whistles by type

Mahdi Esfahanian*, [email protected]

Hanqi Zhuang, [email protected]

Nurgun Erdol, [email protected]

Department of Electrical and Computer engineering and Computer Science, Florida Atlantic

University, 777 Glades road, Boca Raton, FL 33431, USA

* Corresponding author. Tel: +1 561 929 6394

E-mail address: [email protected]

ManuscriptClick here to view linked References

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2

1. Introduction

The Southeast National Marine Renewable Energy Center within the College of Engineering and

Computer Science at Florida Atlantic University is investigating the potential of harnessing

ocean currents such as the Gulf Stream to generate base-load electricity, thereby making a

unique contribution to a broadly diversified portfolio of renewable energy for the nation’s future.

Commercial turbines are being considered for installation in the Straits of Florida which is home

to a large population of the Atlantic bottlenose dolphin (Tursiops truncatus), and several species

of endangered whales are known to migrate through the region. Farms of underwater turbines

with rotor diameters of 20 meters may disrupt migration patterns and will likely act as fish

aggregation devices, attracting predators such as the dolphin. The ability to classify vocalization

patterns of these marine mammals for signs of distress and disruption and distortion of their

communication whistles will prove to be invaluable in predicting changes in their natural life

cycle and health and in developing turbine control measures to protect them.

The over reach of anthropogenic activity in the oceans is not unique to the Florida coast. The

near-shore location of the wave resource along the coast of northern California and Oregon, for

example, is known to be a migration route for the California gray whale (Eschrichtius robustus),

and placement of extensive "farms" of wave harvesting equipment, including networks of cables,

could present obstacles to the historical migration patterns of this protected species. Studying the

vocal patterns in the vicinity of planned installations would provide vital input to mitigate

possible disruptions to their behavior patterns.

Association of vocalizations with natural communication and behavioral messages has been

the subject of research in bioacoustics for nearly three decades [1-3]. Recent studies matching

behavioral activity to vocalizations have been reported in [4] by Panova et al. who investigated

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

3

the relationship between different whistle types, pulsed tones, click series and noise vocalizations

and six behavioral contexts by using the Kruskal-Wallis criterion and the Mann-Whitney U-test.

Test results of Fisher’s exact test and the Pearson test to correlate dolphin whistles and 15

behavioral descriptions were reported in [5]. Classification of dolphin whistle types is considered

as a major step for biologists studying dolphin recognition and behavior [6]. Acoustic parameters

of whistles were analyzed as potential indicators of stress in bottlenose dolphins under various

capture-release events and undisturbed conditions [7].

Dolphin whistles are narrow-band frequency-modulated sounds centered on a time-varying

fundamental frequency commonly with several harmonic components [8]. They are believed to

serve the purpose of conveying information and communication. They have sufficient

distinguishing features that they may be categorized by function [9]. Dreher and Evans [10]

argued that dolphins shared a function-specific repertoire of whistles, and each whistle was used

in a particular behavioral context. The possibility of individual “dolphin identification” is also

considered based on the existence of a distinctive signature whistle transmitting the identity

information independent of the caller’s voice or locations [11, 12].

Based on the dataset used in the research, we have identified four types of whistle signatures

in the time-frequency domain to comprise an exhaustive set to classify bottlenose dolphin

whistles. Time-frequency spectrograms are convenient displays of long and short term data

characteristics and capture their time-evolutionary spectral features. Treating them as images

avails us of the use of archival image processing techniques. This paper proposes using Fourier

descriptors (FD) on the spectrogram as a means of extracting contour features for their automatic

classification. Fourier descriptors are translation-, scale-, rotation-, and start-point invariant

signature vectors of contours, edges, and similar shapes encountered in images. We compare the

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

4

performance of the FD feature vectors with the more direct but tedious method of deriving

Tempo-Frequency Parameters (TFPs) from traced contours. The classifiers we have chosen are

the K-Nearest Neighbor (KNN) and the Support Vector Machine (SVM) classifiers. We have

chosen SVM because its class boundaries are data adaptive and are chosen to maximize the

margin width between them. Increasing the margin width increases the measure of confidence in

classifiers and reduces misclassification. KNN is a simple classifier that functions on majority

voting. It is chosen as a “control” element and its output is used as a reference for judging

SVM’s performance. To the authors’ best knowledge, this is the first paper that reports the use

of the FD as whistle contour feature vectors and the first paper to use the SVM to classify whistle

types of dolphins and whales. That we have applied them to the classification of bottlenose

dolphins’ vocalizations by type does not preclude their application to classification by species or

to other vocalizations displaying narrowband contours. To identify dolphin whistle types

effectively, we tested combinations of the two feature extraction and two classification

algorithms under consideration with the data measured by passive acoustic sensors and report the

results and their analysis.

The steps involved in the automatic classification of dolphin whistles are detection, pre-

processing, contour tracing, feature extraction and classification (refer to Fig. 1). In the

preprocessing stage we use two-dimensional band-pass filters to isolate the fundamental whistle

and denoise the band-limited spectrogram. Under controlled environments of data acquisition and

when real-time or even pseudo real-time processing are not the object, the easiest way to denoise

is to do some kind of contour tracing in the spectrogram. The isolation significantly improves the

overall signal-to-noise ratio.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

5

The remainder of the paper is organized as follows. Section 2 describes the pre-processing and

tracing stages. The outputs are spectrograms of isolated dolphin whistles fundamental

components after they have been time windowed and band-pass filtered. Section 3 describes

algorithms for extracting relevant features. The classification methods relevant to this research

are presented in Section 4. Discussion of results from comparative experimental studies is given

in Section 5.

2. Preliminaries: Representation, Denoising and Extracting Whistle Contours

An effective method of displaying dolphin vocalizations is the time-frequency domain

spectrogram. Spectrogram images of dolphin whistles show (refer to Fig. 2) that each whistle is

adequately modeled by a Fourier series with a time-varying fundamental frequency. The

frequency of the fundamental whistle varies in the range of 4 and 20 kHz over 250-500 ms. with

2-4 harmonics mimicking the frequency sweep at integer multiples of the fundamental. The

harmonic relationship renders the higher harmonics redundant for characterization of the whistle

so that we only need to work with the fundamental whistle. Data corpus used in this study shows

that four distinct whistles shapes define an exhaustive set of classes although the proposed method

is scalable to include more classes presented by a larger corpus.

Preprocessing steps depicted in Fig. 3 are used in order to extract contours of the fundamental

whistle and include a cascade of high- and low-pass filters that limit the useful frequency range

to 4 - 16 kHz. This is a generous bandwidth to contain all the fundamental whistles we have

observed in our data set and to exclude low frequency environmental noise. Clicks and some

parts of the second harmonic that remain in the pass-band are smoothed by a running-average

filter. Reduced background noise and the absence of nearly all the non-fundamental harmonics

are clearly visible. In this work all spectrograms were obtained using a Hamming window of

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

6

1024 length (~13 ms windows) with 50% overlap. It is also assumed that the start and end points

of dolphin whistles have been previously identified.

In most cases, fundamentals remain as the highest intensity contours, as shown in Fig. 4; thus

contour tracing can be obtained from its frequency location as a function of the time. Spectral

peak picking along each temporal, and some modifications are needed if the second harmonic’s

peak competes, yield contours such as the one shown in dashed lines in Fig. 5.

3. Feature Extraction

Feature extraction for whistle identification may be parametric or non-parametric. In the first

category, we derive seven Tempo-Frequency Parameters that describe the contour shape in the

spectrogram are defined as follows: (1) Minimum frequency, (2) Maximum frequency, (3) Start

frequency, (4) End frequency, (5) Frequency range, (6) Time duration, (7) Number of inflection

points. The last parameter is defined as the number of times where the slope of whistle contour

changes sign over the time.

The non-parametric methods are distilled from a transformation of the data that concentrates

the defining information into vectors that cluster effectively. In this category, we use the Fourier

Descriptor which is an image processing technique of extracting signatures of image shapes.

Many FD methods have been utilized in areas such as shape analysis [13, 14], character

recognition [15, 16], shape coding [17], and shape retrieval [18, 19]. FD coefficients

characterize the whistle shapes to be used in the classification stage. For pedagogical reasons and

completeness, we present a brief description of the scheme next.

The key strategy is to consider the thi point on the whistle contour as a complex number

i i iz u j v (1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

7

where the contour coordinates iu and iv are time and frequency, respectively, in a spectrogram.

The Fourier descriptors are derived from the discrete Fourier transform samples kq of iz . FDs

can be made translation-, scale-, rotation-, and start-point invariant hence are good signature

vectors for whistle contours. To make complex coordinates signature invariant to the position,

all the N descriptors except the first one (DC component) are needed to index the whistle shape.

Since the DC component depends only on the position of the shape, it is not useful in describing

shape and is thus discarded [20]. As most of the dolphin whistles are within a given frequency

range, scale normalization is achieved by dividing the magnitude values of all the other

descriptors by the magnitude value of the second descriptor. The invariant feature vector used to

index the shape is then given by:

2 3 1

1 1 1

, ,...,Nq q q

fq q q

(2)

where f is the feature vector of whistle needed for classification. Directly related to the Discrete

Fourier transform, the lower frequency FD samples represent the general features of the shape,

and the higher frequency descriptors characterize fine details of the shape. Although the number

of coefficients generated from the transform is usually large, a subset of the coefficients is

usually sufficient to capture the overall features of a shape.

4. Classification Methods

KNN and SVM are the two classifiers used in this research. The KNN algorithm classifies an

object to the class of the majority of its K nearest neighbors. The method fails for objects that are

on or near class boundaries unless the margin between the class boundaries is of sufficient width,

and misclassified objects bias the subsequent classifications.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

8

The SVM algorithm uses data adaptive class boundaries that are chosen to maximize the

margin width between them. SVM [21] is a decision making algorithm that determines a class

boundary by maximizing the margin of the class boundary. Its decision function surfaces are

obtained directly from the training data [22]. An illustration of choosing the maximum margin

for objects in two dimensions is given in Fig. 7. Of the infinite number of straight lines that

separate the data, the one chosen maximizes a distance measure to the closes data on the

boundary of each class. The dashed lines define the boundaries on each side, and the region

between them is called the margin. The data points on the boundaries are the support vectors.

They are the closest points to the decision boundary (separating hyperplane) and play the most

important role in determining the margin by which the two classes are distinguished.

Let the training inputs , 1,...,ix i M have the associated labels 1iy for Class 1 and

1iy for Class 2. If these data are linearly separable, one can define the hyperplane

( ) TD x w x b (3)

where w is the weighting vector perpendicular to the hyperplane and b is the scalar bias. The

distance between the dashed boundary planes is inversely proportional to w and the

optimization problem is equivalent to minimizing the function

21( , )

2Q w b w (4)

subject to the constraints that the training data satisfy

( ) 1, 1,..., .T

i iy w x b for i M (5)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

9

The choice of the Euclidean norm w in Eq. (4) makes it possible to solve the problem with a

quadratic programming algorithm. The optimal hyperplane is a function of the support vectors

satisfying Eq. (5) and therefore adapts to new training data. The minimization problem of Eq. (4)

may be solved by standard optimization procedures.

It is worthwhile to point out that for soft-margin problems where training samples are linearly

inseparable, a nonnegative slack variable is added to the constraints and the objective function.

If 0 1 , the training data does not have the maximum margin. Also the aforementioned

linear discriminant function may not work well for some data sets. So a nonlinear kernel may be

introduced to map the data into a higher dimensional space by following Mercer’s theorem.

Polynomial kernel, Gaussian (RBF) kernel and sigmoid kernel are some examples frequently

used by researchers [23].

The SVM classifier is usually known as a binary classifier and may not be used directly for

problems with more than two classes [24]. A conventional remedy to the problem is the so

called one-against-all method [25] which separates one class from the other classes at each time

leading to run M binary SVMs for classification of M classes. In our study, after whistles of the

thi class in both training and testing data sets are extracted, feature vectors of training samples

are fed into the thi SVM to find the optimal hyperplane. In this technique, support vector weights

and bias are calculated for each class. Subsequently the testing data from different classes go

through the SVM trained for the thi class and are classified according to their membership to the

thi class.

5. Results and Discussions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

10

To evaluate the performance of the proposed algorithms, a collection of 100 dolphin whistles

belonging to one of the four different types was extracted from underwater passive acoustic

recordings of bottlenose dolphins. To avoid any bias in selection of the training and testing

samples, all were randomly chosen. Half of this collection was declared as the training set and

the remaining half used as testing data. Only those whistles with traceable contours were

analyzed so that tracing would not be a factor in testing the overall whistle-type classification

scheme.

The classification results using the FDs and the TFPs (refer to Section 3) are displayed in

Table 1. Overall classifier accuracy for KNN is 92% with FD and 96% with TFP, whereas

accuracy scores with linear SVM show a large variation at 80% with FD and 100% with TFP.

Even though we worked with a small size corpus, the difference is large enough to be significant.

One may conjecture from the above results that using physical parameters extracted from the

whistle contours as features has the potential of providing more accurate and reliable

classifications. They are more closely related to a human expert in determining which whistle

belongs to which class.

As to the selection of classifiers for dolphin whistle-type classification, there are often cases

where data in the feature space may not be linearly separable causing misclassifications. KNN is

a linear classifier which is not very effective when the data cannot be separated linearly. On the

other hand, SVM as one of the promising classification methods has the ability to utilize

nonlinear kernel functions instead of linear ones to overcome this issue. However, depending on

a specific problem, some kernels may perform better than others. In this study, in order to seek a

better kernel for whistle classification, three nonlinear kernels, Radial Basis Function (RBF)

kernel, polynomial kernel and sigmoid kernel, were considered. The confusion matrices in

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

11

Tables 2-4 illustrate that when the FDs were used as features, SVM with the RBF kernel, the

polynomial kernel, and the sigmoid kernel achieved an accuracy of 86%, 82% and 60%,

respectively. Another important point is that the RBF SVM is the only classifier being capable of

completely correctly classifying the third dolphin whistles type. On the other hand, the sigmoid

SVM has the highest accuracy for second-class whistles despite of its low overall performance.

As to why the combination of TFPs and SVMs performs the best, we conjecture that the

reason lies in the fact that the TFP features are hand-picked for the task guaranteeing near perfect

success similar to a human classifier. The task of classifying dolphin vocalizations by types, like

human speech recognition, is best done by humans, and the goal of machine classification is to

match the human performance. The TFP set has seven elements that were chosen by

spectrogram observation and analysis as well as consideration of all the features that completely

describe the contour. One could fathom a non-linear method of extracting TFP features from the

data set. Performance of the SVM classifier using non-linear kernels on the nonlinear features

would be indicative of the evolution of its success.

6. Conclusions

A systematic scheme has been proposed toward the goal of automatic classification of dolphin

whistle types. The concept of categorizing feature extraction algorithms into two groups: one

based on tempo-frequency characteristics of dolphin whistles and another on their global spectral

information, has been presented. As an example for each feature extraction algorithms, the TFP

and FD feature extractors have been introduced. For classification, the SVM algorithm has been

outlined. To investigate the effectiveness of the proposed approach, an experimental study based

on the data collected by passive acoustic sensors has been conducted. The experimental results

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

12

show that the combination of the TFP feature extractor and the kernel SVM classifier

outperforms any other options tested. We conjecture that it is mainly due to the fact that the

seven tempo-frequency parameters are nonlinear in nature, which integrates naturally to the

SVM with an RBF kernel. The application of these algorithms on a more expansive set of

vocalizations from a wider variety of marine mammals is left as future work. In our future

research, we will also explore ways of classifying dolphin whistle types without tracing contours

from spectrograms of dolphin whistles.

Acknowledgments

The authors would like to thank the financial support by Southeast National Marine Renewable

Energy Center in this research. We thank Drs. George Frisk and Edmund Gerstein of Florida

Atlantic University for their valuable comments about the research, and Drs. Laela Sayigh and

Mary Ann Daher of Woods Hole Oceanographic Institution (WHOI) for their assistance in our

acquisition of the acoustic data used in this research.

References

[1] J. H. Poole, K. Payne, Jr.W.R. Langbauer, C.J. Moss, The social context of some very low

frequency calls of African elephants, Behav. Ecol. Sociobiol. 22 (1988) 385–392.

[2] B.L. Sjare, T.G. Smith, The vocal repertoire of white whales, Delphinapterus leucas,

summering the Cunningham Inlet, Northwest Territories, Can. J. Zool. 64 (1986a) 407–415.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

13

[3] B.L. Sjare, T.G. Smith, The relationship between behavioral activity and underwater

vocalizations of the white whale, Delphinapterus leucas, Can. J. Zool. 64 (1986b) 2824–

2831.

[4] E.M Panova, R.A. Belikov, A.V. Agafonov, V.M. Bel'Kovich, The relationship between the

behavioral activity and the underwater vocalization of the beluga whale (Delphinapterus

leucas), Oceanology 52 (2012) 79-87.

[5] R. Ferrer-i-Cancho, B. McCowan, A law of word meaning in dolphin whistle types, Entropy

11(1995) 688–701.

[6] G. Rui, M. Chitre, S.H. Ong, E. Taylor, Automatic template matching for classification of

dolphin vocalizations, MTS/IEEE Kobe-Ocean (2008) 1-6.

[7] H. Carter Esch, L.S. Sayigh, J.E. Blum, R.S. Wells, Whistles as Potential Indicators of Stress

in Bottlenose Dolphins (Tursiopstruncatus), Journal of Mammalogy 90 (2009) 638-650.

[8] A.N. Propper, Sound emission and detection by Delphinids, in: L. M. Herman, Cetacean

Behavior: Mechanisms and Functions, Wiley, New York, 1980, pp. 1-52.

[9] P.L. Tyack, Development and social functions of signiture whistles in bottlenose dolphins

Tursiops truncatus, Bioacoustics 8 (1997) 21-46.

[10] J. J. Dreher, W.E. Evans, Cetacean communication, in: W.N. Tavolga, Marine Bioacoustics,

Pergamon Press, Oxford, 1964, pp. 373–393.

[11] M.C. Caldwell, D.K. Caldwell, P.L. Tyack, Review of the signature-whistle hypothesis for

the Atlantic Bottlenose dolphin, in: S. Leatherwood and R.R. Reeves, The Bottlenose

Dolphin, Academic Press, San Diego, 1990, pp. 199–234.

[12] V.M. Janik, P.J.B. Slater, Context-specific use suggests that bottlenose dolphin signature

whistles are cohesion calls, Anim. Behav. 56 (1998) 829–838.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

14

[13] C.T. Zahn, R.Z. Roskies, Fourier descriptors for plane closed curves, IEEE Trans.

Computers 21 (1972) 269-281.

[14] P.J. Van Otterloo, A Contour-Oriented Approach to Shape Analysis, Prentice-Hall,

Englewood Cliffs, New Jersy, 1991.

[15] E. Persoon, K-S. Fu, Shape discrimination using Fourier descriptors,” IEEE Trans. Systems,

Man and Cybernetics 7 (1977) 170-179.

[16] T.W. Rauber, Two-dimensional shape description, Technical Report: GRUNINOVA-RT-

10-94, University Nova de Lisboa, Portugal, 1994.

[17] R. Chellappa, R. Bagdazian, Fourier coding of image bouderies, IEEE Trans. Pattern

Analysis and Machine Intelligence 6 (1984) 102-105.

[18] G. Lu, A. Sajjanhar, Region-based shape representation and similarity measure suitable for

content-base image retrireval, Multimedia Systems 7 (1999) 165-174.

[19] C-L. Huang, D-H. Huang, A content-based image retrieval system,” Image and Vision

Computing 16 (1998) 149-163.

[20] H. Kauppinen, T. Seppänen, M. Pietikäinen, An experimental comparison of autoregressive

and Fourier-based descriptors in 2D shape classification,” IEEE Trans. on Pattern

Recognition Analysis and Machine Iintelligence 17 (1995) 201-208.

[21] V.N. Vapnik, Statistical Learning Theory, Wiley, New York, 1998.

[22] C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Knowledge

Discovery Data Mining 2 (1998) 121–167.

[23] I. Steinwart, A. Christmann, Support Vector Machines, Springer-Verlag, New York, 2008.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

15

[24] C.W. Hsu, C.J. Lin, A comparision of methods for multiclass support vector machines,

IEEE Trans. Neural Networks 13 (2002) 415-425.

[25] L. Bottou, C. Cortes, J. Denker, H. Drucker, I. Guyon, L. Jackel, Y. LeCun, U. Muller, E.

Sackinger, P. Simard, V. Vapnik, Comparison of classifier methods: A case study in

handwriting digit recognition, in Proceedings of the International Conference on Pattern

Recognition, (1994) 77-87.

Fig. 1. Block diagram of the proposed approach to classify different types of dolphin whistles

Fig. 2. (Color reproduction on the Web and black-and-white in print) Spectrogram of four

distinct types of dolphin whistles. Different patterns and their harmonics are observable as well

as the broadband vertical lines denoting clicks.

Figure

Fig. 3. The steps need to be taken in the pre-processing stage to isolate cleaner whistles

Fig. 4. (Color reproduction on the Web and black-and-white in print) The output of pre-

processing block. The spectrogram contains a nice whistle pattern where most of the noises

including clicks were filtered out.

Fig. 5. (Color reproduction on the Web and black-and-white in print) The whistle contour

extracted from the resulting denoised spectrogram indicated by dashed line.

Fig. 6. (Color reproduction on the Web and black-and-white in print) Linear separation of data

using the maximum margin criteria. The solid line denotes the optimal hyperplane and those

points lying on the dashed lines on both sides of the optimal hyperplane are called support

vectors which play the most important role in the performance of SVM classifier.

1

Table 1

Results accuracy of two classifiers using TFPs and FDs as the feature vectors

TFPs

FDs

KNN

(%)

Linear SVM

(%)

KNN

(%)

Linear SVM

(%)

First class 100 100 100 100

Second class 100 100 94 65

Third class 67 100 50 50

Fourth class 100 100 100 93

Table

2

Table 2

Confusion matrix of SVM classifier with RBF kernel defined as 2exp g g for 0.2

using FDs as feature vectors

First class

(%)

Second class

(%)

Third class

(%)

Fourth class

(%)

First class 100 0 0 0

Second class 0 65 0 35

Third class 0 0 100 0

Fourth class 0 7 0 93

3

Table 3

Confusion matrix of SVM classifier with 3rd

degree polynomial kernel defined as 0( . )dg g c

for 0 0, 3c d using FDs as feature vectors

First class

(%)

Second class

(%)

Third class

(%)

Fourth class

(%)

First class 100 0 0 0

Second class 0 93 0 7

Third class 0 100 0 0

Fourth class 0 14 0 86

4

Table 4

Confusion matrix of SVM classifier with sigmoid kernel defined as 0( ( . ) )sigmoid g g c for

00.2, 0c using FDs as feature vectors

First class

(%)

Second class

(%)

Third class

(%)

Fourth class

(%)

First class 100 0 0 0

Second class 0 100 0 0

Third class 83 17 0 0

Fourth class 0 100 0 0