VOCAL TRACT FLEXIBILITY AND VARIATION IN THE VOCAL OUTPUT IN WILD INDRIS

17
This article was downloaded by: [Universita degli Studi di Torino] On: 26 April 2012, At: 02:38 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Bioacoustics: The International Journal of Animal Sound and its Recording Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tbio20 VOCAL TRACT FLEXIBILITY AND VARIATION IN THE VOCAL OUTPUT IN WILD INDRIS MARCO GAMBA a , LIVIO FAVARO a , VALERIA TORTI a , VIVIANA SORRENTINO a & CRISTINA GIACOMA a a Department of Animal and Human Biology, University of Torino, Via Accademia Albertina 13, 10123, Torino, Italy Available online: 13 Apr 2012 To cite this article: MARCO GAMBA, LIVIO FAVARO, VALERIA TORTI, VIVIANA SORRENTINO & CRISTINA GIACOMA (2011): VOCAL TRACT FLEXIBILITY AND VARIATION IN THE VOCAL OUTPUT IN WILD INDRIS, Bioacoustics: The International Journal of Animal Sound and its Recording, 20:3, 251-265 To link to this article: http://dx.doi.org/10.1080/09524622.2011.9753649 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/ terms-and-conditions

Transcript of VOCAL TRACT FLEXIBILITY AND VARIATION IN THE VOCAL OUTPUT IN WILD INDRIS

This article was downloaded by: [Universita degli Studi di Torino]On: 26 April 2012, At: 02:38Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number:1072954 Registered office: Mortimer House, 37-41 Mortimer Street,London W1T 3JH, UK

Bioacoustics: TheInternational Journalof Animal Sound and itsRecordingPublication details, including instructionsfor authors and subscription information:http://www.tandfonline.com/loi/tbio20

VOCAL TRACT FLEXIBILITYAND VARIATION IN THEVOCAL OUTPUT IN WILDINDRISMARCO GAMBA a , LIVIO FAVARO a , VALERIATORTI a , VIVIANA SORRENTINO a & CRISTINAGIACOMA aa Department of Animal and Human Biology,University of Torino, Via AccademiaAlbertina 13, 10123, Torino, Italy

Available online: 13 Apr 2012

To cite this article: MARCO GAMBA, LIVIO FAVARO, VALERIA TORTI, VIVIANASORRENTINO & CRISTINA GIACOMA (2011): VOCAL TRACT FLEXIBILITYAND VARIATION IN THE VOCAL OUTPUT IN WILD INDRIS, Bioacoustics: TheInternational Journal of Animal Sound and its Recording, 20:3, 251-265

To link to this article: http://dx.doi.org/10.1080/09524622.2011.9753649

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private studypurposes. Any substantial or systematic reproduction, redistribution,reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or makeany representation that the contents will be complete or accurateor up to date. The accuracy of any instructions, formulae, and drugdoses should be independently verified with primary sources. Thepublisher shall not be liable for any loss, actions, claims, proceedings,demand, or costs or damages whatsoever or howsoever causedarising directly or indirectly in connection with or arising out of theuse of this material.

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

Bioacoustics The International Journal of Animal Sound and its Recording, 2011, Vol. 20, pp. 251-266 © 2011 AB Academic Publishers

VOCAL TRACT FLEXIBILITY AND VARIATION IN THE VOCAL OUTPUT IN WILD INDRIS

MARCO GAMBA, LIVIO FAVARO, VALERIA TORTI, VIVIANA SORRENTINO AND CRISTINA GIACOMA

Department of Animal and Human Biology, University of Torino, Via Accademia Albertina 13, 10123, Torino, Italy

INTRODUCTION

Acoustic output in human speech is commonly considered to result from the combination of a source of sound energy modulated by the transfer function determined by the supralaryngeal vocal tract. This process is central to the source-filter theory of voice production (Fant 1960) and applies well to the description of vocal sounds produced by other terrestrial mammals (see McComb & Reby 2009 for a review). This source-filter theory of voice production postulates that phonation consists of the combination of two independent events. First, air passes from the lungs through the glottis, where an harmonic signal ("source signal") is generated by the vibration of the vocal folds. Then the source signal passes through the supra-laryngeal vocal cavities, where vocal tract resonance properties selectively attenuate or reinforce certain harmonics of the source signal. In this second step, the vocal tract acts as a filter. The most emphasized frequencies are called 'formants'.

In non-human primates, including lemurs, the resulting signal, a linear combination of the two previously described independent mechanisms, finally radiates through the mouth or nostrils into the environment (Fitch & Hauser 1995). Previous research also demonstrated that articulation of the mandibles may affect formants (Gamba & Giacoma 2006), then the acoustic structure of the vocalization is directly affected by both the anatomy of the vocal tract and the articulation of the vocal apparatus.

From voice research, we know that the length of the vocal tract of a speaker affects both formant position and formant spacing (Titze 1994). In humans, singers often open their mouths as wide as possible to produce powerful high tones. This happens because the vocal tract can adopt specific shapes to project certain pitches and resonance overtones better (Joliveau et al. 2004).

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

252

The ability to modify the vocal tract to the source signal of the vocalization is, however, not exclusive to humans (Fitch 1997; Riede et al. 2005; Beckers et al. 2004). Birds show complex communicative behaviours and a recent study showed that the vocal tract of Northern Cardinals Cardinalis cardinalis could assume different forms depending on the type of syllable emitted. These birds modify the vocal tract in order to let first resonance landing on the fundamental frequency (Riede et al. 2006).

Studies on non-human primate vocal communication have shown that there is a degree of flexibility in the non-human oral tract (Fitch 1997; Zuberbi.ihler 2005), but the extent of this flexibility has never been examined in prosimians.

Long-distance calls, frequently called "loud calls", are widespread among primates (Waser 1982; Hohmann & Fruth 1995; Zimmermann 1995; Geissmann 2002) and also relatively common in lemurs, usually exhibiting high amplitudes and relatively high frequencies (Zimmermann 1995). Indri Indri indri represents a unique case, being not only the largest of all living lemurs but also because of its production of impressively loud howling cries, commonly known as "the song of the Indri".

The song is a complex sequence of vocalizations emitted by group members in a coordinated manner. During the song, two types of vocalizations are easily recognizable: harsh noisy sounds (roars) and harmonic frequency modulated notes. All adult and sub adult group members produce the modulated notes, varying both head orientation and mouth opening between different emissions occurring within the same song (Pollock 1986).

Several functions have been proposed for the Indri's song. It has been suggested that the song informs neighbouring groups about the occupation of a territory (Petter et al. 1977; Petter & Charles­Dominique 1979; Pollock 1986; Geissmann & Mutschler 2006) and also that it has a cohesion function for the group members (Pollock 1986).

In Indri, the larynx is enormously elongated when compared to those of other lemur species (Grandidier 1875; Harrison 1995; Gamba pers. observ.) and a dorsally positioned air sac is present (Grandidier 1875; Hewitt et al. 2002; Favaro et al. 2008). Indri has the ability to change the configuration of its lips in a manner that is different to that of other lemurs (Varecia spp., Eulemur spp., Hapalemur spp.). In fact, in other species lip protrusion is absent (Gamba & Giacoma 2006; 2007).

This study represents a first step towards understanding formant patterns in the vocalizations of Indri during singng. The paper investigates formant variation in three different mouth-opening configurations by means of formant measurements from the natural calls and computational modelling of the vocal tract. The paper also

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

253

investigates whether FO and Fl showed simultaneous changes in vocalizations showing the different configurations.

METHODS

Subjects and study groups

lndris live in small family groups (2-6 individuals) and produce a typical song, which includes vocalizations of 2-4 individuals yelling at the same time.

The Indris' song is a coordinated series of vocalizations given by all group members. Individuals produce vocalisations that may or may not overlap with those given by another group member. The song consists of a sequence of harmonic units, or "notes", usually preceded by a short series of harsh sounds termed "roars". During the song, changes in mandible position and mouth opening were commonly observed between subsequent vocalizations; meanwhile they were uniform within the same utterance. All emitters showed different mouth opening depending on the notes given, not depending on their sex or on the number of utterances.

All individuals uttered calls with different configurations, and sampling of suitable vocalizations was driven simply by the absence of overlapping calls produced by other group members. This study included data from 8 wild Indris (5 males, 3 females, belonging to 4 social groups), inhabiting the Analamazaotra Special Reserve of the National Pare Andasibe-Mantadia and Station Forestiere Mitsinjo, in Madagascar. We studied a total of 122 utterances. Samples from all individuals were used for each vocal tract configuration we considered in this study.

Computational models

The oral vocal tract of the Indri was simulated on the basis of drawings and anatomical analysis of this species and other indrids. This simulated vocal tract was then split into various concatenated tubes, following a methodology used in previous studies to reproduce vocal tract resonance in other lemur species (Gamba & Giacoma 2006). Articulation of the mandible and degree of mouth opening (hereafter, DMO) was simulated on the basis of observations and video recordings of wild vocalizing lndris (Figure la).

The number of concatenated tubes changed according to oral tract length in 3 different configurations: mouth barely open (MBO, DMO<l5°); intermediate mouth opening (IMO, l5°<DM0<35°), and mouth completely open (MCO, DM0>35°). Cross-sectional areas of

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

254

Figure 1. Schematic representation of the articulatory configurations observed in the wild Indris vocalizing during the song (a) and the spectrogram of the notes associated with different mouth opening configurations (b). Mouth opening exceeding 30 degrees (MCO). Mouth opening configuration between 15 and 30 degrees was referred to as intermediate (IMO). Mouth opening ranged between approximately 0 and 15 degrees (MBO) in those notes emitted with mouth barely open. Below each configuration, the respective spectrogram of a typical note is shown (b). All sounds were emitted from the same Indri and the spectrogram was generated in Praat with the following parameters: window length: 0.01 s; frequency range: 0-11000 Hz; maximum: 80 dB/Hz; dynamic range: 45 dB; pre-emphasis: 0.0 dB/Oct; dynamic compression: 0.0.

the sections, which also varied according to DMO, and length were used to build the oral tract area function that represented the input of MatLab-based vocal tract modelling software using a frequency­domain model (Zhang & Espy-Wilson 2004). We modelled the vocal tract according to the anatomical data, but also modelled tubes in which areas and lengths were increased and decreased by 10%. This range is congruent with the head crown size variability described in wild Indri populations (Zaonarivelo et al. 2007). For each computational model, the first four estimated formants were considered (Fl, F2, F3, F4).

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

255

Vocalization recording

We conducted all recordings without any manipulation of the individuals and without the use of playback stimuli. Recordings were taken using two Sennheiser shotgun microphones (ME 60 and ME 66), one facing the focal animal and the other facing other members of its family group. Microphone output signal was recorded using a solid­state Marantz digital audio recorder (PMD671) at a sampling rate of 48 kHz. The sound and video images of the focal animal's emissions were also recorded by a MiniDV camcorder (Canon MV800i).

Acoustic analysis

All the raw WAV files from the memory cards were transferred onto a PC hard drive. Among all the audio recordings, segments containing vocalizations were edited using Praat 4.6.31 (Paul Boersma & David Weenink, University of Amsterdam) and copied to a single audio file for each song (in AIFF format). A silence of 0.5 s was inserted at the beginning and at the end of the sound file.

At first, vocal signals occurring during the song were labelled according to the vocalisers by means of the video recordings. We concentrated on the focal animal and split each of its emissions into a single file (Giacoma et al. 2010; Sorrentino et al. in press). We utilized customised software to screen automatically through the saved sound files and extract vocalizations from the animal of interest based on the previously set labels and textgrids. We discarded all vocalizations showing high noise levels and extracted vocalizations, after manual examinations, exhibiting no overlap with other individuals' notes. Vocalizations that passed these two criteria were then classified according to the degree of mouth opening that the vocalisers showed during emission. The spectrograms of the sounds associated with different mouth opening configurations are shown in Figure lb. The degree of mouth opening was evaluated as a factor predicting formants in the units occurring during the song. To ensure that mouth opening was constant across the windows taken into account, a section within the unit was specified, over which the formant measurements was performed. To estimate formant location, we generated one cepstral­smoothed spectrum at the beginning of each note.

Cepstral-smoothing of signal spectra is a common technique of parameterization of speech (Pribil & Madlova 2000) and animal sounds (Favaretto et al. 2006) and it is a more reliable method for formant estimation in high-pitched signals when compared to LPC (Rabiner & Juang, 1993).

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

256

The first four formants (F1, F2, F3, F4) were measured by processing spectra, with the "cepstral smoothing" command in Praat, for a portion of 0.50 s. This was done to ensure that the measured formant values were relative to congruent air sac inflation. The bandwidth parameter was set to a frequency value 10% higher than FO (Boersma, personal communication) and ranged between 730-1220 Hz in this study.

Formants and frequency values were saved to a text file and all sections were double-checked whilst viewing spectrograms (frequency range: 0-12000 Hz; maximum: 50 dB/Hz; dynamic range: 30 dB; pre­emphasis: 6.0 dB/Oct; dynamic compression: 0.0) and video recordings at the same time.

Mter running the Praat analysis program, the text file results were saved and then imported into a customized SPSS file. The study requires the formants of each note to be measured and later compared with model formants.

To allow comparison with F1 and fundamental frequency (FO), FO was collected in Praat using the autocorrelation method ["Sound: To pitch (ac) ... "] after adjusting the analysis parameters according to the range of variation in each of the vocalization. FO measurements were appended to the text file containing formants for each vocalization analyzed.

Statistics

Formants and FO were individually averaged for those sounds emitted with the same mouth opening configurations before the statistical analyses.

The Levene test and the Tukey post hoc test were used to determine homogeneity of variance and in order to analyze formant changes between vocal tract configurations. The Friedman test and Kendall's W tests were used to determine whether significant differences can be found in formants measured across the three vocal tract configurations of the same individual. The Friedman test ranked the mean formant values for the 8 subjects for each of the three configurations, and then compared the average ranks of each of the formant values to determine if they were significantly different from one another. Kendall's W test was used as a test of concordance to the Friedman test. All tests were performed in SPSS 14 for Windows (SPSS Inc.).

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

257

RESULTS

The models

Vocal tract area functions derived from the anatomical reconstruction were used to generate computational models of the oral vocal tract of lndri using the transmission-line model (Stevens, 1998) in the different configurations we observed. These tubes are approximations of the anatomical components: from the glottal constriction, through the upper larynx, the oral cavity and lips.

The computational models for the supraglottal vocal systems consisted of 8 to 11 concatenated tubes of 10.0 mm corresponding to total tract length ranging from 80.0 mm to 110.0 mm (Table 1).

Vocal tract area functions for the original reconstruction and respective formant estimates are shown in Figure 2.

Figure 2 Vocal tract area functions and estimated formants for the models based on the measurements we originally derived from the anatomical reconstruction. Area functions and estimated formant are shown for the three vocal tract models representing different mouth opening configurations: MCO (a and b); IMO (c and d); MBO (e and f).

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

TA

BL

E 1

Inp

ut

dat

a fo

r V

TA

R (

Zh

ang

&

Esp

y-W

ilso

n 20

04)

der

ived

fro

m t

he

reco

nst

ruct

ion

of

the

Ind

ri's

voc

al t

ract

an

d f

rom

cal

cula

tin

g

esti

mat

es o

f m

ou

th o

peni

ng.

We

calc

ula

ted

cro

ss-s

ecti

onal

are

as a

t in

crem

ents

of

10.0

mm

of

10.0

mm

fro

m g

lott

is t

o li

ps (

Ori

gina

l).

We

also

mod

elle

d tu

bes

in

wh

ich

are

as a

nd

len

gth

s w

ere

incr

ease

d (

Incr

ease

d) a

nd

dec

reas

ed (

Dec

reas

ed)

by 1

0%.

Nu

mb

er o

f se

gm

ents

(N

S),

len

gth

of

each

seg

men

t (L

S)

and

seg

men

t n

um

ber

wit

h i

ts c

ross

sec

tio

nal

are

a (8

1-S

ll)

are

rep

ort

ed i

n c

olum

ns.

Mod

el

NS

L

S

81

82

8

3

84

8

5

86

8

7

88

8

9

81

0

81

1

Ori

gin

al

8 1.

00

3.59

2.

89

1.94

5.

00

3.58

3.

94

6.38

5.

36

MC

O

Incr

ease

d

8 1.

10

3.95

3.

17

2.14

5.

50

3.94

4.

33

7.02

5.

90

Dec

reas

ed

8 0.

90

3.23

2.

60

1.75

4.

50

3.22

3.

55

5.74

4.

83

Ori

gin

al

10

1.00

3.

59

2.89

1.

94

5.00

4.

53

2.97

4.

28

3.54

8.

23

8.73

IMO

In

crea

sed

10

1.

10

3.95

3.

17

2.14

5.

50

4.98

3.

26

4.71

3.

89

9.06

9.

60

Dec

reas

ed

10

0.90

3.

59

2.89

1.

94

5.00

4.

53

2.97

4.

28

3.54

8.

23

8.73

O

rig

inal

11

1.

00

3.59

2.

89

1.94

5.

00

4.53

2.

97

4.28

3.

54

4.94

5.

45

4.49

MB

O

Incr

ease

d

11

1.10

3.

95

3.17

2.

14

5.50

4.

98

3.26

4.

71

3.89

5.

43

6.00

4.

94

Dec

reas

ed

11

0.90

3.

23

2.60

1.

75

4.50

4.

07

2.67

3.

85

3.19

4.

45

4.91

4.

04

~

C)1

00

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

259

Formant estimates for MCO, IMO and MBO configurations in the models we examined are shown in Table 2 (original reconstruction, areas and lengths increased and decreased by 10%).

Formant analysis

Formants measured from the vocalizations showing the different vocal tract configurations are shown in Table 2. We applied the Levene test to verify the homogeneity of variance. Even if only F3 showed significantly different variances among the vocal tract configurations (p < 0.013), we used a non-parametric test to investigate differences in the related samples.

For all formants measured from the natural calls, the Friedman and Kendall's W tests showed that at least one vocal tract configuration differs from the others (F1, N = 8, d.f. = 2, Chi-Square = 14.250, Kendall's W = 0.891, p = 0.001; F2, N = 8, d.f. = 2, Chi­Square = 13.000, Kendall's W = 0.813, p = 0.002; F3, N = 8, d.f. = 2, Chi-Square = 13.000, Kendall's W = 0.813, p = 0.002; F4, N = 8, d.f. = 2, Chi-Square = 12.250, Kendall's W = 0. 766, p = 0.002).

Further analyses were performed using the Tukey post hoc test to verify which formants actually differed among groups (excluding F3, not showing equal variances). In all these tests, MCO formants were greater than those shown in IMO/MBO configurations (F1, N = 37, F = 11.081, p < 0.001 - Tukey, p < 0.001 and p = 0.014; F2, N = 37, F = 11.081, p < 0.001 - Tukey, p < 0.001; F4, N = 37, F = 6.723, p = 0.003 - Tukey, p = 0.003).

Changes in FO and formants across notes

Fundamental frequency measurements were compared with F1 values in natural vocalizations.

Comparison between first formant values and average FO as measured for the 8 Indris in the different vocal tract configurations are shown in Figure 3. Variation in Fundamental frequency and first formant reflects coordinate variation across vocal tract configurations. Average FO changed across vocalizations with different degrees of mouth opening from 729 ± 58 Hz and 856 ± 59 Hz of MBO and IMO to 1047 ± 62 Hz of MCO. The relationships between average FO and F1, in natural calls, showed significant correlation in all configuration groups (maximum r2 = 0.901, df = 22, p s_ 0.001).

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

TA

BL

E 2

Av

erag

e fo

rman

ts e

stim

ated

fro

m t

he

com

pu

tati

on

al m

odel

s (M

odel

s) a

nd

fro

m t

he

no

tes

of w

ild

Ind

ris

(Nat

ura

l C

alls

).

Fo

rman

ts o

f m

odel

s w

ere

aver

aged

acr

oss

dif

fere

nt

com

pu

tati

on

s an

d n

atu

ral

form

ants

wer

e av

erag

ed w

ith

in-

and

th

en b

etw

een

­in

div

idu

als

(mea

n +

sta

nd

ard

dev

iati

on).

Co

nfi

gu

rati

on

MC

O

IMO

M

BO

Mod

els

F1

(H

z)

1133

± 1

05

961

±

88

82

6 ±

78

F2(

Hz)

2694

± 2

53

2391

± 2

08

2163

± 2

08

F3(

Hz)

4343

± 4

16

3329

± 3

00

3278

± 3

18

F4(

Hz)

7471

±

719

5244

±

503

4923

± 4

78

Nat

ura

l C

alls

F1(

Hz)

1104

±

48

892

± 3

5 78

7 ±

73

F2(

Hz)

2381

±

253

1786

±

94

16

19

±1

74

F3(

Hz)

4057

± 3

56

2

96

3 ±

26

2 28

47 ±

543

F4(

Hz)

5847

± 5

37

47

26 ±

3

28

46

16 ±

33

8

~

0)

0

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

261

Figure 3: Variation of Fundamental frequency (FO,Ll) and first formant (Fl,•) across vocalizations with different vocal tract configurations (averaged within and then across individuals).

DISCUSSION

Our study combined both acoustic analyses and vocal tract computational models on the vocal output of Indri during its song, a complex acoustic behaviour unique to this prosimian species. During the song, vocalizing Indris emit a number of harmonic vocalizations showing various acoustic structures and different degrees of mouth opening. Having selected those calls not overlapping other group members' vocalizations, we compared formants estimated from the natural calls with the formants calculated from computational models of the vocal tract.

Even if the configuration of the vocal tract during the song was stable within the same vocalization, we decided to focus on the initial part of the utterance to avoid potential changes due to deflation of the air sac. Given the scarce information on the air sac morphology and phonation processes in Indri, and the lack of morphological investigation in terms of size and shape of the vocal apparatus across individuals of the two sexes, we attempted to focus on comparable conditions across individuals. Thus, we focused on the onset of the vocalization.

Three different models were considered because vocal tract shape and length vary depending on lip position and mouth opening. In agreement with previous studies (Favaro et al. 2008), we found

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

262

that models simulating resonance in the supraglottal cavities were appropriately explaining formant positions across most of the natural vocalizations. Variation occurring due to changes in mouth opening and lip position was congruently described by mean of a model simulating mouth-opening configurations. This is in agreement with evidence from anatomically correct vocal tract modelling in other lemur species (Gamba & Giacoma 2006).

Data concerning estimated natural formants and model calculations were in agreement with both MCO and IMO configurations, but showed some differences in the MBO configuration. There are at least two possible explanations. First, a longer vocal tract can become increasingly sensitive to details of the area function and thus generate less accurate calculations of formant positions. Segmentation of the vocal tract always introduces a certain margin of error due to the portion thickness of each tube. This error could by itself explain the discrepancies between the natural and the model formants. Second, lip protrusion as it occurs in the MBO configuration may possibly need more detail in its description (e.g. anatomical models, Gamba & Giacoma 2006). Lip position is not easily observed and could possibly have introduced some uncertainty about what constitutes the end of the VT. Better video resolution and a bigger dataset should help with defining a more standardized model. Moreover, as the impact of the air sac can vary across different configurations, we can hypothesize that anatomical investigation of the laryngeal cavities should greatly improve model precision in determining formants in the MBO configuration. The formants frequencies are sensitive to minimal changes in cross-sectional area at constriction points, and in the non­human vocal tracts, the points of constriction always occur between the vocal folds and the beginning of the oral cavity (Harrison 1995; Lieberman 2006). Again MBO, showing a longer tract, could be more sensitive to this fact.

Starting from the considerations above, we encourage readers not to interpret discrepancies in tract length as an implication of errors derived from incorrect length estimation, but to see them in the light of the multi-tube modelling of the vocal tract. Differences in contiguous tubes' cross-sectional areas can generate errors in the computed formants.

However, we are confident that the most important determinant in the Indri's formant pattern would be the mouth opening; as also emerges from the study of human female singers, who share with Indris the remarkably high pitch of their songs (Sundberg 1975).

This is the first step towards modelling the Indri's vocal tract and towards understanding the articulation changes occurring during its most striking and complex display, the song.

Concerning variation of FO and F1, the first formant varies remarkably from one configuration to the other, showing variation

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

263

congruent with the one exhibited by FO. Previous studies have demonstrated that human singers open their mouths as wide as possible to tune the vocal tract to certain frequencies and produce powerful high tones to be heard over the orchestra (Joliveau et al. 2004). Something similar may occur during the lndri's song to increase the broadcast area of the signalling in its dense tropical rain forest environment. In the case of the Indri's song, this is certainly something that would matter, as its primary role is, presumably, to exchange information between different social groups in the forest.

Many aspects deserve further investigations: the acoustic role of the air sacs and dispersion of higher formants, for example.

AKNOWLEDGEMENTS

All the research reported in this manuscript adhered to the principles for the ethical treatment of non-human primates. All research objectives and protocols were reviewed and approved by the Ministere de l'Environnement des Fon'\ts et du Tourisme, which gave us a legal permit to conduct the research. This research was supported by the University of Torino and by grants from the Parco Natura Viva -Centro Tutela Specie Minacciate. We thank Prof. Clement Rabarivola of the University of Mahajanga, Association Nationale pour la Gestion des Aires Protegees Madagascar, Association Mitsinjo, Dr Cesare Avesani Zavorra and Dr Caterina Spiezio for their help and support. We are also grateful to Zhaoyan Zhang, Gianni Pavan and Caroline Harcourt for their suggestions and to two anonymous reviewers for their important feedback.

REFERENCES

Beckers, G., Nelson, B. & Suthers, R. (2004). Vocal-tract filtering by lingual articula­tion in a parrot. Cur. Biol., 14(17), 1592-1597.

Fant, G. (1960). Acoustic Theory of Speech Production. Mouton: The Hague. Favaretto, A., De Battisti, R., Pavan, G. & Piccin, A. (2006). Acoustic features of red

deer (Cervus elaphus) stags vocalizations in the Cansiglio Forest (NE Italy, 2001-2002). Advances in Bioacoustics 2, Dissertationes Classis IV· Historia Naturalis, Slovenian Academy of Sciences and Arts (Ljubljana), XLVII-3, 125-138.

Favaro, L., Gamba, M., Sorrentino, V., Torti, V. & Giacoma, C. (2008). Singers in the forest: acoustic structure of indri's loud calls and first evidence of vocal tract tuning in a prosimian primate. Atti del 35° Convegno Nazionale dell'Associazione Italiana di Acustica, Milano.

Fitch, W.T. (1997). Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. J. Acoust. Soc. Am., 102, 1213-1222.

Fitch, W.T. & Hauser, M.D. (1995). Vocal production in nonhuman primates: acous­tics, physiology and functional constraints on honest advertisement. Am. J. Prima­tal., 7, 191-219.

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

264

Gamba, M. & Giacoma, C. (2006). Vocal Tract Modeling in a Prosimian Primate: The Black and White Ruffed Lemur. Acta Acust. United Acust., 92, 749-755.

Gamba, M. & Giacoma, C. (2007). Quantitative acoustic analysis of the vocal reper­toire of the crowned lemur. Ethol. Eco. Evol., 19(4), 323-343.

Geissmann, T. (2002). Duet-splitting and the evolution of gibbon songs. Biol. Rev., 77, 57-76.

Geissmann, T. & Mutschler, T. (2006). "Diurnal distribution of loud calls in sympatric wild indris (Indri indri) and ruffed lemurs (Varecia variegata): Implications for call functions". Primates, 47, 393-396.

Giacoma, C., Sorrentino, V., Rabarivola, C. & Gamba, M. (2010). Sex differences in the song of Indri indri. Int J. Primatol., 31, 539-55.

Grandidier, A. (1875). Histoire Physique, Naturelle et Politique de Madagascar. Paris: Hachette.

Harrison, D.F.N. (1995). The Anatomy and Physiology of the Mammalian Larynx. Cambridge: Cambridge University Press.

Hewitt, G., MacLarnon, A. & Jones, K.E. (2002). The Functions of Laryngeal Air Sacs in Primates: A New Hypothesis. Folia Primatol., 73, 70-94.

Hohmann, G. & Fruth, B. (1995). Long-distance calls in great apes: sex differences and social correlates. In: Current topics in primate vocal communication (Ed. by E. Zimmermann, J. Newman & U. Jurgens), 161-184. New York: Plenum Press.

Joliveau, E., Smith, J. & Wolfe J. (2004). Tuning of vocal tract resonance by sopranos. Nature, 427, 116.

Lieberman, P. (2006). Limits on tongue deformation. Diana monkey formants and the impossible vocal tract shapes proposed by Riede et al. (2005). Journal of Human Evolution, 50, 219-221.

McComb K. & Reby D. (2009). Communication in terrestrial animals. In: Encyclope­dia of Neuroscience (L.R. Squire, Editor), 2, 1167-1171. Oxford: Academic Press.

Petter, J.J., Albignac R. & Rumpler, Y. (1977). Mammiferes lemuriens (Primates prosimiens). Faune de Madagascar, 44, Paris: ORSTOM-CNRS.

Petter, J.J. & Charles-Dominique, P. (1979). Vocal communication in prosimians. In: The study of prosimian behaviour (Ed. By G.A. Doyle & R.D. Martin), pp. 247-305. New York: Academic Press.

Pollock, J.I. (1986). The song of Indris (Indri indri; Primates, Lemuroidea): Natural history, form and function. Int. J. Primatol., 7, 225-267.

Pribil, J. & Madlova, A. (2000). Computational complexity of two methods based on cepstral parameterization of speech signal. Proceedings of the 5th International Conference New Trends in Signal Processing, Liptovsky Mikulas (Slovakia), pp. 248-251.

Rabiner, L.R. & Juang, B.H. (Eds.) 1993. Fundamentals of Speech Recognition. Pren­tice Hall, Englewood Cliffs, NJ.

Riede, T., Bronson E., Hatzikirou H. & Zuberbiihler K. (2005). Vocal production mech­anisms in a non-human primate: morphological data and a model. J. Hum. Evol., 48, 85-96.

Riede, T., Suthers, R.A., Fletcher, N.H. & Blevins, W.E. (2006). Songbirds tune their vocal tract to the fundamental frequency of their song. PNAS 103, 14, 5543-5548.

Sorrentino, V., Gamba, M. & Giacoma, C. submitted. A quantitative description of the vocal types emitted in the indri's song. Leaping Ahead: Advances in Prosim­ian Biology (Ed. by J.C. Masters, M. Gamba, F. Genin). NY: Springer Science + Business Media.

Stevens, K.N. (1998). Acoustic Phonetics. Cambridge: MIT Press. Sundberg, J. (1975). Formant technique in a professional female singer. Acustica, 32,

89-96. Titze, I. R. (1994). Principles of voice production. Englewood Cliffs: Prentice Hall. Waser, P.M. (1982). The evolution of male loud calls among mangabeys and baboons.

In: Primate communication (Ed. by C.T Snowdon, C.H Brown, M.R. Petersen), pp. 117-143. Cambridge: Cambridge University Press.

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012

265

Zaonarivelo, J.R., Andriantompohavana, R., Engberg, S.E., Kelley, S.G., Randria­manana, J-C., Louis, E.E. & Brenneman, R.A. (2007). Morphometric data for lndri (lndri indri) collected from ten forest fragments in eastern Madagascar. Lemur News, 12, 19-23.

Zhang, Z. & Espy-Wilson, C.Y. (2004). A vocal tract model for American English 11/. J. Acoust. Soc. Am., 115, 1274-1280.

Zimmermann, E. (1995). Loud calls in nocturnal prosimians: structure, evolution and ontogeny. In: Current topics in primate vocal communication (Ed. by E. Zimmer­mann, J. Newman & U. Jurgens), pp. 47-72. New York: Plenum Press.

Zuberbiihler, K. (2005). The phylogenetic roots of language: evidence from primate communication and cognition. Cur. Dir. Psychol. Sci., 14, 126-130.

Dow

nloa

ded

by [

Uni

vers

ita d

egli

Stud

i di T

orin

o] a

t 02:

38 2

6 A

pril

2012