Rhythmic modification in child directed speech

Supplementi alla Biblioteca di Linguistica

diretta da Massimo Arcangeli

9

Biblioteca di Linguistica

collana diretta da Massimo Arcangeli

La collana prevede una serie di volumi, affidati alle cure di diversi specialisti, dedicati ad aspetti essenziali della linguistica e ad alcuni temi forti della linguistica contemporanea. Ogni volu-me sarà costituito da una parte teorica introduttiva, da un’ampia antologia e da un glossario ragionato, e concederà uno spazio privilegiato alla linguistica italiana. Un Dizionario ragionato di linguistica assommerà alla fine in sé tutti i dizionari acclusi ai vari volumi. A utile corredo della collana è prevista inoltre la pubblicazione di una serie di supplementi di approfondimento di singoli temi.

Volume pubblicato con l’appoggio di un finanziamento ministeriale per la ricerca scientifica (BQR2009 = Bonus Qualité Recherche)

assegnato dall’Università di Parigi 8 e con una sovvenzione dell’UMR 7023-CNRS, Structures formelles du langage

Prosodic Universals

edited by

Michela Russo

comParative stUdiesin rhythmic modeling and rhythm tyPology

Copyright © MMXARACNE editrice S.r.l.

[email protected]

via Raffaele Garofalo, 133/A–B00173 Roma

(06) 93781065

isbn 978–88–548–2710–3

I diritti di traduzione, di memorizzazione elettronica,di riproduzione e di adattamento anche parziale,

con qualsiasi mezzo, sono riservati per tutti i Paesi.

Non sono assolutamente consentite le fotocopiesenza il permesso scritto dell’Editore.

I edizione: novembre 2010

7

IndIce

Michela RussoIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

William J . Barry and Bistra AndreevaLosing the trees in the wood? Reflections on the measurement of spoken-language rhythm . . . . . . . . . . . . 27

Pier Marco Bertinetto and Chiara Bertini Towards a unified predictive model of Natural Language Rhythm . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Antonio Romano and Paolo MairanoSpeech rhythm measuring and modelling: pointing out multi-layer and multi-parameter assessments . . . . . . . . . . . . . . 79

Petra WagnerA Time-delay Approach to Speech Rhythm Visualisation, Modeling and Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Elinor Payne, Brechtje Post, Lluïsa Astruc, Pilar Prieto and Maria del Mar Vanrell Rhythmic modification in Child Directed Speech . . . . . . . . . . . . 147

Michela Russo and William John BarryIl Pairwise Variability Index (PVI e PVIs): valori ritmici per i dialetti italiani e per l’italiano regionale. Implicazioni tipologiche . . . . . . . . 185

Antonio Pamies Bertrán Quelques malentendus à propos du concept de rhythme en linguistique . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

Haike Jacobs Quantity-Insensitive Stress and the Need for OT with Candidate Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291

9

IntroductIon

Rhythm has traditionally been associated with the idea of iso-chronic (regular) intervals between sequences of speech units – either syllables or feet (the isochrony claim). This is based on the assumption that there is either no shortening or lengthening of syllables as a function of stress (syllable timing) or a shortening of unstressed and lengthening of stressed syllables (stress tim-ing). However, instrumental analyses focussing on these tradi-tional claims of syllable isochrony and accented syllable (foot) isochrony have never been satisfactory, even if differing degrees of lengthening and shortening as a function of stress have been found. One of many steps in defence of the isochrony claim has been to retire to the level of perceived regularity, i.e. from the objective to the subjective. However, little systematic work has been undertaken to support the idea.

Failure to verify the equal-intervals assumption objectively has led to a reassessment of what lies behind the auditory impres-sions which prompted the original differentiation of languages as syllable – or stress-timed. Bertinetto (1981) and Dauer (1983, 1987) proposed a number of structural properties which can dif-fer in their manifestation across languages, and which lent some support to the originally assumed dichotomy while also allow-ing mixed language types between the two extremes. Although it would appear logical to attribute rhythmic differences to phono-logical properties in languages, the validity of methods for quan-tifying the properties in rhythmic terms is still not clear.

In a delayed reaction to the structural reassessment proposed by Bertinetto and Dauer, the search for an objective basis shifted direction from a search for isochrony to a search for differenc-es in variability, initially taking the duration of the vocalic and consonantal components of the syllable for its calculations. Sim-ple measures of the variability of vowel intervals (DV, PVI-V)

10 Prosodic Universals

and inter-vocalic intervals (DC, PVI-C) were tested (Ramus 1999, 2002, Ramus et al. 1999; Low et al. 2000; Grabe and Low 2002). These reflect phonetic consequences of many of the sylla-ble-based structural properties and prosodic differences between languages. With two such measures within a bi-axial space, lan-guages of differing syllabic complexity or with differing ten-dency for reduction of unstressed syllables can be separated in a way which appears to reflect the traditional rhythm types. As might be expected with a multi-factorial structure, not all lan-guages fulfil all criteria, so that mixed rhythm types are found (cf. Nespor 1990; Grabe-Low 2002). A satisfying scatter of mea-sures has been found for different languages between what one might consider prototypical extremes. The use of purely dura-tional measures, in contrast to the more differentiated structural criteria, is defendable to a certain extent. Many of the factors identified by Bertinetto and Dauer are known to correlate with duration, and even melodic change has been shown to influence perceived length (Lehiste 1977; van Dommelen 1990). And in any case, whatever else may contribute, the essence of rhythm is one of temporal patterning.

Understandable objections to the use of uncoordinated sub-syllabic units led to some researchers returning to a direct mea-sure of syllable variability (see for example work by Deterding 2001, Russo and Barry or Mok-Dellwo 2008), with the loss of the convenient two-dimensional space for visualisation. This has been restored with the logical extension of the measures to the relation of syllable variability to foot variability (Nolan and Asu 2009).

Thus, there have not only been extensions to the type of unit measured but also changes to the way the variability of sub-syl-labic units is calculated. To cite just two prominent examples, Dellwo (2004) has explicitly addressed variation stemming from tempo differences, and Bertinetto-Bertini (2008) have made the relationship between the duration and the number of underlying ‘slots’ in the consonantal and vocalic intervals explicit in their variability measure. They have also explicitly distanced them-selves from the persistent ‘syllable- vs. stress-timed’ distinction,

11Introduction

making the rhythmic consequences of production timing second-ary to a typology of production control.

Consideration of the principles involved in quantifying rhythm and categorizing languages according to rhythm type has led to the above-mentioned extension of the rhythm units mea-sured, and it has also uncovered a number of as yet unresolved problems. The speaker, the speaking style and the speech ma-terial can change the position of a language within the rhythm space relative to other languages, and the spatial relations be-tween languages can change when they are quantified by differ-ent methods (see Grabe and Low 2002; Deterding 2001; Ramus 2002; Steiner 2003 for discussion about the principles behind the different methods). Seen positively, this indicates that the issue of rhythm quantification is still alive.

The structural basis of the measurements implies dependency on the nature of the speech material. Representativity of the cor-pus used for quantification is an overriding issue. If it is large enough, has enough different speakers, enough different utter-ances spoken in enough different styles of speech, it must by definition claim to be representative of the language. But, at the same time, to move towards representativity is to increase the scatter of measures from the multitude of utterances, making any single average measure typologically almost meaningless. There are no reliable criteria for limiting the selection of utterances while still guaranteeing representative coverage.

In summary, the shift in the understanding of rhythm is not unproblematic: a) Dauer’s consideration of structural properties are abstract and far removed from the surface phenomena of con-tinuous speech production which is the level of rhythmic experi-ence; b) Ramus, Low -and the subsequent research by those fol-lowing them- reduce the complex of structural properties which Dauer implicates in rhythmic characterisation to purely duration-al measures; c) the effect of message-dependent speech material is greatly underestimated. Fluctuations in measurement can be shown to relate to the lexical and intonational structure of the in-dividual utterances, thus placing the concept of rhythm typology in the same sphere as other perceptually based generalisations on

12 Prosodic Universals

human behaviour. Statistical analysis shows, however, that the factor language still remains a significant differentiator. It is un-clear, however, to what extent the differences in discriminatory power between the approaches really reflect significant properties of the languages. And are we capturing rhythm in any way?

It is not easy to predict where languages should be placed rel-ative to one another on the rhythm continuum. Only the syllable-complexity criterion offers clear differentiation along traditional lines. All measures capture some aspect of differences in syllable complexity, so they are basically suitable for comparing lan-guages (and such differences affect the time spent articulating a syllable): PVI-C and DC measures are sensitive to differences in the onset and coda structure, PVI-V and DV measures reflect dif-ferences between languages with only single-slot syllabic nuclei and those with single- and double-slot nuclei. A language with more variable onset- and coda-structure, long and short vowels reduction of unstressed/unaccented syllables will generate higher variability measures than a language without such features. But there will also be a larger scatter of PVI or Delta measures over the utterances of any given corpus, because some utterances con-sist of just simple structures with only small changes from one syllable to another. So the measures are extremely text-depen-dent. The same measures, calculated for the same language, but from two different corpora, can result in radically different typo-logical associations. This suggests that the rhythm values are not reliable indicators of the rhythmic status of a language, or even of a regional variety in typological terms, but rather reflect the language material that occurs in the utterances produced and the style in which they are produced.

Furthermore, these recent measures, which are based on vari-abilty rather than isochrony (i.e., on the durational consequences of differences in syllable structure and phrasal modification), have a much less obvious link between the auditory impression and the physical measures. Different units (Cs, Vs, syllables) have different inherent relationship to the basis of perceived rhythm. But perception experiments are rare, and are not related to individual rhythm measures of the utterances.

27

LosIng the trees In the wood? refLectIons on the measurement of spoken-Language rhythm

William John BarryBistra AndreevaInstitut für Phonetik - Universität des Saarlandes

abstract: This paper takes a critical look at language rhythm from the per-spective of what it might actually be in auditory and functional terms. Querying the reasonableness of expecting any sort of temporal regu-larity in the prominent syllables produced in natural, communicatively meaningful speech, the discussion probes the link between instrumental measures and auditory impression that appears to be taken for granted. It also examines recent evidence for the assumption that the rhythmic structure of an utterance is a decoding aid, arguing that regularity is not essential if the multi-level nature of the decoding process is taken into consideration. In the final section, the multi-parametric acoustic basis of rhythm-carrying prominences is reviewed. It is argued that the production-interdependence of duration, fundamental frequency (f0), intensity and spectral definition does not imply the mutual predictivity of these parameters. Reduction of the analytic measurements to dura-tion alone, which has been the custom in rhythm studies, is therefore not an acceptable option. An example of duration – f0 compensation providing trading-relation support for perceived rhythmic strength un-derlines this view.

1. Preamble–thedevelopingfield

One of the intriguing and, at the same time frustrating things about the instrumental analysis of spoken-language rhythm is that it has stubbornly survived, without empirical justification it would seem, for a human life-span. In dynastic terms, we should be well into the third generation, traditionally a guarantee of the approaching demise. However, first-generation studies, wedded to the concept of ‘isochrony’ – mainly of either the syllable or

Prosodic Universals28

the foot – continued for more than half a century, albeit with increasing scepticism (cf. Classe 1939; Pointon 1995). They ex-plored, without success, the many crevices of possible articulato-ry, acoustic and perceptual explanatory edifices (cf. the compre-hensive and quirkily structured discussion in Bertinetto 1989), before a methodological sleight of hand, heralding the second generation, turned everything on its head.

‘Isochrony’ was usurped from its role as the blaue Blume1 of rhythm research, and replaced by variability. At the same time, the syllable and the foot (and later the mora), as alternative base units of rhythmic identity, were discarded in favour of the univer-sal nuclear (sonantal) and non-nuclear (con-sonantal) sub-units of syllabic structure. Two measures emerged towards the end of the 1990s: a) Frank Ramus’ delta values (ΔC and ΔV) (cf. Ramus et al. 1999), which were the standard deviation of the vocalic and consonantal intervals within an utterance, together with a measure of the vocalic proportion of the utterance (%V), b) Ee Low’s Pairwise Variability Indices (PVI), developed in collabo-ration with Esther Grabe and Francis Nolan (Low et al. 2000). A vocalic and a consonantal measure expressed the average differ-ence between the members of all the pairs of consecutive vocalic and consecutive intervocalic intervals over the course of an utter-ance. An alternative, locally normalized difference measure was generally preferred for the vowel intervals to correct for possible rate changes. Both the Ramus and the Low measures allowed the two-dimensional presentation of the values for any given lan-guage in a sort of rhythm space (see Fig. 1).

While this space provides a plausible visualization of the stress-timed – syllable-timed continuum, and even provides a ‘diagnostic’ frame for identifying the dimension (vocalic or consonantal) along which any language might deviate from the prototypical, there is no theoretically predictable location for the third category, the mora-timed language.

1. The blaue Blume (blue flower) of the German romantic movement, which had its origin in Novalis’ novel Heinrich von Ofterdingen, symbolized the yearning for an (unattainable) understanding of Nature.

291

acknowLedgements

Some of the articles which appear in this volume are revised versions of papers presented at the International Workshop on Prosodic Universals Confrontation sur l’état des recherches en modélisation du rythme et typologies rythmiques held in Paris, on October 15, 2008. The work-shop was organized by Michela Russo (University of Paris 8 / UMR 7023, C.N.R.S. Structures formelles du Langage) in collaboration with Sophie Wauquier (University of Paris 8 / UMR 7023) and Annie Rial-land (Laboratoire de Phonétique et Phonologie UMR 7018, C.N.R.S. / Sorbonne-Nouvelle) and under the academic sponsorship of the Pro-gramme Pluri-Formation Fédération (PPF) Typologie et universaux linguistiques, directed by Stéphane Robert. I wish to thanks all the keynote speakers at the workshop: Carlos Gussenhoven, François Dell, Haike Jacobs, Frank Ramus, Fred Cummins, Petra Wagner and William John Barry as well as the institutions that sponsored the event. I would also like to thank external reviewers and colleagues from the depart-ment for dedicating their time and their expertise in the reviewing of the papers. Special thanks are due to Jean-François Bourdin for technical assistance. The other contributors to this volume have been invited to elaborate on crucial points in the overall account of Language and Speech Rhythm. I would like to thank all the contributors for their patience, their hard work and their ability to react to the editorial requirements.Finally, I would like to express my thanks to the institutions that sponsored the publication of this volume: the UMR 7023 Structures formelles du Langage (C.N.R.S.) and the Scientific Committee of the Paris 8 University (for the grant of a BQR =Bonus Qualité Recherche 2009). The publication of this volume would not have been possible without their co-operation.

M.R.

292

William John [email protected] of the SaarlandFR. 4.7 PhoneticsP.O. Box 151150D-66041 SaarbrueckenGermany

Bistra [email protected] of the SaarlandFR. 4.7 PhoneticsP.O. Box 151150D-66041 SaarbrueckenGermany

Pier Marco [email protected] Normale SuperioreP.zza dei Cavalieri, 756127 PisaItaly

Chiara [email protected] Normale SuperioreP.zza dei Cavalieri, 756127 PisaItaly

Antonio [email protected]à degli Studi di TorinoDipartimento Scienze del LinguaggioLaboratorio di Fonetica Sperimentale “Arturo Genre”via Sant’Ottavio 20, 10124TorinoItaly

Paolo [email protected]à degli Studi di TorinoDipartimento Scienze del LinguaggioLaboratorio di Fonetica Sperimentale “Arturo Genre”via Sant’Ottavio 20, 10124TorinoItaly

Petra [email protected]ät BielefeldFakultät für Linguistik und LiteraturwissenschaftPostfach 10 01 31 33501 BielefeldGermany

Elinor [email protected] Lab and St Hilda’s College University of Oxford41 Wellington SquareOxford OX1 2JFUK

Brechtje [email protected] and Jesus College University of CambridgeEnglish Faculty Building9 West RoadCambridge CB3 9DBUK

293

Lluisa [email protected] Open University and Dept of Spanish and Portuguese University of CambridgeSidgiwick AvenueCambridge CB3 9DAUK Pilar [email protected] de Traducció i Ciències del Llenguatge, ICREAUniversitat Pompeu Fabra (Barcelona)Campus de la ComunicacióC/Roc Boronat, 13808018 BarcelonaSpain

Maria Del Mar Vanrell [email protected] de Filologia Catalana Universitat Autònoma de BarcelonaCampus de la UAB08193 Bella terra (Cerdanyola del Vallès)Spain

Michela [email protected]é Paris 8UFR Sciences du Langage2, rue de la Liberté93526 St-Denis CedexFrance Antonio Pamies Bertrá[email protected] Università di GranadaDepartamento de Lingüística General y Teoría de la LiteraturaFacultad de LetrasCampus de Cartuja s/n Universidad de Granada 18071Spain

Haike [email protected] Universiteit NijmegenRomaanse Talen en CulturenErasmusplein 1, E 05.066525 HD NijimegenThe Netherlands

Rhythmic modification in child directed speech

Documents

Transcript of Rhythmic modification in child directed speech