The Role Of Segmentation And Expectation In The Perception ...

235
Florida State University Libraries Electronic Theses, Treatises and Dissertations The Graduate School 2011 The Role of Segmentation and Expectation in the Perception of Closure Crystal Peebles Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected]

Transcript of The Role Of Segmentation And Expectation In The Perception ...

Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2011

The Role of Segmentation and Expectationin the Perception of ClosureCrystal Peebles

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected]

THE FLORIDA STATE UNIVERSITY

COLLEGE OF MUSIC

THE ROLE OF SEGMENTATION AND EXPECTATION IN THE PERCEPTION OF

CLOSURE

By

CRYSTAL PEEBLES

A dissertation submitted to the

College of Music

in partial fulfillment of the

requirements for the degree of

Doctor of Philosophy

Degree Awarded:

Fall Semester, 2011

Copyright © 2011

Crystal Peebles

All Rights Reserved

ii

Crystal Peebles defended this Dissertation on October 31, 2011.

The members of the supervisory committee were:

Nancy Rogers

Professor Directing Dissertation

Michael Kaschak

University Representative

James Mathes

Committee Member

Matthew Shaftel

Committee Member

The Graduate School has verified and approved the above-named committee members, and

certifies that the dissertation has been approved in accordance with university requirements.

iii

For my parents, John and Lola Peebles

iv

ACKNOWLEDGEMENTS

I first must thank my committee members for their unending support and guidance

through the duration of this project. The unique perspectives contributed by Jim Mathes, Matt

Shaftel, and Mike Kaschak positively shaped my initial inquiry and the subsequent outcome. I

especially want to thank Mike for his assistance in designing the experiments, recruiting and

testing participants, and analyzing the data. Finally, to the head of my committee, Nancy Rogers,

I am immeasurably grateful for her constructive criticism, her keen attention to detail, and her

encouragement, both during this project and throughout my graduate career.

My experience as a graduate student in music theory program at The Florida State

University has shaped me personally and professionally in more ways than I can count. I am

indebted to the entire music theory faculty for providing opportunities for me to grow both as a

teacher and as a scholar. I also value the meaningful friendships and professional relationships I

have formed with my fellow graduate students at Florida State. I will always cherish the time we

spent together in the T.A. office, in the library, and at the local pizza joint.

Within the Tallahassee community, I would also like to thank my violin students and

their parents for their untempered enthusiasm for music as well as the lovely people at St. Paul’s

United Methodist Church. You have been my second family during my six years in Tallahassee,

offering an unwavering community of love and support. Indeed, I thank God every time I

remember you.

I cannot begin to express my gratitude for my family’s love and encouragement. My

parents, John and Lola Peebles, have consistently supported me in all I do, even when it takes me

halfway across the country. A special note of thanks also goes to Rachel McCleery, whose

loving companionship during my graduate studies has made an indelible mark on my life. I will

always treasure our many meaningful conversations shared over a delicious dinner or a warm

cup of coffee. Your critical commentary and insight certainly influenced this project and is

reflected in these pages.

Finally, I would like to acknowledge Boosey & Hawkes and European American Music

Distributors for their permission to reproduce copyrighted excerpts of twentieth-century works

by Bartók, Copland, and Webern.

v

TABLE OF CONTENTS

List of Tables ............................................................................................................................vii

List of Figures............................................................................................................................ix

List of Examples .........................................................................................................................x

Abstract ....................................................................................................................................xii

CHAPTER 1: INTRODUCTION................................................................................................1

CHAPTER 2: MUSICAL CHARACTISICS OF CLOSURE.......................................................6

Closure as the Completion of a Goal-Directed Process ........................................................6

Closure as the Segmentation of Musical Experience..........................................................11

Hierarchy and Closure.......................................................................................................15

Style and Closure ..............................................................................................................18

CHAPTER 3: MUSICAL EXPECTATION AND CLOSURE ..................................................22

Formation of Expectations: Statistical Learning ................................................................22

Expectation .......................................................................................................................26

Expectation in Music Theory ...................................................................................26

Types of Expectation and Schema............................................................................29

Expectation and Memory: An Alternative View.......................................................38

An Expectation-based Model of Closure ...........................................................................41

Three Analytical Vignettes................................................................................................47

Schumann’s “Widmung” .........................................................................................47

Webern’s “Der Tag ist vergangen”...........................................................................57

Copland’s “The World Feels Dusty” ........................................................................60

CHAPTER 4: EVENT SEGMENTATION THEORY...............................................................67

Event Segmentation ..........................................................................................................67

Event Segmentation Theory ..............................................................................................73

Event Segmentation Theory and Musical Closure .............................................................77

Experiment Overview .......................................................................................................81

Experiment 1 ...........................................................................................................81

Experiment 2 ...........................................................................................................81

Experiment 3 ...........................................................................................................82

CHAPTER 5: EXPERIMENT 1................................................................................................85

Method .............................................................................................................................86

Participants ..............................................................................................................86

Stimuli .....................................................................................................................86

Coding Procedure ....................................................................................................88

Participant Procedure.............................................................................................101

Results ............................................................................................................................102

General Results......................................................................................................104

vi

Experiment 1a: Bartók Results...............................................................................116

Experiment 1b: Mozart Results ..............................................................................124

Grouping Analysis .................................................................................................135

Discussion.......................................................................................................................149

CHAPTER 6: EXPERIMENT 2..............................................................................................153

Method ...........................................................................................................................154

Participants ............................................................................................................154

Stimuli...................................................................................................................154

Procedure...............................................................................................................157

Results ............................................................................................................................157

Discussion.......................................................................................................................162

CHAPTER 7: EXPERIMENT 3..............................................................................................165

Method ...........................................................................................................................166

Participants ............................................................................................................166

Stimuli...................................................................................................................166

Procedure...............................................................................................................167

Results ............................................................................................................................170

Discussion.......................................................................................................................179

CHAPTER 8: CLOSURE........................................................................................................181

APPENDIX A: SEGMENTATION RESPONSES IN EXPERIMENT 1.................................186

APPENDIX B: ANNOTATED SCORES FOR EXPERIMENT 3 ...........................................201

APPENDIX C: COPYRIGHT PERMISSION LETTERS........................................................208

APPENDIX D: IRB APPROVAL LETTER AND INFORMED CONSENT LETTER ...........211

REFERENCES .......................................................................................................................215

BIOGRAPHICAL SKETCH...................................................................................................222

vii

LIST OF TABLES

2.1 Lerdahl and Jackendoff’s Grouping Preference Rules .......................................................12

3.1 Text and Translation of “Widmung” .................................................................................49

4.1 Lerdahl and Jackendoff’s Grouping Well–Formedness Rules ............................................69

5.1 Musical Stimuli Characteristics for Experiments 1a and 1b ...............................................88

5.2 Window Construction in Each Window.............................................................................89

5.3 Arrival and Change Features in Bartók..............................................................................90

5.4 Arrival and Change Features in Mozart .............................................................................91

5.5 Total Number of Responses and Percentage Used in Data Analysis (Bartók)...................105

5.6 Total Number of Responses and Percentage Used in Data Analysis (Mozart)..................106

5.7 ANOVA Means for Interactions between Starting Task and the Nested Structure

(Bartók)..........................................................................................................................110

5.8 Mixed Models Regression Analysis: Latency Time.........................................................110

5.9 Mixed Logit Regression Analysis: Number of Changes...................................................111

5.10 Mixed Logit Regression Analysis: Ending Type..............................................................114

5.11 Mixed Logit Regression Analysis: Arrival Features, Third Movement ............................117

5.12 Mixed Logit Regression Analysis: Arrival Features, Fifth Movement .............................118

5.13 Mixed Logit Regression Analysis: Change Features, Third Movement............................120

5.14 ANOVA Means for Interactions in Change Feature Analysis, Third Movement ..............120

5.15 Percentage of Responses at Complete Silence in Coarse 2, Third Movement...................122

5.16 Mixed Logit Regression Analysis: Change Features, Fifth Movement.............................123

5.17 ANOVA Means for Interactions in Change Feature Analysis, Fifth Movement ...............124

5.18 Mixed Logit Regression Analysis: Arrival Features, No. 19............................................125

5.19 ANOVA Means for Interactions in the Arrival Feature Analysis, No. 19.........................127

5.20 Percentage of Responses at PACs in Fine 2, No. 19 ........................................................128

5.21 Mixed Logit Regression Analysis: Arrival Features, No. 21............................................130

5.22 ANOVA Means for Interactions in the Arrival Feature Analysis, No. 21.........................131

5.23 Mixed Logit Regression Analysis: Change Features, No. 19 ...........................................132

5.24 Mixed Logit Regression Analysis: Change Features, No. 21 ...........................................132

5.25 Section Divisions in Bartók, String Quartet No. 5, Fifth Movement ................................139

5.26 Mixed Logit Regression Analysis: Grouping Analysis ....................................................147

viii

5.27 ANOVA Means for Interactions in the Grouping Analysis ..............................................148

6.1 Exposure Excerpts ..........................................................................................................155

6.2 Mixed Models Regression Analysis: Rating ....................................................................158

7.1 Mixed Logit Regression Analysis: Cadence and Hypermeter ..........................................172

7.2 Mixed Logit Regression Analysis: Cadence Types ..........................................................173

7.3 Mixed Models Regression Analysis: Response Time and Cadences ................................177

7.4 Mixed Models Regression Analysis: Response Time and Cadence Types .......................177

7.5 Mixed Models Regression Analysis: Ratings...................................................................178

7.6 Mixed Models Regression Analysis: Ratings and Response Time ...................................179

ix

LIST OF FIGURES

3.1 Continuum of Expectations ...............................................................................................33

3.2 Continuum of Expectations with Schemata .......................................................................34

3.3 Continuum of Expectation: Closural Expectations.............................................................44

4.1 Schematic Depiction of the Event Segmentation Theory ...................................................74

4.2 Schematic Depiction of the Segmentation Process as Posited by EST................................76

5.1 Interactions between Subject Group and Consistency (Mozart) .......................................109

5.2 Interaction between Starting Condition and Consistency (Mozart)...................................109

5.3 Interactions between Subject Group and Number of Changes (Bartók)............................112

5.4 Interactions between Subject Group and Number of Changes (Mozart) ...........................113

5.5 Interactions between Subject Group and Ending Type (Bartók).......................................115

5.6 Interactions between Subject Group and Ending Type (Mozart) ......................................116

5.7 Possible Phrase Structure Analyses .................................................................................139

5.8 Mozart, String Quartet No. 19, mvmt. 4: Grouping Analysis of the Exposition................144

6.1 Two-way Interaction between Subject Group and the Exposure Composer......................159

6.2 Three-way Interaction between Subject Group, the Exposure Composer, and the Rated

Composer........................................................................................................................160

6.3 Two-way Interaction between Participant Group and the Ratings for Composer (cadential

excerpts) .........................................................................................................................161

6.4 Two-way Interaction between Participant Group and the Ratings for Composer (all

excerpts) .........................................................................................................................162

7.1 Excerpt No. 6 from Mozart’s String Quartet in G Major (K. 156), third movement .........166

7.2 Interactions between Subject Group and the Presence of a Cadence ................................172

7.3 Interactions between Subject Group and Cadence Type...................................................175

A.1 Bartók, String Quartet No. 4, third movement .................................................................187

A.2 Bartók, String Quartet No. 4, fifth movement..................................................................190

A.3 Mozart, String Quartet No. 19, fourth movement.............................................................195

A.4 Mozart, String Quartet No. 21, second movement ...........................................................199

x

LIST OF MUSICAL EXAMPLES

2.1 Beethoven, String Quartet, Op. 130, second movement, mm.1–8 (analysis after Meyer) ...15

3.1 Beethoven, God Save the King, WoO 78, mm. 1–6 ...........................................................40

3.2 Schumann, Myrthen, “Widmung” Op. 25, No. 1, mm. 1–13 ..............................................50

3.3 Schumann, Myrthen, “Widmung” Op. 25, No. 1, mm. 14–29 ............................................54

3.4 Schumann, Myrthen, “Widmung” Op. 25, No. 1, mm. 37–44 ............................................56

3.5 Webern, Vier Lieder, “Der Tag ist vergangen,” Op. 12, No. 1, mm. 1–11..........................59

3.6 Webern, Vier Lieder, “Der Tag ist vergangen,” Op. 12, No. 1, mm. 18–21........................60

3.7 Copland, Twelve Poems of Emily Dickinson, “The World Feels Dusty,” mm. 1–2.............63

3.8 Copland, Twelve Poems of Emily Dickinson, “The World Feels Dusty,” m. 27..................65

5.1 Mozart, String Quartet No. 19, fourth movement, mm. 89–93 ...........................................92

5.2 Mozart, String Quartet No. 19, fourth movement, mm. 67–70 ...........................................93

5.3 Mozart, String Quartet No. 19, fourth movement, mm. 76–78 ...........................................93

5.4 Mozart, String Quartet No. 21, second movement, mm. 15–20 .........................................94

5.5 Bartók, String Quartet No. 4, third movement, mm. 20–23................................................95

5.6 Bartók, String Quartet No. 4, third movement, mm. 40–41................................................96

5.7 Bartók, String Quartet No. 4, fifth movement, mm. 235–239.............................................96

5.8 Bartók, String Quartet No. 4, fifth movement, mm. 330–332.............................................97

5.9 Bartók, String Quartet No. 4, fifth movement, mm. 74–76 ................................................97

5.10 Bartók, String Quartet No. 4, fifth movement, mm. 279–284.............................................98

5.11 Bartók, String Quartet No. 4, third movement, mm. 6–35 (cello).....................................121

5.12 Mozart, String Quartet No. 21, second movement, mm. 1–8 (violin 1) ............................131

5.13 Bartók, String Quartet No. 4, fifth movement, mm. 11–18 ..............................................140

5.14 Bartók, String Quartet No. 4, fifth movement, mm. 102–108...........................................140

5.15 Bartók, String Quartet No. 4, fifth movement, mm. 238–249...........................................141

5.16 Mozart, String Quartet No. 19, fourth movement, mm. 1–34 (violin 1 and cello).............143

5.17 Mozart, String Quartet No. 19, fourth movement, mm. 118–135 (violin 1 and cello).......144

5.18 Mozart, String Quartet No. 21, second movement, mm. 40–44........................................146

6.1 Motive x from Bartók’s String Quartet, No. 4, first movement, m. 7................................156

6.2 Motive y from Bartók’s String Quartet, No. 4, fifth movement, m. 16–18 .......................156

xi

7.1 Mozart String Quintet No. 4 in G Minor (K. 516), third movement, mm. 1–13................168

7.2 Mozart’s Sonata for Piano and Violin in B!"Major (K. 454), third movement,

mm. 1–16.......................................................................................................................170

B.1 Mozart, Quartet No. 3 in G Major, K. 156, third movement.............................................201

B.2 Mozart, String Quartet No. 8 in F Major, K. 168, third movement...................................203

B.3 Mozart, String Quartet No. 13 in D Minor, K. 173, third movement ................................205

xii

ABSTRACT

In the musicological discourse, “closure” can refer to a variety of musical phenomena,

but the language describing closure usually involves at least one of two common metaphors:

closure is the completion of a musical process, or closure is the segmentation of musical

experience. Along with these two descriptions of closure, musicians also recognize that closure’s

markers vary between musical styles and that some moments of closure are stronger than others,

articulating a composition’s hierarchical construction. These four characteristics of closure,

gleaned from the musicological literature, inform my definition of closure: an anticipated end to

a musical segment.

This dissertation will empirically investigate the role of expectation in the perception of

closure. I hypothesize that closure is not something intrinsic to a piece of music; rather, it relies

on an individual’s previous musical encounters. This previous experience gives rise to musical

expectations, and closure is experienced when a listener is accurately able to anticipate the end of

a musical segment, on any hierarchical level. The degree of perceived closure correlates with a

listener’s ability to predict an ending, coupled with relatively weak expectations for what will

occur next. This perspective is informed by recent research in Event Segmentation Theory

(EST), a theory from the field of cognitive psychology that investigates the segmentation of

everyday non-musical events.

Three experimental studies test this hypothesis. The first study determines whether

listeners segment music according to the predictions made by EST. The results from this study

corroborate previous research: listeners consistently use musical features to segment an ongoing

composition, and the fine segmentation results are nested within the coarse segmentation results.

The learning task in the second study ascertains whether exposure to an unfamiliar musical style

will change a listener’s perception of closure in that style. While the data do not entirely confirm

this hypothesis, results from this study suggest the importance of previous experience in the

perception of closure. The third study finds a correlation between predicted endings in a familiar

style and the rating of a listener’s perceived strength of closure. Results from all three studies

support an expectation-based model of musical segmentation and the perception of closure.

1

CHAPTER 1

INTRODUCTION

Listeners and musicians have intense aesthetic opinions regarding the degree to which a

particular musical ending sounds satisfying. “Closure” is the oft-used term to describe the

listener’s feeling of satisfaction or completeness. Although the concept of musical closure is

ubiquitous in the scholarly discourse, the exact meaning of closure can vary widely among

authors. Despite these different meanings, similarities in closural metaphors speak to our shared

experience of finality. This study examines that shared experience: exploring characteristics of

closure, situating these characteristics in a listener’s musical expectations, and creating a

cognitive model of musical closure that transcends stylistic boundaries. In the second half of this

dissertation, I discuss a series of three experiments that test this model.

I define closure as the feeling of finality that occurs at the anticipated end of a musical

segment. This definition brings up three points that will be expanded throughout the course of

this project. First, I am primarily concerned with the listener’s perception of closure, not with

how a composer achieves closure in a composition or with pinpointing the moment at which

closure occurs. By taking a listener’s perspective, I am free to explore how listeners experience

closure in different repertoires without being bogged down by stylistic differences. Instead of

specifically talking about musical signs of closure, I examine how these signs evoke the feeling

of finality in a listener. Second, musical experience is segmented into discrete events that have a

beginning and end. Being able to segment the continuous stream of acoustical input is a

prerequisite to experiencing closure in the first place, for without endings there would be no

closure. Finally, a listener must be able to anticipate the placement and content of an ending in

order to experience finality. These expectations may not be conscious, and a listener will not

always be able to anticipate the exact content of an ending, but expectation is an integral part of

my model of closure.

In this chapter, I illustrate various meanings of closure seen through a handful of recent

contributions to the fields of music theory and musicology, where closure shapes the theoretic or

analytic narrative. “Closure” is regularly used as a synonym for “resolution,” particularly a

2

cadence defining V-I progression in tonal music, but other times musical closure is highlighted

because it advances the theoretical aims of the author, illustrates the stylistic tendencies of a

composer, or supports the author’s overarching analytical narrative. Despite methodological

differences, common metaphors of closure emerge. By far, the most common metaphor for

closure is likening it to a goal at the end of a musical pathway, experienced as a point of finality

or stasis, but closure can also be conceived as a boundary separating musical entities.1 These

common metaphors speak to two “actions” of closure: (1) closure marks the achievement of a

musical goal (usually coupled with a feeling of finality), and (2) closure segments a listener’s

musical experience. While these metaphors are not necessarily explicit within the discourse, they

do shape the concept of closure.

James Hepokoski and Warren Darcy’s oft-cited Elements of Sonata Theory

reconceptualizes sonata form as movements toward genre-defined goals (2006). These two main

goals are the essential expositional closure (EEC)—the obligatory perfect authentic cadence

(PAC) located near the end of the exposition—and the essential structural closure (ESC)—the

PAC in the corresponding place in the recapitulation. This perspective shifts the focus of sonata

form from a schematic script to a more dynamic process. According to Hepokoski and Darcy’s

theory, the EEC is usually the first PAC following the initiation of the secondary theme: it is the

moment towards which the preceding secondary theme has been “aiming” (120). The authors are

careful to note that the EEC may not be the strongest cadence in the exposition; the EEC merely

represents the goal of the exposition.

One should not determine an EEC on the basis of what one imagines an EEC should

“feel” like in terms of force or unassailably conclusive implication. Nor should one

assume that we are making grand claims regarding either the completeness or the degree

of the closure implied by the EEC. Its “closure” may not in fact be absolute or “fully

satisfying” from the perspective of the larger proportions of or other telling factors within

the exposition as a whole. The first PAC closing the essential exposition is primarily an

attainment of an important generic requirement—nothing more and nothing less. (124)

Not only does the EEC mark the first confirmation of the new key (the goal of the exposition), it

also separates the secondary theme from the closing theme, dividing the exposition into smaller

1 These two metaphors of closure are similar to two of Brower’s (2000) embodied image schemas:

SOURCE-PATH-GOAL and CONTAINER. These yield the metaphoric concepts of musical motion and musical

space. While a detailed study of an embodied basis of musical closure is outside the scope of this project, I speculate

that our shared experience of closure is rooted in embodiment.

3

parts. “Closure” in this sense is not dependent on listener perception; rather, it is a compositional

construct (according to Hepokoski and Darcy) that marks the end of a theoretically defined

process.

In some cases, a listener’s perception of closure contradicts the interpretation of closure

from a theoretical perspective. Such is the case in Edward Pearsall’s (1999) article, which

explores the analytical process through the lens of current cognitive theories. The second half of

this article analyzes “Nun ich der Riesen Stärksten überwand” by Alban Berg (Op. 2, No. 3).

Pearsall notes that the song ends on the dominant, an unexpected harmony that could signify a

lack of closure. Yet, as he states, “when we listen to the song, we have the sense that the piece

does end with finality” (246). In order to reconcile the perception of finality with the apparent

open-ended conclusion, Pearsall reconstructs the analysis to make the ending “goal-directed”

(253, footnote 20). By reinterpreting the last note of the penultimate melodic unit as an upper

neighbor to the last pitch of the song, he is able to reinterpret previously unexplained pitches

from earlier in the composition as semitone neighbors to the last pitch as well. In this example,

the author constructs a goal-directed process toward the last note to account for his feeling of

finality, and the analytical whole is shaped by this interpretation. This process of reconciling the

perception of finality with the theoretical lack of closure (defined by not ending on a tonic

harmony) illustrates Pearsall’s stance that music is not “a collection of immutable structures” but

rather “a subject for intentional creative perception” (231).

The achievement or denial of closure frequently supports a larger narrative, especially

when closure is conceived as achieving a goal in the music, thus playing an integral role in

conveying musical meaning. When closure is problematized in analysis, often it is because the

expected goal of a musical process is postponed or even completely denied. Alternatively, as

seen in Pearsall’s article, the experience of finality may contradict prevailing theories regarding

closure. In other cases it is the very denial of closure that carries the expressive content of the

composition. For instance, Ramon Satyendra (1997) examines four works by Lizst, each

implying a key but lacking adequate tonic resolution; all four compositions prolong a dominant

harmony that never resolves to a tonic chord on the same structural level. While Satyendra

argues that these pieces are contextually closed (beginning and ending with the same harmony),

4

they remain open because the dominant harmony remains unresolved. This paradox (the “tension

between contextual closure and tonal openness”) reflects the Romantic aesthetic (194).

Other authors examine closure as a reflection of a particular composer’s stylistic

tendencies. W. Dean Sutcliffe (2010) argues that Haydn’s slow movements written in the 1770s

are marked by an “expressive ambivalence” that defines his style (98). One way Haydn creates

this ambivalence, according to Sutcliffe, is by using the same passage to evoke two different

affective attributes. In the Andante of Haydn’s Symphony No. 52, the same musical material

both opens and closes the first phrase, creating “two different, apparently opposed, meanings”

(102). Closure is somewhat obscured, compared with Mozart’s more “punctual” endings from

the same period (110), but in Haydn’s slow movements this helps create the impression of

ambivalence because the same gestures engender a feeling of initiation and finality.

Closure in Mahler’s symphonies contributes expressively to the unfolding musical drama,

according to Seth Monahan (2011). In his analyses, recapitulatory success or failure (as defined

by Hepokoski and Darcy) correlates with the expressive outcome of the movement, but Mahler

moves away from a “closure-oriented” tonal narrative in his later compositions (38). Thus, the

expressive meaning of closure changes as Mahler’s style changes. Monahan states that Mahler’s

earlier “sonata dramas are oriented specifically around the ability of the secondary theme (S) to

attain tonic closure,” where “the S-theme acts like a musical agent bent on controlling its own

modal/tonal fate, seeking to secure closure in the tonic major while avoiding a ‘tragic’ collapse

into minor” (40). This goal-oriented perspective of closure supports Monahan’s dramatic

narrative.

At the annual meeting of the Society for Music Theory in 2010, John Roeder presented

his analysis of Saariaho’s song for two sopranos, “The claw of the magnolia,” which is the third

of five settings from Sylvia Plath’s poem “Paralytic.” As a part of his narrative, Roeder

demonstrated how a gesture can become associated with closure as the piece unfolds and how

surrounding pitch material may imbue this gesture with meaning. He noted that the tritone in

m. 2 (between F4 and B4) sounds like an ending retrospectively because it precedes the first

simultaneous attack of the song (F#4 and A#4). This reading elevates the tritone to a cadential

gesture of sorts—a harmonic entity regularly segmenting phrases. Another analysis of this same

passage suggests that two different diatonic sets are superimposed, one with a B tonic, the other

5

with a B! tonic. As Roeder noted, such a reading seems contradictory to hearing the tritone as an

ending feature: “[T]he tonal focus modulates again to A#/B! and then to B. Such intuitions raise

an interpretative problem: they do not attribute repose to a tritone, and so they run contrary to

hearing stability at the {B, F}s that terminate the phrases.” To reconcile these readings, Roeder

interpreted the {F#, A#} dyad as implying both tonalities simultaneously, combining the

dominant of B with the tonic of B!, which allows the tritone to be interpreted as including both

the dominant of B! and the tonic of B. By recasting this gesture in a larger analytic narrative, this

same tritone, which concludes the entire song, “provides both convincing closure and an

ingenious musical expression of the paralytic’s mentality.” Closure in this sense is not the

realization of a musical goal, for as Roeder stated “the mezzo’s last F4 wants to fall again to A#3

tonic, but this goal … fails to be realized, leaving the listener musically, like the paralytic

literally, in a state of suspended animation”; instead, closure here refers to a fitting ending

considering the meaning of the text.

This small sample of musicological discourse demonstrates how closure shapes the

analytic and theoretic narrative. All of these examples serve to emphasize how important the

feeling of closure is in our musical experience, and how the common metaphoric descriptions of

closure speak to a shared experience of closure. By shifting the focus from how a composition or

composer achieves closure to how a listener experiences closure, I create an expectation-based

model for this shared experience. The satisfaction and feeling of finality associated with closure

is tied to how a listener expects the piece to unfold and, more importantly, how and when a

listener expects a composition (or a segment of a composition) to end. In the remainder of the

first part of my dissertation, I explore four common characteristics of closure (Chapter 2) and

describe how these characteristics are accounted for in an expectation-based model of closure

(Chapter 3). This emphasis on expectation is supported by recent theories in event segmentation,

which outline a possible cognitive process for the segmentation of music and the perception of

closure (Chapter 4).

6

CHAPTER 2

MUSICAL CHARACTISICS OF CLOSURE

Leonard Meyer conveys an inclusive understanding of closure in his influential books

Emotion and Meaning in Music (1956) and Explaining Music (1973). In these books, Meyer

enumerates various musical parameters relating to closure, recognizes that closure occurs on

different hierarchical levels, and considers closure in post-tonal music. He suggests four

characteristics of closure, which will be fleshed out in more detail throughout this chapter:

(1) closure is a completion of a goal-directed process resulting in an arrival of relative

stability or rest;

(2) closure segments a continuous musical stream into discrete events;

(3) the strength of closure depends on many musical variables and plays an integral role

in the hierarchic construction of a composition; and

(4) closure is stylistically dependent.

The first two characteristics of closure were discussed as common metaphors of closure (closure

is a directed goal and closure is a segmenting agent) in the previous chapter. The final two

characteristics can also be inferred through a close reading of those same analyses; for instance,

Hepokoski and Darcy (2006) recognize that PACs have varying degrees of finality, and even

within the small sample of analyses in the previous chapter, authors highlight different signs of

closure appropriate for the musical style. The extent to which an analyst emphasizes one of these

concepts over another colors the resulting musical discourse, resulting both in different

metaphorical descriptions of closure and in different evaluations of closure’s meaning within the

analytic narrative.

Closure as the Completion of a Goal-Directed Process

For both Robert Hopkins and Leonard Meyer, the completion of a musical process is the

most important marker of closure; both state that without a process, the music will just stop and

not close (Hopkins 1990, 4; Meyer 1956, 139). In other words, closure occurs upon completion

of a commonly recognized musical process or event as defined through an analytical theory. For

Meyer, the musical syntax of a particular style determines the processes that drive toward closure

and is manifested through its primary musical parameters. In tonal music, these primary

7

parameters would include melody, rhythm, and harmony while timbre, dynamics, and register

would be secondary parameters (1973, 88). Meyer values primary parameters over secondary

parameters because these primary parameters point to a particular moment of cadential closure in

tonal music, while secondary parameters merely contribute to the perceived strength of this

closure.2 Closure defined through these goal-directed processes is understood as syntactical

closure.

For many theorists looking at tonal literature, the completion of the Schenkerian Ursatz

defines syntactical closure, with the arrival of $"marking the attainment of a goal. This syntactic

understanding of closure is pervasive in the musicological discourse. Mark Anson-Cartwright

(2007) acknowledges that “one assumption about closure has abided in discourse about tonal

music: the idea that it is synonymous with tonal (or structural or syntactic) closure—a state of

rest articulated by a cadence, usually very close to, or even coinciding with, the ‘end’ of a piece

or movement” (1). Even Patrick McCreless (1991), who addresses four types of closure,

privileges syntactic closure over other varieties. In his analysis of Beethoven’s Piano Sonata in

C minor (Op. 10, No. 1), McCreless reveals this preference by first “locating the point of

syntactic closure,” because syntactic closure is “primary” (65).

While Anson-Cartwright and McCreless are mainly concerned with closure of the entire

piece or movement (or, using Agawu’s term (1987), “global closure”), tonal processes can create

closure at the end of a phrase as well, with the goal being the last chord in a cadence. Although

tonal processes at the local or intermediate level are less specifically prescribed, from a

Schenkerian perspective, harmonic and contrapuntal structures of the Ursatz are projected onto

lower structural levels (Cadwallader and Gagné 2006). Alternatively, the basic phrase model

provides another source of smaller-scale musical processes: since the basic phrase model is a

formulaic succession of sonic events (contextually defined by a particular musical style), the

musical process consists of completing the step-by-step model. Even more simply, syntactic

closure often serves as a synonym for the resolution of V to I on any structural level of a

composition.

2 According to Meyer (1956), being able to predict when closure will occur is a prerequisite for being able

to perceive closure.

8

While tonal closure from a Schenkerian perspective remains a theoretical construct,

empirical research on the extent to which listeners perceive large-scale closure adhering to

Schenkerian norms (e.g., a composition beginning and ending in the same key) is not clear.3

Nicholas Cook (1987) found that listeners indicated a sense of closure in recomposed works even

when the music ended in a different key, suggesting that large-scale tonal closure is not

necessarily the only marker of closure. These studies, however, did not ask whether a listener

could detect these recomposed passages; Elizabeth West Marvin and Alexander Brinkman’s

(1999) study demonstrated that expert musicians were able to detect whether musical passages

began and ended in the same key.

Michael Graubart (2003) recognizes the importance of tonal music’s goal-directed nature,

noting “the sense of completion when, after setting-up of a charged dominant region and various

ensuing tonal adventures and misadventures, the tonic key is firmly re-established” (34). This

type of goal direction is missing in twelve-note music, so Graubert appeals to the completion of a

twelve-note row, proposing that the twelfth note of the pattern would supply the needed “goal of

the musical process” (34).4 Hence, this compositional method might substitute for tonal closure

in non-tonal works, although Graubart admits that a listener may have trouble recognizing this

pattern completion (specifically at the beginning of a row, given that a listener would not be able

to anticipate the ending pitch-class until the row approached its end). Although Graubart perhaps

carries his comparison between closure at the completion of a twelve-tone row and closure at a

tonal cadence to an extreme, his proposal speaks to the pervasive use of a goal-directed syntactic

concept of closure in musicological literature.

Not all musicians, though, ascribe to this goal-oriented view of musical closure. Anne

Hyland (2009) finds fault with syntactic closure as Hepokoski and Darcy’s (2006) main defining

factor of sonata form. In her analysis of the first movement from Schubert’s C-Major String

Quartet (D. 46), Hyland argues that the bias of goal-directed trajectories in Hepokoski and

Darcy’s theory (e.g., the exposition moves towards the EEC) does not accurately describe this

3 I do not think that any analytic theory needs empirical validation. However, since the main crux of my

study will focus on listener perception of closure, I will use empirical studies to support my model of the perception

of musical closure.

4 As Graubart states, “twelve-note rows may give back to atonal music a goal-directed force and the

possibility of closure” (36).

9

movement, which instead relies on rhetorical signs of closure. In contrast to syntactic closure,

rhetorical closure consists of “those signals of a work’s finality or closure which are not tonal in

nature” (113). While syntactical closure, defined through primary parameters, may be sufficient

for teleologically-composed pieces in the tonal style, other musical features may contribute to a

sense of goal-direction and the perception of closure. Like Hyland, Hopkins (1990) and Bryden

(2001) turn to secondary parameters to explain a process of closure.

In musical styles beyond common practice, secondary features may signify goal

completion. In his study of Mahler’s music, Hopkins (1990) proposes the concept of

“abatement” to explain closure for a composer who at times eschews traditional cadences. In

order to depict abatement, Hopkins creates graphs that show the composite dynamics, registral

pitch, durations, and concordance (i.e., consonance) of the entire texture along with the number

of voices. For Hopkins, closure occurs when these parameters abate (or “descend”): for instance,

durations increase, harmonies become more consonant, and dynamics soften. Hopkins, however,

criticizes his own theoretic concept of “abatement” for not technically being a goal-directed

process. He explicitly states that a goal-directed process is a requirement for closure because “for

closure to occur, it is necessary — but not sufficient — for a discernible process or pattern in one

or more musical parameters to imply a particular point of conclusion” (4; emphasis added).

Hopkins recognizes that a listener cannot predict the moment closure will occur using his

abatement principles, but there seems to be an eventual goal of “dying out” that the listener

perceives. Even though Hopkins seems to be contradicting himself (i.e., abatement is the process

of closure in Mahler, but because we cannot predict the moment at which the process is

completed, closure can never occur), he suggests that a listener’s feeling of “finality” may

depend on expectations other than predicting the exact moment of an ending.

Kristy Bryden (2001) uses a similar model in her dissertation on closure in late twentieth-

century chamber works. She begins with six characteristics of closural processes that transcend

stylistic boundaries.

10

Closural processes are

1) temporal and may operate on both local and larger more global levels,

2) lines of increasing intensity followed by lines of decreasing intensity,

3) the creation and either the fulfillment or postponement of expectations,

4) a summary of past events,

5) the highlighting of concluding moments, and

6) transitional techniques leading into or foreshadowing the following event. (i)

The second of her six definitions outlines a theoretical process based on secondary parameters.

Bryden’s intensity curves represent processes of musical growth and decline. She creates a

graphical representation of her score analysis by mapping various musical elements: dynamics,

registral height, textural space, frequency of attack, density, composite rhythm, and pulse. These

are then averaged to create a composite curve.5 She posits that closure occurs when a rise in

tension is followed by a decrease in tension, with a possible parallel in tonal music: the rise in

harmonic tension followed by a decrease in tension at the end of a phrase.

Bob Snyder (2000) echoes this description of the influence of secondary parameters on

the perception of closure, stating

“decreases in intensity can establish closure. If we look at changes in intensity of the

elements in a melodic, temporal, or formal grouping, we find that all other things being

equal, a grouping feels more closed the more the intensity of its various musical

parameters decreases at the end.” (emphasis Snyder’s; 63)6

Snyder argues that conceptualizing the musical surface in terms of our own bodily experiences

influences our perception of closure, comparing the feeling of repose in music to the way our

bodies feel after we have completed an action. This embodied perspective of closure implies that

we understand the metaphors of goal direction, completion, and finality through the way in

which our bodies interact with the world.

Many authors observe that the completion of a musical process results in a feeling of

repose or finality (or, in Meyer’s words, an “arrival at relative stability” (1973, 81)). For

5 Bryden’s methodology. First, she awards equal weight to each parameter in the overall average when one

parameter may exert more influence in projecting a close or continuation. Second, even though the data from these

parameters are normalized to fit on a scale from 1–10, the units of measurement are so different that an increase of a

single unit does not mean the same thing across parameters. Both of these render the composite curve almost

meaningless because it attempts to capture too much different information. While the methodology may not achieve

a “good” curve, Bryden’s idea of using secondary parameters to inform music analysis certainly has merit.

6 This emphasis on abatement and relaxation overlooks other instances of finality where there is more of a

feeling of excitement.

11

instance, in his 2007 critical study of concepts of tonal closure, Anson-Cartwright posits that the

feeling of rest is a result of tonal resolution. Bryden (2001, 1) also notes that the perception of

decreasing intensity at the moment of closure results from a “dynamic temporal process”

modeled by her intensity curves. This feeling of finality that accompanies the completion of a

goal-directed process is further explored by David Huron (2006), who states that a listener

perceives closure at the expected completion of some process, and—because the listener is less

able to predict what will occur next—this completion is followed by a perceived loss of forward

continuation. From Huron’s perspective, a feeling of repose is not created by musical parameters

that diminish in intensity or the resolution of a “tense” harmony; rather, the decrease in

predictability for subsequent events causes the perception of repose. Eugene Narmour (1990)

agrees, defining closure as “syntactic events whereby the termination, blunting, inhibiting, or

weakening of melodic implication occurs” (102).

Although the completion of a goal-directed process and a feeling of finality are

connected, it is important not to conflate these two characteristics of closure. A musical process

need not be consciously perceptible, whereas the feeling of finality is a psychological

experience. From a listener’s perspective, closure is the sense of finality that occurs at an

anticipated ending. This perspective shifts the focus away from the music per se and toward an

individual’s musical expectations. The completion of a goal-directed process does not itself elicit

a feeling of finality; rather, this feeling stems from the combination of an anticipated ending with

lessening of expectation for subsequent events.

Closure as the Segmentation of Musical Experience

The point at which a listener experiences finality segments the musical experience.

Closure marks the end of a musical event, resulting in a perceived boundary between two

musical entities. Early in his book, Snyder (2000) defines closure as the establishment of a

grouping boundary, allowing a listener to segment musical events (33). Meyer (1973) suggests

that closure creates relatively stable musical entities (90), although many musicians might

dispute the notion that all event boundaries necessarily correspond with a feeling of closure: for

instance, analysts normally do not consider motives to have closure despite their local-level

event boundaries. Not surprisingly, there is a close relationship between closure and

segmentation, but it is not necessarily a causal relationship.

12

Meyer (1973) uses both primary and secondary parameters to establish musical grouping

structure, focusing on the completion of harmonic units, changes in the musical surface, and

repetition as a means to create event boundaries. Meyer freely uses the term “closure” when

referring to segmentation at any level, including the identification of discrete motives. This sense

of closure comes from what Meyer calls “patterning,” and he provides a list of various factors

that delineate musical patterns:

1. the presence of similarity and difference between successive events within a

particular parameter. Both complete uniformity and total heterogeneity preclude

syntactic organization, and hence establish no stability-instability relationships;

2. the separation of one event from another in time, pitch, or both; or through clear

differences in dynamics, timbre, or texture;

3. immediate repetition, whether varied or exact, of part or all of a pattern;

4. the completion of previously generated implications;

5. harmonic cadence and tonal stability. (83)

Many of these ideas regarding grouping are incorporated into Lerdahl and Jackendoff’s

Grouping Preference Rules (GPRs), which reflect the principles of Gestalt psychology. While

Lerdahl and Jackendoff (1983) do not specifically discuss closure, they do use secondary

parameters such as attack points, register, dynamics, articulation, and duration to create grouping

boundaries. A complete list of GPRs is provided in Table 2.1; notice that GPRs 2 and 3 use

Meyer’s secondary parameters.

Table 2.1: Lerdahl and Jackendoff’s Grouping Preference Rules

GPR 1: Avoid analyses with very small groups—the smaller, the less preferable.

GPR 2: Group boundaries are heard at a slur or rest, as well as points where there is a

greater attack-point time interval.

GPR 3: A change in register, dynamics, articulation, and note lengths can distinguish

group boundaries.

GPR 4: Where the effects of GPR 2 and 3 are relatively more pronounced, a larger-level

group boundary may be placed.

GPR 5: Prefer grouping analyses that most closely approach the ideal subdivision of

groups into two parts of equal length.

GPR 6: Where two or more segments of the music can be construed as parallel, they

preferably form parallel parts of groups.

GPR 7: Prefer a grouping structure that results in more stable time-span and/or

prolongational reductions.

13

Narmour (1990) exclusively focuses on the establishment of closure within a three-note

grouping. In his Implication-Realization Model, Narmour suggests that the third note of a

sequence creates closure when the interval formed between the second and third notes does not

create any new implications (102). He also identifies six parametric conditions of closure, similar

to Meyer’s list, which include:

1. a rest or repetition;

2. strong metric emphasis;

3. resolution of dissonance;

4. increase in duration;

5. smaller intervallic motion;

6. change of registral direction. (11–12)

Snyder (2000) builds on Narmour’s work, stating, “continuity is nonclosural and progressive,

whereas reversal of implication is closural and segmentive” (148). This type of closure, Snyder

argues, does not necessarily end anything, but it “helps articulate the contour of the

phrase” (148). He differentiates this from closure at the end of the phrase with the designation

“soft closure,” which refers to “having any kind of segmentation, however weak” (148).

In his article on segmentation in post-tonal music, Christopher Hasty (1981) segments the

musical surface using parameters such as timbre, dynamics, intervallic associations, register and

contour. These parameters are described as musical domains, and Hasty suggests that groupings

are formed by discontinuities in at least one domain. He also claims that music is typically

segmented in such a way that the emerging groups are similar to one another; for instance,

groups in a stronger segmentation may contain the same number of constituents and may share

intervallic content. Hasty uses the term “closure” to describe a return to some musical quality

from a prior segment within a phrase, like an overall aba form, and reserves this expression as a

marker for higher-level formal divisions, such as a phrase or section. In a later article, he revisits

closure’s role in creating phrase segments, stating that, “closure is itself the articulation of the

unit since unrelated elements are thereby segregated” (1984, 172). “Closure” in this sense creates

a phrase-like entity containing musical elements that are related to each other (e.g., notes

segmented into the same set class, or notes in the same register). A phrase is “closed” off,

becoming its own entity, not including unrelated elements.

Dora Hanninen more carefully balances differentiation, similarity, and the role of music

theory in her 2001 article, which explicates her general theory on music segmentation. Hanninen

14

posits three types of criteria for segmenting music: (1) sonic, which rely on disjunction between

adjacent events or non-adjacent events, (2) contextual, which are based on associative

relationships between two possible groups, and (3) structural, which reflect a theoretical

orientation that remains purely conceptual until paired with sonic or contextual cues, resulting in

a musical segment.

Both Meyer (1973) and Hopkins (1990) also indicate that segmentation in music depends

on a plethora of musical markers; however, they divide these markers into two types: primary

parameters that allow a listener to project the moment of closure, and secondary parameters that

can strengthen the presence of an ending or retrospectively mark a moment of closure. Changes

in secondary parameters can create a sense of a new beginning, and hence a boundary, but

retrospectively recognized closure and anticipated closure may be different psychological

phenomena. The degree to which a listener is able to anticipate closure can vary as well, and it is

important to recognize that the capacity to predict an ending is quite different from arriving at an

ending and subsequently recognizing it as the end of a segment.

Huron suggests an even closer connection between musical segmentation and closure: we

perceive boundaries because we have experienced closure—a fulfillment of musical expectation.

Thus, from the perspective of segmentation, anything that creates a separable unit is closed

(2006). Meyer concurs, stating,

A motive, a phrase, or a period is defined by some degree of closure. On the level of its

closure—the level on which it is understood as a separable event—it is a relatively stable,

formal entity. Though it contains and is defined by internal processes, once closed, it is

not a process but a palpable ‘thing.’ (1973, 90)

This understanding allows for closure in motives, units that most musicians would not consider

closed, as well as closure in segments that do not evoke any particular expectation for a specific

ending point.

These two characteristics of closure discussed thus far (i.e., closure as completing a

musical process and closure as marking the segmentation of musical events) harkens back to the

metaphors examined in the first chapter. This shared understanding of closure, especially from a

listener’s perspective, will form the basis of my expectation-based model of closure. Before

discussing segmentation and expectation from the perspective of cognitive psychology, I first

describe two related characteristics of closure: (1) closure has varying strengths, creating

15

segments that group together on different hierarchical levels, and (2) goal-directed processes and

specific closural expectations are defined by musical style.

Hierarchy and Closure

Having enumerated the various parameters that affect closure (both primary and

secondary), Meyer states, “the degree of closure … depends upon the shaping of the particular

parameters at work, the degree of articulation contributed by each, and the number of parameters

promoting or preventing closure” (1973, 88). To illustrate his point, Meyer presents an analysis

of Beethoven’s String Quartet, Op. 103, second movement (reproduced here as Example 2.1),

indicating that the sense of finality at the end of m. 4 is stronger than the sense of finality at the

end of mm. 1 and 2 because m. 4 is articulated by rhythmic closure.7 Furthermore, the half

cadence (HC) is weaker than the perfect authentic cadence (PAC) that follows in m. 8. The

reason for this, according to Meyer, is that in a HC not all the musical elements are implying

closure at the same time: the rhythm implies closure, but the harmony does not. As Meyer states,

“a semicadence is a case of parametric noncongruence which has become archetypal in the

stylistic syntax of tonal music” (85). Other authors further discuss how parametric congruence

and noncongruence vary closure’s perceived strength, focusing especially on the role of

secondary parameters in confirming or weakening a syntactical close (Hopkins 1990, Snyder

2000, Hyland 2009).

Example 2.1: Beethoven String Quartet, Op. 130, second movement, mm. 1–8

(analysis after Meyer [1973])

McCreless (1991) uses formal and rhetorical markers of closure to determine the location

of structural (or syntactical) closure, implying that formal schema and rhetorical emphasis can

7 Meyer’s precise meaning is unclear, but I suspect Meyer may be referring to hypermetrical expectations.

16

modulate the strength of closure. If a composition has several possible points of syntactic

closure, formal expectations can emphasize one of these PACs as the structural close. McCreless

further states that rhetorical closure uses rhythmic and registral extremes to highlight the end of

the melody, which can make one ending sound more conclusive than another ending.

Kofi Agawu (1987) also discusses closure as occurring on various musical levels. While

his definition of closure, “the tendency to close” (2), seems a bit circular, it emphasizes that

closure is not synonymous with an ending, but rather is “dependent for its effect on the listener’s

experience of the entire composition” (4). From Agawu’s perspective, an “ending” describes

“local elements in the musical structure, whereas closure denotes a global mechanism” (4). This

view of closure is similar to Anson-Cartwright’s third concept of closure, “that condition of

immanent rest or finality which a piece or movement possesses as a temporal whole, by virtue of

all the tendencies to close projected within that whole” (2007, 3). Global closure, according to

Agawu, “secures closure for the entire piece” and fulfills these tendencies to close (6). There is

only one global closure in a composition, and nested within this closure are subordinate closes on

the local and intermediate levels. While global closure is the most decisive close, local closure

“articulates the smallest meaningful units of the piece” and intermediate closure “nests one or

more local closes” (6). In all of these cases, Agawu requires a syntactic V-I gesture at the end,

but his idea of nested closes could be applied to music outside the tonal idiom.

William Caplin (2004) discuses closure primarily at the phrase level, stating that the

“cadence effects formal closure at middle-ground levels in the structural hierarchy of a

work” (56). He goes on to explain that a “cadence creates musical closure, but not all closure in

music is cadential,” reserving the term “cadence” for a limited number of hierarchic levels (56).8

For Caplin, cadences close specific musical processes (harmonic and melodic, in the case of a

PAC); most importantly, cadences elicit a sense of formal closure. According to Caplin, a

cadence follows the structural beginning of a group on the same hierarchic level, and any

cadential harmonic paradigm must include a root-position dominant chord. While Caplin’s

insistence that cadences must be on the same hierarchical level as the phrase beginnings is a bit

8 Caplin specifically states that the Ursatz does not end with a cadence because there is no true beginning to

the Ursatz. Also, because he reserves the term “cadence” as the means to close a theme, the fact that the Ursatz’s

structural close is on a higher hierarchical level precludes his use of this term. Caplin does indicate that a cadence

can occur at the same time as the structural close.

17

idiosyncratic, it does emphasize that a combination of V followed by I will not achieve closure

unless it concludes a formal unit.

There is some interaction between musical hierarchy and the perceived strength of

closure. While the strength of closure may depend on local musical cues, as previous research

has indicated, it also depends on the boundary’s role in the overall musical hierarchy (Joichi

2006).9 Meyer further states, “every composition, then, exhibits a hierarchy of closures. The

more decisive the closure at a particular point, the more important the structural articulation ….

The way in which a particular parameter acts in articulating structure may be different on

different hierarchic levels” (1973, 89). It seems that, according to Meyer, the strength of closure

for a particular segment determines the hierarchical structure of the piece, and markers of closure

at one level can vary from markers at another level. It follows that closure at the end of a phrase

is weaker than closure at the end of a section, which in turn is weaker than closure at the end of a

piece.10

I am not convinced that there is a simple correspondence between formal hierarchy and

the perceived strength of closure at the end of any particular segment. Some research has shown

that top-down knowledge of formal design can influence the perceived strength of a point of

closure (Joichi 2006). Agawu (1987) also emphasizes listener knowledge of a how a composition

should unfold: its “scheme.” Comparing poetry to music, he suggests,

… the trained reader (or listener) approaches a lyric genre such as the Shakespearean

sonnet with a set of expectation regarding its length, meter and rhyme scheme. The

awareness of this scheme mediates the experience of the poem, and therefore of closure.

The same is true of musical genres such as minuet and trio, nocturne, concerto, and

prelude, genres in which various types of signs—some conventional, others arbitrary—

are used to inform the listener of how and when a piece is going to end. (4)

The completion of a schematic formal unit thus elicits a feeling of finality based on expectations

generated by previous experience, and this knowledge could lead to the various completions

9 Joichi also notices that the length of the preceding context influences decisions regarding the strength of

closure (longer contexts are rated as having stronger closure), but longer segments ending with a higher hierarchical

boundary are rated as more closed than are longer segments ending with a lower hierarchical boundary.

10 Caplin (2004) would qualify that statement, arguing that although the cadence located at the end of a

piece may seem stronger than previous cadences, the cadence itself does not close the composition. “A cadence

typically presumed to close an entire movement is often accorded a high degree of foreground rhetorical emphasis

… [which] renders such cadential arrivals so prominent and forceful that they can give the impression that they must

be concluding something more structurally significant than a thematic region alone” (64–65).

18

within a composition having varying strengths of finality. Meyer’s bottom-up view (the

hierarchical arrangement of these closes depending solely on the “number of parameters

promoting or preventing closure”) and this top-down view (underlying formal schemata

influencing a listener’s perception) will be re-examined in Experiment 3, located in Chapter 7.

Style and Closure

Several authors have claimed that knowledge of style—even if only implicit—is a

prerequisite for perceiving closure. According to Mary Louise Serafine (1988), closure is marked

by stasis and rest compared to the surrounding material, and the factors that generate movement

and stasis vary among styles. While markers of closure differ among styles, it may be possible to

experience closure in unfamiliar styles by imposing knowledge gained by experience in some

familiar style to the unfamiliar style. While markers of closure are highly conventionalized in

tonal music (cadences, stepwise descent to $% etc.), they are more variable in recent music. This

variability has led to two approaches to discussing closure in post-tonal music: (1) authors retain

tonal models of closure even for music in a clearly non-tonal style (Kurth 2000, Pellegrino 2002)

or (2) authors turn to alternative goal-directed processes, such as abatement and intensity curves

(Hopkins 1990, Bryden 2001).

Richard Kurth’s (2000) analysis of Schoenberg’s fourth string quartet is particularly

revealing in regards to the first approach. He states that “memory is one of the general conditions

for musical closure” (139), both within a work, where memory engages elements to create

musical forms, and between works, where memory invokes materials from earlier pieces or

compositional approaches. From this perspective, Kurth argues that latent tonal tendencies are

present in Schoenberg’s fourth string quartet, suggesting that Schoenberg’s compositional

background (and presumably a listener’s abundant experience with tonal music) allows tonal

implications and realizations to serve as markers of closure in this work. Kurth states that closure

occurs when “fluctuating tonal latencies can no longer be kept in a state of balanced suspension.

The latency of one or several individual tonalities is then revealed … and itself becomes an

attribute of closure, in moments that are characterized by vivid qualities of incipience and

expectancy” (159). While other factors, such as duration and dynamic level, may also contribute

to closure at those moments in the quartet, Kurth raises an important point: knowledge of

structures within a work and between works can influence the way a listener perceives closure.

19

In her article on closure in John Adams’ music, Catherine Pellegrino (2002) specifically

states that closure at the end of a work primarily depends upon tonal organization. Although she

acknowledges that Adams’ music is not tonal, she suggests that discernible pitch patterns

emerge, and the completion of these patterns contributes to closure. She states,

[I]f the end of a work is to be experienced as closure and not simply as an arbitrary

stopping point, the nature and placement of the point of closure must be anticipated. In

other words, for closure to occur, the tonal organization of the music must either define

its own endpoint or participate in a system in which a given endpoint is already defined.

(150)

Pellegrino likens this experience of closure to achieving $ in the melody over the tonic harmony

at the end of a tonal work. She also recognizes that other factors contribute to closure in this

repertoire: the completion of a well-known formal structure and rhetorical, stereotypical ending

gestures; she maintains, however, that these are subservient to tonal closure.

In contrast to these approaches, Robert Clifford (2005) suggests that we abandon the

notion of tonal closure in defining closure in atonal music, specifically addressing compositions

by Webern. He suggests that we instead redefine our expectations for closure in this style of

music based on compositional elements found in the piece.

For isn’t tonal closure really about expectations set into motion by the composer? …

Should we expect in atonal music, then, with its radically different melodic and harmonic

landscape, the same type of musical experience, the same solid confirmation of musical

expectations? I think not. (29)

The processes set in motion at the beginning of a work will differ from those in other

compositions, and could include symmetrical arrangement of pitches around a center pitch or a

series of gestures that balance each other (e.g., a rising gesture balanced by a descent). While

Clifford questions whether these types of processes are perceptible, he emphasizes that there are

alternative means of achieving closure besides those that are tonally motivated.

Abatement (Hopkins 1990) and intensity curves (Bryden 2001) were addressed

previously, but this approach to describing closure in non-tonal styles warrants further

discussion. Returning to Meyer and Hopkins, the emphasis on goal direction as a marker of

closure across styles leads to a disturbing conclusion: there can be no closure in music without a

goal-directed process. Setting aside the obvious difficulties in defining what exactly constitutes a

goal-directed process, I think most musicians would agree that the musical features determining

20

a goal-directed process depend on music style. The issue of style becomes increasingly

problematic throughout the twentieth and twenty-first centuries, because works by different

composers (and sometimes even works by the same composer) do not typically share the same

musical syntax. If closure requires the completion of a goal-directed process, there must first be a

goal-directed process to complete.

Intensity curves, and the like, are similar to the phenomenon McCreless (1991) describes

as rhetorical closure, “the importation of closural conventions or the use of harmonic, melodic,

rhythmic, textural, orchestrational, dynamic, articulative, or registral extremes as a means of

dramatizing the end of a piece” (51). Although some of these rhetorical conventions transcend

styles (like Bryden’s closural processes), repertoires and composers can have their own

idiosyncratic rhetorical ending gestures. For instance, Gretchen Wheelock (1991), George

Edwards (1991), and Floyd Grave (2009) all focus on rhetorical indicators of closure in Haydn’s

string quartets, while Wye Allanbrook (1994) looks at a “tune” (as defined in her article) as a

closural sign in Mozart.

In contrast to these composer-specific signs of closure, some theories of closure and

segmentation attempt to define musical characteristics of closure that are not style specific (most

notably, Lerdahl and Jackendoff 1983; Narmour 1990). One such example is durational closure

(Joichi 2006), defined by rests, pauses, and longer durations at the end of a segment. While

Narmour cites these characteristics as closural, Elizabeth Margulis (2007) found that the

interpretation of silence (how much tension the silence carries) varies with context. This suggests

that the interpretation of closure is based on more than changes in acoustic input—that the

meaning of these supposedly cross-stylistic cues varies based on context and listener experience.

Thus, stylistic competency is a direct manifestation of a listener’s experience, where

expectations gathered through statistical learning (Huron 2006) influence the perception of

closure. From the perspective of probabilistic learning, conventional signs of closure (such as

cadential patterns) begin as recurring surface features that are gradually incorporated into a

listener’s stylistic knowledge. As a listener experiences the same harmonic/melodic paradigms

ending musical units, such paradigms begin to evoke the feeling of closure. Listeners are better

able to anticipate endings in musical styles where they have sufficient experience.

21

Along with stylistic considerations in the perception of closure, the other three

characteristics of closure explored in this chapter are dependent in some fashion on a listener’s

previous musical experience. It is musical expectations engendered from these knowledge

structures that contribute to the sense of goal-direction, the segmentation of musical experience,

and the recognition of differing strengths of closure. The formation of expectations and their

influence on a listener’s perception of finality will be further explored in the next chapter.

22

CHAPTER 3

MUSICAL EXPECTATION AND CLOSURE

As discussed in the previous chapter, there is a relationship between expectation and

closure. To this effect, Eugene Narmour asserts that closure is a fulfillment of musical

expectation followed by an absence of expectation for what will follow, or, in Narmour’s terms,

closure is the realization of a melodic implication that does not create some new implication.

Leonard Meyer (1956) even goes so far as to state that without any expectation of when and how

a musical segment will end, the music will always sound incomplete; it will merely “stop” and

will not “close.” David Huron, in his seminal study on musical expectation (2006), also

acknowledges the role of expectation in the perception of closure. Building on the work of Huron

and others, I propose a model of closure that explains how various characteristics of closure

(completion of a goal-directed process, segmentation of musical experience, hierarchical

construction, and stylistic dependence—see Chapter 2) are derived from expectation. This

chapter concludes with my model of musical closure and illustrates its predictions in three short

songs: Robert Schumann’s “Widmung,” Anton Webern’s “Der Tag ist vergangen,” and Aaron

Copland’s “The World Feels Dusty.”

Formation of Expectations: Statistical Learning

Previous research has shown that we are experts at extracting statistical regularities from

auditory input, and this process of statistical learning leads to expectations for musical events

(Krumhansl 1990; Huron 2006). An individual’s musical experience will therefore determine

how closure is perceived; for example, the more often a person hears a certain harmonic or

melodic unit at the conclusion of a musical segment, the more that the listener will associate that

unit with closure. This is supported by a study (Eberlein and Frick 1992) that asked musicians to

rate the strength of closure projected by cadential patterns from a variety of historical periods.

The ratings correlated with an individual’s self-determined own stylistic competency, confirming

that increased exposure to a musical style influences the perception of closure.

23

Since it is well documented that different musical styles have their own characteristic

tokens of closure, these results are hardly surprising,11

but they leave an important question

unanswered: how do listeners form an association between musical cues and a feeling of finality?

A mere exposure effect (whereby listeners perceive closure because similar patterns have ended

musical segments in the past) is an insufficient explanation: this simplistic view cannot account

for how listeners segment music into meaningful units. A possible solution may be found in

research on language acquisition, which has shown that an auditory stimulus is segmented based

on sequential probabilities. As in language acquisition, these sequential probabilities may

influence a listener’s segmentation of music and, thus, contribute to the perception of closure.

Children acquiring a language are faced with a daunting task. Before they can even begin

to learn semantic meaning and grammatical syntax, they must first learn to discern boundaries

between words. This is a difficult task when based on acoustical cues alone, because word

boundaries are not consistently marked in fluent speech (Saffran, Aslin, and Newport 1996).

Saffran and her colleagues demonstrated that infants as young as eight months can extract

transitional probabilities (the probability that one event will follow another) between spoken

syllables. As an example, imagine that an infant hears the phrases “pretty baby” and “pretty

flower.” The transitional probability between pre and ty is higher than the transitional probability

between ty and ba simply because the former sounds have been heard in sequence more often.12

Saffran, Aslin, and Newport (1996) created a speech stream of three-syllable nonsense words

where every syllable was spoken without accentual stress and at a steady tempo. The only cues

to the location of word boundaries were the transitional probabilities between the sounds. In two

separate experiments, infants were able to distinguish between “words” and “non-words,” as well

as between “words” and “part-words,” after only a two-minute exposure period.

11

As Robert Gauldin (1988) wrote in his eighteenth-century counterpoint text, “Each period of music

history has devised clichés associated with cadential formulas. These may include stereotyped soprano and bass

melodic movements, harmonic progressions, rhythmic figuration, non-harmonic activity, and suspensions” (13).

Following this statement, Gauldin presents cadential paradigms common to the late-Baroque period. Similar lists of

stylistically appropriate cadential paradigms are included in his sixteenth-century counterpoint text as well (1985, 27

and 87).

12 The formula for calculating the transitional probability of x followed by y is

y|x = Frequency of xy / Frequency of x. In the limited example above, the probability of pre being followed by ty is

2/2 = 1.0, while the probability of ty being followed by ba is 1/2 = 0.5.

24

Of course, there are other acoustical cues that assist in the perception of word boundaries

(accentual patterns, intonational profiles, pauses, etc.), so it is especially remarkable that these

infants showed significant learning despite such impoverished stimuli. This experiment has been

replicated with adult participants (Saffran 2001), as well as with stimuli consisting of action

sequences (Baldwin et al. 2008) and tones (Saffran et al. 1999). A more recent study replicated

these results using chord sequences derived from an artificial harmonic syntax. Jonaitis and

Saffran (2009) found that after a two-day exposure period listeners were able to generalize

syntactical rules from the transitional probabilities governing chord succession, and subsequently

to differentiate novel correct harmonic progressions from progressions that did not adhere to the

artificial harmonic syntax. Compared to actual music, the stimuli were quite improvised (lacking,

for instance, metrical regularities), yet listeners were able to extract transitional probabilities

between chords after sufficient exposure.

Although the experiment outlined above does not explicitly address the inference of

musical segments based on transitional probabilities, it does suggest that a similar learning

mechanism is used for both language and music. Comparable to word boundaries in language,

musical boundaries are formed when two musical events have a relatively low transitional

probability in a particular style. Such events are not limited to pitch and harmonic material

(although these elements are the most explored in the literature), but can extend to timbre,

rhythm, loudness, articulation, etc.13

The analytical preference to segment music at a point of change in the musical surface

can be explained with the help of transitional probabilities. Recall from Chapter 2 Hanninen’s

(2001) theory of segmentation, which posits that a change in a musical parameter (e.g., register,

instrumental timbre, or articulation) creates a boundary in the sonic domain. Other authors

(Lerdahl and Jackendoff 1983; Meyer 1973) have also used differentiation to segment musical

experience. These authors imply that listeners expect continuity in all musical domains, an

expectation that is formed through statistical learning. For instance, a large melodic leap could

signify a boundary between two musical groups (similar to Lerdahl and Jackendoff’s GPR 3).

13

When using transitional probabilities, the grain at which probabilities are extracted must be specified. In

music, this depends on the time window, or event type, in question when describing statistical regularities.

Transitional probabilities can be calculated between motivic cells, chords, and individual notes, but one can also

reduce the window size to calculate the transitional probabilities within a single sound. Such a fine grain of division,

though, will not necessarily provide interesting results.

25

Folk songs from a variety of musical cultures reveal that smaller intervals occur more frequently

(Huron 2006), increasing the transitional probability for pitches that are closer together in pitch

space. Although I know of no database documenting the transitional probabilities of non-pitch

domains, given that expectation is informed by statistical learning, it is likely that other musical

domains have a transitional probability profile similar to that of intervallic succession, where no

change or slight changes occur more frequently than do drastic sonic changes. That said, not

every sonic disjunction will result in a meaningful musical boundary and a feeling of closure.

The mind unconsciously uses statistical learning to extract transitional probabilities of

musical events, which in turn guide segmentation.14

Along with extracting transitional

probabilities, a listener also becomes sensitive to the likelihood that a particular sound will occur

somewhere in a composition. Huron (2006) nicely summarizes this point by differentiating

between inclusional probabilities and transitional probabilities.15

Inclusional probabilities convey

the likelihood that a particular sound element will be present in the style, regardless of the

preceding events, while transitional probabilities represent the likelihood that a sound element

will occur based on the previous event. In tonal music, members of the tonic triad occur more

frequently than do other scale degrees (inclusional probability), and the tonic chord usually

follows a dominant harmony (transitional probability). Both sets of probabilities would give rise

to musical expectations, but I posit that a segmentation resulting in the strongest feeling of

finality, or closure, depends specifically on transitional probabilities. In addition to creating

perceptual boundaries, information gleaned through statistical learning is generalized into a

broad set of musical expectations called “schematic expectations.”

14

In an analytical narrative, transitional probabilities are not the only means of segmenting a musical

stream. Take, for instance, Hanninen’s other criteria, contextual and structural. Forming associations between

groups in a composition requires listeners to remember past material in order to form new groupings. This dynamic

listening process is not based on a generalization of transitional probabilities. Structural criteria are based on a

theoretical framework and are applied consciously to the sonic and contextual domains to create musical segments.

Because statistical learning occurs unconsciously, using conscious theoretical knowledge to create groupings is a

different phenomenon.

15 Huron labels inclusional probabilities as “zeroth-order probabilities” and transitional probabilities as

“first-order probabilities.”

26

Expectation

Being able to anticipate upcoming musical events is an integral part of musical

experience. This experience has not only been captured empirically through various experimental

paradigms, but it is reflected in musical discourse. Empirical research indicates that not all of

these expectations are explicit and that expectations can reflect different types of musical

knowledge, such as generalized knowledge applicable to different works, or exact knowledge of

a particular work. While I do not provide a comprehensive overview of musical expectation (see

Huron 2006; Ockelford 2006), I first discuss how the concept of expectation informs discourse in

the discipline of music theory, especially with regard to musical closure. After this summary, I

turn to Huron’s four types of expectation (schematic, veridical, dynamic, and conscious),

outlining concepts essential to an expectation-based model of musical closure.

Expectation in Music Theory

As Schmuckler (1989) states, “almost all contemporary music-theoretic analyses have

adopted implicit or explicit ideas of expectation” (111). While “almost all” may seem like an

overstatement, I believe that the aims of music theory, as a discipline, are indeed rooted in

expectation. Some theorists strive to bring unconscious expectations to consciousness, while

others create alternative sets of musical expectations, allowing listeners to experience music in

new ways. This is readily evident in the language used in music theory pedagogy, analytical

discourse, and theoretic systems.

A common topic in which we invoke expectation in the music theory classroom is the

deceptive cadence (or deceptive resolution, or deceptive motion). The term “deceptive” clearly

indicates the use of an unexpected chord in place of the expected chord, and textbooks usually

spell out clearly that listener expectations have been thwarted. Take, for instance, Clendinning

and Marvin (2005):

Bach’s solution at the end of the first phrase is to replace the expected tonic

harmony with a tonic substitute, the submediant triad, to make a deceptive

cadence: V7-vi. The name of this cadence is appropriate, since the drama of this

harmonic “deception” can be striking. (300; emphasis added)

Other labels for musical phenomena reveal a foundation in expectation. Some musical

vocabulary implicitly relies on expectation; consider “tendency tone” and “anticipation.”

27

Tendency tones require a particular resolution in common-practice tonality, expressing recurring

patterns of dissonance resolution. Anticipations, a term that describes the early arrival of $"

(usually), anticipate the next harmony. In the realm of harmonic syntax, textbooks frequently

organize the common pre-dominant chords into a hierarchy based on “strength” or how likely a

given chord is to progress to dominant harmony. It is the higher transitional probability from

ii 6-V compared to IV-V that makes the ii

6 chord a “stronger” pre-dominant than the IV chord. In

a similar vein, the transitional probability between V56/V-V is even higher, resulting in an even

“stronger” pre-dominant harmony.

In aural skills classrooms, some teachers advise students taking dictation to rely

strategically on the conscious application of theoretical patterns to guide the listening experience.

For instance, Rogers (1984) suggests that instructors train students to chunk the musical surface

into memorable musical patterns to assist in melodic dictation, where the student learns to expect

patterns taught in the written theory classroom. This “intelligent guessing” allows students to fill

in missing pitches based on theoretical expectations. In their recent aural-skills textbook, Jones

and Shaftel (2009) encourage students to fill in the harmonic content of cadences early in the

listening process because “the harmonies in these measures will be very predictable” (3-10). In

both written theory and aural skills classes, textbooks and instructors regularly appeal to student

expectations—those formed through previous musical experiences and those formed within the

classroom.

Other writings about music regularly draw upon a hypothetical listener’s expectation,

either explicitly or implicitly. As an example, Sarver (2010) explores how chromatic passages

interact with prolongational processes in works by Richard Straus. Her analysis of the chromatic

passage in mm. 34–40 of “Säusle, liebe Myrthe” makes explicit use of expectation.

The digression leads to a cadential six-four in E! minor, which establishes the

expectation for local closure in E! in the measures that follow. The illusory

cadential six-four, however, is thwarted in m. 39 by an upward chromatic shift

that leads to a surprising cadence in E major. (83; emphasis added)

For this short passage, Sarver draws on conventional tonal expectations to explain the “surprising

cadence” that follows when expectations for closure are “thwarted.” Denial of closural

expectations set up by a listener’s extensive familiarity with tonal syntax is integral to Sarver’s

analytical methodology and ensuing narrative.

28

Forrest’s 2010 article “Prolongation in the Choral Music by Benjamin Britten” implies a

slightly different view of expectation. Exploring the means by which a musical entity might be

prolonged in a non-functional, yet triadic, musical style, Forrest makes the case for surface-level

triads prolonging symmetrical middle-ground interval cycles, which in turn promote pitch

centricity. Central to this discussion of musical expectation is his analysis of the fifth movement

of Britten’s Ad majorem Dei gloriam, entitled “O Deus, Ego Amo Te.” In his analysis, Forrest

shows how the first two sections of the movement establish the precedent for unbroken interval

cycles, setting up the implicit expectation that the final two sections of the piece will also include

complete interval cycles. When an ic3 cycle begins in the third phrase (starting with B major and

moving to D major), he suggests that, “This incomplete cycle creates a strong expectation of F

major, the next step in the cycle” (21; emphasis added). However, this projected expectation is

not immediately satisfied: the ic3 cycle is temporarily suspended with a strong arrival on E! in

the last section of the movement. The concluding phrase of the movement finally resumes the

interval cycle, arriving on an F-major chord, which itself is prolonged by a complete ic3 cycle.

To this effect, Forrest states, “Both voices then proceed through their familiar minor-third cycle

to cadence ultimately on the pitch classes which began the piece, thereby completing the

interrupted cycle…” (22). While the expectation for a complete ic3 cycle may not be shared by

as many musicians as are expectations for syntactical tonal language, the expectation for

continuity, especially continuity on deeper structural levels, is shared among music theorists.

This expectation for continuity is built into many of our theoretic systems, especially

theoretical narratives that rely on organicism or self-similarity. In Schenkerian analysis, we

expect to find foreground musical structures in the middleground, and in transformational theory

we expect transformations to relate sound objects to each other. Further, a Reti-style analysis

would encourage listeners to find “homogeneity both between the movements and between the

parts of one movement” within a multi-movement composition (Reti 1951 [reprinted 1978], 5).

In short, any theoretical system creates expectations for how the structure of the music can be

experienced or explained.

The concept of expectation pervades the entire discipline of music theory—its pedagogy,

analytical discourse, and theoretical systems. These expectations need not be empirically

grounded: while empirical evidence is necessary for cognition studies, music analysis is

29

interpretative. An analyst draws upon explicit and implicit expectations (formed through

previous musical engagements) and methodological choices to create an individual

interpretation. However, as we saw with closure, the ubiquity of expectation in this discipline

speaks to the general human experience of musical engagement. While listeners might not be

able to verbalize their expectations, we have definite opinions regarding how music should go in

a particular style.

Types of Expectation and Schema

Different experiences of expectation emerge from this brief survey of music theory

literature. Music theory pedagogy and Sarver’s thwarted expectation for phrase closure, for

example, imply that listeners have generalized expectations regarding how harmonic syntax

should proceed and how phrases should progress in common-practice tonality. In contrast,

Forrest’s analytical expectation of ic3 cycle completion is based on prior events in that particular

composition.16

This later experience could also be considered a conscious expectation,

expectations dictated by a theoretical system or other musical knowledge, while expectations

stemming from previous knowledge of a particular composition capture yet another type of

expectation. These different experiences of expectation have led scholars, such as Bharucha,

Huron, and Margulis, to categorize various types of expectation.

Schematic expectations represent broadly enculturated patterns of events. According to

Bharucha and Stoeckig (1987), these automatically formed expectations generalize musical

patterns from a large musical corpus. Such patterns range from the consistent hypermeter and

harmonic syntax of the Classical style to the timbre and riffs associated with punk music. I want

to emphasize that these are generalized expectations: although a listener may have specific

expectations for ensuing events (for instance, a V42 chord in a Mozart piano sonata will lead a

listener specifically to expect a I 6 chord), they are not based on knowledge of that particular

work. Rather, these expectations are formed gradually by listening to many exemplars of a

particular musical style and accumulating knowledge of their recurring patterns.

16

There is no clear boundary between piece-specific expectations and general expectations. Someone

familiar with a large corpus of Britten’s works might form more general expectations for interval cycles, just as

someone familiar with tonal music has general expectations for harmonic syntax.

30

In contrast, veridical expectations are formed by specific knowledge about the sequence

of events in a single composition. Both Bharucha and Stoeckig (1987) and Huron (2006) use

these two types of expectation to explain the musical surprise associated with a deceptive

cadence (V-vi) in a well-known piece of music. Schematic expectations guide listeners to

anticipate a tonic harmony following a V chord (the transitional probability for the progression

V-I is higher than that of any other chordal succession), even if they veridically expect a vi chord

based on previous experience with this particular composition. This violation of schematic

expectations results in a feeling of “deception” even when a listener expects the surprising

harmony.

Put another way, general schematic expectations derive from learned categories of

musical experience. Both psychologists and music theorists use the term schema to describe

these learned categories. Like “closure,” “schema” is an often used but seldom defined term.

Even among musicians, “schema” encompasses different shades of meaning, due in part to

disciplinary differences (whether the focus of the study is psychological or musicological). Three

of the most common uses of the word are exemplified in the writings of Meyer, Gjerdingen, and

Huron. (For a comprehensive and more nuanced discussion of “schema,” see Byros 2009,

especially chapter 5, part 1.)

In Explaining Music (1973), Meyer posits that melodies in Western music derive from a

limited set of melodic processes. Melodic processes represent basic archetypes, his term for “an

innate or universally valid schemata” (Gjerdingen 1988, 7). Two archetypes that Meyer

subsequently tested with Rosner are the gap-fill and changing-note archetypes. A gap-fill

archetype consists of an initial upward melodic leap subsequently filled in by a stepwise descent,

while the changing-note archetype is comprised of two melodic dyads, the first leading away

from the tonic triad, the second leading back to the tonic triad (for instance, $-&-'-(). Rosner and

Meyer (1982) found that listeners could abstract the archetype from musical exemplars of each

category, then, in a forced-choice paradigm, they could identify the archetype present in novel

musical clips. A later study (Rosner and Meyer 1986) expanded this work and found that these

archetypes also influenced similarity judgments between musical excerpts.17

17

In a more recent article, Paul von Hippel (2000) questions the perceptual validity of these archetypes by

re-examining the results from the 1982 and 1986 studies. Von Hippel concludes that gap-fill does not influence

melodic shape, nor does it influence the classification of melodies to the extent that Rosner and Meyer suggest.

31

Gjerdingen’s (1988) work on schema builds upon the foundation laid by Meyer,

especially Meyer’s changing-note archetype. Gjerdingen limits his study to schemata that

generalize a musical event-sequence (notes and rhythms notated in the score). He differentiates

scripts, which outline an event sequence, from plans, which contain information regarding

intentionality rather than implying a particular series of events. Relating these two types of

schemata to Meyer’s archetypes, a changing-note archetype is an example of a script, while a

gap-fill archetype would be a plan. This distinction can account for style change; for instance,

according to Gjerdingen, the eighteenth-century’s scripted phrase construction evolved into the

nineteenth-century’s plan-like phrase. Although my own discussion of schematic expectations

will not distinguish scripts from plans, it is important to recognize that schemata can vary in their

specificity.

Huron offers the most inclusive definition of schema, “a mental preconception of the

habitual course of events” (2006, 419). Huron likens schemata to semantic categories, where

“schemas are generalizations formed by encountering many exemplars. Our most established

schemas reflect the most commonly experienced patterns” (225). Without schemata (which guide

schematic expectations) it would be impossible to have any expectations for a novel work;

listeners could only have expectations for a work after listening to it. Schemata also aid in

encoding and remembering music; for instance, pitches presented in a tonal context are more

easily remembered than those presented in a non-tonal context (see the discussion in Hérbert,

Peretz, and Gagnon 1995, 194). Further support for the existence of schemata comes from

instances in which these broad generalizations create an incorrect expectation or musical

memory, as Huron discusses (2006, 210–16). For instance, a schema could be overly general

(e.g., not providing specific enough expectations) or misapplied (e.g., approaching Non-Western

music with Western expectations). Since schemata aid in remembering music, listeners tend to

misremember an atypical musical pattern in a way that conforms to a more common schema.

This brief overview shows that schemata can range in specificity, from less specific

expectations (pitch proximity, stylistic timbres, and behavior of scale degrees in the major mode)

to more specific expectations (a particular chord succession). The narrower understanding of

“schema” posited by Meyer (universal archetypes) and Gjerdingen (style-specific

harmonic/melodic progressions) are easily subsumed within Huron’s broad definition of

32

“schema.” Margulis (2005) addresses this range of schemata-types, positing that schematic

expectation itself encompasses more than one type of expectation.

Specifically, schematic expectations inhabit a continuum from relatively deep to

relatively shallow, where depth relates to availability for direct access (from little to

much availability), susceptibility to change through exposure (from little to much

susceptibility), and scope of application (from more universal to more limited). Examples

of increasingly shallow schematic expectations might be: expectations for closure;

expectations for cadential closure in tonal music; expectations for common cadence types

in music from the classical period; and expectations for common cadence figures in the

music of Mozart, where these expectations are increasingly available for access,

increasingly susceptible to change through exposure to new pieces within the relevant

repertoire, and increasingly limited in scope. (666)

I agree that schematic expectations exist along a continuum, but, as noted by authors who

specifically discuss deeply schematic expectations, there seems to be a question about the extent

to which music alone informs the creation of these expectations. Expectations for pitch proximity

and melodic regression transcend musical culture (for the most part), so there seems to be a

perceptual preference for small melodic intervals and post-skip reversals. Indeed, many of these

expectations do not apply just to musical stimuli, but adhere to the broad perceptual laws set

forth by Gestalt theories of perception. Instead of teetering dangerously close to a chicken-or-egg

question—which came first, a preference for small melodic intervals (evident in musical

composition), or small melodic intervals in musical composition (informing a preference for

these intervals)—I simply posit that these deeply schematic expectations are different from other

types of schematic expectation because they are not distinctive to music and arise from general

perceptual processes. Music-specific schematic expectations are then derived solely from

musical experience.18

Both types of expectation are applicable to music, but only the latter is

solely applicable to music.

Figure 3.1 shows this continuum between deep schematic expectations (Margulis:

“deeply schematic expectations”) and surface schematic expectations (Margulis: “shallowly

schematic expectations”). The difference in shading indicates that the deepest of the deep

expectations transcend musical culture and are cross-modal; however, there is no clear boundary

between these expectations and ones that are unquestionably influenced by a particular musical

18

Musical experience here is understood in the broadest sense: listening to music, performing music, bodily

engaging music, and understanding limits the body may place on performance can all contribute to these music-

specific expectations.

33

culture. As Narmour explains (1990), these cross-modal expectations are formed through a

bottom-up cognitive system (consisting of Gestalt principles), while the music-specific

expectations are formed through a top-down cognitive system. Even so, these cross-modal

expectations are evident as statistical regularities.

Pearce and Wiggins (2006) present another perspective on the creation of these

regularities, stating “patterns of expectation that do not vary between musical styles are

accounted for in terms of simple regularities in music whose ubiquity may be related to the

constraints of physical performance” (378). Whether these statistical regularities are determined

by performance or perceptual limitations, and whether such expectations are formed through

statistical learning or are innate, it remains that listeners expect continuity of sound (in terms of

pitch proximity, location, timbre, etc.). Regardless of its origins, continuity is the cross-modal

bedrock on which other expectations are constructed.

Figure 3.1: Continuum of Expectations

Figure 3.2 shows the continuum again with possible schemata located along the right

side. The expectations on this continuum are implicit, evident in statistical regularities in the

34

music. Expectations for pitch proximity and melodic regression (Huron 2006) are considered

deep expectations because they are the most widely applicable, providing generalized

expectations. Style-specific schemata can range from general tonal and rhythmic expectations to

more specific expectations for a particular harmonic progression. Within a composer’s oeuvre,

his or her characteristic fingerprint may result in statistical regularities that differentiate these

works from those of other composers writing within the same style. In general, a greater quantity

of compositions informs the creation of deep schematic expectations while a smaller quantity of

compositions informs the creation of surface schematic expectations.

Figure 3.2: Continuum of Expectations with Schemata

35

Empirical research addressing expectation has predominantly focused on schematic

expectations. For instance, Schellenberg (1996, 1997) found empirical support for aspects of

Narmour’s Implication-Realization theory, which makes relatively specific predictions for deep

melodic expectation, particularly pitch proximity and pitch-reversal. Many authors have found

resounding empirical support for expectations of pitch proximity (Shepard 1964; Deutsch 1991;

Aarden 2003) using a variety of experimental methodologies and stimuli. Despite overwhelming

evidence that pitch proximity is preferred, the expectation for pitch proximity can be influenced

by other expectations, as Hérbert, Peretz, and Gagnon (1995) found in their probe-tone study,

which examines listener preference for tones at the end of musical phrases, where scale degrees

predicted the results better than did pitch proximity alone. So, while pitch proximity represents a

deep schematic expectation, more accessible surface-level expectations (such as the tonal

system) can exert greater influence on listener expectations.

Surface schematic expectations, especially those pertaining to expectations within the

tonal system, have been extensively explored. Along with Hérbert, Peretz, and Gagnon (1995),

who focused on melodic phrase endings, other authors have examined more general melodic

expectations (Carlsen 1981), harmonic expectations (Bharucha and Krumhansl 1983; Bharucha

and Stoeckig 1986), and a combination of both (Schmuckler 1989). Other studies have examined

types of musical expectation beyond a tonal context, and they still confirm the existence of

expectations that are not work-specific, but applicable to a wider body of music (Cuddy and

Lunney 1995). Finally, a series of cross-cultural expectation studies illustrates how these surface

expectations are formed through previous exposure (Krumhansl et al. 1999; Krumhansl et al.

2000).

Dynamic expectations, which Huron also discusses, exploit a listener’s short-term

memory to form predictions of likely future events within a musical composition while it is

being heard. Dynamic expectations are so named because they are relatively volatile compared to

schematic and veridical expectations, and arise from brief exposures to a stimulus. Huron states,

“as the events of a musical work unfold, the work itself engenders expectations that influence

how the remainder of the work is experienced” (227). Adaptive expectations of this sort have

been explored in empirical studies demonstrating that listeners unfamiliar with a particular style

can pick out statistical probabilities after a short exposure period. Kessler, Hanse, and Shepard

36

(1984) performed a cross-cultural study using a probe-tone method to explore schemata based on

tonal structure. Their Balinese and American participants listened to three melodies, one based

on the Western scale and two based on Balinese scales (Pelog and Slendro), then rated the

goodness of fit between the material they just heard and a probe tone. In general, the results

found that pre-existing schemata based on first-order probabilities influenced ratings by

enculturated listeners, while ratings by naïve listeners were based on pitch frequency (inclusional

probabilities). Even listeners completely unfamiliar with the style were able to extrapolate

statistical regularities from the given music to form a set of expectations.

These three types of expectation (schematic, veridical, and dynamic) are very much

related, as illustrated by Ockelford’s (2006) zygonic model of musical expectation. His model

shows how implications (and hence expectations) can arise while listening to a musical

composition. Ockelford differentiates between expectations within a musical segment and those

between musical segments. Within a segment, there are implications of pitch proximity (we tend

to expect small intervals), while expectations between segments are formed through four

different experiences:

1) other material or materials occurring within the same hearing of the same performance of

the same piece;

2) a different hearing or hearings of the same performance of the same piece (in the case of

recorded music);

3) a hearing or hearings of a different performance or performances of the same piece; or

4) a hearing or hearings of a performance or performances of a different piece or pieces.

(110)

The first expectation refers to dynamic expectations, which are molded by the ongoing

composition. The second and third experiences result in piece-specific expectations, or veridical

expectations, while the fourth results in generalized learning, influencing schematic expectations.

Ockelford relates the interaction of structures within a segment (which he calls within-

group structures) and the interaction of structures between two different segments (called

between-group structures). “Previous structures,” which inform schematic and veridical

expectations, are stored in long-term memory while “current structures” are encoded in short-

term memory. Previous structures form between-groups expectations and can also be a general or

specific indication of future events. Dynamic expectations guide between-group expectations,

37

while pitch proximity forms within-group expectations.19

As a listener hears a composition

multiple times, more specific expectations are formed, but these expectations are always

contextualized by schematic expectations. Although these expectations seem categorically

discrete, all three operate concurrently, allowing a listener to experience musical expectations in

unfamiliar music as well as thwarted expectations in well-known music.

According to Huron and Ockelford, the differences between schematic and veridical

expectations arise from the way in which the information is encoded in long-term memory;

however, the difference between schematic and dynamic expectations is less clear. First, both

types of expectation are formed through a common learning mechanism: statistical learning.

Second, assumptions behind what is valued in dynamic expectations seem to be rooted in

schematic expectations. Expectations for continuity and repetition are deep schematic

expectations, but the specific musical elements to be repeated or continued are piece-specific. As

discussed earlier, Forrest’s analytical narrative implies that the expectation for interval cycle

completion is formed as Britten’s composition progresses. Given that dynamic expectations are

considered implicit, and this particular pattern could be captured by transitional probabilities, a

listener could indeed form an unconscious expectation for interval-cycle completion. However,

the foundation for this expectation would be the listener’s schematic expectation for repetition in

a work, blurring the line between these two types of expectation. Furthermore, since Britten uses

interval cycles regularly in his works, a listener could unconsciously form a general schema for

this compositional characteristic after exposure to many exemplars.20

Even though dynamic

expectations may sometimes be difficult to distinguish from schematic expectations, the term

remains useful because the concepts behind dynamic expectations strongly influence music

analytical discourse.

Ockleford emphasizes that the expectations in his model are not explicit and are formed

unconsciously; however, there are times when expectations rise to the surface of consciousness.

A knowledgeable listener hearing the first movement of a Mozart piano sonata will have specific

19

Ockelford distinguishes pitch proximity from other forms of schematic expectations, very similar to my

own representation of the continuum between various types schematic expectation. Ultimately, I still maintain

(along with Margulis [2005] and Huron [2006]) that pitch proximity is a deeply schematic expectation.

20 Whether the completion of an interval cycle can be an implicit expectation is debatable. It is more likely

that conscious knowledge of Britten’s compositional preference for interval cycles shapes a listener’s hearing of a

work. In any case, Forrest uses the language of dynamic expectation to shape his narrative.

38

formal and structural expectations that, assuming some training and technical vocabulary, could

be explicitly articulated (e.g., the second theme will probably have a lyrical character, and the

recapitulation will probably be preceded by a prolonged dominant harmony). Conscious

expectations “arise from conscious reflection and conscious prediction” (Huron 2006, 235),

bringing us full circle back to the role of expectation in music theory. Theoretic systems ask us to

carry another set of expectations into our musical experiences, which can enhance our musical

experience. Many music-theoretical systems invoke the language of expectation, but we should

be careful not to confuse conscious expectations reflecting implicit expectations with those that

arise solely from abstract theories. I do not presume that abstract theoretical expectations are

invalid, but it is important—given the pervasive references to expectation in our scholarship and

teaching—to recognize the distinctive varieties of expectation.

Expectation and Memory: An Alternative View

While these different types of expectation seem to capture our experience with music, the

underlying assumption that they reside in different memory structures is problematic. Huron

(2006) posits two different types of memory guiding schematic and veridical expectations.

Knowledge of individual pieces is stored in episodic memory (also known as autobiographical

memory), while auditory generalizations are stored in semantic memory. Huron notes that this

distinction is problematic for two reasons (225). First, familiar works may leave out biographical

episodic content (although we can remember a composition along with the context in which it

was heard, we are also capable of remembering a composition without explicitly recalling the

context). Also, all auditory generalizations began as a single exemplar of a recurring pattern,

suggesting that all semantic memories began as a large collection of episodic information.

Rather than two distinct systems, Hintzman posits a multiple-trace memory model that

can account both for schemata abstraction and for veridical expectations. In Hintzman’s

multiple-trace memory model (1986, 1988, 2010), each experience is recorded in long-term

memory as a separate memory trace—in contrast with models where subsequent exposures to a

given stimulus strengthen an existing memory trace for that stimulus. Hintzman suggests that

schema abstraction of everyday concepts (like “chair” or “table”) is determined by a person’s

exposures to many exemplars of a category, each exposure laying down a memory trace. The

memory traces encode various features of each exemplar, such as the context in which the

39

exemplar was experienced, along with other sensory characteristics. Abstract concepts are then

derived from this pool of episodic traces. When a retrieval cue interacts with all of these traces

simultaneously, it activates traces according to their similarity with the cue. Traces that are more

similar to the cue are more strongly activated than other traces, and the summed content of these

activations represents the information retrieved from memory (Hintzman 1986). This concept of

memory can also reflect the results of statistical learning: the number of activated traces informs

inclusional probabilities, while the number of activated traces involving a particular event

succession informs transitional probabilities.

This memory model can also be applied to music cognition. Although Hintzman has not

explored the creation and content of memory traces for temporal experiences, if each musical

experience results in a separate memory trace, then this model can account for both veridical and

schematic expectations without relying on two separate memory systems.21

When a person

listens to music, each musical trace is activated in parallel. Traces that are most similar to the

current auditory input are activated more strongly and will have the greatest impact on listener

expectations; less similar traces, while present, will influence expectations to a lesser degree. In

this model, generalized schematic expectations are based on the activation of memory traces

from many different pieces of music.

The range of specificity for expectation depends on the degree to which these traces share

the same subsequent events. A large number of memory traces lead us to expect pitch proximity,

but these traces do not imply the same pitch or scale degree because the context of each trace is

so different. In contrast, harmonic progressions conforming to tonal syntax may activate fewer

memory traces, but these traces will overwhelmingly involve a tonic chord following a dominant

harmony, creating a more specific expectation. Memory traces for a particular composition

provide even more specific expectations, of course. Because all these traces respond in parallel, a

listener can have several different expectations for a single piece of music.

Once again, consider a deceptive cadence, which, as previously discussed, can be

surprising even in a well-known composition. When a listener hears a dominant chord, all

21

The creation of memory traces would probably be influenced by the way in which a listener interacts

with music. A listener’s ability to use language to label musical events, the extent to which a listener is paying

attention to the music, and how the listener is parsing the musical surface would probably influence the creation and

content of the memory traces. Also, the way in which a memory trace is activated would differ from Hintzman’s

original conception because listening to music is a temporal experience.

40

memory traces containing an experience of this harmony are activated, including the memory

trace(s) for the work itself. In the vast majority of these traces, the dominant chord is followed by

a tonic harmony, eliciting a strong and relatively specific expectation for tonic. This expectation

is in direct conflict with the even more specific expectation for the submediant chord that stems

from the listener’s experience with this particular piece.

Increased exposure to a particular composition can lessen the effect of a surprising

musical feature, like the deceptive cadence, and Hintzman’s model can account for this type of

experience. Consider, for example, the opening phrase of “America” (“My Country ’tis of Thee”;

Beethoven’s arrangement of the identical “God Save the King” is shown in Example 3.1). This

six-measure phrase contains a deceptive resolution in the fourth measure. The abnormal length of

the phrase (six measures instead of the more typical four) and the deceptive motion violate

schematic expectations and should therefore elicit surprise. However, “America” is so familiar in

our culture that listeners have multiple memory traces for this song. The sheer number of traces

correctly predicting the features of the phrase and the greater activation of these traces based on

their similarity to the incoming input lessens the influence of the generalized expectations based

on a listener’s cumulative musical experience. Because this model relies on specificity of

expectation as well as a listener’s cumulative musical experience, it can account for the

simultaneous occurrence of different expectations as well as a listener’s changing expectations

with increased exposure.

Example 3.1: Beethoven, “God Save the King,” WoO 78, mm. 1–6

Hintzman’s conception of memory allows us to consider two distinct factors influencing

our expectations: the specificity of an expectation and the number of times a listener has been

exposed to a particular pattern. Figure 3.2 depicted both generalized expectations and specific

expectations for particular pieces (that is, both schematic and veridical expectations). The

specificity and quantity of memory traces determines the strength of an expectation, where

41

higher specificity and greater trace quantity result in stronger expectations. For instance, tonal

syntax makes highly specific predictions about the harmonic and melodic content of a

composition, and the considerable exposure to music conforming to these norms creates strong

expectations in Western listeners.

An Expectation-based Model of Closure

The concept of expectation can also provide insight into the listener’s experience of

musical closure. This idea is nothing new; recall Meyer’s (1956) statement about closure: “A

stimulus series which develops no process, awakens no tendencies, will . . . always appear to be

incomplete” (139). Whether expectations are implicit or explicit, Meyer is quite adamant that a

listener must anticipate the point at which a musical segment will terminate in order to

experience a feeling of finality. Observing the definition of closure from Chapter 1 (the expected

end to a musical segment, resulting in a feeling of finality), closure necessarily depends upon

expectation. The characteristics of closure—completion of a goal-directed process, segmentation

of musical experience, hierarchical construction, and stylistic dependence—describe the

listener’s experience. To capture the various experiences of closure proposed by this model, I

suggest three types of closure: anticipatory closure, arrival closure, and retrospective closure.

These describe a listener’s phenomenological perception of closure, where each of these types of

closure draws upon a different set of expectations, resulting in a different feeling of closure.

Segmentation is a prerequisite to experiencing musical closure. After unconsciously

learning the transitional probabilities of various musical domains, a listener is able to segment

the musical surface into meaningful musical units. Some of these segmentations reflect deep

schematic expectations (e.g., expectations for continuity and proximity), while other

segmentations depend on stylistic knowledge. If a style consistently uses a formulaic pattern at

boundary locations, then this pattern itself becomes associated with endings. Consider for a

moment the classical cadence, V-I. Nothing inherently links this harmonic succession with

closure; only through a learned schematic expectation has the authentic cadence—especially the

perfect authentic cadence (PAC)—become synonymous with closure. Because musical

boundaries are formed at points with a lower transitional probability (and hence a higher

prediction error), the uncertainty of subsequent events leads to the closing off of the musical

segment, resulting in a feeling of finality.

42

According to Huron, this feeling of finality is “a direct consequence of learned first-order

probabilities” (2006, 167). He associates “closure” or a feeling of “home” with the positive

emotions resulting from a correct prediction based on these transitional probabilities. A listener

misattributes this positive feeling to the sound itself (in the case of a PAC, to the tonic triad).

This understanding of the relationship between expectation and closure works well for tonal

music, but more recent music does not have so many first-order probabilities that transcend an

individual composition. Expectations for twentieth- and twenty-first-century works typically

result from fewer episodic traces (surface schematic expectations) or tend to be overly general

(deep schematic expectation). Specific expectations supported by many memory traces (from the

middle of the continuum) are not present in this repertoire, unlike in tonal music. Without a

strong prediction effect, listeners do not have as strong of a sense of finality.

Huron notes that the uncertainty following a cadence also contributes to the sense of

finality. In my model, more specific expectations for the end of a segment lead to a greater

change in the listener’s ability to predict subsequent events. Once again, consider the PAC in the

Classical style. A knowledgeable listener will have strong expectations for the tonic arrival, but

not necessarily for subsequent events, resulting in a large increase in prediction error. Compare

this to a listener who is not fluent in the Classical style and therefore did not strongly expect this

same tonic arrival. Both listeners may have similar uncertainty for events following the cadence,

but the knowledgeable listener will experience a greater decrease in the ability to predict

subsequent events, resulting in a greater feeling of finality for the same passage of music.

Considering that the feeling of finality is the product of a successful prediction (and its

resulting positive valence) followed by a rise in uncertainty for subsequent events, and that

memory traces for music are activated in parallel, the experience of musical closure is directly

related to a listener’s prior experience and the implicit expectations derived from this

experience.22

The multiple-trace memory model can account for a comparative rise in

uncertainty even in well-known pieces, given that general schematic expectations are an

amalgamation of all active memory traces for the musical context. While not all traces are

22

Listeners also experience closure when an ending conforms to conscious expectations. For instance, a

listener might expect a segment-concluding Figure that occurs early in a composition to return in a significant way,

closing another segment later in the work. Also, conscious knowledge of musical structure could lead a listener to

anticipate when an ending is likely to occur.

43

activated to the same degree (the ones that share the most features with the incoming sonic input

are more strongly activated), the sheer variety of musical content following a typical ending

gesture results in a higher prediction error for the next event.23

As discussed in Chapter 2, many aspects of closure are style dependent, but broader

schematic expectations allow the experience of closure not only in unfamiliar pieces but in

unfamiliar styles as well. As Huron notes, however, the crossover that occurs with schematic

expectations between styles does not always provide contextually sensitive expectations. More

general expectations, like the expectation that phrases normally relax at the end, can transcend

styles (Hopkins 1990), but whether this expectation is a product of understanding music through

embodied motion (Snyder 2000) or acquired through statistical learning is a question that

warrants more research.

I would argue that the strongest sensation of closure is typically associated with music-

specific schematic expectations (as opposed to cross-domain schematic expectations), which

may account for the common description among music theorists that closure occurs at the end of

a goal-directed process. This feeling of goal direction is often misattributed to the music (or,

more specifically, to an actively unfolding musical process), when in fact the feeling is an

artifact of the listener’s ability to predict with increasing certainty subsequent events within a

musical segment. Consider, for instance, a schema that provides a general outline of successive

events (such as Gjerdingin’s do-ti-re-do schema or Byros’ le-sol-fi-sol schema). As each event in

such a schema occurs, the transitional probability for subsequent elements increases in that

context. More specific listener expectations have a higher transitional probability between

successive elements, resulting in a greater feeling of goal direction and, ultimately, closure.

The continuum of expectation specificity also applies to closure, as depicted in

Figure 3.3. On the deepest level, we expect some sort of musical closure—a feeling of finality at

23

To explore how traces can be activated to different degree (as opposed to a binary activation), consider a

categorization task. Suppose you see a chair, and for this illustration this chair has five features: “sitability,” four

legs, no arms, a back, and a location close to the ground. The memory trace for a different chair that shares all of

these features will be more strongly activated than the memory trace for a three-legged barstool, which would only

share the “sitability” and no arms features with the perceived stimulus. While these features are binary in nature,

they may be weighted differing amounts in determining the degree of activation of a particular memory trace (in this

example “sitability” is more of a requirement for something to be a chair than whether the chair has arms). In

actuality, the mind probably keeps track of many more features than this simple example, including context. In

music, the process may be more complicated due to the temporal nature of music and listener limitations in

remembering the preceding musical context.

44

the end of a music segment or piece. Margulis (2005) agrees that closure is a deep schematic

expectation; it is also a cross-modal expectation (in general, we expect sounds to end eventually,

and, from a Gestalt perspective, we see broken objects as completed wholes), not a music-

specific expectation. This deeply schematic expectation is extremely general, applicable to all

music.

Figure 3.3: Continuum of Expectation: Closural Expectations

Appearing just above these expectations are general expectations for phrase length,

ending gestures, and performative signs of closure, all of which contribute to a basic phrase

shape—usually corresponding to an increase in tension and a subsequent relaxation. These

45

general expectations are applicable across styles, but musical patterns found in a smaller

collection of musical works (style-specific, genre-specific, or composer-specific patterns) also

shape expectations. Endings that conform to the surface end of the schematic expectation

continuum result in a greater feeling of finality because of the relatively high specificity of the

expectations they engender. With the frequent use of a formulaic cadence in certain styles, a

listener has more exact expectations for the end, resulting in a better ability to predict when

phrases will end and consequently an increased feeling of goal-direction. The number of memory

traces supporting a particular progression would also influence the feeling of goal-direction,

where an increased number of traces would increase the anticipation of the ending.

The more specific an expectation is, the less applicable it will be to a variety of pieces:

for instance, there are forms of closure specific to a particular period, genre, or even composer.

At the top of the diagram are expectations for closure based on previous encounters with a

particular composition. Even though these work-specific expectations allow a listener to

anticipate the end of a musical segment accurately, the quantity of these specific episodic traces

is dwarfed by a listener’s more general cumulative musical experiences; the feeling of finality

would be less intense than the feeling of finality following signs of closure from the middle of

the continuum.

Because the feeling of finality results from a change in the listener’s ability to predict

subsequent events, especially specific expectations leading to the end of a segment will produce

a greater subsequent decrease in predictability. For instance, the feeling of closure following the

end of a harmonic schema is stronger than an ending indicated by a long note because the

expectations are more specific (in regards to content and timing), resulting in a larger difference

in a listener’s ability to predict subsequent events. Deeper schematic expectations are too general

to elicit specific expectations for the end of a segment, resulting in a smaller change in prediction

error. At the same time these expectations are unfolding, listeners still carry surface schematic

expectations and perhaps veridical expectations for closure. Similar to the experience of a

deceptive cadence, an ending that conforms to a listener’s veridical expectations for a particular

piece, but that overall denies schematic expectations, could lead to a less intense feeling of

finality because some expectations imply continuation while other imply an ending. The

46

proportion of continuation and ending implications results in various degrees of closure, creating

a sense of hierarchical closes within a piece.

I posit three different types of closure, based on the type of expectations governing the

feeling of finality and the amount of change in the prediction error, which results in closes with

different strengths. Anticipatory closure occurs when a listener can predict when the musical

segment will be completed. Anticipatory closure can be experienced through all the types of

expectation, but, as mentioned earlier, schematic expectations on the surface end of the spectrum

seem to create the greatest feeling of finality. Anticipatory closure is the strongest type of closure

because the expectations leading into the final event are so strong, resulting in a large rise in

prediction error. Veridical and dynamic expectations can also lead a listener to experience

anticipatory closure, even if a composer eschews conventions set forth by schematic

expectations; however, without the combined weight of many exemplars, the experience of

closure may not be as strong.24

Arrival closure occurs when a listener experiences finality before the beginning of the

next segment. This is best understood in Narmour’s definition of closure: a musical event that

creates no further implications. Like anticipatory closure, arrival closure depends upon a

listener’s schematic expectations and requires a decrease in the ability to predict subsequent

events, but in arrival closure the listener does not know when a musical segment is going to end

until it actually ends. Sonic features (learned through statistical learning) such as an extended

note or silence signify the ending’s arrival, but the transient increase in prediction error is not as

high as in anticipatory closure because the expectations approaching a particular ending are not

as strong.

Retrospective closure can only barely be classified as closure, since it refers to

experiencing an ending solely because a new beginning has occurred. Meyer would not classify

this as closure at all, although he does suggest that closure occurs prior to a recognized

beginning, such as the beginning of a parallel consequent phrase (Meyer 1973, 86). In my view,

rather than experiencing finality because the next event is more difficult to predict than previous

24

An evaded (or deceptive) cadence would be an example of denied anticipatory closure. A listener is

primed to expect an ending following the dominant harmony, but the chord that follows implies continuation

instead. While a listener may experience some finality with this harmony, like the closing off of a subphrase, the

feeling is not nearly as strong as it would have been had the expected harmony occurred instead.

47

events, we instead experience an ending because the current event defied expectations for

continuation implied by the previous event. Retrospective closure can be characterized as a

failure to recognize an ending precisely when it occurs. Deeply schematic expectations are at

work here, based mainly on sonic disjunction. While multiple listenings may change a once

retrospective close to an arrival or anticipated close, these veridical expectations will not elicit

the same effect as shallow-end schematic expectations.

Three Analytical Vignettes

This section will briefly examine closure in three songs that represent different musical

styles and carry different sets of schematic expectations: Robert Schumann’s “Widmung” from

Myrthen, Op. 25, No. 1; Anton Webern’s “Der Tag ist vergangen” from Vier Lieder für

Singstimme und Klavier, Op. 12, No. 1; and Aaron Copland’s “The World Feels Dusty” from

Twelve Poems of Emily Dickinson. My analyses, representing the perspective of an experienced

listener, will focus on schematic, dynamic, and conscious expectations; obviously, multiple

repetitions of a single work would also create more specific veridical expectations. I will

highlight goal-directed processes and surface features that most compellingly project closure in

each song, briefly exploring how schematic and dynamic expectations influence closure.

Because the text can contribute to closure through repetition, rhyme scheme, and semantic

meaning, I chose these three texted works to illustrate the role of expectation in the perception of

closure.

Schumann’s “Widmung”

Many authors portray the completion of a tonal process as the main marker of closure;

however, these tonal markers of closure, such as cadences and the completion of the Ursatz,

combine with several musical elements to create a sense of closure. Other markers of ending,

such as increased duration and strong metrical articulation, interact with pitch-based signs of

closure to create a hierarchy of segmentation, ranging from subphrase to phrase to section to the

entire song. While subphrases are not typically considered “closed,” they are usually marked by

musical characteristics that also accompany phrase endings. These elements (such as change of

texture, increase in rhythmic duration, and silence) create perceptual boundaries in the musical

surface because of the discontinuity in the sound. Other surface characteristics (such as a

48

decrease in tempo, softer dynamics, and falling pitch contour) also signify endings, usually

coinciding with the end of a subphrase or a phrase. All of these features, which reflect common

ending gestures, contribute to the perception of closure.

From a tonal standpoint, “Widmung” is relatively straightforward. It begins in A! major,

and the first part of its simple ternary form contains a single phrase. The B section of the ternary

form begins with a common-tone modulation to E major (!VI). Over the course of this section,

the music slips back to A! by reinterpreting the A-major chord in m. 25 as a B!! chord (!II in A!

major). The B section sets up the return of the A section with a prolonged dominant, which could

be read from a Schenkerian perspective as the arrival of ) in an interrupted structure. The A'

section begins in m. 30, exactly repeating the music and text from the beginning until m. 35.

Here Schumann tonicizes ii instead of IV, and the lyrics return to the text from the end of the B

section. The tempo broadens and the vocal line rhetorically leaps up to F5–E!5 (covering the

structural )* before falling conclusively to $ on the strong beat of m. 39, achieving tonal closure

(from a Schenkerian perspective, completing the Ursatz).

The text (provided along with an English translation in Table 3.1) is a setting of

F. Rückert’s poem. In the first half of the poem, set in the A section of Schumann’s song, the

author acknowledges that his love is his entire life. The anaphora preceding each characteristic of

his lover (du meine) is contrasted with the anaphora “du bist” in the second half of the poem (set

in Schumann’s B section). In this second half, the author describes the transformative power of

his lover’s affection. The twelve-line poem consists of rhymed couplets, where both lines within

each couplet have the same number of syllables. The poem alternates between couplets of eight

and couplets of nine syllables until the last couplet, which breaks the pattern and substitutes an

eight-syllable couplet in place of the expected nine-syllable couplet. Rückert’s consistent syllabic

structure and rhyme scheme within each couplet provide a listener with very specific

expectations for the ending of each couplet.

The A section has only one true ending, as dictated by tonal schemata: the perfect

authentic cadence in m. 13 (see Example 3.2) completes the goal-directed harmonic process

initiated at the beginning of the song. Other musical markers, such as the middleground

descending step progression and the relatively long vocal note followed by a rest (coupled with

the diminuendo and ritardando in the piano) strengthen the feeling of closure at this point. These

49

musical features represent deep schematic expectations, which apply to a larger variety of works

than do the more surface schema for tonal syntax. Other points in the A section (such as in

mm. 5 and 10) also include a descending pitch contour, a longer rhythmic duration, and

diminuendos to mark endings, but these points do not evoke the same feeling of finality as does

the cadence in m. 13. The more surface schemata of tonal process allow a listener to predict a

more specific type of ending (a tonic chord on a strong beat) than do more general schematic

signs of ending (or, to use Meyer’s term, “secondary parameters”).

Table 3.1: Text and Translation of “Widmung” Poem by F. Rückert [trans. by Reinhard 1989, 128-9]

The number of syllables in each line is noted to the left of the text

8 Du meine Seele, du mein Herz, You my soul, you my heart,

8 Du meine Wonn', o du mein Schmerz, You my bliss, oh you my grief,

9 Du meine Welt, in der ich lebe, You my world in which I live,

9 Mein Himmel du, darin ich schwebe, My heaven, you, therein I soar,

8 O du mein Grab, in das hinab Oh you my grave down into

8 Ich ewig meinen Kummer gab! I eternally gave my sorrow!

9

Du bist die Ruh, du bist der Frieden,

You are repose, you are peace,

9 Du bist von Himmel, mir beschieden. You are bestowed to me by heaven.

8 Daß du mich liebst, macht mich mir wert, Your love for me makes me worthy to myself,

8 Dein Blick hat mich vor mir verklärt, Your gaze has transfigured me within my own eyes,

8 Du hebst mich liebend über mich, You lift me above myself with your love,

8 Mein guter Geist, mein beßres Ich! My good spirit, my better self!

A listener could segment the first phrase into smaller units; as Meyer (1973) explains, a

hierarchy of closure emerges based on the proportion of musical parameters projecting an ending

to the parameters projecting continuation. A listener’s previous musical encounters will

determine whether the features of Schumann’s first phrase project closure or continuation.

Because harmony is one of the best predictors of endings in this style, an experienced listener

will presumably first sense closure in m. 13. The subphrase-concluding harmonies prior to this

point imply continuation because an experienced listener will have specific expectations for

subsequent harmonies. A subphrase in this analysis consists of a grouping on a lower hierarchic

level than the harmonically-driven phrase.25

25

While in this analysis, subphrases are marked by an initiating and concluding gesture occurring on the

same hierarchical level, subphrases in other pieces could be formed on the basis of discontinuity within the phrase.

Further, in this analysis, there are different levels of subphrases: larger subphrases can be divided into smaller

subphrases.

50

The first two subphrases (mm. 1–3 and mm. 4–5; refer to Example 3.2) are similarly

delineated by rests and end with a relatively long note on a strong beat. The first subphrase

ascends to C5 over a prolonged tonic harmony, and a change of harmony initiates the second

subphrase, which reaches beyond the registral boundary established by the first subphrase before

stepping down to ) (harmonized with the borrowed iiø56 chord) in m. 5. Both the first and second

subphrases clearly set up the expectation that more music will follow, but in the local context

they group together to form a higher hierarchical unit. Although musical elements at the end of

the second subphrase also imply continuation (setting up sentential expectations and arriving on

an unstable scale degree and harmony), the descending gesture in mm. 4–5 balance the rising

gesture in mm. 2–3, creating a stronger boundary at m. 5. The text influences my interpretation

as well: the last word’s rhyme confirms the grouping of these subphrases. In other words, I am

able to predict the conclusion of the second subphrase better than the first because of the rhyme

scheme and repeated syllabic pattern.

Example 3.2: Schumann, Myrthen, “Widmung” Op. 25, No. 1, mm. 1–13 The annotations under the score represent the two levels of subphrase analysis discussed in the text.

Prefix to the opening subphrase

51

Example 3.2 (continued): Schumann, Myrthen, “Widmung” Op. 25, No. 1, mm. 1–13

Both the subphrase that begins with the V42 chord in the second half of m. 5 and the

subphrase that begins with the V42/IV chord in the second half of m. 7 use the same harmonic

progression, first in the home key, then tonicizing IV. Even without a rest separating the two

events, the repetition of the harmonic progression and basic melodic outline ()-$-'-(), the 4–3

52

suspension on the strong beat, and the dynamic expectation of subphrase length (established by

the first two subphrases) suggest that the V42/IV chord in the second half of m. 7 begins a new

event. Again, these subphrases group together because of their musical similarity; a listener

would have strong expectations for the location of the end of this second subphrase because of

the harmonic and melodic repetition. A dynamic expectation of subphrase length coupled with

the number of syllables present in each line of text also provides fairly specific expectations for

this point of conclusion. The arrival on the pre-dominant harmony with the rhyming final word

of the couplet in m. 9 does initially sound like the conclusion of the subphrase, fulfilling a

listener’s expectation for an ending.

However, the material that immediately follows (end of m. 9–beginning of m. 10) does

not sound like a complete subphrase: it sounds like a stronger ending than the resolution of the

4–3 suspension in m. 9. Despite confounding dynamic expectations for subphrase length, hearing

the arrival on ) in m. 10 as an ending confirms other dynamic expectations. “O du mein Grab”

begins with the same words and melodic material as the end of the second subphrase (mm. 4–5),

producing specific expectations for an ending. One possible reading of the grouping structure is

that this short segment groups with the previous subphrase to prolong the pre-dominant area,

creating a longer subphrase in mm. 8–10, which, when combined with the previous subphrase in

mm. 6–7, essentially replicates the harmonic motion in mm. 2–5. Even though both measures

conclude with a descent down to ), m. 10 sounds more implicative than does m. 5. Perhaps the

premature stop in the middle of a line and the absence of a complete rhyming couplet project

continuation at the surface level, while the expanded pre-dominant chord increases anticipation

for an upcoming cadence at the middleground level. This anticipated completion of a harmonic

schema and the unexpected subphrase length overshadows other signs of ending.

This section’s last subphrase is longer than the previous ones, concluding in m. 13 with a

PAC. Interestingly, Schumann disguises the end of the fifth line, which occurs on the downbeat

of m. 11 (the expected place for the line to end, as implied by the previous subphrase lengths).

The rising melodic contour and short note values make “in das hinab” sound more like a musical

beginning than an ending. The cadential arrival coincides with the conclusion of a rhymed

couplet, and, although the end of the couplet’s first line was not musically articulated, Schumann

takes advantage of the internal rhyme in line five. In this first section, Schumann’s compositional

53

decisions and the text of the poem influence the perceived grouping structure of the subphrases,

the conclusiveness of the endings, and the sense of closure at the end of the A section.

Even though Schumann concludes the first phrase with a (-)-$ melodic motion coupled

with a strong harmonic cadential gesture, an experienced listener is unlikely to hear this as the

final close of the piece. There has not been enough music; a listener familiar with Romantic

Lieder would know that a contrasting section, or at least another stanza, usually follows.

Furthermore, the poem itself remains open; the speaker has only listed the characteristics of his

love. Musicians who take an organicist analytical position may further state that the full

implication of the borrowed !+ has not been fully realized, necessitating more music.

The common-tone modulation to !VI (enharmonically respelled as E major) begins the

song’s next section (see Example 3.3). This section is differentiated from the first not only by

this key change but also by a change in the accompaniment pattern and the longer durations in

the melodic line. After the half cadence (HC) in m. 21, harmonic and melodic patterns from the

first section recur followed by another common-tone modulation back to the home key. Despite a

ritardando and a return of the original accompaniment figure, the return to the original key does

not sound like the third part of a ternary form. Part of the reason is that elements from the A

section gradually emerge from the B section, blurring the boundary from the perspective of tonal

and thematic content. It is not until the HC in m. 29, which concludes the phrase begun in m. 22,

that there is a strong enough anticipated ending to close the B section of this Lied. The rhymed

couplet, the dominant prolongation preceding this point, and the ritardando that occurs in the

previous measure all contribute to the close of this section. However, the very nature of a HC

and, to a lesser extent, the ascending melodic line implies continuation. In his paper at the 2010

meeting of the Society for Music Theory, Poundie Burstein explored this very issue of

continuation at a half cadence, noting that the implicative nature of the V chord sometime blurs

the distinction between a half-cadential ending and an elided authentic cadence.

54

Example 3.3: Schumann, “Widmung” from Myrthen, Op. 25, No. 1, mm. 14–29

From a theoretical perspective, a HC can end a phrase, but the phrase is not considered

“closed.” If closure is seen as the completion of a goal-directed process, the most common

harmonic process in tonal music is the motion away from tonic and then motion back towards

55

tonic. For instance, in his textbook Harmonic Practice in Tonal Music, Gauldin describes half

cadences as open cadences compared to closed authentic cadences (2004, 132). However,

listeners do experience a sense of some finality at a HC because of the role of expectation in the

perception of closure. While a HC does not conclude on a “restful” tonic chord, it does signify

the end of a common harmonic paradigm. As previously discussed, listeners misattribute the

feeling of a goal-directed process to the increased prediction success when successive elements

in a script-like schema are confirmed. Since the ending on V can be anticipated, there is a degree

of finality at the arrival of a HC, and yet a HC does not sound as closed as an authentic cadence

because the dominant harmony itself implies continuation. Despite the prevalence of half-

cadential harmonic patterns, there are still many more traces in long-term memory where the V

chord does, in fact, proceed to a tonic harmony.26

Because Schumann’s A' section replicates many elements from the beginning, I will not

provide a thorough analysis of this section, but a few words about the structural close and the

codetta are warranted. To heighten the expectation of closure in m. 39, Schumann uses rhetorical

and formal markers of closure. Returning to the last two lines of the poem, Schumann concludes

the song’s narrative by reiterating the transformative power of the poet’s love. The dramatic leap

up to +"before the leap down to the final tonic covers the structural descending step progression.

This rhetorical flourish points towards the approaching cadence, whose finality is further

emphasized by a ritardando.

An experienced listener would know that the song is unlikely to end with this structural

close, because most Lieder conclude with a piano codetta. The passing 42 chord following the

cadence implies a continuation, which is realized by the return of the accompaniment

Figure from the beginning. This codetta divides into two segments (see Example 3.4), each one

concluding with a V-I harmonic motion. Adopting Caplin’s (2005) perspective on phrase and

cadence, these two-bar units do not constitute a phrase: they are merely an external phrase

extension, repeating the cadential gesture from the previous phrase.27

The last one sounds more

26

There are instances where the half cadence can sound more conclusive. Most notably, the Phrygian half

cadence (typically iv6-V) is a common harmonic formula in Baroque music that may conclude non-final movements

in multi-movement works.

27 According to Caplin, a cadence must end a formal unit. These two-bar units do not end anything; rather,

they are extra endings tacked onto the end of the phrase.

56

final because it lacks the passing 42 chord on the third beat of m. 43 and the tonic harmony is

sustained as the tempo slows down and the music fades out.28

Example 3.4: Schumann, “Widmung” from Myrthen, Op. 25, No. 1, mm. 37–44

In this analysis, the completion of harmonic schemata projected the strongest feeling of

closure. Harmonic schemata, by their very nature, are broad generalizations formed by many

exemplars and tend to elicit rather specific expectations, allowing a listener to experience

anticipatory closure at cadential points. Signs of closure such as silence and increased duration

contribute to a work’s grouping structure, but perhaps not its sense of closure (especially at the

level of the subphrase) because the arrival of these features aren’t as strongly anticipated.29

28

One can point to the final sonority of this song, an A!-major triad in second inversion as evidence that

this song does not have a satisfying ending; however, I hear the low A!2 at the beginning of the measure as the

functional bass note for the entire measure. Further when viewed as a part of the entire song cycle, this opening song

contains implicative elements in terms of the overall narrative of the cycle, but in this analysis, I only examined

closure within the context of this single song. My analysis of Copland’s song “The World Feels Dusty,” will

consider closure within the context of the entire song cycle.

29 A listener with less experience in this musical style might have quite a different impression of endings. I

imagine that sonic disjunctions might exert much more influence in the song’s segmentation and in the listener’s

perception of closure.

57

Returning to Meyer’s (1973) suggestion that the strength of a particular ending can influence the

perception of form, it follows that the strongest closes conclude the larger sections of the ternary

form, while less strong endings delineate and group the subphrases. Furthermore, sophisticated

listeners may use genre-specific knowledge—the relationship between the piano and voice,

stylistic knowledge about typical Romantic era harmonic language, and conscious knowledge

about prototypical formal construction—to shape expectations, affecting their perception of

closure for this work.30

Webern’s “Der Tag ist vergangen”

In my analysis of closure in “Widmung,” I relied primarily on tonal paradigms. Although

segmentation in post-tonal styles may seem more reliant than tonal music on differences in the

musical surface, I believe that schematic knowledge structures still provide a top-down guide to

segmentation. One such structure is an increase in tension followed by a decrease in tension,

marking a complete unit (this is also discussed by Bryden [2001] and Hopkins [1990]). This deep

schematic expectation does not imply a specific type of ending or point of ending, but refers to a

general phrase “shape” based on exposure to many different types of music.31

Both Meyer’s

primary and secondary musical parameters can influence our learned association between

“falling” or “decreased intensity” and our perception of closure. Even in tonal music, musicians

maintain that we experience an increase in harmonic intensity followed by a relaxation at

cadences. According to Huron (2006), this feeling of intensity and relaxation is misattributed: it

is our strong first-order expectations at the dominant chord followed by the reward for a correct

prediction at the tonic chord. Some other musical elements (discussed in the previous analysis),

such as falling pitch contour and a decrease in tempo and dynamics, can also contribute to a

sense of lessening intensity.

Violations of the even deeper expectation for continuity in sound also play a role in the

segmentation of music. Abrupt changes in pitch and articulation can segment a work into motivic

units, usually occurring within a phrase, while other changes, such as the intrusion of silence and

30

This interaction between a “bottom-up” construction of form and pre-existing “top-down” knowledge of

formal structure will be explored further in Chapter 7.

31 Some authors derive this structure from our embodied experience (Snyder 2000), while others attribute it

to the perception of movement by musical forces (Bryden 2001). In either case, statistical generalizations of how

musical segments should end are accumulated in long-term memory.

58

the lengthening of durations, usually occur at the end of a phrase. It is the degree to which these

elements are anticipated, if they are at all, that gives rise to feelings of closure.

Schematic knowledge of phrase shape and discontinuity in the sound influence a

listener’s sense of closure (and therefore formal structure) in Webern’s “Der Tag ist vergangen.”

Dynamic expectations also play a role in establishing which musical parameters will contribute

to the sense of closure. Consider, for instance, the song’s opening measures (reproduced in

Example 3.5), which illustrate the parameters that contribute to the shape of phrases in this work:

dynamics and pitch register, where an increase in the dynamic level and pitch height build

musical tension, followed by relaxation as the dynamic level drops and the pitches descend.

This basic shape is expanded in the first vocal phrase, where the vocal line has two

distinct peaks, one in each line of the poem, with the second peak reaching a step higher than the

first. The piano line has a similar shape in mm. 5–6, but it does not align with the vocal shape.

The vocal line is foregrounded in this phrase, due, in part, to the higher register, making the

vocal rest at the end of the phrase more salient than the silence separating gestures in the piano

part. A listener who did not recognize a phrase ending with the arrival of the G, in m. 6 (arrival

closure) would almost certainly realize the phrase had concluded with the first instance of silence

in the vocal line (retrospective closure).32

If this silence is insufficient, the piano punctuates the

end of the phrase with a two-chord gesture that collapses in pitch space (a span of 56 semitones

contracting to 34 semitones).

The first phrase creates a dynamic expectation for phrase length and basic phrase shape.

The second phrase traverses more pitch space, reaching past the registral boundaries of the first

phrase, but still descends in pitch space at the end of the phrase. The end of the second phrase

sounds more conclusive because its length exactly matches that of the first phrase, confirming a

listener’s expectations for when the phrase should end. Coupled with the rhyme scheme, a

listener can experience arrival closure on “mir”—or even anticipatory closure, if the listener is

paying attention to phrase length.

32

There are elements present in this first phrase that could elicit anticipatory closure: the foreshortening of

the contour segments and the syntax of the text.

59

Webern VIER LIEDER, OP. 12

© 1925 by Universal Edition A.G., Wien; © Renewed; All rights reserved

Used by permission of the European American Music Distributors LLC,

U.S. and Canadian agent for Universal Edition A.G., Wien

Example 3.5: Webern, Vier Lieder, “Der Tag ist vergangen,” Op. 12, No. 1, mm. 1–11

In the first stanza, an increase in the dynamic level and pitch height correlates with an

increase in tension, which is relaxed as the dynamic level and pitch drop, but this pattern does

not hold true at the end. While the dynamics decrease in intensity towards the end (Example 3.6),

the vocal line leaps a diminished octave up to F5. One could argue that the pitch is displaced an

octave and it essentially moves only a half step from the preceding F#, but even aside from this

explanation there are other elements that clearly contribute to the sense of closure here. First, the

dynamic expectations established by the first stanza suggest that the length of the second stanza

will be similar, which it is, and also that it will probably end with a word conforming to the

60

poem’s rhyme scheme, which it does. The descending perfect fifth interval between the last pitch

of phrase one (G4) and that of phrase two (C4) is inversionally related (in pitch space) to the last

pitches of phrases three and four (B!4 and F5). This intervallic balance may also contribute to

closure.33

Webern VIER LIEDER, OP. 12

© 1925 by Universal Edition A.G., Wien; © Renewed; All rights reserved

Used by permission of the European American Music Distributors LLC,

U.S. and Canadian agent for Universal Edition A.G., Wien

Example 3.6: Webern, Vier Lieder, “Der Tag ist vergangen,” Op. 12, No. 1, mm. 18–21

Incidentally, ending a song with an upward leap is not uncommon within Webern’s own

oeuvre; for example, the well-known songs “Wie bin ich froh” and “Vorfrühling” end with rising

vocal leaps. While this otherwise unusual interval may be a composer-specific sign of closure,

the small number of examples of this gesture ending a musical segment compared with the vast

number of more “typical” endings may still imply “openness” or “continuation” for even the

most expert of Webern listeners.34

Copland’s “The World Feels Dusty”

In his setting of Emily Dickinson’s poem “The world feels dusty,” Copland manipulates a

listener’s expectation for closure. His consistent use of two-bar groupings and stereotypical tonal

patterns at the beginning of the song creates a set of dynamic expectations that are left unfulfilled

33

An analysis could also imply “latent tonal tendencies” (Kurth 2000) by pointing to allusions of a tonal

relationship between these two intervals (suggesting that the first and third phrase end with a kind of HC that is

answered by a PAC). While this is a possible hearing, I think other factors discussed in my analysis play a larger

role in projecting closure.

34 One could argue that regarding a large ascending leap as strong closure would constitute an attempt to

normalize music that is intended to be unusual. The very idea of the text—“Give to the deceased eternal rest”—

suggests eternal stasis, not closure at all, so Webern may be manipulating closural expectations in order to explore

meaning within the text.

61

as the song progresses. These unfulfilled expectations for closure might lead the listener to an

understanding of the poem that differs from reading Dickinson’s words without music, and

presumably Copland’s compositional choices reflect his own interpretation of the poetry. Of

course, we might also understand this song differently in the context of the entire song cycle.

The text for this song is taken from the 1929 Bianchi edition of Dickinson’s poems,

which include quite a few changes from Dickinson’s original poem. For instance, the last two

lines were changed from “And Hybla Balms—Dews of Thessaly, to fetch—” to “Dews of thyself

to fetch and Holy balms.” According to Cherlin (1991), the 1929 edition also removes the

possibility of enjambment by inserting periods at the end of each stanza, thereby limiting the

number of possible readings of this poem.

The dashes at the ends of stanzas equivocate, no less than those internal to quatrains. At

the ends of stanzas dashes avoid the strong sense of closure that the unfortunate editorial

choice of 1929, periods, bring. Hence, “Honors—taste dry—Flags—vex” are separated

by versification, yet connected by belonging to the same short catalogue of things dry,

vexing, things not needed that can crowd out those most needed. (58)

Still, multiple readings of the text are possible. Looking just at the text, this poem can be

understood as leaving behind the dusty “honors” and “flags” of this world and longing for close

companionship while death is embraced. Another interpretation places the poem within the

context of Copland’s song cycle.35

“The world feels dusty” immediately follows the poem “Why

do they shut me out of heaven?” Taken in this context, Baker (2003) argues that the speaker

longs for the “dews” of this world, since she will not receive the “honors” of the next.

At the moment of death, when the good Christian is said to weary of the world and seek

heavenly waters for spiritual relief, Dickinson argues that one thirsts for the dew of this

world, not the next. “We want the dew then Honors taste dry.” The honors that come with

the “privilege” and “victory” of death in anticipation of salvation—to invoke the

metaphors of the nineteenth-century Calvinist—“taste dry” on the tongue of one who has

just learned in song three that she will be denied happiness in the afterlife. Dickinson

looks to the soothing waters of worldly friendship as a “holy balm” that might restore one

to life on earth. (11)

35

Soll and Dorr’s (1992) research indicates that Copland intended for the song cycle to be heard as a whole

(100). Their subsequent analysis reveals features supporting their cyclic reading, where “the intricate musical and

textual materials combine to create a highly organized and unique overall structure” (101).

62

In either case, Copland’s manipulation of a listener’s expectation for closure can color the

meaning of the text. This analysis will focus on musical expectations Copland creates in the

beginning of the song and what the confirmation or denial of these expectations might mean.

Although diatonic, this song does not project a clear tonal center in the manner of

Schumann’s “Widmung,” but the diatonic (or perhaps very weakly tonal) nature of this piece

creates tonal implications, unlike Webern’s highly chromatic song “Der Tag ist vergangen.”

F# minor seems to be prevalent in the piano part, but the voice emphasizes D major in the first

few measures. This conflict is also present at the end of the first stanza, where the voice clearly

implies B minor (with a skip down a fifth in the vocal part at the end of the first stanza to

emphasize pitch-class B), but the piano part does not corroborate this tonal center. Fleeting tonal

implications would activate traces of previous experiences with tonal music, eliciting tonal

expectations.

The piano introduction establishes an expectation for two-measure subphrases with an

emphasis on the second beat of each measure (see Example 3.7). The vocal line continues this

two-measure division in the first two lines of the poem, and both of these subphrases create a

dynamic expectation that registral extremes in the vocal line will correlate with segmentation.

The established subphrase length from the piano introduction projects a boundary after “dusty”

in m. 4, even though the contour ascends to this pitch and the rhythmic syncopation emphasizes

the weak syllable of this word. The stronger boundary in m. 6 concludes a higher-level subphrase

unit. The features contributing to the close in m. 6 include the completion of a thought in the

text, the slightly longer note on the word “die,” and a longer break in the vocal line. The pitch in

m. 6 is higher than in m. 4, confirming a dynamic expectation that a range extreme correlates

with the end of musical segments.

Copland changes the subphrase grouping in the reminder of the first stanza. In m. 7, the

vocal line articulates a one-measure group, with the durational accent beginning on the second

beat of the measure, echoing the sighing piano gesture from the beginning. The piano part

supports this reading: the recurring gesture is transposed up a third for just this measure before

slipping back down to the original pitch level in m. 8. The increased pace is surprising,

differentiating this subphrase from the previous material.

63

The World Feels Dusty by Aaron Copland, text by Emily Dickenson

© Copyright 1951 by The Aaron Copland Fund for Music, Inc.

Copyright Renewed. Boosey & Hawkes, Inc., Sole Licensee.

Reprinted by Permission.

Example 3.7: Copland, Twelve Poems of Emily Dickinson, “The World Feels Dusty,” mm. 1–2

The last line of text appregiates a B-minor triad, using the same syncopated rhythm from

m. 4, and concludes on the lowest vocal pitch sounded thus far (B3). While this creates an arch-

shaped phrase (the register gradually ascends through B4 in m. 4 to the E5 in m. 7 before falling

quickly in the last line), other musical parameters do not follow the pattern of abatement

suggested by Hopkins (1990). The accompaniment at this cadential point subtly changes from a

sighing gesture to a rising step gesture in an inner voice, but perhaps the most obvious

continuational parameters are the increased tempo and dynamics.36

In summary, the first stanza sets up these expectations:

1) Text: four lines in a stanza, abcb rhyme scheme, 5+5+5+4 syllabic pattern

2) Two-measure subphrases (except for the last three measures in the vocal line)

creating balanced larger groups (2+2)

3) Diatonic, but not strongly tonal, melodic line

4) Generally, registral extremes mark endings

5) More specifically, phrases end with a"--$ gesture, concluding with the pitch B3

However, in the second stanza, many of these dynamic expectations are altered to match

the changes in the text. The rhyme scheme changes to abbc and the syllabic pattern slightly shifts

to 6+4+5+4. These changes, along with the descending contour in mm. 12–13 and the similar

melodic gestures in mm. 14 and 15, lead me to group the subphrases a bit differently from the

first phrase. I hear a 1+2+1 grouping instead of the balanced 2+2 structure I heard earlier. The

last note of this section overshoots the anticipated B3 from the first section, arriving on A#3.

Cherlin (2003) recognizes that

36

The use of secondary parameters to imply continuation at cadential points is not atypical for Copland’s

settings; see “There Came a Wind Like a Bugle” and “Heart, We Will Forget Him.”

64

the low A# on ‘rain’ is clearly a substitute for the B of ‘dry,’ which is the ‘expected’

note—the fresh note for refreshing ‘rain.’ This touch is quite effective, and it works in

conjunction with a modulation (extending the term slightly so that it can fit this not-quite-

tonal context) projected by a shift in the piano's ostinato. (73)

These changes in the second stanza, along with the piano’s registral change (now

reaching down to G2), the louder dynamics, and the faster tempo, set this section apart from the

first. Copland’s B section creates an expectation for ternary form, confirmed by the return of the

original tempo in m. 17 and the opening sonority (minus pitch-class B) in m. 19. However, any

expectation for a clear return to the A section is thwarted by the lingering A# in the piano part

and the new vocal line. A literal repetition of the A section would not capture the poet’s

transformed perspective: that friendship provides needed comfort at the time of death.

The material in mm. 23–24 imitates the pitch and rhythmic content of mm. 7–8 (where

the pace increased in the first phrase), both setting similar texts—“we want the dew then” (mm.

7–8) and “Dews of thyself to fetch” (mm. 23–24). The listener may consequently develop strong

expectations for the phrase in this final section to close on B3, similar to ending in m. 9;

however, the phrase instead arrives on A#3. Along with this surprising pitch, the abrupt change of

pitch collection (G Aeolian) and the final rising gesture in the piano (m. 27; see Example 3.8)

suggest that this song is not closed at all, it is just one in a cycle. While elements at the end of

this song imply finality, and multiple hearings of the song can allow a listener to experience a

sense of closure, the first encounter might leave a listener unsatisfied. Copland may have chosen

to undermine strong closure at the end of this song in order to underscore the text. The poem

ends trapped in the time between life and death; strong closure would imply death. Here the poet

wants to hold onto life in this world as long as possible.

This lack of satisfactory closure leads me to consider further how this poem functions in

the larger context of the song cycle. Baker (2003) argues that its placement after “Why do they

shut me out of heaven?” shows that the poet chooses a worldly life, ministering to friends, over

the afterlife. This reading is supported in Copland’s last song of the cycle, “The Chariot,” where

the protagonist will not “stop for Death.” Although this reading creates a single trajectory from

the beginning of the cycle until the end, I tend to favor the interpretation of the song offered in

the previous paragraph. For me, this song cycle explores various aspects of the human

experience: our relationships with nature, with each other, and, ultimately, with death. Copland’s

65

blatant thwarting of rather specific expectations created in the first stanza captures the

timelessness Dickenson drafted, denying death as long as possible in order to enjoy earthly

friendship.

The World Feels Dusty by Aaron Copland, text by Emily Dickenson

© Copyright 1951 by The Aaron Copland Fund for Music, Inc.

Copyright Renewed. Boosey & Hawkes, Inc., Sole Licensee.

Reprinted by Permission.

Example 3.8: Copland, Twelve Poems of Emily Dickinson, “The World Feels Dusty,” m. 27

In the three preceding examples, we have seen that expectation informs a listener’s sense

of closure, segmenting musical experience. The perception of goal direction is an artifact of

transitional probabilities formed through a listener’s familiarity with a musical style. The

positive valence resulting from a confirmation of a listener’s expectation for the end of a

segment followed by a weakening of expectations for subsequent events (i.e., an increase in

prediction error) causes a feeling of finality (or momentary repose). The perceived strength of

closure will correlate with the specificity of expectation and number of traces in long-term

memory, which together will influence the difference between the transitional probabilities for

sonic entities approaching and leaving an ending. From here, I hypothesize that there are two

main factors that influence our perception of closure:

1) Knowledge structures will guide our expectations for when and how musical

segments will conclude. These knowledge structures give rise to schematic, veridical,

and conscious expectations, promoting anticipatory and arrival closure.

2) At the same time, sonic disjunctions in the musical surface features will segment

musical experience. We expect musical surface features to continue in a similar

manner (deep schematic expectation); when they don’t, we retrospectively perceive

closure.

Musical surface features can either support or contradict the expectations formed through

knowledge structures, and the weight of elements projecting closure compared to those

66

projecting continuation creates a hierarchical grouping structure. I predict that sophisticated

listeners with more experience in a particular style will tend to rely more on knowledge

structures than on sonic features. Expectations from both of these types of features are

incorporated in the Event Segmentation Theory (EST), which is a cognitive model of

segmentation. The next chapter explores this cognitive model and applies it to the perception of

musical structure and closure.

67

CHAPTER 4

EVENT SEGMENTATION THEORY

Chapters 2 and 3 illustrated that segmentation is closely related to closure, comprising an

integral part of my definition of closure—the anticipated end of a musical segment.

Segmentation in music depends both on the bottom-up processing of sensory features and on the

top-down processing of knowledge structures. Deep schematic expectations for sound continuity

are reflected through the bottom-up sensory features that segment musical experience (for

instance, the musical features listed in Lerdahl and Jackendoff’s GPRs 2 and 3). Knowledge

structures inform surface schematic expectations, allowing a listener to anticipate endings and

providing a framework for musical segmentation. Event Segmentation Theory (EST), as

proposed by Christopher Kurby and Jeffrey Zacks (2007), incorporates segmentation based on

learned knowledge structures and changes in sensory features into a single cognitive model.

Kurby and Zacks’ proposed process by which we segment everyday events is grounded in

expectation, which, when applied to music, can account for the perception of musical closure.

Event Segmentation

An “event,” as defined by Kurby and Zacks (2007), is “a segment of time at a given

location that is conceived by an observer to have a beginning and an end” (72). This definition is

extremely similar to how musicians define a phrase, which can be considered a discrete event in

music. For instance, Lerdahl and Jackendoff (1977) define a phrase as “the lowest level of

grouping which has a structural beginning, a middle, and a structural ending (a cadence)” (123).

If we carry this line of thinking to its logical conclusion, then any musical “object” is an “event”

since it exists as a temporal experience; for instance, a single tone has a beginning and end, so by

the definition posited by Kurby and Zacks, it is an event. But this conclusion goes too far

because musical tones are not usually heard in isolation; instead, tones are usually grouped

together to create larger units. Because I am examining the perception of musical closure, I

examine grains of segmentation resulting from the creation of formal units, starting with

subphrases through entire pieces.

68

Event segmentation is “the process by which people parse a continuous stream of activity

into meaningful events” (Zacks and Swallow 2007). Research has shown that event segmentation

is an automatic, hierarchical process, and is essential for guiding memory and learning. In many

event segmentation experiments, participants designate perceived boundaries through an explicit

task. Although this does not directly support the idea that event segmentation is automatic,

widespread agreement among individuals on the location of event boundaries suggest that

individuals may be tapping into ongoing event processing (Kurby and Zacks 2007; Zacks, Speer,

and Reynolds 2009). Even music segmentation studies reveal a high level of agreement among

participants on the location of boundaries (Joichi 2006). Listeners broadly agree on the location

of musical segments across a variety of literature (Deliège 1987; Deliège et al. 1996; Krumhansl

1996). Even when confronted with ambiguous stimuli, participants using the same segmentation

strategy still generally agree (Pearce, Müllensiefen, and Wiggins 2010), implying a shared

underlying cognitive process. However, an explicit task may “change the nature of the perceptual

processing” (Zacks and Swallow 2007, 80); Zacks, Tversky, and Iyer (2001) found that

participants’ segmentation behavior varies depending on whether the participants verbally

described the event and, to a lesser extent, their familiarity with the stimulus.

Brain-imaging data provide much stronger evidence for an automatic process. In one

such study, participants passively watched movies of actors portraying everyday events while

their brains were scanned. The fMRI showed increased brain activity at points later designated

by the same subjects as event boundaries (Zacks et al. 2001). A related music study found similar

results. Musically untrained participants listened to excerpts from a multi-movement orchestral

work by William Boyce while their brain activity was recorded with fMRI (Sridharan et al.

2007). There was increased brain activity in two distinct regions of the brain coinciding with

movement boundaries: first there was an increase in activity in the ventral network,

corresponding to violations of musical expectancy, followed by an increase in activity in dorsal

network, corresponding to the processing of new musical information. Despite not directly

attending to the event structure of the stimuli, participants in both studies were still sensitive to

event boundaries.

Event segmentation is hierarchical and occurs simultaneously on multiple time scales

(Kurby and Zacks 2007). When asked to mark the smallest meaningful units and the largest

69

meaningful units, participants’ fine-grained divisions are nested within their coarse-grained

divisions (Zacks, Speer, and Reynolds 2009). Music is also understood to be hierarchical: notes

form motives, which form subphrases, which form phrases, etc. Lerdahl and Jackendoff (1983)

have laid out grouping well-formedness rules (GWFRs) as well as grouping preference rules

(GPRs, discussed previously) to explain how a listener creates groups as well as how these

groups are hierarchically related.37

These rules are listed in Table 4.1.

Table 4.1: Lerdahl and Jackendoff’s Grouping Well-Formedness Rules

GWFR 1: Any contiguous sequence of pitch-events, drum beats, or the like can

constitute a group, and only contiguous sequences can constitute a

group.

GWFR 2: A piece constitutes a group.

GWFR 3: A group may contain smaller groups.

GWFR 4: If a group G1 contains part of a group G2, it must contain all of G2.

GWFR 5: If a group G1 contains a smaller group G2, then G1 must be

exhaustively partitioned into smaller groups. (37–39)

Being able to segment an activity into events can also guide learning and understanding.

Zacks et al. (2006) demonstrate that subjects tend to parse events in a similar manner, usually

creating segments falling into nameable events, and the better a person can form large

meaningful groups, the better he or she will remember the activity. Here, elderly adults

segmented movies of everyday events; some adults did the task “well,” meaning their

segmentation followed the patterns of the group, while other adults were not as successful. In a

subsequent identification task, participants were shown still pictures, some of which were drawn

from the movies and others of which were not. Participants who segmented the movies

successfully also better identified the visual images from the movies. This research is directly

applicable to music. As discussed in Chapter 3, some ear-training curricula teach undergraduates

to listen for larger patterns in order to remember the passage for dictation, since creating discrete,

namable units aids in remembering the music.38

37

In their chapter on grouping structure, Lerdahl and Jackendoff liken the grouping of the musical surface

to “partitioning the visual field into objects, parts of objects, and parts of parts of objects” (36). This perspective is

quite different from my more phenomenological approach, comparing the grouping of the musical surface to the

segmentation of experience.

38 Ease of segmentation might also relate to individual aesthetic preferences: musical processing would be

facilitated for a person who could easily segment a composition into meaningful events. A similar correlation

70

Both top-down knowledge structures and bottom-up sensory characteristics influence

segmentation (Zacks 2004). Knowledge structures, as defined by Zacks, are “representations that

capture recurring patterns of covariation” (2004, 980), referring to transitional probabilities

gleaned through statistical learning. Other authors refer to knowledge structures using different

terms such as event schemata (Hard, Tversky, and Lang 2006) and situation models (Zwaan and

Radvansky 1998). Knowledge structures guide the top-down processing of events, especially for

perceived goals and intentions. Research has shown that top-down knowledge of goals assists in

segmentation, and event boundaries coincide with changes in perceived intention (Baldwin and

Baird, 2001; Baldwin et al., 2008; Hard, Tversky, and Lang 2006; Zacks 2004). When the actor’s

goals are unpredictable, viewers tend to segment an activity into smaller units, suggesting that

knowledge structures particularly assist in creating larger groups.

In music, stylistic competency might also assist in creating larger groups. For instance,

the first section of Schumann’s “Widmung” (discussed earlier) contains a single phrase, although

surface discontinuities suggest several subphrases. A listener less familiar with Schumann’s style

may not be able to perceive harmonic intention towards the tonic goal over these breaks in the

sound. This is readily apparent in the undergraduate classroom, where instructors have to teach

music students not to be “tricked” by surface discontinuities and to focus on the longer,

harmonically driven phrase.

In music, knowledge structures are revealed through musical expectations, both implicit

and explicit. In light of the discussion from the previous chapter, knowledge structures can take

the form of generalized schemata (e.g., tonality), learned knowledge (e.g., formal archetypes),

and piece-specific knowledge. While an expectation for continuity can be a knowledge structure,

especially when used consciously or when it creates dynamic expectations within a composition,

the deepest expectations for continuity merely rely on sonic disjunctions and correspond with the

bottom-up changes in sensory characteristics.39

Segmentation created by bottom-up processing relies on changes in movement features or

spatial location to produce event boundaries (Magliano, Miller, and Zwann 2001). In the visual

between processing fluency and aesthetic preference in visual art has been demonstrated (Reber, Winkielaman, and

Schwarz 1998).

39 Refer to Figure 3.1 for an illustration of the different types of schematic expectations.

71

realm, when a person or object changes direction or speed, a boundary is perceived, and these

changes correlate with increased activity in brain regions involved with motion processing

(Zacks, Braver, et al. 2001). Even segmentation based on surface features alone can create a

nested hierarchical structure, where event boundaries with a large number of feature changes are

perceived as more important than those with fewer changes (Newtson, Engquist, and Bois 1977).

Changes in the musical surface equate to these visual discontinuities; for instance,

Lerdahl and Jackendoff’s (1983) GPRs 2, 3, and 4 (see Table 2.1) are based on sensory

characteristics, indicating that a change in attack-point, register, dynamics, articulation, and

duration distinguish group boundaries, while more marked changes result in higher-level

boundaries. Deliège (1987) has shown that when musicians are asked to segment a short excerpt,

most of their divisions correspond to these GPRs. When listening to recorded music, one cannot

see physical motion, unlike the visual stimuli used in studies examining event segmentation.

However, many descriptions of music employ motion words (for instance: a stepwise ascent, a

downward leap), suggesting that, even when motion is not literally present, it is still conceptually

understood.40

Research in discourse processing suggests that conceptual changes in sensory

characteristics have the same effect on event segmentation as do physical changes (Zacks, Speer,

and Reynolds 2009). Of course, the influence of musical “motion” on segmentation becomes

much more complex when combined with a visual stimulus such as a performer, conductor,

score, or dancer. While this is outside the scope of my study, it would be interesting to see the

extent to which visual input influences auditory segmentation.41

Comparable to the expectation continuum discussed in Chapter 3, in which there was no

clear boundary between cross-modal expectations and musically-derived expectations,

segmentation prompted by knowledge structures and segmentation prompted by sensory

characteristics cannot be sharply distinguished. Zacks and Swallow (2007) also characterize

elements essential to segmentation in terms of a continuum:

Further, we believe that a number of little-studied features, from purely sensory to purely

conceptual, must be important for event segmentation. Toward the sensory end are

40 Gjerdingen (1994) likens our experience of motion in music to apparent motion in visual studies: a

sequence of flashing lights can create the impression that light is moving from one place to another. Others (Clarke

2001) explain our perception of motion in music from an embodied perspective.

41 For instance, observers can acquire both structural and expressive information from musicians’ motions

(Nusseck and Wanderley 2009).

72

features such as sound, lighting, and contact between actors and objects. Toward the

conceptual end are features such as goals and social conventions. In the middle are

features such as sequential statistical structure—that is, the order in which events tend to

occur. (83)

Other research indicates an interaction between knowledge structures and movement

features (characteristics of the object’s movement), supporting four postulates (Zacks 2004,

983):

1. Movement features contribute to the identification of fine event segments.

2. Grouping these fine segments into larger units can be based on aspects of the activity

other than movement features. Observers rely less on movement features as the grain

of encoding becomes larger.

3. Inferences about actors’ intentions can affect how and to what extent movement

features drive the identification of event segments.

4. Inferences about actors’ intentions can be influenced by both intrinsic features of the

stimulus and by top-down information.

Sensory characteristics (e.g., movement features) contribute more to the segmentation of small

units than to larger units. The extent to which an observer can infer an actors’ intention

determines the influence of sensory characteristics on segmentation, and intention itself is

inferred both by top-down information and by sensory input. Usually fine-level events are

described with the actor’s motion path, while coarse-level events are described with the actor’s

intention. As previously mentioned, intention isn’t the only determiner of course-level events:

hierarchical event segmentation can result even without an overarching event schema. More

change in motion results in a higher-level segment, which could then be stored as a new

knowledge structure (Deliège 2006; Hard, Tversky, and Lang 2006).

Even in music, knowledge structures can influence the perception of sensory

characteristics and vice versa. Returning once again to music theory pedagogy, one of the main

goals of an aural skills curriculum is to teach students strategies for determining the structure of a

composition. Students who learn to label musical events are more likely to parse the musical

surface in a way that highlights those events. According to the tenets of statistical learning,

repetitions of musical patterns, even atypical patterns, guide the creation of new knowledge

structures. For instance, as a listener is exposed to more instances of a particular melodic

paradigm occurring at the end of a segment (as determined by movement features), the listener

will began to expect the end of a segment when the melodic paradigm is heard (Eberlein and

Frick 1992; Huron 2006). Segmentation therefore facilitates both learning and memory. Event

73

Segmentation Theory, outlined in the next section, incorporates these characteristics of event

segmentation into a single cognitive model.

Event Segmentation Theory

According to Kurby and Zacks (2007), “the Event Segmentation Theory (EST) proposes

that perceptual systems spontaneously segment activity into events as a side effect of trying to

anticipate upcoming information” (72). Predictions in this model are formed both by knowledge

structures and by sensory characteristics, as discussed earlier. Because “segmentation results

from the continual anticipation of future events” (77), a person can adaptively encode event

structure from a continuous stream, understand the intention of an actor (hence anticipate the

actor’s future actions), and select future actions in response to the ongoing event. “Correct”

segmentation has distinct evolutionary advantages: being able to chunk an interval of time

together as a single event saves on cognitive resources and, as seen earlier, improves

comprehension. Also, being able to group events together hierarchically assists in learning and

problem solving. The way we segment experience reflects the environment in which the human

perceptual system developed, as stated by Kirby and Zacks (2007):

None of this would be true if the structure of the world were not congenial to

segmentation. If sequential dependencies were not predictable, if activity were not

hierarchically organized, there would be no advantage to imposing chunking and

grouping on the stream of behavior. In this regard, as in many others, human perceptual

systems seem to be specialized information-processing devices that are tuned to the

structure of their environment. (78)

EST posits that perceivers form a representation of the event—an event model—in

working memory. These event models capture “what is happening now” and guide predictions

about upcoming actions. As long as predictions are accurate, the event model is maintained,

integrating the new information, but when prediction errors rise, a boundary is perceived as the

event model is updated. At this moment, the system becomes more sensitive to incoming

information. As a new event model is established, prediction errors fall and the system stabilizes

once again. Periods of stability in this system are then perceived as single events, while periods

of change create perceptual boundaries (Kurby and Zacks 2007; Zacks et al. 2007; Zacks et al.

2009).

74

Figure 4.1 (reproduced from Zacks et al. 2007, 274, Figure 1) is a schematic depiction of

this theory. Sensory inputs include information collected by the peripheral nervous system, such

as visual, auditory, and tactile information. These inputs are transformed by perceptual

processing, which produces “rich multimodal representations with rich semantic content” (274).

Here objects are identified, motion trajectories are determined, and intentions and goals are

inferred. The ultimate goal of the processing is to make predictions about the future state of

events. Event models bias processing since they “provide a stable representation of the current

event” (275), and they are only open to new sensory input when they are updated at event

boundaries (discussed in more detail below). While event models are “active and accessible

representations” of current events, the amount of information held in these models exceeds the

amount of information held in working memory. Zacks and colleagues (2007) suggest that the

mental capacity of the event model is extended by efficiently using previously stored knowledge

structures (275).

Figure 4.1: Schematic Depiction of the Event Segmentation Theory

(Zacks et al. 2007, 274, Figure 1). The grey arrows show the flow of information while the dashed arrow represents the signal that initiates an updating

of event models. Sensory input only feeds into the event models when the model is updated.

Event schemata represent the knowledge structures in this model. While they can be

understood as “semantic memory representations that capture shared features of previously

encountered events” (Zacks et al. 2007, 275), Hintzman’s view of schematic abstraction as

75

deriving from the summed activation of episodic traces can also be applied here (refer back to

Chapter 3). No matter the perspective, these event schemata “contain previously learned

information about the sequential structure of activities” (Zacks et al. 2007, 275). These

knowledge structures, which represent learned statistical regularities, interact with the current

representation of the event, influencing its shape. In turn, the content in the event model shapes

long-term memory through a learning process. So, in terms of expectations derived from

knowledge structures, when the actor achieves the perceived goal, there is a momentary rise in

prediction error, since the observer cannot predict the actor’s future intentions. A similar effect

was noted in the last chapter, where the event following an authentic cadence is less predictable

than the goal tonic chord. The event model is updated as the perceiver discerns the actor’s next

goal.

These knowledge structures correspond with longer events, whereas fine segmentation is

usually influenced by changes in motion. Being able to predict the movement of objects and

people around us is essential to being able to interact with the world, so when an actor changes

direction and speed, the perceiver has to form new predictions about where the actor will be next

in order to interact effectively. There are deep-seated expectations for continuity (e.g., in music,

pitches in a certain register are usually followed by notes in that same register), so when there is

an unexpected disruption, the event model is updated, incorporating new sensory input in order

to form a new set of expectations.

Figure 4.2 (from Kurby and Zacks 2007, 73, Figure 1) illustrates a schematic depiction of

segmentation process, portraying event models as relatively robust representations of the current

event. This perceptual constancy allows the ongoing event to be a “single entity despite potential

disruptions in sensory input such as occlusion or distraction” (Zacks et al. 2007, 274). The first

panel shows that the event model accurately guides perceptual processing and predictions,

therefore it is not open to new sensory input. Only when the prediction error rises does the

current model become insufficient, at which point it resets and integrates new perceptual

information (Kurby and Zacks 2007). Several studies have corroborated this facet of the model:

observers had superior long-term memory at event boundaries, suggesting greater sensitivity to

sensory information at event boundaries (Baird and Baldwin 2001; Zacks et al. 2006; Swallow,

Zacks, and Abrams 2009).

76

The model suggested by EST may be easily assimilated into a theory of event

segmentation in music, but with one important difference. As originally conceived, EST

considers perceptional changes of motion and intentionality, not conceptual changes; however,

the predictions made by EST can account for event segmentation in narratives, which rely on

conceptual changes. In discourse comprehension, readers are able to track multiple dimensions at

once. Despite changes in character, location, character goals, causal relationships, and time,

readers are able to follow and understand what is going on in the text. Several researchers,

including Zwaan and Radvansky (1998), have suggested that we use situation models to

comprehend the action in the discourse. When there is a discontinuity in any of these dimensions

(such as a change in location), the reader updates the situation model. According to Zwaan’s

(1995) event-indexing model, readers monitor five independent dimensions of the situation

model: time, space, protagonist, causality, and intentionality. A change in any one of these

dimensions prompts an update to the corresponding index in the situation model. Because

updating the model takes time, reading time increases as the number discontinuous elements

increases in a narrative (Zwaan, Langston, and Graesser 1995).

Figure 4.2: Schematic Depiction of the Segmentation Process as Posited by EST

(Kurby and Zacks 2007, 73, Figure 1)

Further recent research by Zacks, Speer, and Reynolds (2009) suggests that EST both

supports and extends discourse comprehension theories (basically equating the event model and

77

the situation model). Because the event-indexing model addresses conceptual events, while EST

focuses on live-action events, Zacks and his colleagues designed experiments to discern whether

the predictions made by EST are applicable to conceptual events and whether the predictions

made by the event-indexing model are applicable to live-action events. In their first two

experiments, the authors presented either a set of narratives or a set of short films to two different

groups of participants, who were asked to divide the stimuli into smaller events. For both groups,

the event-indexing model predicted the location of participants’ boundaries, and a later study

confirmed that boundaries in narratives corresponded with story elements rated by readers as less

predictable. These findings indicate that both physical cues and conceptual changes enable event

segmentation.

Using a combination of empirical research and computational modeling, Pearce,

Müllensiefen, and Wiggins (2010) illustrate that expectation, based on probabilistic learning, can

also inform the segmentation of musical melodies, further speculating that it can be extended to

all auditory stimuli. They hypothesize that, similar to the mechanisms of segmentation described

by EST, “boundaries are perceived before events for which the unexpectedness of the outcome

and the uncertainty of the prediction are high” (1375). The first principle of segmentation, “the

unexpectedness of the outcome,” refers to discontinuities in the sound, roughly corresponding

with Lerdahl and Jackendoff’s GPRs 2 and 3 (unexpected disruptions in the musical surface).

Their results indicate that a computer model, using only probabilistic learning, can segment a

melody according to their first principle in a way that mirrors the results from expert listeners.

From a phenomenological perspective, this results in a retrospective marking of a boundary. The

second principle, “the uncertainty of a predication,” refers to the way preexisting knowledge

structures guide segmentation. Both principles are derived from statistical learning and inform

listener expectation. EST provides a cognitive mechanism that describes how we segment not

only visual or conceptual experience, but musical experience as well.

Event Segmentation Theory and Musical Closure

Because EST is applicable both to live-action events as well as to conceptual events, EST

interacts with an expectation-based model of musical closure. Considering only the auditory

experience of music (setting aside visual input from the score or a performer), according to EST,

boundaries are formed when the event model is updated because of an increased prediction error

78

in the music. Both sensory input from the musical surface and pre-existing knowledge structures

influence the creation of the event model and the ensuing predictions.

This theory has several implications for music cognition in general and for the perception

of closure in particular. First, EST represents an automatic and unconscious model for

segmentation; it does not require conscious attention. While listeners can focus their attention on

segmenting music, it occurs spontaneously as well, as seen in brain imagining studies. For

instance, Knösche et al. (2005) asked musicians to determine whether a melody they heard

contained any note outside of the key while their brain waves were measured with an EEG. The

authors found that between 500 and 600 ms following the end of a phrase, there was a positive

wave spike (called a closure positive shift) in the electrical output of the brain structures that

guide memory and attention processes.42

The location of this closure positive shift suggests

increased attention following the end of a phrase, not merely resulting from the identification of

phrase boundaries.43

An extension of this study examined the difference between musicians and

non-musicians. Although both groups experienced a closure positive shift following the end of a

tonal phrase, Neuhaus, Knösche, and Friderici (2006) note differences between the musical

features that elicited this effect. This study manipulated the last chord implied by the melody

(either I or V), the length of the last note, and the length of the silence between phrases. Both

groups responded to all three markers, but musicians were more sensitive than non-musicians to

the implied harmony. The authors speculate that musicians, who had more stylistic knowledge of

closure, used top-down knowledge, while non-musicians relied more on sensory features.

Event segmentation occurs on multiple time scales simultaneously, creating a hierarchical

construction of the musical grouping structure. Hierarchical grouping structures are explored in

detail by Lerdahl and Jackendoff (1983), and empirical work by Krumhansl (1996) supports the

perceptual validity of hierarchically construed musical segments. In Krumhansl’s study,

participants designated section endings on three different time scales. First, they segmented the

entire first movement of Mozart’s Piano Sonata in E! Major, K. 282, followed by the first fifteen

measures of the movement, and finally just the first eight measures. As the musical excerpt’s

42

This closure positive shift is similar to the positive shift found at the end of spoken phrases in language

studies.

43 This might reflect the gating mechanism in EST. This cognitive control mechanism delegates more

processing resources (thereby increasing sensitivity to new input) at boundaries when the event model is updated.

79

length decreased, participants were told to decrease the grain size of the segmentation, and the

results indicated a high correlation between the responses for the large and small sections.

Boundaries at different segmentation grains imply varying degrees of continuation and closure.

EST posits that we can simultaneously hold event models on multiple time scales, which can

account for the feeling of closure at the level of a phrase even while the listener expects a

continuation of the piece: although a hypothetical event model is updated on the phrase level, a

separate event model could continue on the level of the entire composition.

Other studies specifically addressing the question of musical closure, rather than

segmentation in general, usually explore the knowledge structures guiding a listener’s experience

of closure. For instance, Rosner and Narmour (1992) played two pairs of chords and asked

participants to rate which pair seemed more closed.44

Unsurprisingly, they found that V-I was

rated as more closed than III-I, IV-I, or VI-I. Most likely due to the brief context, there was no

effect for soprano scale degree and only a weak effect for inversion. Tonic is usually described

as the goal of V; recall from Chapter 3 that this feeling of goal directedness is an artifact of first-

order probabilities. Because a tonic chord can go just about anywhere, a listener has no strong

expectations for the next event, creating a perceptional boundary.

While the findings from probe-tone studies are intended to measure melodic expectancy

and are usually used to support a hierarchical model of tonality, Aarden (2003) suggests that

these studies really examine the perception of closure. Aarden posits that due to the nature of the

design (a retrospective rating of a tone following a musical context) these studies really ask how

well a particular tone would complete that musical unit, reflecting learned schema of the

distribution of tones as the final note in a melody (ii-iii). From this perspective, probe-tone

studies support the model of closure based on expectations formed through statistical learning.

For instance, because $"is statistically more likely to conclude musical units, listeners rated $"as a

better fit to end the musical context compared to the other scale degrees.

44 Rosner and Narmour described closure to their participants as the degree of conclusiveness or satisfaction

of a musical ending. They also described the varying strength of closure by likening it to punctuation, “The most

strongly closed progressions says that a piece has finished. This is like the words, ‘THE END,’ at the conclusion of a

story. Less closed chord progressions in music act like a full stop (a period) at the end of a sentence, signaling that

one thought is complete and a new one will follow. Still less closed progressions behave like semicolons and tell

you that one complete thought will be followed by a closely related one. Just as a writer must use punctuation marks

correctly, a composer must get his or her signs of closure right” (390).

80

Hierarchical knowledge structures can also influence the perception of closure. Joichi

(2006) examines two related issues: (1) the perception of closure in small musical units in

relation to the larger hierarchical structure and (2) the variation of a particular cue’s influence on

the perception of closure among different hierarchical levels. In one study, Joichi divided binary-

form excerpts into four shorter units (usually evenly dividing the binary into fourths), which

were played either individually or grouped into longer contexts. The most robust finding was a

positive correlation between the listener’s rating of completeness and the length of the excerpt.

With a longer context, participants had more opportunity to anticipate the point at which the

excerpt would conclude, especially if the longer context had a cadential arrival occurring

halfway through. This cadence at the midpoint provided a fairly specific expectation for the

location of the eventual ending.

EST predicts that expectations for schematic knowledge structures, like the ones explored

above, will influence the perception of higher hierarchic levels, while changes in surface events

will dictate boundaries at lower levels. At the moment the schematic structure is completed, there

is a rise in uncertainty for subsequent events, initiating an update of the event model. As

discussed in the previous chapter, the more specific an expectation is for a particular ending, the

greater the change in expectancy levels after that ending occurs, and this amount of change

correlates with the strength of closure. Fulfillment of expectations for endings derived from

knowledge structures results in the feeling of anticipatory and arrival closure.

Changes in the musical surface can lead to an increase in prediction error. However, the

expectation for continuity (and other similarly deep schematic expectations) is very general in

nature and does not regularly lead to a feeling of finality at the end of a segment dictated solely

by changes in the musical surface. In EST, a musical boundary forms when the expectation for

continuity is violated; changes in sensory input initiate an updating of the event model.

Inexperienced listeners rely on these surface discontinuities more than seasoned musicians do

because they have not formed the knowledge structures necessary to segment their experience.

Nevertheless, for all listeners, a greater change in the musical surface will result in the creation

of a higher-level boundary. Retrospective closure is associated with changes in the continuity of

sound.

81

Experiment Overview

To explore the association between segmentation and expectation in the perception of

musical closure, I conducted a series of studies designed to test three main hypotheses of EST:

(1) musical experience is segmented unconsciously, hierarchically, and consistently among

subjects; (2) stylistic knowledge in the form of learned musical schemata influences a listener’s

perception of closure; and (3) boundaries are formed at moments of transient increases in

prediction error.

Experiment 1

Earlier empirical research has shown that listeners segment music consistently and

hierarchically (Deliège 1989; Krumhansl 1996). The purpose of this experiment is to replicate

these findings using the same methodology outlined in Zacks, Speer, and Reynolds (2009).

Listeners will be asked to indicate the end of both fine- and coarse-grained segments while

listening to string quartet movements by either Mozart or Bartók. In accordance with past

research, I hypothesize that listeners will segment the musical stream consistently and

hierarchically.

This study will also examine correlations between a set of musical features and perceived

boundaries. Along with correlations between the musical surface and event boundaries, I will

also see whether factors such as musical expertise or the order in which tasks are completed

apparently influence listeners’ perception of event boundaries. For instance, perhaps it is easier

to make decisions about larger boundaries after hearing the movement in its entirety, or

musicians might indicate boundaries at larger formal units more consistently than do non-

musicians. Participants more familiar with Bartók’s music, and with twentieth-century music in

general, might mark boundaries more consistently. While this study does not directly ask about

closure, it does establish where listeners perceive boundaries, and it may lend support for

applying EST to music.

Experiment 2

The perception of closure is contingent on a listener’s musical expectations, especially

those that anticipate the completion of a musical schema. As explored in previous chapters, these

expectations are formed through musical experiences. Event Segmentation Theory supports a

82

developmental story for the formation of these knowledge structures and how they can influence

an individual’s perception of closure. When confronted with a new style, a listener relies on

changes in the musical surface to segment the composition into smaller events, with bigger

changes resulting in a higher hierarchical boundary. With repeated exposure to the style,

statistical regularities allow the listener to develop expectations for how various hierarchical

levels of the piece should unfold in this style. When events include stylized signs of endings, a

listener becomes able to predict that an ending is about to occur.

This study will test the hypothesis that learned stylistic cues influence our perception of

closure. This study is in two parts. First, participants will listen to string quartet music during a

twelve-minute exposure period and will mark endings within each excerpt by pressing a key-

combination on the computer. Participants will be randomly assigned to one of two conditions:

one group will listen to excerpts by Bartók and the other group will listen to excerpts by Mozart.

In the second part of the study, participants will rate the degree of closure for two blocks of short

excerpts, one drawn from the Bartók quartet and the other from the Mozart quartet.

I expect that ratings for cadential paradigms in the Mozart excerpts will be higher than

those for the Bartók excerpts, based on a presumed greater listener familiarity with Mozart’s

style. While I do not expect a strong effect, I hope to see that participants exposed to as little as

twelve minutes of Bartók’s music will interpret closure in these works differently from the group

of participants who initially listened to Mozart. Even though studies have shown that participants

pick up on statistical regularities in auditory stimuli rather quickly, this study may not produce

robust differences between the conditions because it asks a slightly different question. Other

statistical learning studies ask whether a series of sounds form a grammatical entity based on an

exposure period. However, all of the testing excerpts in this task are grammatical entities in the

style they represent, and participants instead must make an interpretive judgment regarding the

suitability of an excerpt to end a musical unit in that style.

Experiment 3

The first two experiments examine two facets of EST: the segmentation of music and the

influence of learned musical schemata on the perception of closure. This experiment seeks to

support the theoretical claim that the perception of closure stems from being able to predict the

moment of the completion for a schematic unit, followed by a transient increase in prediction

83

error. In their examination of the segmentation of narratives, Zacks, Speer, and Reynolds (2009)

found that boundaries tended to occur when the activity in the narrative was rated as less

predictable. However, subjective ratings predictably did not account for all of the event

boundaries, especially those formed through a change in character, so the authors suggested the

need for a more objective measure of prediction performance (323–324).

In this study, instead of asking participants retrospectively to rate the predictability of a

musical segment, I will ask participants to predict the moment at which a musical unit will

conclude. I will then correlate these results with the listener’s perceived degree of closure.

Participants in this study will include both musicians and non-musicians, and all musical

excerpts will come from minuet movements of three Mozart string quartets (K. 156, K. 168, and

K. 173). After the prediction task, participants will hear excerpts from the same minuets in one

of two conditions: either in the order in which they occur in the movement or in a random order.

Participants will rate each excerpt’s strength of closure on a seven-point scale. Participants in the

ordered condition will see a schematic representation of the movement to help locate each

excerpt within the movement.

In the prediction task, I expect that musicians, who probably are more familiar with

Mozart’s compositional style, will be more successful than non-musicians at predicting phrase

endings. Since the minuets used for this study are all in binary form, participants will hear each

section of the movement at least twice, so I anticipate that all participants will make more

accurate predictions the second time through each section. Further, the predictability of a musical

unit’s conclusion may vary with the type of musical ending; for instance, a PAC may be more

predictable than a HC.

According to EST, endings that are better predicted should correlate with a higher rating

of closure. Also with the data from the rating task, I will look for a main effect for condition

(ordered vs. unordered excerpts), which might reveal that formal hierarchy can influence the

perception of closure. If ratings for the same excerpt vary widely between participants in

different conditions, my results might indicate that a schematic understanding of form

contributes to the sense of musical closure. As previously discussed, the strength of closure

informs the hierarchy of a composition; however, this study may instead demonstrate that

84

schematic knowledge of formal structures affects the listener’s perception of the strength of

closure.

85

CHAPTER 5

EXPERIMENT 1

This first study does not explicitly examine a listener’s perception of closure; rather, it

addresses what I consider a prerequisite to this larger issue by looking only at segmentation.

Event Segmentation Theory (EST), as discussed in Chapter 4, provides a model for the

perception of closure that is compatible with previous research in musical expectation and

segmentation. Experiments 1a and 1b only consider segmentation, to see whether listeners

segment music in a manner consistent with this theory. To test this, I adopted an experimental

paradigm previously used in studies to support EST.

The design for Experiment 1 is based on the segmentation task used in Zacks, Speer, and

Reynolds (2009), in which participants either read or listened to several narratives detailing the

everyday activities of a seven-year-old boy and were asked to divide the continuous narrative,

identifying points at which “one meaningful unit of activity ended and another began” (309).

Each participant read or heard every narrative twice, once to indicate the largest unit of

meaningful activity (coarse segmentation) and once to indicate the smallest unit of meaningful

activity (fine segmentation).45

The results indicated that participants segment narratives

hierarchically—with smaller units nested within larger units—and that conceptual changes in the

narratives can predict the presence of a boundary. In the terms of EST, the event model updates

at a conceptual change (like a change in temporal or spatial location), since change is less

predictable than continuity, and at the end of an event defined by pre-existing event models.

In my study, participants segmented two complete string quartet movements using a

similar segmentation task. In Experiment 1a, participants segmented two movements by Béla

Bartók, and in Experiment 1b, participants segmented two movements by Wolfgang Amadeus

Mozart. Consistent with Zacks, Speer, and Reynolds, I hypothesize that listeners will segment

the music hierarchically and that their segmentation will consistently correlate with various

musical features falling into two broad categories: arrival features, marking the end of a musical

segment; and change features, epitomized by Lerdahl and Jackendoff’s Grouping Preference

45

The order of tasks was counterbalanced between subjects.

86

Rules (GPRs) 2 and 3. Further, based on the literature examined in Chapter 4, I predict an

increased response time for the coarse-grained segmentation task compared to the fine-grained

task and that these higher-level boundaries will correlate with an increased number of change

features.

Method

Participants

In both studies, participants were divided into three groups based on their musical

expertise: non-musicians, first-year undergraduate music majors, and graduate/professional

musicians. There were 32 participants in Experiment 1a (14 non-musicians, 9 undergraduate

musicians, 9 graduate musicians) and 33 participants in Experiment 1b (14 non-musicians, 10

undergraduate musicians, 9 graduate musicians).

Stimuli

Two sets of stimuli were used in this study, one from twentieth-century non-tonal

practice and the other from the common-practice tonal idiom. In Experiment 1a, participants

listened to the third and fifth movements from Béla Bartók’s String Quartet No. 4; in Experiment

1b, participants listened to the fourth movement from Wolfgang Amadeus Mozart’s String

Quartet No. 19 in C major (K. 465) and the second movement from Mozart’s String Quartet No.

21 in D major (K. 575). Mozart’s music unquestionably exemplifies the common-practice style,

whereas Bartók’s music represents just one of many twentieth-century styles. Bartók’s style

tends to be relatively accessible to listeners unfamiliar or uncomfortable with non-tonal music

because he incorporates phrase lengths and formal divisions familiar from the common-practice

repertoire, and he usually provides a metrical framework. His String Quartet No. 4 particularly

epitomizes these stylistic characteristics.

One inherent difficulty in using pre-composed pieces of music in an experiment is that

many elements of the stimuli cannot be controlled. For instance, given the experiment’s reliance

on existing recordings, the exact length of the movement and tempo could not easily be

manipulated. To compensate, I used strict selection criteria. First, I wanted to use a genre and

instrumental group that is well established in both the common-practice and twentieth-century

repertoires, a criterion met by the string quartet. I only considered short string quartet movements

87

(lasting under five minutes) that included both a clearly distinguishable melody and also a

strongly articulated formal structure that could be divided into smaller phrases. Finally, I wanted

to include one fast movement and one slow movement in each study, so I specifically sought fast

and slow movements for both composers.

Bartók has the smaller repertoire (six string quartets compared with Mozart’s twenty-

three), so I began by selecting two movements by Bartók that fit my criteria before turning to

Mozart’s repertoire to find suitable pairs. Bartók’s slow third movement from his fourth string

quartet has characteristics of a theme and variation movement, which is then organized into a

larger ternary form. This movement significantly features the cello, which plays the opening

theme and subsequent variations, accompanied by sustained chords in the upper strings for the

first 34 measures. The matching Mozart movement (String Quartet 21, second movement) also

has a clear ternary construction and extensively features the cello (especially in mm. 38–50).

To match the sonata-like thematic construction of the fifth movement from Bartók’s

fourth quartet, I used the last movement of Mozart’s Quartet No. 19 (the “Dissonance” Quartet),

which is in sonata form.46

Although the two movements have vastly different characters, both

conclude longer works, suggesting that their respective composers considered their degree of

finality appropriately strong for the conclusion of a significant work. Presumably to enhance its

degree of closure, the last movement of Bartók’s string quartet harkens back to the first

movement, repeating its ending gesture at a much slower tempo. The last movement of Mozart’s

“Dissonance” Quartet features an extensive coda, ending with a recurring motive from the

secondary tonal area (STA). Even though the Mozart movements are not from the same quartet,

the two movements chosen are still suitable companions because both were written later in

Mozart’s life (No. 19 was composed in 1785 and No. 21 in 1789). Table 5.1 outlines the basic

characteristics of each movement.

Along with these four movements, I selected two shorter excerpts for practice before data

collection. Participants in Experiment 1a listened to another excerpt from Bartók’s fourth string

quartet, the first 49 measures of the first movement (lasting 1:43) as performed by the Emerson

String Quartet; participants in Experiment 1b listened to the third movement of Mozart’s String

Quartet No. 2 (K. 155) as performed by the Amadeus String Quartet. Both excerpts introduced

46

In order not to exceed the five-minute time limit, I chose a recording that did not repeat the exposition.

88

participants to the style of music used in the study and exemplified the grouping of phrases into

clearly differentiated sections.

Table 5.1: Musical Stimuli Characteristics for Experiments 1a and 1b

Bartók, No. 4,

mvmt 3

Bartók, No. 4,

mvmt 5

Mozart, No. 19,

mvmt 4

Mozart, No. 21,

mvmt 2

Experiment 1a 1a 1b 1b

Tempo marking Non troppo lento Allegro molto Allegro Andante

Time (in m:ss) 5:12 5:05 5:22 4:01

Performers47

Emerson String

Quartet

Emerson String

Quartet

Emerson String

Quartet

Amadeus String

Quartet

Number of Measures 71 392 419 73

Meter 4/4 2/4 2/4 3/4

Coding Procedure

The most practical way to accommodate variations in subject response time was to

examine only responses made in predetermined time “windows” in each movement.48

I analyzed

each movement and created two types of windows: Type 1 Windows, coinciding with a

meaningful arrival feature, and Type 2 Windows, determined by changes in the musical surface.

Listeners were free to indicate endings at any point, and I do not mean to suggest that these

windows are the only correct places to respond. The windows occur in locations that I thought

were likely dividing points because of their musical features, allowing me to explore how a set of

chosen features predicts listener responses resulting in a simplification of the data analysis.

Type 1 Windows begin with the onset of the beat containing the last note of a segment

and continue on until the beginning of the next musical segment, whether it is a section, phrase,

or subphrase. It would have been interesting to see if participants responded immediately to the

end of a formal unit or if they instead waited until the beginning of the next formal unit to make

a decision, but because I was unable to determine the length of time between the listener’s

47

The Bartók recordings are performed by the Emerson String Quartet on the CD Bartók: The String

Quartet (1988). Mozart’s String Quartet No. 19 is also performed by the Emerson String Quartet in their 2005

album Mozart String Quartets K. 465 “Dissonance”, 458 “The Hunt” & 421. Unfortunately, the Emerson Quartet

did not record the slow movement from Mozart’s String Quartet No. 21, so the recording used here was performed

by the Amadeus String Quartet in their 1988 recording Mozart: The String Quartets.

48 Alternatively, I could have followed Krumhansl (1996), who smoothed out responses over a two-beat

window in her data analysis of a similar data set. Because I am examining more features than Krumhansl and I am

not working from a MIDI source, using predetermined windows is a better option.

89

perception of an ending and his/her subsequent response, I included both the ending and

subsequent beginning in a single window. Type 2 Windows do not correspond with an ending,

but instead occur at some sort of change in the musical surface as defined by the “change

features” described below. I created these windows to begin with the onset of the beat before the

change occurs and to continue on for one to five beats, depending on the tempo (some windows

are shorter to avoid overlapping with another window).

Window beginnings always coincide with the beginning of the beat, even when the last

melodic note of the phrase begins off the beat (for instance, when Mozart includes a suspension).

Because listeners might conceivably respond to the harmonic arrival as opposed to the melodic

resolution, Type 1 Windows begin at the initiation of the goal harmony. When the melody is

presented in canon, listeners might respond to the melodic ending in the leading voice, so in

these cases the window begins with the last note of the leading voice. In all such cases, the

imitation was temporally close, and the ending of the following voice fell within the same

window.

Windows exist in various sizes, both within a movement and between movements. Table

5.2 presents the number of windows found in each piece as well as the average duration of these

windows (measured both in seconds and in beats). This variation in window length will not

affect the data analysis. While the windows are useful for identifying which features are present,

they are also useful for determining the extent to which the data is hierarchically constructed, as

well as for measuring the consistency of the responses between listenings.

Table 5.2: Window Construction in Each Movement

Movement Number of windows Average duration of

windows (high/low)

Average number of beats

in each window*

(high/low)

Bartók 3 43 3.53 s (13.7 s / 1.39 s) 2.77 beats (10 / 2)

Bartók 5 94 1.87 s (5.2 s / .69 s) 4.76 beats (12 / 2)

Mozart 19 77 1.61 s (3.11 s / .60 s) 4.01 beats (7 / 2)

Mozart 21 40 2.73 s (5.14 s / .92 s) 2.65 beats (5 / 1)

*not including the last window in each movement, which lasted until all the sound faded out

From these movements, I chose a set of musical features that could influence listener

segmentation and catalogued the presence of these features in each window. There are two

categories of features: features that define musical endings (arrival features) and features

90

corresponding to a change in the musical surface (change features). Each composer has a

different set of features, defined in Tables 5.3 and 5.4.49

Once defined, most of the musical

features can simply be catalogued from the music, but some rely on analytic interpretation.

Table 5.3: Arrival and Change Features in Bartók

Intervallic Direction Descent; Ascent: Melodic line has at least a two-note descending or

ascending figure

Intervallic Approach -1/-2; -3/-4; -5/-7; +1/+2: The distance in semitones to the final

melodic pitch of a segment measured from the previous melodic

pitch

Duration Change Compared to the preceding melodic sound, the last note of the

segment is longer or shorter

Arrival

Features

Cadences Falling fourth (4th

); Falling third (3rd

); Low-high Chord (LH);

Single Chord (1); Double Chord (2): Defined by common ending

gestures in these two movements50

Silence Complete Silence; Melodic Silence; Non-Melodic Silence: Silence

in the entire texture, just the melody, or in at least one non-melodic

instrument

Orchestration

Changes

New Instrument; New Melodic Instrument: A new instrument

joins the texture or a new instrument performs the melody

Register Change: The melodic line leaps up or down an octave

Dynamic Change: The melody is performed louder or softer

Change

Features

Other Changes

Ostinato Change: The underlying ostinato changes in pitch content

or rhythmic figuration

Most of the arrival features are pitch-centered structural features: the approach to the last

melodic note, the scale-degree of the last melodic note and its harmonic support (Mozart only),

and the presence of a cadential paradigm (defined by the repertoire). Duration change is included

in this category (as opposed to the change features) because a change in duration can punctuate

the end of a segment (so called “durational closure”).51

As stated in Chapter 3, listeners have

learned though previous experience with music which specific features define the end of a

musical unit, and when an expectation for an ending gesture is fulfilled, the listener experiences

anticipatory or arrival closure. While this study does not explicitly ask about closure, an

expectation-based view of closure would suggest that listeners more familiar with a particular

49

These lists do not represent an exhaustive list of features that could influence the responses to a musical

segmentation task.

50 The labels in the parenthesis are the labels used throughout this chapter to designate specific cadential

gestures in the Bartók. See the discussion regarding Examples 5.4–5.10 for more detail about these cadences.

51 See Narmour (1990) and Joichi (2006).

91

repertoire, or even the typical cadential gestures of a style, will consistently respond to these

features. These end-defining features roughly correspond with a “goal,” such as $, the tonic triad,

or the conclusion of a cadential gesture. EST predicts that our understanding of goals and

intentions helps us segment life experience on a coarser grain. Musical “goals,” which are merely

a metaphor resulting from a misattribution of the positive emotions experienced when a listener

makes a correct prediction, still have perceptual salience.52

Since determining the “goal” of a

musical unit is highly subjective, especially in the Bartók movements, which do not conform to a

widely shared syntax, these end-defining features as a group do not necessarily signify goals, but

a listener could interpret some of them as goals.

Table 5.4: Arrival and Change Features in Mozart

Scale-Degree $;"(;"-; and")"or"& : scale-degree of the last melodic note of the

musical segment as defined by the local tonal context

I; V; V7: Last harmony of a musical segment as defined by the local

tonal context

Harmony/Harmonic

Progression

V-I; x-V: Motion into the last harmony of a musical segment as

defined by the local tonal context

Descent; Ascent: Melodic line has at least a two-note descending or

ascending Figure into the final note of a segment

Intervallic Direction

Leading Tone Ascent to Tonic

Step Descent; Step Ascent: Melodic line has three or more notes

descending or ascending by diatonic step

Steps/Embellished

Steps

Embellished Step Descent; Embellished Step Ascent: Melodic line

has three or more notes descending or ascending by diatonic step

with surface embellishment

Duration Change Compared to the preceding melodic sound, the last note of the

segment is longer or shorter

Arrival

Features

Cadences PAC; IAC; HC; Evaded Cadence: Defined by tonal cadential

paradigms.

Silence Complete Silence; Melodic Silence; Non- Melodic Silence: Silence

in the entire texture, just the melody, or in at least one non-melodic

instrument

Orchestration

Changes

New Instrument; New Melodic Instrument: A new instrument

joins the texture or a new instrument performs the melody

Register Change: The melodic line leaps up or down an octave

Change

Features

Other Changes

Dynamic Change: The melody is performed louder or softer

52

As explored in Chapter 3, Huron (2006) suggests that the feeling of finality or repose associated with

closure is an artifact of correctly anticipated endings. These expectations result from learned transitional (first-order)

probabilities.

92

In both Mozart movements, the local tonal area (not the overall key of the movement)

determines scale-degree designations and harmonic and cadential labels. For instance, in the

exposition from the C-major “Dissonance” Quartet, the STA turns from G Major (V) to

tonicize E!"Major (!VI in the context of G major)—see Example 5.1. Even though there is no

cadence in E!, the I-V-I progression in that key clearly implies E! as a local tonic in this short

passage. Therefore, the melodic G4 that ends the opening subphrase of the longer phrase is

interpreted as (.

Example 5.1: Mozart, String Quartet No. 19, fourth movement, mm. 89–93 The Type 1 Window is annotated on the score with a solid box

In both the Mozart and the Bartók analysis, the directional approach into the last melodic

note of a musical unit is easily catalogued from the musical surface—either a descent or ascent.

In the case of Example 5.1, there is a melodic descent into ( from the preceding B!. Specific

ordered pitch intervals in the Bartók movements are also determined from the musical surface.

These particular intervals were chosen because of their prevalence at endings in both movements

and roughly correspond to a downward leap, a smaller downward skip, and a stepwise ascent or

descent.

Step progressions in the Mozart analysis only consider a surface stepwise ascent or

descent, but often a mid-range step progression is decorated with embellishing tones. After

removing these tones, if the resulting melodic line moves three diatonic steps up or down in its

approach to the last note of a musical segment, then it is classified as an embellished step

progression. The two annotated examples below show cadential arrivals from the fourth

movement of Mozart’s String Quartet No. 19. Example 5.2 (mm. 67–70) shows an approach to a

89

93

PAC in G major, which arrives on the downbeat of m. 69. The annotations highlight members of

the underlying step-progression (which is embellished by a series of escape tones) by circling

notes involved in the stepwise descent. The final note of the phrase is approached from above,

but does not appear to include three diatonic steps leading to the cadence until a layer of

embellishment is removed, so this phrase ending only has a melodic descent and an embellished

step descent. In addition to both of those features, Example 5.3 (mm. 76–78) also has a stepwise

descent as the sixteenth notes cascade down to the cadential G4 in the first beat in m. 77. In this

case, the embellished descent connects the B4 and A4 in m. 76 with the final G4 of the phrase.

Example 5.2: Mozart, String Quartet No. 19, fourth movement, mm. 67–70 The Type 1 Window is annotated on the score with a solid box.

Example 5.3: Mozart, String Quartet No. 19, fourth movement, mm. 76–78 The Type 1 Window is annotated on the score with a solid box.

Cadential gestures vary between repertoires. In the Mozart movements, cadences are

defined by standard harmonic and melodic paradigms. A perfect authentic cadence (PAC) is

narrowly defined as a root-position V-I progression ending with $ in the melody, which is

76

67

94

contrasted with the more broadly defined imperfect authentic cadence (IAC)—a cadential V-I

progression in which either chord (or both chords) may not be in root position or, more likely,

the melody does not conclude on $. A half cadence (HC) ends a formal unit with a V chord. An

evaded cadence is not technically a cadence; rather, it is the denial of cadential expectation. Here

this term encompasses both the deceptive cadence (typically a V-vi progression) and a weakened

authentic cadence. An evaded PAC is illustrated in Example 5.4, where Mozart denies a perfect

authentic arrival in m.16. Instead of landing conclusively on a root-position tonic harmony, the

bass voice slips to a I6 chord, the cello revisits the melody from mm. 13–14, and the top voice

takes up an accompanimental texture. The cadential expectation set up by the dominant chord in

m. 15 is finally resolved in m. 19.

Example 5.4: Mozart, String Quartet No. 21, second movement, mm. 15–20 The Type 1 Windows are annotated on the score with a solid box.

The Type 2 Windows are annotated on the score with a dashed box.

To clarify, the presence of these harmonic and melodic paradigms does not necessarily

signify a cadence. I agree with Caplin’s 2004 definition of cadence in the Classical style, which

describes cadence as a syntactic ending to mid-level formal units. In Caplin’s words, “a cadence

must end something” (56), and that something is usually a phrase. However, the moment of

cadential articulation may or may not coincide with the conclusion of a phrase, an issue explored

in more detail below.

The cadential gestures in the Bartók movements are not representative of a wider

twentieth-century style or even of cadences in Bartók’s own oeuvre. So, in this context, a

cadence is a movement-specific gesture that concludes a mid-level formal unit. In the third

15 19

95

movement, a descending fourth gesture concludes many of the variations. This melodic Figure is

usually presented with a long-short durational pattern; see m. 21 in Example 5.5 (the phrase ends

on the third beat of m. 21). A lesser-used cadence in this movement is a descending minor third

(illustrated in Example 5.6), which is used extensively in the fifth movement (Example 5.7).

Also in the fifth movement, Bartók uses a multi-voiced chord in all the instruments as a

concluding gesture, and this assumes several guises throughout the movement. At times it is

presented as a single chord, as in the cadential arrival in m. 332 (Example 5.8). More often, a

held lower note precedes the single chord. This lower note is usually presented in unison, for

instance in mm. 280–281 (not serving a cadential function in this passage), but it also takes other

forms like in m. 75 (see Examples 5.9 and 5.10). Another variation of this cadential type is

articulating the chord twice following a long held note (mm. 283–284, also in Example 5.10).

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission.

Example 5.5: Bartók, String Quartet No. 4, third movement, mm. 20–23 (Falling fourth — 4th

) The Type 1 Window is annotated on the score with a solid box.

The Type 2 Windows are annotated on the score with a dashed box.

20

96

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.6: Bartók, String Quartet No. 4, third movement, mm. 40–41 (Falling third — 3rd

) The Type 1 Window is annotated on the score with a solid box.

The Type 2 Windows are annotated on the score with a dashed box.

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.7: Bartók, String Quartet No. 4, fifth movement, mm. 235–239 (Falling third — 3rd

) The Type 1 Window is annotated on the score with a solid box.

40

235 237

97

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.8: Bartók, String Quartet No. 4, fifth movement, mm. 330–332 (Single chord — 1) The Type 1 Window is annotated on the score with a solid box.

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.9: Bartók, String Quartet No. 4, fifth movement, mm. 74–76

Single chord preceded by lower dyad — LH-1 The Type 1 Window is annotated on the score with a solid box.

330

74

98

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.10: Bartók, String Quartet No. 4, fifth movement, mm. 279–284

Single chord preceded by single pitch class — LH-1 (not a cadence here);

Double chord preceded by single pitch class — LH-2 The Type 1 Window is annotated on the score with a solid box.

The Type 2 Window is annotated on the score with a dashed box.

For duration changes, the notated rhythmic value of the last note of a segment had to be

longer or shorter than the note immediately preceding it. A similar procedure was used to

determine a change in register or dynamics, where the feature is present if there is a notated

octave leap or change in dynamics. For practical reasons I did not interpret either feature,

determining both through a score-based analysis. Although the leap to the F. following the

cadence in Example 5.3 lands on an embellishing tone that resolves upward to the G, it is not

interpreted as a register change because on the surface of the music it is a major seventh leap, not

an octave leap. Dynamic change was determined through a score-based analysis, only relying on

the composer’s written directions for dynamics, which the performers conveyed faithfully. While

there are varying degrees of change for all three features, I decided to treat them as binary

features, indicating only whether such a change was present. The last two features, register and

dynamic change, usually don’t correspond to a musical ending; instead, they belong to the

second category of features—ones that contribute to an acoustic change.

Along with register and dynamic change, the other features (silence, texture change, and

orchestration change) indicate some change in the musical surface and can possibly elicit

279

99

retrospective closure by indicating a new beginning or the space between an ending and a

subsequent new beginning. Even though listeners were instructed in this study to indicate

musical endings, they might not realize an ending had occurred prior to the onset of a new

beginning. EST, and more specifically the segmentation study by Zacks, Speer, and Reynolds

(2009), suggests than an increased number of changes in a stimulus will correlate with an

increased likelihood of segmentation, especially on a coarser grain of segmentation.

The absence of sound is one of these acoustic changes that could influence the

segmentation task. I distinguish complete silence from melodic silence and from non-melodic

silence, although certainly these categories are interrelated. Both melodic and non-melodic

silence occur in moments of complete silence, so I reserve these terms for instances in which not

all instruments are silent. Referring back to Examples 5.2 and 5.5, complete silence occurs

immediately following the cadence (the black line in the Bartók example indicates a caesura),

while only melodic silence occurs after the PAC in Example 5.4. Melodic silence can occur

simultaneously with non-melodic silence, though. Following the PACs in Examples 5.3 and 5.4,

at least one non-melodic instrument temporarily drops out, thinning the texture. In contrast to

silence, the addition of an instrument thickens the texture. This change can coincide with a

change in orchestration, where the new instrument becomes a melodic force (as in Example 5.4,

m. 16).

For each window, I recorded whether a particular feature was present. Arrival features

always occur at the beginning of a window coinciding with the last note of a musical segment

(Type 1 Window) or the last note before a change (Type 2 Window). Change features occur later

in the window; in a Type 1 Window they follow the ending and coincide with the new beginning,

while they occur before the second beat is completed in a Type 2 Window.

The presence of a musical ending was determined through my own analysis, which

reflects a more complex interaction between the various arrival features (going beyond merely

cataloguing their presence), and my analysis also represents a possible well-formed hierarchal

grouping structure for each movement. I analyzed each movement for the location of subphrase,

phrase, and section endings, which determined the placement of Type 1 Windows.53

Since I am

admittedly bringing my own musical experience and bias into this study, I will briefly outline

53

The presence of these formal endings was also coded with the other features.

100

how I created a three-level grouping structure, by determining the end of subphrases, phrases,

and sections for each movement.

Because I decided that all grouping structures must be well-formed, as defined by

Lerdahl and Jackendoff (1983), and I was working with a limited vocabulary for data analysis

purposes, some of my analytic designations use these terms—especially “subphrase”—in non-

traditional ways.54

While the hierarchical relationships between sections, phrases, and subphrases

remain constant between composers, some of the defining features of these units vary. Also,

determining the type of hierarchical ending represents my own analytic interpretations (even

more so than evaluating the features already mentioned), and does not represent an objective

measure of the phrase structure.

For the Mozart compositions, my definition of a “phrase” conforms to current analytic

understanding of this term: a formal musical unit consisting of a beginning, middle, and end,

most of the time concluding with a cadential gesture. Following Caplin (2004), who disentangles

phrases and cadences (allowing for phrases to exist without cadential punctuation), a phrase is

not necessarily completed immediately following a cadential gesture, for a phrase also

encompasses any phrase extensions that follow the cadence. I use “subphrase” to describe formal

units smaller than a phrase, and in order to have a well-formed hierarchical analysis, a phrase

must be divided into subphrases either completely or not at all. I use this designation for both

parts of the presentation as well as for the continuation of a sentential formal structure, for legs in

a sequence, for the material between the cadential arrival and the end of a phrase (i.e., an

external phrase extension), and for introductory material preceding the beginning of a phrase

(i.e., a prefix), although the “subphrase” label might not be completely apt in every case. While I

could have expanded my analytic vocabulary, acknowledging each feature individually, grouping

these features together under “subphrase” simplifies the data analysis. A “section” describes

formal units larger than a phrase. Not every unit larger than a phrase received the designation of

a section; instead, sections unite areas of the movement that share the same formal function, the

same key, and related melodic ideas.

Formal designations in Bartók are more open to interpretation since there is not a widely

agreed upon definition of “phrase” in this repertoire. Given that cadential arrival, as narrowly

54

A list of Lerdahl and Jackendoff’s well-formedness rules can be found in Chapter 4, Table 4.1.

101

understood from common-practice style, is absent, I decided to analyze these works using a top-

down approach, starting with large sections. I first divided each movement into large sections,

grouping together music that is thematically related, shares the same pitch-collection, and

implies the same formal function. I then divided the sections into phrases, units that seemed to

have a beginning, middle, and end. In many cases these units ended with a shared musical

gesture that could arguably be described as “cadential” because of its prevalence at endings.

Finally, if there seemed to be any internal divisions in the phrase, I then further divided the

phrases into subphrases. Like in the Mozart analysis, “subphrase” was also used to describe

music that functions as a prefix or suffix.

Participant Procedure

After giving informed consent, participants were assigned to one of two conditions,

which determined the starting task. In order to differentiate between the coarse and fine

segmentation tasks while avoiding unfamiliar technical vocabulary, I described the tasks to all

participants using a linguistic analogy. The directions stated:

In language, sentences group together to form paragraphs. The same is true in music,

where smaller sentence-like phrases are combined to form larger paragraph-like sections.

In this task, you will hear the same piece of music four times. The first time, you will

press the SPACEBAR every time you hear the end of a PARAGRAPH-LIKE SECTION.

Because you may change your mind about the location of the boundaries, you will repeat

this activity in the second listening. In the third and fourth listenings, you will indicate

the end of sentence-like phrases.55

After the participants read the instructions and were given the opportunity to ask

questions, they began with a practice task on a short excerpt before segmenting the actual

stimuli. Participants in Experiment 1a listened to the first 49 measures of the first movement of

Bartók’s String Quartet No. 4, and participants in Experiment 1b listened to the third movement

of Mozart’s String Quartet No. 2 (K. 155). The first time through, they performed the

segmentation task as dictated by their condition. If participants performed the task in a manner

that demonstrated understanding of the instructions (designating between 8 and 15 fine divisions

or between 3 and 7 coarse divisions) they continued on to segment the practice excerpt again

using the other grain of division. Subjects who did not achieve the needed number of responses

55

This order was changed in the other condition.

102

received feedback and additional instruction before performing the same segmentation task

again. After the minimum requirements were met in both tasks, subjects were given the

opportunity to ask any additional questions before moving on to the actual test.

While listening to the entire movement, participants in the coarse segmentation condition

first indicated event boundaries delineating groups of phrases and formal sections, while

participants in the fine segmentation group first indicated boundaries for shorter events, such as a

phrase or subphrase. Because segmenting music in real time could be a difficult task, the

participants immediately repeated the task before switching conditions and performing the other

task on the same movement. Thus, all participants listened to each movement four times,

performing both the coarse and fine segmentation tasks twice on each movement. Everyone

listened to both movements through headphones and indicated event boundaries by pressing the

space bar on a computer, which recorded the times for these key presses. Participants in

Experiment 1a listened to the third and fifth movements of Bartók’s String Quartet No. 4, while

participants in Experiment 1b listened to the fourth movement of Mozart’s String Quartet No. 19

(K. 465) and the second movement of Mozart’s String Quartet No. 21 (K. 575).56

Following this

task, participants completed a questionnaire documenting musical experience and familiarity

with the compositions.

Results

Each subject had four trials with each movement, which I will identify as Fine 1, Fine 2,

Coarse 1, and Coarse 2 (the number indicates whether it was the first or second time the

participant performed that particular segmentation task on the given movement). Since there is

no limit to the number of responses an individual could make, nor can I reliably determine

listener response time (i.e., the time between a given feature and the key press), the data analysis

only examines presses that occurred within the predetermined windows. The dependent variable

is a binary variable indicating whether the participant responded within a particular window

(scored as 1) or not (scored as 0).

I used two different types of mixed models regressions in this analysis section. Both

regression types take into account the fact that individual participants may respond differently

56

The order in which the movements were heard was counterbalanced between participants.

103

during the task and allow for the assessment of variables such as whether the participant was a

musician or a graduate student, along with the start tempo (fast or slow) and start segmentation

(fine or coarse) for each participant. Most analyses used a mixed logit model (i.e., a mixed-

models logistic regression) to analyze the binary response variable. This regression is used to

predict the odds of a participant responding to a feature of the stimulus (or any other independent

variable), and this information is conveyed by the odds ratio (OR) value. Odds ratios of 1.0

indicate that the odds of a response are as likely when a specific feature or variable is present as

when it is not. Odds ratios greater than 1.0 indicate that the odds of a response increase when the

feature is present, while odds ratios less than 1.0 indicate that the odds of a response decrease

when the feature is present. While odds ratios can’t be less than zero, there is no upper bound for

these ratios. The other type of mixed models regression was used to analyze continuous

dependent variables, such as response time. This regression does not produce an odds ratio, but

rather a series of coefficients showing the weight of each variable on the outcome as predicted by

the regression equation.

For this study, only results significant at p < 0.05 could be included in the results; indeed,

most of the discussion will highlight results significant at p < 0.02 to avoid over-interpreting

spurious results. Analysis of interactions will focus on the apparent influence of musical training

(specifically, whether the subjects were non-music majors, music majors, or post-graduate

musicians). For interactions significant at p < 0.02, I ran an ANOVA to determine the direction

of the interaction. In these interactions, I usually compare two groups of participants, one with a

higher level of musical training to one with a lower level of training. When I compare musicians

to non-musicians, the “musicians” group includes both graduate and undergraduate musicians,

but when I compare graduates to undergraduates, the undergraduate group includes both the

undergraduate musicians and the non-musicians (all of whom were undergraduates). I labeled the

ANOVA tables throughout the chapter with the headings “Less Musical Training” and “More

Musical Training” to distinguish between these types of groups. I interpreted the direction of the

interaction by comparing the change between the means for instances when a particular feature is

present and instances when it is not for each subject group.

For each composer, I examine how well the Fine 1 and Coarse 1 conditions predict the

Fine 2 and Coarse 2 responses respectively (i.e., within-subject consistency) and how well the

104

Coarse 1 and Coarse 2 responses predict the Fine 1 and Fine 2 responses (i.e., nested lower

levels). For each composer, I also observe the influence of tempo and segmentation task on

latency (i.e., the delay between the beginning of a window and the subject’s response). Then, for

each individual movement, I use a series of mixed logit regressions to explore how well the

coded arrival features, change features, and formal endings outlined earlier in this chapter predict

the responses. These data cannot conclusively indicate whether subjects are responding to these

musical features; rather, they represent the probability of a response given the presence of a set

of features.

General Results

To get an overall picture of the responses in these movements, Figures A.1 through A.4

in Appendix A tally the total number of responses associated with each beat. The red line

indicates responses in the Fine 2 trial, while the dashed purple line shows the Coarse 2 trial. The

distinct peaks and valleys in all four movements imply that listeners were responding to musical

features consistently, rather than just pressing the spacebar in a random manner. For clarity, I

have labeled the peaks with the measure number and beat in the measure where it is located.

Notice that the dashed-purple line peaks (coarse segmentation) tend to match up with the red

peaks, suggesting a nested hierarchical structure. For participants segmenting the Mozart

movements (A.3 and A.4), the peaks and valleys are more sharply articulated than they are for

the Bartók movements, suggesting more consensus among these participants. All participants

tend to take more time to indicate coarse boundaries; generally these boundaries occur slightly

later than do the boundaries in the fine condition. Overall, as expected, there are far fewer coarse

segmentation responses than fine segmentation responses.

105

Table 5.5: Total Number of Responses and Percentage Used in Data Analysis (Bartók)

Movement Trial Subject Group Total number of

responses

Average number of

responses

Percentage of

responses in a

window

Fine 1 Non-musicians 420 30.00 64.76%

Undergraduates 186 20.67 78.49%

Graduates 109 12.11 95.41%

Total 715 22.34 73.01%

Fine 2 Non-musicians 369 26.36 65.58%

Undergraduates 192 21.33 71.35%

Graduates 104 11.56 93.27%

Total 665 20.78 71.28%

Coarse 1 Non-musicians 111 7.93 88.29%

Undergraduates 62 6.89 96.77%

Graduates 48 5.33 93.75%

Total 221 6.91 91.86%

Coarse 2 Non-musicians 96 6.86 93.75%

Undergraduates 50 5.56 96.00%

Graduates 44 4.89 100.00%

Total 190 5.94 95.79%

Bartók,

No. 4

Mvmt. 3

Total 1791 55.97 77.22%

Fine 1 Non-musicians 453 32.36 83.00%

Undergraduates 256 28.44 83.98%

Graduates 252 28.00 88.49%

Total 961 30.03 84.70%

Fine 2 Non-musicians 502 35.86 79.48%

Undergraduates 337 37.44 81.90%

Graduates 278 30.89 88.13%

Total 1117 34.91 82.36%

Coarse 1 Non-musicians 146 10.43 82.88%

Undergraduates 92 10.22 84.78%

Graduates 92 10.22 89.13%

Total 330 10.31 85.15%

Coarse 2 Non-musicians 121 8.64 85.95%

Undergraduates 71 7.89 85.92%

Graduates 76 8.44 80.26%

Total 268 8.38 84.33%

Bartók,

No. 4

Mvmt. 5

Total 2676 83.63 83.74%

While the figures in Appendix A show every response made in the Fine 2 and Coarse 2

trials, I am only considering responses that fell inside one of the predetermined windows for data

analysis, discarding responses not meeting this requirement. Tables 5.5 and 5.6 show the total

number of responses in each trial, divided by subject group, and the percentage of those

responses that fell into a window. A high number of responses was retained in all four

movements. These responses indicate a general trend: as musical expertise increases, participants

106

make fewer responses during the segmentation task, and, a majority of the time, greater musical

expertise also correlates with a higher percentage of the responses falling in the predetermined

windows.

Table 5.6: Total Number of Responses and Percentage Used in Data Analysis (Mozart)

Movement Trial Subject Group Total number of

responses

Average number of

responses

Percentage of

responses in a

window

Non-musicians 713 50.93 65.08%

Undergraduates 496 49.60 76.01%

Graduates 270 30.00 85.19%

Fine 1

Total 1479 44.82 72.41%

Non-musicians 665 47.50 64.06%

Undergraduates 500 50.00 74.80%

Graduates 315 35.00 88.25%

Fine 2

Total 1480 44.85 72.84%

Non-musicians 218 15.57 82.11%

Undergraduates 184 18.40 80.43%

Graduates 113 12.56 69.91%

Coarse 1

Total 515 15.61 78.83%

Non-musicians 121 8.64 85.95%

Undergraduates 197 19.70 78.17%

Graduates 125 13.89 82.40%

Coarse 2

Total 443 13.42 81.49%

Mozart,

No. 19

Mvmt. 4

Total 3917 118.70 74.44%

Non-musicians 163 11.64 76.07%

Undergraduates 230 23.00 88.26%

Graduates 136 15.11 88.97%

Fine 1

Total 529 16.03 84.69%

Non-musicians 316 22.57 78.80%

Undergraduates 230 23.00 86.96%

Graduates 130 14.44 94.62%

Fine 2

Total 676 20.48 84.62%

Non-musicians 128 9.14 79.69%

Undergraduates 82 8.20 84.15%

Graduates 37 4.11 78.38%

Coarse 1

Total 247 7.48 80.97%

Non-musicians 126 9.00 77.78%

Undergraduates 73 7.30 86.30%

Graduates 39 4.33 79.49%

Coarse 2

Total 238 7.21 80.67%

Mozart,

No. 21

Mvmt. 2

Total 1690 51.21 83.55%

107

Using these data, the first set of mixed logit regressions demonstrates how well one set of

subject responses predicts another set of responses.57

Results indicate that participants are

consistent in their responses between trials in the same condition. Across both Bartók

movements, Fine 1 and Coarse 1 responses predict Fine 2 and Coarse 2 responses, respectively

(t(4251) = 6.19, p < 0.001, OR = 3.24 and t(4251) = 8.25, p < 0.001, OR = 20.4). The same trend

occurs across both Mozart movements (Fine: t(3757) = 5.76, p < 0.001, OR = 5.57; Coarse:

t(3757) = 12.73, p < 0.001, OR = 25.16). In both of these cases, the odds ratios suggest that

participants who respond in a particular window the first time through the piece are more likely

to respond in that window the second time through the piece. In both the Bartók and Mozart

conditions, the difference in odds ratios between the fine task and coarse task is significant,

indicating more consistency in the coarse condition, but the Bartók responses are not

significantly different from the Mozart responses.58

Only in the Mozart condition, are there two significant interactions. First, musicians in

the fine condition are more likely to respond in a window within which they had responded

previously. This effect is even stronger for graduates, suggesting that an increase in musical

training correlates with increased consistency (see Figure 5.1). This effect is only in the fine

condition; there is no interaction in the coarse condition. Second, the starting segmentation task

influenced consistency in the coarse condition: participants who began with the fine task tended

to be more consistent in the coarse condition (Figure 5.2). This could reflect a learning effect

since the coarse condition was the third and fourth listenings for these participants, but a similar

interaction was not found in the Bartók condition. Instead, the use of consistent cadential

paradigms at the end of fine divisions in the Mozart stimuli might facilitate the formation of

larger sections, suggesting a bottom-up approach to determining formal sections.

57

In all of these mixed logit analyses, the odds ratio shows the probability of a response when a given

variable is present. In other words, the presence of a variable can predict the occurrence of a response. In this

particular case, I am treating the presence of a response in another trial as a variable to see if one response can

predict another. Used this way, “predict” is not time sensitive: the data from a later response can predict the presence

of a response in an earlier trial.

58 Odds ratios are said to be significantly different if one odds ratio does not fall within the confidence

interval of the other odds ratio. In short, I am saying that I am 95% confident that these ratios are different from one

another. In the Bartók analysis the confidence interval for the fine odds ratio is (2.232, 4.698) and the confidence

interval for the coarse odds ratio is (9.969, 41.763), while in the Mozart analysis the confidence interval for the fine

odds ratio is (3.11, 10.00) and the confidence interval for the coarse odds ratio is (15.31, 41.33).

108

In order for the resulting subject analysis to be hierarchically constructed, every coarse

response should correspond with a fine response, but obviously not vice versa. In both composer

conditions, Coarse 1 is a significant predictor for Fine 1, while Coarse 2 is a significant predictor

of Fine 2.59

In the Bartók analysis, Coarse 1 and 2 responses are strong predictors of Fine 1 and 2

responses (t(4251) = 7.25, p < 0.001, OR = 4.52 and t(4251) = 3.00, p = 0.003, OR = 2.54),

indicating that the fine responses are nested within the coarse responses. The odds ratio for the

first trial in each condition is slightly higher than that of the second trial, but this difference is not

significant. The Mozart analysis shows the same trend: Coarse 1 responses significantly predict

Fine 1 responses (t(28) = 4.248, p < 0.001, OR = 3.69), and Coarse 2 responses significantly

predict Fine 2 responses (t(28) = 3.04, p = 0.005, OR = 4.44). As with the Bartók analysis, the

difference between odds ratios is not significant.

For the participants in the Bartók condition, the starting segmentation task significantly

interacted with the coarse responses, where the ability of the coarse responses to predict the fine

responses varies based on the starting segmentation task. The direction of the difference is

represented by the estimated means shown in Table 5.7. In both cases, the difference between the

estimated means for subjects beginning with the coarse segmentation task is significantly higher

than for the subjects who began with the fine segmentation task, so the coarse responses are

better predictors of the fine responses when participants start with the coarse segmentation task.

These participants are therefore more likely to have their fine segmentation responses nested

within the coarse responses. Unlike in the Mozart condition where starting with the fine

segmentation task produces more consistent results, participants had an advantage in the Bartók

condition when they began with the coarse segmentation task.

The segmentation task and tempo of the movement affected participant response time

across the board. Response latency was measured from the beginning of each window to the time

at which the subject responded, and it varies significantly between different tempos and

segmentation tasks (Table 5.8). Segmentation task and tempo were coded as binary variables

where 0 represents the fine segmentation task and a fast tempo and 1 represents the coarse

segmentation task and a slow tempo. Both coefficients are positive, indicating that subjects were

59

I did not examine whether Coarse 2 predicts Fine 1 or whether Coarse 1 predicts Fine 2 because this

would compare the first trial in one condition with the second trial in the other condition.

109

slower to respond in the coarse segmentation task or while listening to the movement with the

slower tempo. The first result confirms the observation made previously that coarse responses

tend to occur later than fine responses (refer to Appendix A, Figures A.1-A.4). The latter result is

also unsurprising because the response windows are almost twice as long in the slow

movements, providing the opportunity for a longer response time.

Figure 5.1: Interactions between Subject Group and Consistency (Mozart) Each line connects the mean number of responses in the Fine 2 trial that do not occur in the same window in both

trials to the mean number of responses that do occur in the same window in both trials.

Figure 5.2: Interaction between Starting Condition and Consistency (Mozart) Each line connects the mean number of responses in the Coarse 2 trial that do not occur in the same window in both

trials to the mean number of responses that do occur in the same window in both trials.

110

Table 5.7: ANOVA Means for Interactions between Starting Task and the Nested Structure

(Bartók)60

Start with Fine Start with Coarse Outcome

variable61

Feature62

Feature

Absent

Feature

Present

Feature

Absent

Feature

Present

p-value63

Fine 1 Coarse 1 0.268 0.621 0.241 0.705 0.019

Fine 2 Coarse 2 0.324 0.687 0.232 0.745 0.008

Table 5.8: Mixed Models Regression Analysis: Latency Time

Composer Fixed Effect Coefficient Standard

error t-ratio

Approx.

d.f. p-value

Bartók Intercept64

976.716640 39.947454 24.450 27 <0.001

Segmentation 1011.440535 170.049966 5.948 3582 <0.001

Tempo 1447.353184 155.966148 9.280 3582 <0.001

Mozart Intercept 1002.587521 64.547880 15.532 28 <0.001

Segmentation 338.847945 53.545578 6.328 4500 <0.001

Tempo 711.824574 101.598533 7.006 4500 <0.001

The next set of analyses examines whether the probability of a segmentation response

increases as the number of changes in the music increases. For this analysis, I counted the

number of changes occurring in each window. The features included in this count are: complete

silence, melodic silence, non-melodic silence, entrance of a new instrument, and a change of

register, dynamics, or ostinato.65

In the Bartók movements, the number of changes in a given

window ranges from 0–7, but windows with more than four changes are grouped together

because there are relatively fewer windows with more than four changes, resulting in a scale

60

These means are not the same estimated means produced by the regression, but they still indicate the

direction of the interaction.

61 Also known as the dependent variable; it is the variable that the regression predicts.

62 In this case, the means represent the number of windows in which there is a fine response, but not a

coarse response (feature absent) and the number of windows in which there is a fine response and a coarse response

(feature present).

63 In all of the ANOVA tables in this chapter, the p-values come from the mixed models regression, not the

actual ANOVA.

64 The intercept is needed for the regression equation and it does not represent anything meaningful (it is

the point where the line crosses the y-axis when the other variables are not included in the equation).

65 Change of ostinato only occurred in the Bartók stimuli.

111

from 0–4. The Mozart windows have fewer changes (only ranging from 0–4), so windows with

three or more changes are grouped together, forming a scale from 0–3. The results from a mixed

logit regression predicting the presence of a listener-perceived boundary from the number of

changes in a window are shown in Table 5.9. A positive coefficient indicates a higher probability

of a boundary, and the p-value indicates whether an increase in the number of changes is a

statistically significant predictor of the observed behavior.

Table 5.9: Mixed Logit Regression Analysis: Number of Changes

Composer Outcome

variable Coefficient

Standard

error t-ratio

Approx.

d.f. p-value

Odds

Ratio

Confidence

Interval

Bartók Fine 1 0.280069 0.049584 5.648 4251 <0.001 1.323221 (1.201,1.458)

Fine 2 0.093373 0.089227 1.046 4251 0.295 1.097871 (0.922,1.308)

Coarse 1 0.300257 0.081595 3.680 4251 <0.001 1.350206 (1.151,1.584)

Coarse 2 0.624884 0.062921 9.931 4251 <0.001 1.868029 (1.651,2.113)

Mozart Fine 1 0.276388 0.053886 5.129 3757 <0.001 1.318359 (1.186,1.465)

Fine 2 0.180791 0.109938 1.644 3757 0.100 1.198165 (0.966,1.486)

Coarse 1 0.440043 0.103856 4.237 3757 <0.001 1.552774 (1.267,1.903)

Coarse 2 0.631480 0.076264 8.280 3757 <0.001 1.880391 (1.619,2.184)

In the Bartók responses, in three of the four trials, the amount of change significantly

predicts the subject responses. For Fine 1, Coarse 1, and Coarse 2, an increase in the number of

changes increases the chance of responding—especially for Coarse 2, whose odds ratio is

significantly greater than those of the other three trials. There is also an interaction effect for the

first three outcome variables and graduate students, illustrated in Figure 5.3, where the lines

connect the estimated means determined by an ANOVA analysis. The slopes of lines

representing graduates are steeper than those representing undergraduates, indicating that

graduates are more likely to respond as the number of changes increases. The two lower graphs

show that coarse responses are best predicted by four or more changes in the musical surface,

indicated by the large jump from 3 to 4 instead of the incremental rise seen in the fine responses.

The difference between the two subject groups in Coarse 2 is not significant.

112

Figure 5.3: Interactions between Subject Group and Number of Changes (Bartók) There is not a significant interaction between the subject groups in Coarse 2.

Now referring to the Mozart analysis, there is also a main effect for an increase in the

number of changes in three of the four trials (Fine 1, Coarse 1, and Coarse 2). The odds ratios in

both coarse trials are significantly larger than the odds ratios in the fine trials, indicating that

more musical changes are needed before a listener will respond in the coarse condition, as

compared with the fine condition. There was only one interaction between the fixed effect

(number of changes) and a subject group (graduates) occurring in the Fine 1 trial, where

graduates are less likely to respond overall. Despite the lack of this interaction effect in the other

trials, I have showed all with estimated means for all four trials for easier comparison with the

Bartók results. In the fine condition, there is an inconsistent upward slope as the number of

changes increase, while participants in the coarse condition are increasingly likely to respond as

number of changes increase. This might suggest that phrase ending analyses are less dependent

113

upon changes in the musical surface, and perhaps participants are paying attention to other

musical features in Mozart, while subjects are much more sensitive to changes as a demarcation

of boundaries in Bartók.

Figure 5.4: Interactions between Subject Group and Number of Changes (Mozart) The only significant difference between subject groups occurs in Fine 1.

My own grouping analysis might provide a more nuanced measure of boundary strength,

since it reflects an interaction between arrival features which I deemed meaningful for a

particular piece and change features taken from the surface of the music. To determine how well

my three-level grouping hierarchy predicts responses, I coded the windows according to the type

of ending each contained. A window containing at least a section ending was coded as 3; a

window containing a phrase ending was coded as 2; a window containing a subphrase ending

was coded as 1. This creates a rating that corresponds to the hierarchical level of the ending.

114

These designations are my own analytical interpretations, of course, but there is a main effect for

an increase in hierarchical ending level on responses in all four outcome variables in both

composer conditions (see Table 5.10). Overall, the odds ratios for the ending ratings in the coarse

condition are significantly higher, indicating a much higher probability of responding as the

hierarchical level changes from a lower level to a higher level.

In the Bartók condition, there is an interaction between graduates and the ending ratings

in the first three conditions (Figure 5.5), where graduates tend not to respond as often as

undergraduates within a window containing no ending or just a subphrase ending; conversely,

graduates are more likely to respond within a window that concludes a section. The Mozart

condition also exhibits this same interaction between the ending type and level of expertise.66

As

Figure 5.6 illustrates, graduate responses are less likely to occur without some sort of ending,

especially in the coarse condition, where graduates wait for at least a phrase ending before

responding.

Table 5.10: Mixed Logit Regression Analysis: Ending Type

Composer Outcome

variable Coefficient

Standard

error t-ratio

Approx.

d.f. p-value

Odds

Ratio

Confidence

Interval

Bartók Fine 1 0.583101 0.070496 8.271 4251 <0.001 1.791586 (1.560,2.057)

Fine 2 0.365162 0.122154 2.989 4251 0.003 1.440748 (1.134,1.830)

Coarse 1 0.936090 0.155891 6.005 4251 <0.001 2.549992 (1.879,3.461)

Coarse 2 1.160561 0.144327 8.041 4251 <0.001 3.191725 (2.405,4.235)

Mozart Fine 1 0.576789 0.141974 4.063 3757 <0.001 1.780312 (1.348,2.352)

Fine 2 0.615687 0.167034 3.686 3757 <0.001 1.850927 (1.334,2.568)

Coarse 1 1.050279 0.074346 14.127 3757 <0.001 2.858447 (2.471,3.307)

Coarse 2 1.220777 0.223261 5.468 3757 <0.001 3.389822 (2.188,5.251)

Neither a simple count of changes nor the presence of an ending determined by a

grouping analysis can examine how particular musical features predict listener responses in each

movement. Movements were separated for these subsequent analyses because different features

may predict endings in each movement. For similar reasons, a separate analysis was run for each

of the four trials. The first set of regressions looks at the arrival features, which vary between

66

This interaction is between musicians and non-musicians in the Fine 1, Fine 2, and Coarse 1 conditions;

and between graduates and undergraduates in the Fine 2, Coarse 1, and Coarse 2 conditions.

115

composers (refer back to Tables 5.3 and 5.4). Because some of these arrival features strongly

correlate with one another (e.g., a PAC will always occur with $) they were not combined into

one large regression; instead, I ran several smaller regression analyses. The change features were

also divided into separate regressions according to their location in a window; for instance,

silence is more likely to occur following the end of a segment and before a new beginning, while

the other change features usually signify a new beginning. The final regression examines the

extent to which analytic endings (subphrase, phrase, and section) predict a perceived boundary.

In all of these analyses, the presence of a feature was coded as 1, so a positive coefficient

indicates that a musical feature predicts the segmentation responses, while a negative coefficient

means the listener is less likely to respond within a window containing the given feature.

Figure 5.5: Interactions between Subject Group and Ending Type (Bartók) There is not a significant difference between the subject groups in Coarse 2.

116

Figure 5.6: Interactions between Subject Group and Ending Type (Mozart) There is not a significant difference between the subject groups in Fine 1.

Experiment 1a: Bartók Results

Arrival Features: In both movements, arrival features were grouped into four separate

analyses: the interval into last note of a segment (Type 1 Window) or first note of a window

(Type 2 Window) and whether it was approached from below or above; change of duration; and

cadential type (which varied between movements). Tables 5.11 and 5.12 summarize the

significant results from this set of regressions. Overall, no single feature or set of features

predicts responses across both Bartók movements; instead, the features that correspond with

listener responses are movement-specific.

117

Table 5.11: Mixed Logit Regression Analysis: Arrival Features, Third Movement

Outcome

variable

Coefficient Standard

error

t-ratio Approx.

d.f.

p-value Odds

Ratio

Confidence

Interval

Coarse 1 0.965358 0.200947 4.804 1398 <0.001 2.625727 (1.770,3.895) Descent

Coarse 2 0.897002 0.130830 6.856 1398 <0.001 2.452240 (1.897,3.170)

Fine 1 1.228127 0.206982 5.933 1398 <0.001 3.414828 (2.275,5.125)

Coarse 1 1.136407 0.217230 5.231 1398 <0.001 3.115554 (2.034,4.771)

Intervallic

Direction

Ascent

Coarse 2 1.242533 0.216170 5.748 1398 <0.001 3.464376 (2.267,5.294)

Coarse 1 1.768610 0.243063 7.276 1392 <0.001 5.862696 (3.639,9.445) -5

-7 Coarse 2 2.218797 0.210323 10.549 1392 <0.001 9.196257 (6.087,13.894)

Fine 1 1.921392 0.429727 4.471 1388 <0.001 6.830462 (2.940,15.871)

Fine 2 0.723158 0.297351 2.432 1388 0.015 2.060932 (1.150,3.693)

Coarse 1 1.363304 0.287454 4.743 1392 <0.001 3.909086 (2.224,6.871)

Intervallic

Approach +1

+2

Coarse 2 2.108026 0.343805 6.131 1392 <0.001 8.231973 (4.193,16.161)

Coarse 1 1.049846 0.199121 5.272 1403 <0.001 2.857210 (1.933,4.223) Duration

Change

Coarse 2 0.875178 0.231805 3.775 1403 <0.001 2.399303 (1.523,3.781)

Coarse 1 1.277560 0.174930 7.303 1372 <0.001 3.587874 (2.546,5.057) Cadence Falling 4th

Coarse 2 1.333816 0.161271 8.271 1372 <0.001 3.795498 (2.766,5.208)

For the third movement, most of the main effects point toward the influence of the falling

fourth cadence on listener segmentation in the coarse condition (an example of this cadence is

found back in Example 5.5). Along with a descent of five semitones, this cadential gesture also

features a duration change, where the arrival note is shorter than the preceding note. While the

main effect for an intervallic ascent, specifically by one or two semitones, is not associated with

the falling fourth cadential gesture, the melodic note in the last window of the movement is

approached by an ascending step. Almost everyone in each trial identified this window as

boundary point. Even though this does not constitute a predefined cadential gesture, it does

illustrate how the results can by swayed by particular compositional features.

The arrival features in the fifth movement do not indicate a systematic preference for any

particular cadential gesture. There is a positive main effect for a melodic descent in two trials,

while participants are less likely to respond to a melodic ascent (note the negative coefficient).

Musicians, however, exhibit less reaction to a melodic descent; an interaction suggests that the

contour leading to the final note of a segment is not a strong indicator of course endings for

118

musicians.67

The more specific intervallic approaches illustrate a similar trend: regardless of the

interval size, a descending contour better predicts responses. Again, an interaction in the coarse

condition for the descending step suggests that this preference for downward intervals may not

hold across subject groups. While undergraduates are more likely to respond when this feature is

present, this feature does not influence graduates, supporting that motion into the final note of a

segment is not a strong indicator for boundaries as musical training and segmentation grain

increases.68

Table 5.12: Mixed Logit Regression Analysis: Arrival Features, Fifth Movement

Outcome

variable

Coefficient Standard

error

t-ratio Approx.

d.f.

p-value Odds

Ratio

Confidence

Interval

Fine 1 0.521713 0.150503 3.466 2806 <0.001 1.684911 (1.254,2.263) Descent

Coarse 2 0.437531 0.151715 2.884 2806 0.004 1.548878 (1.150,2.085) Intervallic

Direction

Ascent Coarse 2 -0.887467 0.229938 -3.860 2806 <0.001 0.411697 (0.262,0.646)

-1, -2 Coarse 2 0.551656 0.151708 3.636 2796 <0.001 1.736125 (1.290,2.337)

Fine 1 0.587790 0.205178 2.865 2796 0.004 1.800005 (1.204,2.691) -3, -4

Coarse 1 0.760485 0.228989 3.321 2796 <0.001 2.139314 (1.366,3.351)

-5, -7 Fine 1 1.779049 0.720338 2.470 2796 0.014 5.924219 (1.444,24.311)

Intervallic

Approach

+1, +2 Coarse 2 -1.440268 0.610440 -2.359 2796 0.018 0.236864 (0.072,0.784)

Fine 1 0.568032 0.154410 3.679 2811 <0.001 1.764791 (1.304,2.389) Duration

Change

Fine 2 0.622929 0.175157 3.556 2811 <0.001 1.864380 (1.323,2.628)

1 Coarse 2 -2.097373 0.804417 -2.607 2800 0.009 0.122779 (0.025,0.594)

2 Fine 2 0.755961 0.255543 2.958 2796 0.003 2.129657 (1.291,3.514) Cadences

L-H Coarse 1 1.052860 0.307220 3.427 2796 <0.001 2.865836 (1.569,5.233)

In both fine trials, there is a main effect for duration change; this is unlike the third

movement, where duration better predicted the coarse responses. Perhaps this difference reflects

the association between the long-short rhythmic Figure and the falling fourth cadential gesture in

the third movement, while the fifth movement has no consistent relationship between a particular

durational pattern and phrase endings. This is further reflected in the lack of main effects for

67

For non-musicians, the ANOVA means increased from 0.062 to 0.126 when the melody descended,

compared with the smaller percent increase from 0.054 to 0.120 for musicians.

68 For undergraduates, in Coarse 1 the ANOVA means increased from 0.095 to 0.140 when the melody

descended, compared with the smaller percent increase from 0.100 to 0.111 for graduates, and in Coarse 2 the

ANOVA means increased from 0.067 to 0.133 when the melody descended, compared with the smaller percent

increase from 0.072 to 0.093 for graduates.

119

cadential gestures, where the only positive main effects include the double chord gesture (Fine 2)

and the low-high succession (Coarse 2). There was an interesting interaction in the first fine trial

involving the double chord cadential gesture and musical expertise: the ANOVA means indicate

that all subjects are more likely to respond at a double chord cadence, but musicians demonstrate

a higher percent increase with the presence of this feature, and graduates show an even higher

percent increase.69

Change Features: Another set of mixed logit regressions calculated the odds ratios for

the change features in each movement. Tables 5.13 and 5.16 summarize the main effects for

these features. In both movements, participants tend to respond to silence fairly consistently,

especially in the coarse trials, but other change features vary between movements. For instance,

register, dynamic, and ostinato changes tend to influence the results more in the third movement

than in the fifth movement.

Complete silence and a thinning of the texture tends to predict responses in the third

movement, and there is an interaction between subject groups and non-melodic silence, where a

thinning of the texture influences graduates (who are overall less likely to respond) more than

undergraduates. (All interactions for this set of features are located in Table 5.14.) Another

feature that usually follows endings, melodic silence, significantly predicts an absence of a

response in the coarse condition. This feature usually occurs at subphrase divisions and in the

middle of musical segments in this movement, which is reflected in these results. For instance,

consider the cello melody from mm. 6–35 (Example 5.11). In this example all the windows are

marked in mm. 19–29 and these windows are annotated with the percentage of participants who

responded in each window (Coarse 1 trial only). These low-performing silences may be too short

to evoke a boundary, or another feature, like phrase length or melodic content, may be

influencing the participants’ performance at this grain of segmentation.

69

For non-musicians, the ANOVA means increased from 0.288 to 0.379 at a double chord cadence,

compared with the larger percent increase from 0.054 to 0.120 for musicians; for undergraduates, the ANOVA

means increased from 0.271 to 0.400, compared with the larger percent increase from 0.017 to 0.047 for graduates.

120

Table 5.13: Mixed Logit Regression Analysis: Change Features, Third Movement

Outcome

variable

Coefficient Standard

error

t-ratio Approx.

d.f.

p-value Odds

Ratio

Confidence

Interval

Coarse 1 1.551360 0.371348 4.178 1393 <0.001 4.717880 (2.277,9.776) Complete

Silence Coarse 2 2.194610 0.280773 7.816 1393 <0.001 8.976497 (5.174,15.572)

Melodic

Silence Coarse 1 -0.797169 0.227236 -3.508 1393 <0.001 0.450603 (0.289,0.704)

Fine 1 2.180737 0.311925 6.991 1393 <0.001 8.852828 (4.801,16.326)

Fine 2 1.485596 0.228470 6.502 1393 <0.001 4.417596 (2.822,6.916)

Coarse 1 1.477900 0.425064 3.477 1393 <0.001 4.383729 (1.904,10.093)

Non-melodic

Silence

Coarse 2 2.521323 0.451601 5.583 1393 <0.001 12.445049 (5.131,30.186)

Coarse 1 1.113262 0.339207 3.282 1398 0.001 3.044271 (1.565,5.923) New

Instrument Coarse 2 1.818096 0.128201 14.182 1398 <0.001 6.160116 (4.790,7.922)

Fine 1 0.998206 0.302832 3.296 1398 0.001 2.713410 (1.498,4.915) New Mel.

Instrument Coarse 1 0.874641 0.366009 2.390 1398 0.017 2.398015 (1.169,4.917)

Fine 2 0.362799 0.154054 2.355 1393 0.019 1.437347 (1.062,1.945)

Coarse 1 0.868092 0.304006 2.856 1393 0.004 2.382360 (1.312,4.326) Register

Coarse 2 0.775337 0.297488 2.606 1393 0.009 2.171325 (1.211,3.892)

Fine 1 0.973550 0.206354 4.718 1393 <0.001 2.647326 (1.766,3.969)

Coarse 1 1.336720 0.204935 6.523 1393 <0.001 3.806536 (2.546,5.691) Dynamics

Coarse 2 2.513043 0.484093 5.191 1393 <0.001 12.342428 (4.774,31.907)

Ostinato Coarse 2 0.664673 0.227824 2.917 1393 0.004 1.943855 (1.243,3.039)

Table 5.14: ANOVA Means for Interactions in Change Feature Analysis, Third Movement

Less Musical Training More Musical Training Outcome

variable

Feature Level of Musical

Experience for

Those with More

Training Feature Absent Feature Present Feature Absent Feature Present

p-value

Fine 1 Lose Instr. Graduate 0.298 0.632 0.150 0.556 0.004

Fine 2 Lose Instr. Graduate 0.280 0.545 0.147 0.525 0.003

Coarse 1 Lose Instr. Graduate 0.075 0.439 0.007 0.434 <0.001

Coarse 2 Silence Graduate 0.086 0.598 0.089 0.278 0.001

Fine 1 New Instr. Graduate 0.356 0.446 0.158 0.500 0.001

Fine 2 New Instr. Graduate 0.327 0.395 0.152 0.481 0.013

Coarse 2 New Melody Graduate 0.112 0.326 0.070 0.472 0.013

Fine 1 Register Graduate 0.355 0.478 0.185 0.506 0.01

Fine 2 Register Graduate 0.316 0.459 0.164 0.543 <0.001

121

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.11: Bartók, String Quartet No. 4, third movement, mm. 6–35 (cello) The Type 1 Windows are annotated on the score with a solid box.

The Type 2 Windows are annotated on the score with a dashed box.

9% 2% 6% 69%

3%

3%

0% 3%

122

On the other hand, all participants strongly respond in the coarse condition to the

presence of complete silence, which usually is reserved for the ends of phrases and sections. In

Coarse 2, however, the effect is tempered for graduates, indicating that not every complete

silence indicates a boundary in this condition. Returning to the cello melody in Example 5.11,

complete silence is marked in the texture by a short, black vertical line. Table 5.15 lists all the

points of silence in this short passage followed by the percentage of undergraduate and graduate

responses in the Coarse 2 trial. Only after m. 34, which doesn’t have complete silence, does the

texture change and a new melody enters, initiating what I consider a new section.70

On their

second time performing the coarse segmentation task, graduates may have learned enough not to

be “tricked” by the complete silence before the end of this section.

Table 5.15: Percentage of Responses at Complete Silence in Coarse 2, Third Movement

Window Location Undergraduates Graduates

m. 13 78% 22%

m. 21 74% 11%

m. 34 39% 89%

The last two sets of regressions examine how changes usually associated with new

beginnings affected listener responses. All of the changes have a main effect in at least one of the

coarse conditions, but the two strongest predictors in this condition are the introduction of a new

instrument and a change in dynamics. There is no main effect for the entrance of a new

instrument in fine segmentation task, but there is an interaction between this feature and

graduates, who tend to respond more to this cue than undergraduates. In contrast, the entrance of

a new melodic instrument significantly predicts responses in the fine condition, but the

interaction remains the same: in Coarse 2, graduates are more likely to respond to this cue.71

In

sum, both features, which sometimes roughly coincide, significantly predict listener responses,

especially in the coarse conditions. Among the other change features, dynamic change is

especially predictive of coarse segmentation (producing a significantly higher odds ratio in

70

Even though complete silence is not indicated in the score, the performers insert a little space between

beats 2 and 3 in m. 34.

71 There is also an effect for this feature in Fine 2 (which is not included in the chart): t(1398) = 2.01,

p = 0.044, OR = 2.83.

123

Coarse 2), and there is both a main effect and interaction for a change in register, especially for

graduates, for whom this feature has a pronounced effect in the fine condition.

For the features present between musical segments in the fifth movement, complete

silence consistently predicts listener responses, especially in the coarse condition (notice the

significantly higher odds ratios), while non-melodic silence only predicts responses in the coarse

condition (see Table 5.16). Although there is no main effect for melodic silence, this feature is

involved in a couple of interactions (Table 5.17). In Fine 1, musicians are more likely to respond

to melodic silence, whereas non-musicians show no reaction. The interaction in the Coarse 1 trial

reveals a different trend: participants are less likely to respond to melodic silence,

undergraduates more so than graduates. This suggests that, for trained musicians, melodic silence

is sufficient for a fine boundary, but not necessarily for a coarse boundary.

Table 5.16: Mixed Logit Regression Analysis: Change Features, Fifth Movement

Outcome

variable

Coefficient Standard

error

t-ratio Approx.

d.f.

p-value Odds

Ratio

Confidence

Interval

Fine 1 0.976257 0.265263 3.680 2801 <0.001 2.654503 (1.578,4.465)

Coarse 1 1.910363 0.319696 5.976 2801 <0.001 6.755540 (3.610,12.641) Complete

Silence

Coarse 2 2.114830 0.305034 6.933 2801 <0.001 8.288176 (4.558,15.070)

Coarse 1 0.610535 0.170981 3.571 2801 <0.001 1.841417 (1.317,2.575) Non-melodic

Silence Coarse 2 1.190035 0.146008 8.151 2801 <0.001 3.287195 (2.469,4.376)

Fine 1 -0.502388 0.187898 -2.674 2806 0.008 0.605084 (0.419,0.874)

Coarse 1 -1.102216 0.214054 -5.149 2806 <0.001 0.332134 (0.218,0.505) New

Instrument

Coarse 2 -0.665164 0.157920 -4.212 2806 <0.001 0.514189 (0.377,0.701)

Dynamics Coarse 2 1.024097 0.277650 3.688 2801 <0.001 2.784581 (1.616,4.798)

During the third movement, the introduction of a new instrument predicts a listener

response, but during the fifth movement we observe the opposite effect. This could be an effect

of the more complex contrapuntal texture of the fifth movement compared with the texture of the

third movement. This more complex texture could also explain the lack of main effects for the

change features that mark the beginning of a new musical segment. Dynamic change in the

Coarse 2 trial produced the only main effect; however, an interaction reveals that this is almost

entirely attributable to graduates.

124

Since Bartók’s music did not necessarily conform to an established syntax, the arrival

features that listeners used to decide upon boundaries vary between pieces. Surprisingly, though,

silence is the only change feature that is consistently used as a boundary marker for both

movements. More specific feature interactions might have been concealed by this broad

overview: for instance, listeners may only respond when a certain combination of change and

arrival features are present. My future research will pursue this avenue, but my own grouping

analysis can function as a simplification of these interactions, since these features influenced my

decisions. I will return to this point after examining the influence of arrival and change features

in Mozart.

Table 5.17: ANOVA Means for Interactions in Change Feature Analysis, Fifth Movement

Less Musical Training More Musical Training Outcome

variable

Feature Level of Musical

Experience for

Those with More

Training Feature Absent Feature Present Feature Absent Feature Present

p-value

Fine 1 Mel. Silence Musician 0.296 0.302 0.258 0.301 0.007

Coarse 1 Mel. Silence Graduate 0.112 0.084 0.109 0.085 0.008

Fine 2 Dynamic Graduate 0.314 0.334 0.259 0.368 0.007

Coarse 1 Dynamic Graduate 0.072 0.146 0.044 0.181 0.021

Experiment 1b: Mozart Results

Arrival Features: While the features that best predicted responses in the Bartók excerpts

were fairly evenly divided between arrival features and change features (especially in the third

movement), arrival features, such as melodic scale degrees and cadential figures, become highly

predictive of responses in the Mozart stimuli. Tables 5.18 and 5.21 list the main effects for the

arrival features, which were grouped into six different regressions comparing similar features:

scale-degrees, harmony/harmonic progression, intervallic direction, step progressions, duration,

and cadences. While some of these features are also explored in the Bartók analysis, a majority

of these features are only associated with endings in the tonal style.

For Mozart’s String Quartet No. 19, melodic $, (, and - are all highly predictive of

listener responses, especially in the coarse condition (notice the high odds ratios). While it may

seem strange to have such high odds ratio for -, in this movement, the three largest sections—

exposition, development, and recapitulation—all end with -"in the soprano. There is not a

125

consistent main effect for $"in the fine condition because non-musicians tend not to respond to

this feature while musicians are likely to respond (see the interaction table: Table 5.19). There

are additional interactions between the other scale degrees and subject group in the fine

condition: all participants—especially musicians—are less likely to respond to ("in the melody,

whereas all participants—especially non-musicians—are more likely to respond to -."These

results suggest that musicians are more sensitive to scale degrees, reserving most of their

responses for cadences with $ in the melody.

Table 5.18: Mixed Logit Regression Analysis: Arrival Features, No. 19

Outcome

variable

Coefficient Standard

error

t-ratio Approx.

d.f.

p-value Odds

Ratio

Confidence

Interval

Fine 1 1.018028 0.409848 2.484 2492 0.013 2.767730 (1.239,6.183)

Coarse 1 3.680713 0.647215 5.687 2496 <0.001 39.674674 (11.151,141.161) $"

Coarse 2 2.834016 0.612873 4.624 2496 <0.001 17.013647 (5.115,56.592)

Coarse 1 2.762380 0.611132 4.520 2496 <0.001 15.837484 (4.778,52.500) ("

Coarse 2 1.755356 0.528397 3.322 2496 <0.001 5.785508 (2.053,16.306)

Fine 1 1.344350 0.422937 3.179 2492 0.001 3.835691 (1.674,8.791)

Coarse 1 4.561242 0.738300 6.178 2496 <0.001 95.702250 (22.498,407.095)

Scale

Degrees

-"

Coarse 2 3.738453 0.577161 6.477 2496 <0.001 42.032921 (13.553,130.356)

I Coarse 2 1.563465 0.638883 2.447 2496 0.014 4.775340 (1.364,16.715)

V7

Coarse 2 1.848644 0.617211 2.995 2496 0.003 6.351204 (1.893,21.306)

V-I Coarse 2 1.561891 0.638213 2.447 2500 0.014 4.767829 (1.364,16.667)

Harmony/

Harmonic

Progression

x-V Coarse 2 1.464869 0.597402 2.452 2500 0.014 4.326977 (1.341,13.962)

Descent Fine 2 0.967142 0.308031 3.140 2493 0.002 2.630415 (1.438,4.812)

Fine 1 1.770332 0.473346 3.740 2493 <0.001 5.872802 (2.321,14.858)

Fine 2 1.732640 0.432869 4.003 2493 <0.001 5.655566 (2.420,13.217) Ascent

Coarse 1 1.678095 0.396861 4.228 2493 <0.001 5.355344 (2.459,11.662)

Intervallic

Direction

LT-Tonic Fine 1 -1.576431 0.407705 -3.867 2493 <0.001 0.206711 (0.093,0.460)

Fine 1 0.641761 0.204140 3.144 2488 0.002 1.899823 (1.273,2.835)

Fine 2 0.663260 0.209690 3.163 2488 0.002 1.941109 (1.287,2.928) Descent

Coarse 2 -0.663897 0.240564 -2.760 2492 0.006 0.514841 (0.321,0.825)

Fine 1 0.771459 0.287521 2.683 2488 0.007 2.162920 (1.231,3.801)

Steps/

Emb. Steps

Emb.

Ascent Fine 2 0.811606 0.264175 3.072 2488 0.002 2.251521 (1.341,3.780)

Coarse 1 0.980611 0.106508 9.207 2503 <0.001 2.666084 (2.164,3.285) Duration

Change

Coarse 2 1.083713 0.220590 4.913 2503 <0.001 2.955633 (1.918,4.554)

Cadences Evaded Coarse 2 -1.733752 0.646037 -2.684 29 0.012 0.176621 (0.047,0.662)

126

Harmonic progression only had a significant effect in Coarse 2, suggesting that harmonic

goals only influence segmentation on a coarse grain. The increased likelihood of responding

following a dominant harmony is a bit surprising, but an interaction effect shows that musicians

are less likely than non-musicians to respond when that feature is present. Structural features of

the movement, like the standard ) over V at the end of the development, may also account for

this result.

The approach to the last note of a segment is also significant, but not consistent between

trials. A descending stepwise melodic line into the last note predicts responses in both fine trials,

but a more general descending melodic contour only predicts responses in Fine 2. The opposite

effect occurs in the coarse condition, where participants are less likely to respond to a descending

stepwise line—a feature that in this particular movement is associated more with cadential

articulations within the exposition and recapitulation than with the ends of these sections. A

significant interaction shows that non-musicians are mostly responsible for this effect in the

coarse condition; musicians exhibit no change based on the presence or the absence of this

feature. Particular compositional characteristics of this movement might account for this result:

the approach to -"at the end of the exposition, development, and recapitulation (before the coda)

is not a stepwise descent, and the entire movement concludes with an ascending gesture, &/$ in

the melody.

There is a strong main effect for an ascending motion into the last note of a segment, and

an interaction shows that graduates are slightly more likely than undergraduates to respond to

this feature in the fine condition. On the other hand, while an embellished ascending line also

predicts the fine responses, graduates are much less likely to respond than undergraduates when

this feature is present. This might reflect the tendency for the less-experienced musicians to

perceive a boundary in mm. 29–30 (see Example 5.16) and other similar passages, in contrast to

more experienced musicians.72

This moment of silence interrupts the ongoing phrase and is

preceded by an embellished stepwise ascent that concludes on - (supported by a dominant

harmony). Participants who are listening for the arrival of harmonic and melodic “goals”

probably would not perceive a boundary at this point, but participants who are responding to

72

For instance, within this particular window in Fine 1, 11% of graduates responded, 70% of undergraduate

musicians responded, and 93% of non-musicians responded.

127

changes in the surface might. This trend is verified by the negative main effect for the leading

tone. While listeners do not indicate endings in Fine 1 for the motion from &-$, an interaction

reveals that graduates are much more likely to respond to this feature.73

In general, increased

expertise seems to elevate arrival features that project a harmonic or melodic goal.

Table 5.19: ANOVA Means for Interactions in the Arrival Feature Analysis, No. 19

Less Musical Training More Musical Training Outcome

variable

Feature Level of Musical

Experience for

Those with More

Training Feature Absent Feature Present Feature Absent Feature Present

p-value

Fine 1 $ Musician 0.452 0.467 0.346 0.524 <0.001

Fine 2 $ Musician 0.364 0.446 0.381 0.549 0.019

Fine 1 ( Musician 0.461 0.440 0.438 0.275 0.004

Fine 1 - Musician 0.430 0.582 0.390 0.504 0.001

Coarse 2 V Musician 0.179 0.116 0.202 0.088 0.007

Coarse 2 x-V Musician 0.165 0.157 0.199 0.137 0.018

Coarse 2 Step Descent Musician 0.185 0.106 0.173 0.165 0.009

Fine 1 Ascent Graduate 0.441 0.573 0.306 0.424 0.001

Fine 2 Ascent Graduate 0.408 0.518 0.364 0.528 0.004

Fine 1 Emb. Ascent Graduate 0.449 0.596 0.342 0.256 0.002

Fine 2 Emb. Ascent Graduate 0.419 0.508 0.410 0.322 0.003

Fine 1 LT-Tonic Graduate 0.469 0.458 0.307 0.506 <0.001

Fine 2 LT-Tonic Graduate 0.430 0.440 0.371 0.605 0.002

Fine 1 Duration Musician 0.434 0.503 0.344 0.532 0.007

Fine 1 PAC Graduate 0.460 0.491 0.358 0.246 <0.001

Fine 2 PAC Graduate 0.431 0.430 0.416 0.345 0.01

Coarse 1 PAC Graduate 0.169 0.200 0.128 0.044 0.001

Surprisingly though, none of the cadences were predictive at the p < 0.02 level,74

but

there are interactions involving cadences and graduates in almost every trial. While

undergraduate responses show little (Fine 2) or no (Fine 1 and Coarse 1) influence from a PAC,

graduates consistently are less likely than undergraduates to respond in a window with a PAC

(note the decreasing means at the bottom of Table 5.19 for graduates).75

This seems

73

This particular arrival feature might account for graduates responding more to an ascending motion.

74 There is a main effect for the PAC in Coarse 1: t(2465) = 2.24, p = 0.025, OR = 1.54.

75 This trend continues in the Coarse 2 trial, but the interaction is only significant at p = 0.046, meaning the

difference between the two groups has a greater likelihood of occurring by chance.

128

counterintuitive given the previous results that indicated that tonal structural features predict

endings. As will be discussed in more detail below, this movement had a large number of

external phrase extensions (i.e., extra material following the cadence but occurring before the

end of the phrase). Graduates, who are presumably more familiar with Mozart’s style, would be

more likely to wait until the end of the phrase extension (which extends the cadence) before

indicating a response.76

Furthermore, some of the PACs in this movement aren’t as strong as

other PACs. Perhaps musicians are more choosy than non-musicians about which PACs indicate

a boundary, as indicated in Table 5.20. This table displays the percentage of responses for three

windows in the passage from mm. 70–87. Graduates do not respond at all to the weaker PAC in

m. 73, while everyone responds in greater numbers to the cadential arrival in m. 77. Following

this arrival is a ten-measure phrase extension prolonging the tonic harmony. At the conclusion of

this passage, everyone is much more likely to indicate a boundary.

Table: 5.20 Percentage of Responses at PACs in Fine 2, No. 19

Window Location Undergraduates Graduates

m. 73 29% 0%

m. 77 50% 56%

m. 87 67% 89%

Table 5.21 reveals no main effects for scale degree at p < 0.02 in String Quartet No. 21,

however $ does significantly predict the fine endings.77

Since there was no subject group

interaction present with this feature (see Table 5.22), there might be a feature interaction, where

subjects respond to $"only when another feature is present. The significant main effect in the fine

condition for the presence of a tonic chord also suggests that this may be the case. There is also a

main effect for the presence of a dominant chord in this condition, but the odds ratio for the tonic

chord is twice that of the dominant.78

Harmonic motion into a tonic harmony is only significant

in Coarse 1, but the presence of a V-I progression correlates with a much higher response rate for

76

I only noted the cadential arrivals in my coding even though the phrase extensions carry the cadential

function until the end of the phrase.

77 In Fine 1, t(1201) = 2.00, p = 0.046, OR = 4.46 and in Fine 2, t(1201) = 2.06, p = 0.039, OR = 8.01.

Despite a high odds ratio for this feature, the large standard error reveals the high variability in the data, increasing

the p value.

78 Due to the large standard error, these ratios aren’t significantly different.

129

the graduate subject group in the fine condition; this is due mostly to a higher baseline for the

undergraduates, implying that undergraduates are less discriminate in their responses (the same

pattern occurs in Coarse 2, but with the musicians subject group—see Table 5.22). While

musicians are more likely to respond to a V-I progression, they are less likely than non-

musicians to respond to a progression that terminates on the dominant in the fine condition.

However, in Coarse 2 the opposite is true: both groups again tend to not respond to an ending on

the dominant, but this effect is lessened for musicians, meaning that they are more likely than

non-musicians to respond. This may relate to main effect of the HC in the coarse condition,

suggesting that cadential goals influenced segmentation more in the coarse condition than in the

fine condition.

Unlike the previous movement, where there were no main effects for the presence of the

cadence, participants more consistently responded to cadential articulation in Mozart’s String

Quartet No. 21. Across the board, PACs are significant, and in both coarse conditions there was

also a main effect for the IAC and HC, although the odds ratio for the HC is significantly lower

than for the PAC. This movement has considerably fewer phrase extensions, so a majority of the

time the cadence coincides with the end of the phrase.

This is the only movement that exhibits a consistent main effect for a stepwise descent.

Although a general downward approach to an ending is not significant, a stepwise descent, even

when embellished, predicts increased responses across all four trials.79

This may reflect the

melodic construction of this piece, where clearly defined subphrases end with a stepwise descent

(m. 2) or an embellished stepwise descent (m. 4) (see Example 5.12). Like in the previous

movement, a general upward contour, including the motion from the leading tone to the tonic,

significantly predicts the absence of a response in two of the trials. Again, musicians are less

likely than non-musicians to perceive an ending when the line ascends, but they are more likely

than non-musicians to perceive an ending when the leading tone ascends to tonic. A stepwise

ascent is predictive in three trials, while an embellished stepwise ascent is only significant in

Fine 2 (an example of an embellished stepwise ascent appears in mm. 5 and 6 of Example 5.12).

79

An interaction reveals that this effect for a descending stepwise line, however, is absent for graduates in

the Coarse 2 trial, where this feature does not influence their responses compared with those of the undergraduates.

130

Table 5.21: Mixed Logit Regression Analysis: Arrival Features, No. 21

Outcome

variable

Coefficient Standard

error

t-ratio Approx.

d.f.

p-value Odds

Ratio

Confidence

Interval

Fine 1 2.783041 0.849425 3.276 1215 0.001 16.168114 (3.054,85.594) I

Fine 2 2.793356 1.131480 2.469 1217 0.014 16.335743 (1.774,150.403)

V Fine 1 2.123094 0.794719 2.672 1215 0.008 8.356952 (1.757,39.739)

Harmony/

Harmonic

Motion

V-I Coarse 1 1.257265 0.407899 3.082 1213 0.002 3.515791 (1.579,7.827)

Ascent Coarse 1 -1.720381 0.545702 -3.153 1209 0.002 0.178998 (0.061,0.522)

Fine 2 -2.448702 0.816184 -3.000 1206 0.003 0.086406 (0.017,0.429) Direction LT-Tonic

Coarse 2 -2.937525 0.957314 -3.069 1209 0.002 0.052997 (0.008,0.347)

Fine 2 1.507373 0.284820 5.292 1201 <0.001 4.514854 (2.582,7.895)

Coarse 1 1.058496 0.309171 3.424 1203 <0.001 2.882034 (1.571,5.286) Descent

Coarse 2 1.463821 0.323258 4.528 1203 <0.001 4.322443 (2.292,8.150)

Fine 1 1.418066 0.567440 2.499 1201 0.013 4.129126 (1.356,12.571)

Fine 2 1.218496 0.394620 3.088 1201 0.002 3.382096 (1.559,7.336) Ascent

Coarse 1 1.616638 0.553836 2.919 1203 0.004 5.036132 (1.699,14.928)

Fine 1 1.184301 0.331546 3.572 1201 <0.001 3.268403 (1.705,6.264)

Fine 2 2.014063 0.294312 6.843 1201 <0.001 7.493704 (4.206,13.350)

Coarse 1 1.228347 0.453447 2.709 1203 0.007 3.415580 (1.403,8.315)

Embellished

Descent

Coarse 2 0.712774 0.298505 2.388 1203 0.017 2.039641 (1.136,3.664)

Steps/

Emb.

Steps

Emb. Ascent Fine 2 1.240258 0.399695 3.103 1201 0.002 3.456504 (1.578,7.572)

Fine 1 1.592021 0.484712 3.284 1201 0.001 4.913670 (1.898,12.718)

Fine 2 2.365207 0.487669 4.850 1201 <0.001 10.646245 (4.089,27.716)

Coarse 1 2.058855 0.255570 8.056 1203 <0.001 7.836993 (4.747,12.939)

PAC

Coarse 2 1.955235 0.270974 7.216 1204 <0.001 7.065582 (4.152,12.024)

Fine 2 2.449530 0.991296 2.471 1201 0.014 11.582905 (1.656,81.000)

Coarse 1 1.848909 0.699153 2.644 1203 0.008 6.352888 (1.612,25.044) IAC

Coarse 2 1.854143 0.767237 2.417 1204 0.016 6.386224 (1.417,28.774)

Coarse 1 1.314523 0.414939 3.168 1203 0.002 3.722974 (1.649,8.403) HC

Coarse 2 1.419039 0.484763 2.927 1204 0.003 4.133147 (1.597,10.699)

Cadence

Evaded

Cadence Fine 2 -1.82185 0.699290 -2.605 1201 0.009 0.161726 (0.041,0.638)

131

Table 5.22: ANOVA Means for Interactions in the Arrival Feature Analysis, No. 21

Less Musical Training More Musical Training Outcome

variable

Feature Level of Musical

Experience for

Those with More

Training Feature Absent Feature Present Feature Absent Feature Present

p-value

Fine 1 V-I Graduate 0.436 0.607 0.236 0.595 0.016

Fine 2 V-I Graduate 0.424 0.595 0.231 0.611 0.003

Coarse 2 V-I Musician 0.143 0.291 0.066 0.237 <0.001

Fine 2 x-V Graduate 0.580 0.359 0.490 0.208 0.011

Fine 2 x-V Musician 0.536 0.366 0.569 0.283 0.006

Coarse 2 x-V Musician 0.260 0.112 0.153 0.095 <0.001

Coarse 2 Step Descent Graduate 0.110 0.271 0.086 0.091 0.006

Fine 1 Ascent Musician 0.563 0.304 0.593 0.140 0.007

Coarse 1 LT-Tonic Musician 0.206 0.024 0.153 0.350 0.009

Example 5.12: Mozart, String Quartet No. 21, second movement, mm. 1-8 (violin 1)

The Type 1 Windows are annotated on the score with a solid box.

The Type 2 Windows are annotated on the score with a dashed box.

Change Features: There is an overall lack of main effects for change features in the

Mozart movements, indicating that change alone is insufficient to create a sense of ending. The

amount of surface change in both movements is much less than in the Bartók movements, and

thus presumably plays a smaller role in the segmentation task. The change features were

analyzed in the same groups as in the Bartók analysis: silences, orchestration changes, and other

changes. The significant results from these analyses are shown in Tables 5.23 and 5.24. There

are fewer change features positively influencing the results compared with the Bartók analyses:

in fact, String Quartet No. 19 has positive coefficients only for complete silence and a change in

dynamic level. Surprisingly, String Quartet No. 21 does not have an effect for complete silence,

but other changes influence participants’ responses.

132

Table 5.23: Mixed Logit Regression Analysis: Change Features, No. 19

Outcome

variable

Coefficient Standard

error

t-ratio Approx.

d.f.

p-value Odds

Ratio

Confidence

Interval

Fine 1 1.299193 0.232535 5.587 2493 <0.001 3.666337 (2.324,5.785)

Fine 2 1.110595 0.282217 3.935 2493 <0.001 3.036164 (1.746,5.281)

Coarse 1 1.803089 0.274909 6.559 2493 <0.001 6.068363 (3.540,10.404)

Silence

Coarse 2 2.011653 0.312752 6.432 2493 <0.001 7.475663 (4.048,13.804)

Fine 1 -0.691147 0.289167 -2.390 2493 0.017 0.501001 (0.284,0.883) Melodic

Silence Coarse 1 -1.830045 0.532366 -3.438 2493 <0.001 0.160406 (0.056,0.456)

Fine 1 -1.103671 0.177798 -6.207 2493 <0.001 0.331651 (0.234,0.470)

Fine 2 -0.980578 0.191200 -5.129 2493 <0.001 0.375094 (0.258,0.546)

Other

Instrument

Silence Coarse 1 -0.907200 0.219563 -4.132 2493 <0.001 0.403653 (0.262,0.621)

Fine 1 -0.975525 0.354085 -2.755 2493 0.006 0.376995 (0.188,0.755) New

Instrument Fine 2 -1.638651 0.426600 -3.841 2493 <0.001 0.194242 (0.084,0.448)

Coarse 1 -0.756947 0.259056 -2.922 2498 0.004 0.469096 (0.282,0.780) Register

Coarse 2 -0.674887 0.270200 -2.498 2498 0.013 0.509214 (0.300,0.865)

Fine 1 0.770116 0.179109 4.300 2493 <0.001 2.160017 (1.520,3.069)

Coarse 1 1.664430 0.293689 5.667 2498 <0.001 5.282661 (2.970,9.397) Dynamics

Coarse 2 2.220171 0.317209 6.999 2498 <0.001 9.208909 (4.944,17.154)

Table 5.24: Mixed Logit Regression Analysis: Change Features, No. 21

Outcome

variable

Coefficient Standard

error

t-ratio Approx.

d.f.

p-value Odds

Ratio

Confidence

Interval

Non-mel Sil Coarse 1 0.861873 0.187515 4.596 1206 <0.001 2.367591 (1.639,3.420)

New Instr Fine 2 -0.903296 0.293516 -3.078 1211 0.002 0.405232 (0.228,0.721)

New Mel Fine 2 0.610553 0.255521 2.389 1211 0.017 1.841450 (1.115,3.040)

Register Fine 1 0.756718 0.239491 3.160 1211 0.002 2.131269 (1.332,3.410)

Coarse 1 0.971746 0.257615 3.772 1211 <0.001 2.642554 (1.594,4.381) Dynamics

Coarse 2 1.016433 0.248980 4.082 1211 <0.001 2.763320 (1.695,4.504)

The negative coefficients in Mozart’s String Quartet No. 19 may reflect that participants

were using cues other than surface changes in the segmentation task, and these changes tend to

distinguish larger units. Consider, for instance, the changes of register in mm. 16 and 17 (refer to

Example 5.16): while the first follows the end of a phrase, the second does not. Accordingly,

participants are much more likely to respond in m. 16 than in m.17 (in the Coarse 1 trial, 45% of

the participants respond in m. 16 compared with only 9% in m. 17). However, a pair of subject

133

group interactions show that participants with formal musical training are more likely than non-

musicians to indicate endings when the register changes.80

This probably reflects the tendency

for musicians to be more consistent than non-musicians in their responses, but does not change

the overarching trend that register change by itself probably does not influence segmentation,

especially in the coarse condition. Also, as seen in the third movement of Bartók’s quartet, a

thinning of the texture is not sufficient to perceive a boundary. An analysis examining feature

interactions would probably reveal that when these change features are combined with a

cadential gesture, participants are more likely to respond to them.

Complete silence, on the other hand, is highly predictive in both conditions—especially

in the coarse condition, where the odds ratio is significantly higher. Besides silence, only

dynamic change elicits a positive main effect, and this occurs only in the coarse condition. This

reflects a feature interaction between cadential arrival followed by the introduction of a new

theme and a change in dynamics: Mozart is much more likely to change the dynamics than to

change the register at these points. Further, in Fine 2, despite not having a main effect for

dynamics, musicians are much more likely to respond to a change in dynamics than non-

musicians, while the opposite is true in Coarse 1, when musicians are less likely to respond.81

This suggests that musicians may be more influenced by these sorts of change features at a finer

grain of segmentation, while arrival features project coarse boundaries.

The second movement of String Quartet No. 21 is the first movement examined here

where complete silence does not predict a response. This may be a result of the compositional

design of this movement. Complete silence never occurs at cadential points; instead, it only

articulates subphrase divisions in mm. 2 and 10 as well as in the corresponding points in the

return of the opening section: mm. 44 and 52. As in the previous movements, melodic silence is

still not a strong predictor of the results, but a thinning of the texture with a non-melodic silence

is significant in the Coarse 1 trial (Table 5.24). An interaction shows that graduates are more

80

For non-musicians, in Fine 2 the ANOVA means increased from 0.383 to 0.433 when the register

changed, compared with the larger percent increase from 0.428 to 0.492 for musicians, and in Coarse 1 the ANOVA

means increased from 0.168 to 0.206 when the register changed, compared with the larger percent increase from

0.089 to 0.170 for musicians.

81 For non-musicians, in Fine 2 the ANOVA means increased from 0.367 to 0.433 when the dynamics

changed, compared with the larger percent increase from 0.346 to 0.577 for musicians, and in Coarse 1 the ANOVA

means increased from 0.080 to 0.433 when the dynamics changed, compared with the smaller percent increase from

0.062 to 0.170 for graduates.

134

likely to respond to this same feature in the fine condition than are undergraduates, who are less

likely to change their behavior when the feature is present.82

This is opposite from the effect

found in the previous Mozart movement, perhaps reflecting a different compositional structure

where a thinning of the texture co-varies with new sections.

The other change features that predict a segmentation response include a new melodic

instrument and changes of register or dynamics, but none of these features is consistently

significant across trials. Generally, just the introduction of a new instrument predicts the absence

of a response for all participants in the Fine 2 trial (like in the previous movement), but when the

orchestration of the melodic line changes in this same trial, participants respond more often. This

suggests that just the entrance of a new instrument isn’t important for segmentation, unlike a

change of timbre in the melodic line. In this movement, like the third movement of Bartók’s

Fourth String Quartet, the violin and cello exchange the melodic role several times (usually at the

beginning of a new phrase), perhaps accounting for this result. A registral change increases

responses in the fine condition, while a dynamic change only increases responses in the coarse

condition. An interaction reveals that graduates in Fine 2 and musicians in Coarse 2 are more

likely than other subjects to respond to dynamic changes.83

As before, Mozart tends to shift

dynamics at important structural points. In this movement, the second phrase begins much louder

than the sotto voce first phrase; the contrasting B section suddenly begins softer following the

loud cadential gesture; and so forth.

In both movements, arrival features best predicted listener responses, especially for

participants with increased musical experience. As expected, many of the features associated

with tonal closure influenced segmentation. The change features, on the other hand, do not

consistently predict responses, suggesting one of two things: (1) because listeners have a familiar

musical syntax on which to base their segmentation, they are less swayed by surface changes; or

(2) listeners are influenced by surface changes, but only when these changes are combined with

82

For undergraduates, the ANOVA means increased from 0.497 to 0.504 in Fine 1 and from 0.483 to 0.496

in Fine 2 when the textured thinned, compared with the larger percent increase from 0.325 to 0.475 in Fine 1 and

0.333 to 0.465 in Fine 2 for graduates.

83 For undergraduates, in Fine 2 the ANOVA means increased from 0.478 to 0.524 when the dynamics

changed, compared with the larger percent increase from 0.319 to 0.603 for graduates. For non-musicians, in

Coarse 2 the ANOVA means increased from 0.175 to 0.296 when the dynamics changed, compared with the larger

percent increase from 0.080 to 0.346 for musicians.

135

another feature. These feature interactions are not captured by the current data analysis. My own

grouping analysis can represent how these features may interact in a segmentation task. As

already noted, the data corroborate the hierarchical levels identified by my grouping analysis, but

this next set of analyses explores how well one of these “ending types” (subphrase, phrase, and

section) predicts responses. Before looking at the data, I will briefly describe which musical

features contributed to my own grouping analysis.

Grouping Analysis

Bartók, String Quartet No. 4, Third Movement: As discussed previously, because the

notion of “phrase” and “subphrase” are less objective in twentieth-century repertoire than in

common-practice repertoire, I first divided the movement into sections, defined mainly by

changes in texture, melodic content, and melodic instrument. Formally, I hear this movement as

a series of varied repetitions of a theme that are organized into a larger ternary structure: the

opening A section (mm. 1–34), a contrasting B section (mm. 34–55), a return to the A material

(mm. 55–63), plus a coda that incorporates elements from both previous sections.84

The analysis of phrases, and especially subphrases, in this style is quite open to

interpretation. After dividing the movement into sections, I further divided it into eight phrase-

like units, mostly informed by changes in the sustained chord and silence in the entire texture. At

the beginning of this movement, a diatonic chord leads into the cello’s first melodic entrance in

m. 6. I interpret the first five measures as a prefix to the beginning of the phrase, and in order to

create a well-formed hierarchical structure, I coded it as a subphrase within the larger phrase.

This analysis does not capture the introductory character of the opening five measures, nor does

it convey the feeling of a beginning at m. 6. Alternatively, I could have separated the first five

measures from the material starting in m. 6 by creating two phrases; however, the first five

measures do not contain a sense of a beginning, middle, and end. This analytical choice of

interpreting the opening five measures as part of a larger phrase is consistent with my subsequent

choices to designate phrase beginnings at the chord changes throughout this movement.

84

Other formal designations are also possible, such as sonata form and strophic construction (Bayley, 2000,

363–4). While a decision about the formal designation is not essential to this study, it is important to note the

movement’s range of possible interpretations.

136

The body of this eight-measure phrase (mm. 6–13) further divides into 4+4 subphrases.

Measure 10 introduces a new melodic gesture following an inversion of the falling fourth

cadential Figure used throughout this movement to mark endings in both A sections. Other

analytic writings support this sense that m. 10 marks a division within a larger unit. Bayley notes

a division here, and recognizes that “the background continuity of a sustained harmony” is

retained “in order that the melody is perceived as an eight-bar entity” (2000, 375).85

Refer to

Example 5.11, which shows the cello melody from mm. 6–35.

The beginning of the second phrase (mm. 14–20) poses new questions regarding the

interpretation of the sustained harmony, which changes on the fourth beat of m. 13, preceding the

cello entrance on the third beat of m. 14. I could interpret this music between the presentation of

the chord and the cello melody as a prefix to the ensuing melodic entrance of the cello,

comparable to mm. 1–5. However, unlike the beginning, these three beats seem too short to

constitute a subphrase. Another alternative is to consider these three beats as belonging to neither

phrase (i.e., merely existing between phrases), but the resulting phrase analysis will not be well-

formed unless this unit is a phrase itself, which would be inconsistent with the analysis of the

opening five measures. The remaining alternative is not to divide the subphrase (mm. 13–17)

into smaller units, recognizing that the beginning of the phrase may not exactly coincide with the

melodic beginning. In this case, and for similar instances in this movement, I chose this third

alternative, which does not explicitly mark the melodic entrance. Similar questions regarding

beginnings and endings arise in the analysis of the fifth movement of this quartet.

The division of the second phrase into subphrases creates a structure similar to the first

phrase, where an inversion of the material from m. 10 motivates a subphrase division in m. 17,

beat 3. Dividing the last phrase of this section (mm. 21–35) is more difficult than dividing the

first two phrases. Bayley (2000) places her only subphrase division at m. 31, drawing a

connection between the cadential material found in m. 19, beat 4 through m. 21, beat 3 and the

material in mm. 29–30. I agree that mm. 29–30 is cadential, so I also placed a new subphrase at

31. However, I interpreted the material in 31–34 as a cadential extension, not as a consequent

response to the material in mm. 21–31. The unsettled character of mm. 31–34 transitions to the B

85

Bayley uses different terminology at this point. From her perspective, each of the phrases I note is in fact

a period containing its own antecedent and consequent phrases (which I describe as subphrases). We agree about the

boundaries of these entities, but we interpret the music on slightly different hierarchical levels.

137

section, which begins in m. 35. I also heard another subphrase division at the end of m. 25,

occurring around the same point in the phrase as the subphrase divisions in the first two phrases.

Even though m. 26 does not pick up the same melodic content as the subphrases that began in

mm. 10 and 17, it does repeat the opening gesture from m. 22, beat 3, clearly articulating a new

beginning. I did not further divide the phrase, despite the melodic silence in m. 27 (which, in my

opinion, just articulates the motivic repetitions as the melody becomes more fragmented).

The B section begins on the third beat of m. 34 and features less homogeneity between

the phrases than did the preceding A section. This section also has three phrases: mm. 34–41,

mm. 42–47, and mm. 47–55. The first phrase has a subphrase division on the second beat of

m. 37, when the rhythmic kernel introduced in m. 35 morphs into a more melodic rendition. Also

at this point, the texture of the harmonic accompaniment changes to a tremolo figure, further

setting this material apart. Following a strong falling third cadence, the second violin takes up

the melody in the short second phrase, with a subphrase division coinciding with the first

melodic rests in this phrase, at m. 44. The last phrase of this section (mm. 47–55) presents two

contrasting ideas, organizing the phrase into three subphrases with a loose aba construction. The

canonic presentation of a new rambunctious melodic idea interrupts the “circling around C”

melody in m. 50, thus initiating a new subphrase. This subphrase is short-lived; the phrase

returns to its previous melodic content on the third beat of m. 51.

The third section (A' section) is a single phrase in sentential structure. A variation of the

opening melody is played in inverted canon between the cello and the first violin. Each

subphrase ending is marked by the descending fourth cadential gesture in the cello. Measure 64

begins the last section of the piece, a coda that incorporates ideas from the previous sections. I

did not divide this coda into subphrases: although a listener could segment this section at one of

the abundant rests, I believe there is insufficient contrast to divide this phrase. Instead, the

rhythmic kernel from the B section spins out over the A diatonic harmony from the A section.

Most of my interpretation of the grouping structure was based on differentiation of

melodic material, a variable not included in my data analysis. These points are usually articulated

by some sort of change in the musical surface or by an arrival feature, which I included in my

data analysis. A similar interaction between change and arrival features influenced my analysis

of the grouping structure in the fifth movement.

138

Bartók, String Quartet No. 4, Fifth Movement: I divided this movement into eight

large sections, outlined in Table 5.25. The thematic content of this driving movement has some

similarities to sonata form, where the material in Sections 2 and 3 returns in Sections 6 and 7 as a

quasi-recapitulation after a developmental Section 5. Sections are demarcated by the introduction

of a new melodic idea, new texture, or new ostinato figure. Much of the movement, in fact,

employs some ostinato, which changes throughout the course of the piece. This ostinato, which

continues past the end of melodic gestures, sounds like a backdrop upon which the melodic

phrases are presented. Although the continuous ostinato sounding between phrase presentations

doesn’t clearly belong to the phrases on either side, I had to interpret the music between melodic

gestures either as a prefix or as a suffix in order to use only well-formed hierarchical structures

(as discussed in the context of the third movement). A pictorial representation of these

possibilities is presented in Figure 5.7.

Usually my analytic decisions interpreted the earliest element (whether the ostinato or the

melody) as the beginning of the phrase. For instance, the first phrase in the second section

(mm. 11–18, reproduced in Example 5.13) began with an ostinato figure, so the ostinato is a

prefix, like the hypothetical phrase diagram in Figure 5.2b. The first phrase in the third section

(Example 5.14) begins with the melody, so the accompanimental material is a suffix, like

Figure 5.2a.

Even though Section 6 recapitulates melodic material from Section 2, my interpretation

of the phrase structure shifts with the introduction of the strong descending third cadence in the

fifth section. In Section 2, phrases concluded with the end of the movement’s primary melodic

theme (Example 5.13); however, in Section 6 repeated chords from the beginning concluding

with a descending third cadence follows the melodic theme (Example 5.7). This cadence sounds

like a stronger ending than did the conclusion of melodic theme (Example 5.15); my phrase

analysis of this section therefore looks quite different from the expositional presentation in

Section 2. Because the development marked the unison descending third as a strong cadential

gesture, its presence elevates the repeated chords to a subphrase within the phrase proper rather

than a suffix (although the limited vocabulary I used for the sake of data analysis does not

preserve the distinction between an internal subphrase and a suffix).

139

Table 5.25: Section divisions in Bartók, String Quartet No. 5, fifth movement

Section Formal Function Measures

1 Introduction 1–11

2 Exposition: First Theme 11–101

3 Exposition: Second Theme 102–121

4 Closing material 121–151

5 Development 152–238

6 Recapitulation: First Theme 238–343

7 Recapitulation: Second Theme 344–374

8 Coda 374–392

a.

b.

Figure 5.7: Possible Phrase Structure Analyses A complete arc in the phrase diagram represents the main content of the phrase, while the incomplete arc leaning on

it represents either a prefix (if it comes before) or a suffix (if it comes after). In order to create a well-formed

grouping analysis, connected arcs collectively form a single phrase.

140

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.13: Bartók, String Quartet No. 4, fifth movement, mm. 11–18

String Quartet No. 4 by Béla Bartók

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.14: Bartók, String Quartet No. 4, fifth movement, mm. 102–108

My analysis of this movement was based on differentiation (as in the third movement),

but context influenced my categorizing of the prevailing ostinato as a beginning or ending. The

ways in which various musical features shaped my interpretation of the context is difficult to

capture using only the arrival and change features from the data analysis, but features that

particularly influenced my segmentation included cadential gestures, changes of ostinato, and

melodic content and repetition.

141

String Quartet No. 4 by Béla Bartok

© Copyright 1929 by Boosey & Hawkes, Inc. Copyright Renewed.

Reprinted by Permission

Example 5.15: Bartók, String Quartet No. 4, fifth movement, mm. 238–249

Mozart, String Quartet No. 19 in C Major (“Dissonance”), K. 465, Fourth

Movement: In my analysis of both Mozart movements, arrival features, especially those

associated with tonal paradigms, principally shaped my analysis, while formal schema and

phrase length also contributed to my interpretation. The primary tonal area (PTA) of this sonata-

form movement resembles a rounded sectional binary form (without the repeats), where the first

sixteen measures present a periodic structure followed by an eight-measure phrase group leading

into a return of the main theme (see Example 5.16). Each phrase in the opening period, an

antecedent phrase ending on the dominant and the parallel consequent concluding on tonic, can

be divided into a pair of clearly articulated subphrases, where the first three subphrases conclude

on a dominant harmony. Following the PAC in m. 16, the next eight measures form a parallel

phrase group, where each phrase begins on the tonic but quickly moves to prolong the dominant

for the remainder of the phrase. These phrases lack true cadential articulation, but the melodic

repetition suggests two phrase-like units grouped together under a higher hierarchical umbrella.

This emphasis on the dominant and the shorter phrase lengths rhetorically signifies the beginning

of the binary form’s second reprise. In m. 24, the opening melody returns, but instead of

immediately concluding with a PAC, the phrase is expanded by repeating the rising fourth

gesture up a step in the first subphrase (mm. 28–29) followed by two beats of complete silence

before initiating the cadential formula.

142

Following the PTA, a typical independent transition begins, with a cadential arrival on

the dominant of V (a D-major chord) in m. 49, followed by a six-measure phrase extension

prolonging the cadential harmony. While the goal of this harmonic motion is achieved in m. 49, I

could not consider m. 49 to be the end of the phrase because it would have left an unacceptable

gap in the hierarchical analysis: mm. 49–54 do not constitute a phrase (there is no harmonic

motion) and the next phrase does not begin until after the medial caesura with the initiation of

the STA. Because mm. 49–54 follow the structural ending, they are not technically a subphrase,

but in order to form a nested hierarchical analysis that acknowledges the cadential arrival in

m. 49 and the formal phrase ending in m. 54, it seemed reasonable to designate both mm. 34–49

and mm. 49–54 as subphrases. Both subphrases are nested within a single larger phrase, forming

the entire transition section.

I divided the STA in three sections. The first, mm. 54–88, remains in G major (dominant

of the original key). The second, mm. 88–103, tonicizes E! major (!VI of G) before returning to

G for the cadence in m. 103, and this section elides into the third, which continues until the end

of the exposition (refer to Figure 5.8, illustrating the division of the entire exposition into

sections, phrases, and subphrases). In this well-formed analysis, all music must be included in a

phrase and in a section, but it need not belong to a subphrase because not every phrase is divided

into subphrases. For instance, the phrase that begins the STA (mm. 35–61) does not easily divide

into smaller units, and so the subphrase level is absent from my analysis at this point.

To this point, I have used subphrases to account for external phrase extensions, but the

subphrase level is also used to account for internal phrase expansions, including those caused by

an evaded cadence. Measures 118–135 present one such use of subphrases (see Example 5.17).

The cadential arrival occurring in m. 125 follows a deceptive motion in m. 122. This evaded

cadence initiates an internal expansion, dividing the phrase into two shorter subphrases. Three

subphrases follow the PAC in m. 125: the first two (mm. 125–129 and 129–131) extend the

phrase by repeating the cadential formula, and the third detonicizes G major with the

introduction of F,"in preparation for the repeat of the exposition (which is not taken in the

recording I used for this study) or for the C-minor beginning of the development section. Along

with illustrating the division of the exposition into smaller groups, Figure 5.8 also shows the

143

location of cadential arrivals in relation to phrase endings.86

My analysis of the recapitulation

followed the model established in the exposition since the recapitulation essentially replicates the

formal construction of the exposition (except that STA2 is expanded with a deceptive motion

leading to a tonicization of !II in m. 306). Both remaining parts of the movement, the short

development and coda, consist of a single larger section divided into three phrases.

Example 5.16: Mozart, String Quartet No. 19, fourth movement, mm. 1–34

(violin 1 and cello)

86

I could have included additional grouping levels between the phrase level and section level since not all

phrase endings are equivalent. One such example occurs in mm. 70-77, where the PAC in m. 73 is weaker than the

PAC in m. 77, creating a longer eight-measure unit (not counting the phrase extension that follows). While such an

addition would better reflect the grouping structure of the movement, it was omitted in order to simplify the data

analysis.

144

Figure 5.8: Mozart, String Quartet No. 19, Fourth Movement: Grouping Analysis of the Exposition The top line shows phrase boundaries; the bottom line shows subphrase boundaries.

Example 5.17: Mozart, String Quartet No. 19, fourth movement, mm. 118–135 (violin 1 and cello)

1 4 8 12 16 20 24 29 34 49 54 61 69 73 77 87 91 95 103 117 121 125 129 131 135 HC PAC PAC HC HC PAC (PAC) PAC PAC PAC PAC

PTA TR STA1 STA2 K

145

Formal schema (including typical phrase length) and arrival features (such as cadential

articulation and harmonic goals) influenced my segmentation in this movement. In my analysis,

endings were marked by typical cadential paradigms and beginnings were marked by new

melodic material or by a repetition of previous melodic material. While surface changes in

dynamics, articulations, and register may coincide with a beginning, they did not play a major

role in my grouping analysis. Given that the second movement of the String Quartet No. 21

shares the same tonal syntax and is written in the same style, I used a similar procedure for my

analysis of this movement.

Mozart, String Quartet No. 21 in D Major, K. 575, Second Movement: I divided this

andante movement into five large sections: the primary theme (mm. 1–19), transition (mm. 20–

33), secondary theme and retransition (mm. 34–42), return of the primary theme (mm. 43–61),

and closing material with a codetta (mm. 62–73). The A-major primary theme is a two-phrase

parallel period where both phrases exemplify sentential structure and the second phrase evades

an expected cadence in m. 16 before arriving on the tonic in m. 19 (refer back to Example 5.2).

As in the previous movement, I analyzed the evaded cadence as completing a subphrase (from

mm. 12–16) and initiating a new subphrase (from mm. 16–19). Each part of the sentential

structure—both parts of the presentation and the entire continuation—is also analyzed as a

subphrase.

The transition begins with a sequence that passes the melodic material between the

instruments, eventually making its way to the dominant (E Major). Despite the complete I-V-I

progression in mm. 20–23 (A major) and in mm. 24–27 (F! minor), I interpret every two

measures as a subphrase, synchronized with the sequential repetition of the melodic line. The

cadential arrival occurs in m. 31 on a HC in E major, but it is immediately extended by two

external phrase expansions (analyzed as subphrases) until the end of the formal phrase in m. 33.

The secondary theme in E major, which is only two phrases long, follows this transition. The

first violin plays the melody in the first phrase, ending with an IAC, and the cello picks up the

melody in m. 38, reaching a PAC in m. 41. E major is then detonicized with the introduction of a

D", setting up the return of the primary theme (see Example 5.18). Because this retransition is so

brief, I heard it as an extension of the phrase begun in m. 38, yet another example of the cadence

occurring before the end of the phrase. The return of the primary theme mirrors the formal

146

structure from the beginning, complete with an evaded cadence, and m. 62 initiates the closing

section of the piece. The second of the two phrases in this section is extended past the cadential

arrival in m. 69. The two subphrases that follow repeat the cadential pattern, rhetorically serving

as a codetta to this movement.

Example 5.18: Mozart, String Quartet No. 21, second movement, mm. 40–44

Data Analysis: While my decisions were based on many listenings and conscious

reflection, participants in this study had limited experience with these compositions and did not

have the opportunity to reflect on their segmentation decisions. Despite these differences, the

participants’ segmentation decisions significantly mirrored my own, especially as musical

expertise increased. This set of analyses investigates the extent to which analytical ending types,

defined by my grouping analysis, predict subject responses. The results from these regressions

are in Table 5.26.

Among all participants, phrase endings consistently predict responses. Sometimes this

effect is particularly strong; for instance, the coarse trials in Bartók’s third movement have

exceedingly high odds ratios. Subject group interactions reveal that the group with more musical

training is more likely to respond at phrase endings (with the exception of Bartók’s fifth

movement).43 Most of these results are driven by the less experienced musicians having a

relatively higher baseline mean for when a phrase ending is absent. This suggests that non-

43 This result is inexplicable, especially since all section endings corresponded with a phrase ending and the

main effect for section ending in this movement indicates that participants respond consistently to that feature.

40

147

musicians are in less agreement with my analysis, and looking at the results as a whole, they tend

to be less discriminating overall.

Table 5.26: Mixed Logit Regression Analysis: Grouping Analysis

Outcome variable

Coefficient Standard error

t-ratio Approx d.f.

p-value Odds Ratio

Confidence Interval

Section Fine 1 2.834667 0.957010 2.962 1393 0.003 17.024731 (2.604,111.310)

Fine 1 1.701294 0.502734 3.384 1393 <0.001 5.481033 (2.044,14.697)

Fine 2 1.917259 0.547971 3.499 1393 <0.001 6.802289 (2.321,19.933)

Coarse 1 4.000502 0.690529 5.793 1393 <0.001 54.625563 (14.093,211.732) Phrase

Coarse 2 6.247745 1.037728 6.021 1393 <0.001 516.845955 (67.473,3959.08)

Fine 2 -0.545557 0.214859 -2.539 1393 0.011 0.579519 (0.380,0.883)

Coarse 1 -1.310557 0.464784 -2.820 1393 0.005 0.269670 (0.108,0.671)

Bartók

Mvmt. 3 Endings

Subphrase

Coarse 2 -2.955109 0.911855 -3.241 1393 0.001 0.052073 (0.009,0.312)

Coarse 1 2.190514 0.265271 8.258 2801 <0.001 8.939810 (5.315,15.036) Section

Coarse 2 1.588419 0.248499 6.392 2801 <0.001 4.896003 (3.008,7.968)

Fine 1 0.939456 0.161433 5.819 2801 <0.001 2.558589 (1.865,3.511)

Fine 2 0.684197 0.187449 3.650 2801 <0.001 1.982180 (1.373,2.862)

Coarse 1 0.870921 0.214951 4.052 2801 <0.001 2.389110 (1.568,3.641) Phrase

Coarse 2 1.310339 0.201581 6.500 2801 <0.001 3.707429 (2.497,5.504)

Fine 2 0.315503 0.095895 3.290 2801 0.001 1.370949 (1.136,1.654)

Bartók Mvmt. 5 Endings

Subphrase Coarse 2 -0.604209 0.207700 -2.909 2801 0.004 0.546507 (0.364,0.821)

Coarse 1 0.642937 0.256859 2.503 2493 0.012 1.902059 (1.149,3.148) Section

Coarse 2 1.202767 0.405722 2.965 2493 0.003 3.329315 (1.503,7.377)

Fine 1 1.209364 0.254545 4.751 2493 <0.001 3.351351 (2.034,5.521)

Fine 2 0.907300 0.356961 2.542 2493 0.011 2.477624 (1.230,4.989)

Coarse 1 1.545690 0.152140 10.160 2493 <0.001 4.691206 (3.481,6.322) Phrase

Coarse 2 1.722955 0.248001 6.947 2493 <0.001 5.601058 (3.444,9.109)

Mozart No. 19

Endings

Subphrase Coarse 1 0.624000 0.200877 3.106 2493 0.002 1.866379 (1.259,2.767)

Fine 1 1.647495 0.615075 2.679 1206 0.007 5.193953 (1.554,17.362)

Fine 2 2.249056 0.595786 3.775 1206 <0.001 9.478788 (2.945,30.508)

Coarse 1 2.410880 0.425501 5.666 1206 <0.001 11.143764 (4.836,25.680)

Phrase

Coarse 2 2.840864 0.509186 5.579 1209 <0.001 17.130561 (6.308,46.520)

Fine 1 0.920706 0.317333 2.901 1206 0.004 2.511062 (1.347,4.680)

Coarse 1 1.997244 0.378694 5.274 1206 <0.001 7.368718 (3.505,15.491)

Mozart No. 21

Endings

Subphrase

Coarse 2 1.284160 0.476020 2.698 1209 0.007 3.611634 (1.419,9.190)

148

The results for section endings depend somewhat on the tempo of the movement and the

grain of segmentation. There are not as many main effects for the end of a section in the slower

movements because these movements have fewer sectional divisions that are spread out over a

longer period of time. Participants with less musical training tend to indicate boundaries more

often (seen in the total counts of all responses in Tables 5.5 and 5.6) and this would lessen the

effect of section on the results, especially in the slow movements. The end section feature is

notably absent from the main effects for the slow Mozart movement, but there is an interaction

with this feature in the Coarse 1 trial. As seen in Table 5.27, both graduates and undergraduates

respond to this feature, but graduates have a higher percent change (notice the high baseline

again for the undergraduates). This interaction is replicated in the other three movements.

Table 5.27: ANOVA Means for Interactions in the Grouping Analysis

Less Musical Training More Musical Training Movement Outcome variable

Feature Level of Musical Experience for Those with More Training

Feature

Absent

Feature

Present

Feature

Absent

Feature

Present

p-value

Fine 1 Section Musician 0.350 0.893 0.247 0.847 0.01

Fine 1 Section Graduate 0.336 0.826 0.179 0.972 0.011

Coarse 2 Phrase Graduate 0.076 0.696 0.027 0.917 <0.001

Coarse 1 Section Graduate 0.107 0.750 0.030 0.944 <0.001

Fine 1 Subphrase Musician 0.354 0.464 0.191 0.463 <0.001

Bartók Mvmt. 3

Fine 2 Subphrase Graduate 0.288 0.430 0.115 0.426 0.005

Coarse 1 Section Graduate 0.072 0.429 0.056 0.569 0.003 Bartók Mvmt. 5

Coarse 2 Phrase Graduate 0.020 0.146 0.340 0.121 <0.001

Coarse 1 Section Musician 0.122 0.411 0.089 0.482 0.013

Fine 1 Phrase Musician 0.370 0.580 0.233 0.660 <0.001

Fine 1 Phrase Graduate 0.351 0.633 0.131 0.609 <0.001

Fine 2 Phrase Musician 0.306 0.518 0.266 0.690 <0.001

Mozart

No. 19

Fine 2 Subphrase Musician 0.393 0.395 0.421 0.449 0.001

Coarse 1 Section Graduate 0.138 0.558 0.024 0.511 0.02

Coarse 2 Phrase Musician 0.126 0.429 0.027 0.456 0.02 Mozart

No. 21

Fine 1 Subphrase Musician 0.363 0.536 0.184 0.573 0.003

The effect of subphrase is less clear over all four movements. Looking first at the Bartók

responses, participants rarely respond within the windows containing a subphrase division

(observe the negative coefficient), especially in the coarse condition. This suggests that listeners

149

tend to reject lower-level endings when asked to note higher-level endings. In the third

movement, despite the negative coefficient for the main effect of subphrase in Fine 2, musicians

are more likely than non-musicians to indicate a boundary at a subphrase division in the fine

condition. Table 5.27 illustrates that when a subphrase ending is present in the third movement

of the Bartók quartet, there is little difference between the means of the two subjects groups. In

the fine condition, when a subphrase ending is not present, the group with more musical

experience has a much lower mean, resulting in a higher percent change for when the feature is

present. This indicates that the effect of subphrase is less pronounced on non-musicians than it is

on musicians, reflecting once again that non-musicians may be less discriminating in their

responses and don’t agree with my analysis to the same extent as do the musicians.

While in the fast Mozart movement, subphrases are only significant predictors in

Coarse 1, there is an increased number of main effects for subphrase in the slow Mozart

movement. For both movements, there is an interaction for this feature: although the presence of

a subphrase ending does not change the non-musician’s responses much, musicians have a higher

percent change when a subphrase ending is present. Again, this reflects a more consistent

segmentation strategy for participants with more formal training.

Discussion

The results from this experiment support the hypotheses posited by EST: 1) participants

will segment a given stimulus consistently between trials; 2) different participants will segment a

given stimulus consistently; (3) the resulting segmentation will form a nested hierarchical

structure; and (4) pre-existing knowledge structures will influence segmentation. The

expectations both for continuity and for goal-directed musical successions derive from these

knowledge structures. Differences between the subject groups further support this last

hypothesis. Non-musicians tend to perceive more boundaries that aren’t paired consistently with

musical features, but as musical expertise increased participants are more likely to respond

consistently to particular features. More musical experience—both listening and performing—

would create knowledge structures that could assist in this segmentation task.44

44 For future data analysis, it might be helpful to divide subjects according to how often they indicated a

boundary, given that participants may have been segmenting on different hierarchical levels (meaning that one

listener’s phrase might be another listener’s section).

150

Overall, participants segmented the compositions consistently between trials, forming a

nested hierarchical structure, and participants somewhat agreed about the location of these

boundaries. There is no significant difference between the odds ratios in the Bartók and Mozart

conditions for these main effects, pointing to a shared cognitive process that is not style-specific,

although the musical features that mark boundaries differ between the styles represented by these

composers. These different features may have resulted in different segmentation strategies,

reflected in the interactions between the starting segmentation task and consistency for the

Mozart subjects and between the starting segmentation task and hierarchy for the Bartók

subjects. In the Mozart condition, participants who began with the fine segmentation task tended

to be more consistent in their coarse responses. The opposite is true for participants who

segmented the Bartók movements: these participants produced a better nested analysis when they

began with the coarse segmentation task. The aural analysis of the Mozart movements improved

when participants were instructed to listen first for fine-grained boundaries, which usually

coincide with a cadential paradigm. This type of musical syntax is missing in Bartók’s style, so

change in the musical surface apparently informed the participants’ segmentation decisions.

Participants in the Bartók condition who first divided the movements into large sections seem to

have formed a more stable representation of the movement into which their fine responses were

nested. Since Mozart’s style does not have as many surface changes differentiating larger

sections, it may have been easier for participants to group together already determined shorter

musical segments into longer sections, as opposed to dividing a longer section into shorter

phrases.

Both of these experiences of the musical structure are supported by the theoretic

cognitive mechanism that guides event segmentation as posited by EST. First, coarse segments

are usually marked with the culmination of some sort of “goal” or a more drastic change in the

musical surface compared to the surrounding input. The Bartók analysis was influenced more

than the Mozart analysis by the number of changes in the musical surface, especially the

perception of coarse divisions. Returning to Figure 5.5 (the estimated mean response in a given

trial for the number of changes), the coarse condition shows a large increase in response rate

when there are between three and four musical changes, suggesting that the change features

highlighted in this analysis strongly influenced segmentation when present in sufficiently large

151

quantities. The creation of a nested structure is facilitated by first identifying these points in a

style where standard cadential gestures are not the norm. Further research suggests that memory

tends to be better at perceptual boundaries because these are the moments at which event models

update, incorporating new input from the ongoing perceptual stream (Swallow, Zacks, and

Abrams 2009). Perhaps participants in my study remembered these points better than other

moments in the music, and this assisted in the fine segmentation task.

The feature analysis reveals that when a movement has goal-directed features (such as

consistent cadential progressions), participants tend to rely on them more than on surface

changes, especially when segmenting on a coarse grain. When goal-directed features are absent,

however, an increased number of changes creates a hierarchical structure. In the Mozart results,

participants relied mainly on arrival features to mark both fine and coarse boundaries. Most of

the arrival features reflect transitional probabilities (it is highly likely that an end would follow #,

but not necessarily $), which would allow participants to anticipate an ending in this repertoire

more so than in the Bartók movements. Since sections weren’t delineated by drastic changes in

the musical surface, participants had to group together shorter segments to build longer sections.

Even though these arrival features, which resemble a musical goal, predict responses in both

conditions, they are far more likely to predict responses in the coarse condition. This effect

increases with musical training, suggesting that musicians use these arrival cues consistently to

segment musical experience.

Even though the absence of standard cadential paradigms in the Bartók examples makes

goal achievement difficult to quantify, features associated with movement-specific cadential

paradigms tend to predict coarse boundaries in the third movement. In this movement, the falling

fourth cadence and other arrival features associated with this cadence (a specific duration pattern

and intervallic succession) consistently predict coarse responses.45 This can be contrasted with

Bartók’s fifth movement, where a change in duration predicts fine responses because it is not

consistently paired with a cadential goal. Although, for both composers, not every feature that

predicts a coarse response is coupled with a cadential paradigm and cadential paradigms can also

45 It is difficult to distinguish whether listeners where responding to these cadential gestures or to the large

amount of surface change that usually followed these gestures. A follow-up study that controls for these variables in

the segmented stimuli might be able to distinguish the extent to which a listener depends on arrival and change

features.

152

predict fine segmentation, a general trend that associates goal-directed motion with coarse

divisions emerges from this data.

EST also predicts that fine divisions occur at a change in motion, which in music would

presumably involve a change in the acoustical stream. Although a durational change predicts

fine responses in Bartók’s fifth movement, as just mentioned, no consistent effects otherwise

support this hypothesis. Because these analytical features are very general, some of the fine

division predictors may have been lost in an over-generalized picture, explaining this lack of

effect. Coding more specific feature interactions (for instance, adding features that acknowledge

a particular change in duration or register) or coding in the degree of change might yield

different results.

For all four movements, the formal divisions I previously identified tended to be the best

predictor of listeners’ responses. Of course, my own analysis did not solely rely on the presence

of a change in the musical surface; I also considered motivic and melodic repetition, conformity

to formal schema, and phrase length, among other things. This evaluation was analytically

complex rather than simplistic, and it depended more on my prior musical experience than on

individual surface features. Despite disparities in musical training and experience, listeners

evidently agreed to a remarkable extent with my formal analysis. In the fast movements, subjects

tended to corroborate my higher-level endings, while in slow movements they were more likely

to confirm my lower-level endings, perhaps revealing a general feeling of phrase length.

Participants with more musical training were even more likely to respond at my formal divisions,

presumably reflecting similarities in our musical experience and training.

This experiment demonstrates that pre-existing knowledge structures can influence

segmentation. Both expectations for continuity and arrival features derive from learned

transitional probabilities, but, as discussed in Chapter 3, the degree of finality would correlate

with the degree to which the arrival of the ending was anticipated and with the subsequent rise in

prediction error. The next study uses a learning task to explore how arrival features may

influence the perception of closure.

153

CHAPTER 6

EXPERIMENT 2

Experiment 1 found that listeners, despite varying familiarity with the musical style,

could segment a musical stream consistently based on features in the music; as previously

discussed, determining meaningful musical segments is a necessary precursor to the actual

perception of closure. Previous research has suggested that the ability to segment a given

stimulus, as well as the associated perception of closure at the end of a segmented unit, derives

from unconsciously learning the statistical structure of music. Probabilistic learning, which

creates expectations that guide listener segmentation, falls into one of two categories: inclusional

probabilities and transitional probabilities. Transitional probabilities are the most associated with

closure, because listeners who are able to anticipate endings will experience a stronger feeling of

finality at the end of a unit. In the previous study, transitional probabilities were associated both

with goal completion and with a change in the acoustic landscape, the latter revealing an

expectation for continuity. Broadly speaking, coarse segmentation responses correspond with

arrival features (equivalent to achieving a musical goal), while fine segmentation responses

reflect a change in musical motion; however, this correspondence is not consistent and varies

according to musical training. Presumably, the boundaries marked by coarse segmentation

elicited a stronger feeling of finality in the participants than did the boundaries marked by fine

segmentation alone. While the specific feeling of finality (anticipatory, arrival, or retrospective)

wasn’t explored, participants consistently use cues to make decisions about segmentation and

their ability to make these decisions consistently depends upon musical experience.

Transitional probabilities broadly stemming from a listener’s previous musical

experiences as well as transitional probabilities learned as a particular composition unfolds guide

a listener’s segmentation of an unfamiliar style. If a consistent feature concludes segments,

listeners can pick up on these first-order probabilities, resulting in a greater feeling of finality

when this feature occurs. Experiment 2 examines the process by which listeners learn the

musical markers of endings in two styles: a more familiar common-practice style exemplified by

the composer Wolfgang Amadeus Mozart and a twentieth-century style represented by the

154

composer Béla Bartók. As discussed in the previous chapter, Bartók tends to provide a metrical

framework and to employ phrase lengths and formal divisions familiar from common-practice

style. More important for my purposes, consistent gestures conclude phrases in the representative

composition by Bartók used in this experiment. While the first-order probabilities for Bartók’s

cadential gestures may not be as high as for Mozart’s cadential gestures, the goal of this study is

to see whether listeners can extract these cadential cues.

While many statistical learning tasks have shown that listeners are sensitive to both

inclusional and transitional probabilities, this study determines whether exposure to an

unfamiliar style can change the way a listener interprets a “satisfactory ending.” Any effects

from this study would not be robust, for several reasons. First, the statistical learning task

outlined in this chapter has an additional layer of interpretation: rather than asking participants

whether a test item is a grammatical entity within a style, this study asks participants to

determine whether an ending sounds “complete” or “satisfying.” Also, this task uses music from

the existing repertoire, which could not be controlled for all the transitional probabilities between

sound elements. Finally, while Bartók is more consistent than many other twentieth-century

composers in his ending gestures, learning any first-order probabilities from a twelve-minute

exposure could be a difficult task.

Method

Participants

The participants were divided into three subject groups determined by their levels of

formal musical training. Data from 21 undergraduate non-musicians (who received psychology

credit for participating in this study), 22 undergraduate music majors (who received extra credit

in their freshman Music Theory class), and 10 graduate music majors (who received a $10 gift

card for their participation) were included in this study. Data from 10 additional participants

were discarded due to technology problems.

Stimuli

Stimuli consisted of clips from Mozart’s String Quartet No. 19 in C Major

(“Dissonance”), movements 1, 2, and 4; and Bartók’s String Quartet No. 4, movements 1, 3, and

155

5.87 To create the exposure period, I selected excerpts from each movement that conclude with a

cadential gesture. Table 6.1 lists the excerpts used in the exposure period, all of which were

created in Audacity by segmenting the original digital file. All excerpts for each composer were

then combined into a single audio file in the order in which they occur in the composition. Each

clip was heard twice in succession with successive clips separated by three seconds of silence.

The resulting Mozart and Bartók sound files were each slightly over twelve minutes long.

Table 6.1: Exposure Excerpts

Composer Movement Measure Numbers Time (s)

Bartók 1 1–49 (beat 1) 104.34

Bartók 1 148 (beat 4)–161 26.66

Bartók 3 1–5 22.93

Bartók 3 13 (beat 4)–21 34.00

Bartók 3 47–55 (beat 1) 37.28

Bartók 5 15–57 45.00

Bartók 5 121–148 22.86

Bartók 5 238–284 (beat 2) 35.88

Mozart 1 23–44 (beat 2) 38.55

Mozart 1 176–211 62.54

Mozart 2 1–13 (beat 2) 46.56

Mozart 2 26–39 (beat 1) 52.74

Mozart 2 101–109 (beat 2) 33.02

Mozart 4 1–34 26.85

Mozart 4 258–291 26.92

Mozart 4 326–348 18.87

Mozart 4 371-419 40.21

From these same movements, I selected 115 test clips representing each composer (a total

of 230 clips). These clips were catalogued either as cadential (target stimuli) or as non-cadential

fillers. For the Bartók stimuli, the average time for the 55 cadential excerpts is 3.06 seconds

87 The Emerson String Quartet performed both of the recordings used in this study from the albums Bartók:

The String Quartets (1988) and Mozart String Quartets K. 465 “Dissonance,” 458 “The Hunt” & 421 (2005).

156

(SD = 1.28), while the average time for the 51 cadential Mozart stimuli is 3.59 seconds

(SD = 1.36). Some of the same cadential points were heard more than once, each one having a

different length context preceding the cadential point. For the Mozart stimuli, the sound clips

began at the initiation of the pre-dominant area (long context) or the dominant area (short

context). For the Bartók stimuli, the sound clips began between 2 and 10 beats prior to cadential

arrival. Sixty percent of the cadential excerpts were present in the exposure; participants heard

these clips in a semi-random order. While the clips were randomized within each movement, the

movements were heard in the order in which they occurred in the composition.

Cadential gestures in Mozart’s string quartet include authentic cadences (both the PAC

and the IAC) and half cadences, as defined in Chapter 5. Cadential gestures in Bartók’s string

quartet include those defined in Chapter 5: the descending fourth and the descending third

gestures as well as the multi-voiced chord, which can be presented by itself, immediately

repeated, or following a lower note. To supplement this list of cadences, the ends of motives x

and y and the diatonic chord that concludes the third movement were also treated as cadential.

Motive x is a chromatic gesture that occurs both in its prime form (Example 6.1) and inverted in

pitch-space. This motive is introduced in the first movement in m. 7 and reappears about halfway

through the fifth movement. Motive y first occurs in the fifth movement in m. 16. Like motive x,

the rhythmic gesture is the primary identifier for this motive (see Example 6.2).

Example 6.1: Motive x from Bartók’s String Quartet, No. 4, first movement, m. 7

Example 6.2: Motive y from Bartók’s String Quartet, No. 4, fifth movement, m. 16–18

The target clips were constructed to include as much of the final cadential event as

possible, cutting the clip right before the beginning of the next event. Unlike the first experiment

where I could not differentiate between the feelings of anticipatory, arrival, or retrospective

157

closure, the construction of the stimuli in this experiment allowed me to examine specifically the

feeling of closure based solely on arrival features rather than discontinuities in the musical

surface. In some cases, the construction of the music resulted in a “clipped” ending in the sound

file, especially for cadences that weren’t followed by a rest. Because participants might be

sensitive to this sound, I cataloged the presence of silence in the music, along with other features

such as the type of cadential gesture and the length of the excerpt for data analysis.

Procedure

After giving informed consent, participants were assigned to one of two listening

conditions that determined the exposure content: participants heard the excerpts either by Mozart

or by Bartók. While listening to the exposure track, participants were asked to indicate on the

computer every time they heard an ending to ensure they were paying attention. Following the

exposure, they listened to the test clips from both composers (order of presentation was

counterbalanced between participants) and rated how complete each ending sounded on a seven-

point scale.88 All of their responses were recorded on a computer. Upon completion of the rating

task, participants filled out a brief questionnaire documenting their familiarity with the

compositions and their musical experience.

Results

The results only examine data from the target stimuli: ratings from the clips that conclude

with a cadential gesture. Using a mixed-model regression, I examine the influence of several

independent variables on the dependent rating variable: within-subject variables, which record

characteristics of the stimuli (the composer of the excerpt, whether a rest followed the cadential

arrival, the length of the excerpt, and the order in which the clips were presented) and between-

subject variables (exposure composer, composer first rated, and subject group, as determined by

musical experience). Composer and silence are binary variables (Bartók = 0 and Mozart = 1; no

silence = 0 and silence = 1), while group is a three-level variable (Non-musicians = 0,

Undergraduate Musicians = 1, and Graduate Musicians = 2). The question order is a number

88 The instructions read to the participants were: “In this part of the study you will use a seven-point scale to

rate how well the short musical clip would end a musical idea. In music, some endings sound more conclusive than

others. If the musical clip does not sound at all like an ending, then press 1. Use higher numbers to indicate stronger

endings, with 7 representing the strongest possible ending. Since this is a matter of opinion, don’t worry that there is

a right or wrong answer. Feel free to use the entire range of the scale.”

158

from 1–115, corresponding with the order of the excerpts within each block (determined by

composer), and time is the length of each excerpt measured in seconds.

Table 6.2: Mixed Models Regression Analysis: Rating

Row Number

Fixed Effect89

Coefficient Standard

error t-ratio

Approx. d.f.

p-value

1 Intercept 3.072478 0.164794 18.644 51 <0.001

2 Exposure -0.561667 0.217472 -2.583 51 0.013

3 First Rated 0.065684 0.218071 0.301 51 0.764

4 Group 0.593148 0.125842 4.713 51 <0.001

5 Rated Composer 0.790687 0.126829 6.234 5159 <0.001

6 Exposure 0.069363 0.197554 0.351 5159 0.726

7 First Rated 0.141623 0.195685 0.724 5159 0.469

8 Group 0.307577 0.111427 2.760 5159 0.006

9 Silence 1.845760 0.139089 13.270 5159 <0.001

10 Exposure 0.013615 0.177474 0.077 5159 0.939

11 First Rated -0.157893 0.177682 -0.889 5159 0.374

12 Group 0.040004 0.116624 0.343 5159 0.732

13 Excerpt Length 0.274982 0.039850 6.900 5159 <0.001

14 Exposure 0.079446 0.051066 1.556 5159 0.120

15 First Rated -0.065781 0.051092 -1.287 5159 0.198

16 Group -0.066598 0.029681 -2.244 5159 0.025

17 Question Order -0.007765 0.000878 -8.847 5159 <0.001

18 Exposure 0.002446 0.001237 1.976 5159 0.048

19 First Rated 0.001444 0.001246 1.159 5159 0.247

20 Group -0.001042 0.000807 -1.291 5159 0.197

An interaction between the variables that document the composer of a rated excerpt and

the composer a participant heard during the exposure period would indicate that the exposure

period influenced the listener’s perception of closure. As seen in row 6 of Table 6.2, the

interaction between exposure and composer is not significant. Despite the lack of direct support

89 A few notes on this table: the first row is the coefficient needed for the regression equation. The between-

subject variables underneath it show the influence of being in any one of these groups on the rating variable. The

between-subject variables underneath a given within-subject variable show the interaction between these subject

groups and a given independent variable.

159

for the main hypothesis of this experiment, other variables significantly influenced the rating

data. For instance, rows 2 and 4 show a main effect for two of the between-subject variables—

exposure and group. The negative coefficient for exposure indicates that the ratings made by

participants who listened to Bartók during the exposure period are generally higher than the

ratings made by participants who listened to Mozart, while the positive coefficient for group

indicates that participants with more musical experience also tend to rate all the excerpts higher.

A separate ANOVA reveals an interaction between exposure and subject group on the

mean rating of the excerpts. This interaction reveals that the graduate musicians who were

exposed to Bartók drive the main effect of higher ratings for the Bartók excerpts (see Figure 6.1).

In fact, a significant three-way interaction shows that the ratings for cadential gestures in Bartók

made by graduate students who listened to Bartók during the exposure period are much higher

than the ratings made by any of the other subject groups for the same cadential gestures

(Figure 6.2).

Figure 6.1: Two-way Interaction between Subject Group and the Exposure Composer

160

Figure 6.2: Three-way Interaction between Subject Group, the Exposure Composer, and the Rated Composer The first Figure shows the mean ratings for the Bartók stimuli, and the second Figure shows the mean ratings for the

Mozart stimuli.

161

The main effect for rated composer shows that all participants tended to rate the excerpts

composed by Mozart higher than those composed by Bartók (row 5), and the significant

interaction for group (row 8) suggests that the three subject groups evaluated the composers

differently. While all three groups exhibit the general trend of rating Mozart excerpts higher then

Bartók excerpts, graduate musicians rate both composers higher, suggesting more familiarity

with both composers (see Figure 6.3). Undergraduates seem to be more familiar with the Mozart

stimuli than with the Bartók stimuli; their estimated mean rating for the Bartók stimuli is closer

to that of the non-musicians, whereas their estimated mean rating for the Mozart stimuli is closer

to that of the graduate musicians. This effect is accentuated in the data set that includes the non-

cadential filler ratings (see Figure 6.4).

Figure 6.3: Two-way Interaction between Participant Group and the Ratings for Composer Only the cadential excerpts are included in this data set.

162

Figure 6.4: Two-way Interaction between Participant Group and the Ratings for Composer All the excerpts (both cadential and non-cadential) are included in this data set.

The remaining within-subject variables are all significant. The presence of silence has the

greatest effect on the ratings, suggesting that participants were paying more attention to the

acoustical properties of the final note of the excerpt rather than the approach to the last note of

the excerpt. The positive coefficient for the length indicates that as the length of the excerpt

increases, the rating increases, but the negative coefficient in the interaction suggests that

participants with more musical training aren’t as influenced by the length of the stimulus. The

final variable, question order, shows that as the experiment progressed all participants tend to

rate the excerpts lower. While this could suggest that participants became slightly more

discerning between degrees of cadential articulation as the experiment progressed, a significant,

but weak negative correlation between all the ratings (including the ratings for non-cadential

fillers) and the question order reveals that all the ratings gradually decreased as the experiment

progressed (r = -0.055, p = 0.01).

Discussion

While there isn’t a significant effect for the interaction between the exposure composer

and the rated composer (exposure match), these data suggest other trends that are supportive of

163

EST. Increased familiarity with musical style increases the ratings for closure. Musicians, who

are presumably more familiar with common-practice repertoire, rate Mozart’s excerpts higher

than the non-musicians do; graduates, who are presumably more familiar with twentieth-century

repertoire, rate Bartók’s excerpts higher than the other two groups do. Further, graduate

musicians seem to be more affected by the exposure period. While both groups of graduates rate

the Mozart excerpts higher than the Bartók excerpts, graduates who were exposed to Bartók rate

the Bartók excerpts higher than did the participants who were exposed to Mozart. Given their

more extensive musical experience, graduates could assimilate new stylistic markers of closure

better than participants with less musical experience could.

Exposure match does not have a significant main effect on the rating of the target stimuli,

which may be a product of the design of this experiment. The twelve-minute exposure period

was probably too short for participants to acquire stylistic cues for endings. In their statistical

learning study, Jonaitis and Saffran (2009) found that participants only learned a novel harmonic

syntax after a two-day exposure period. Further, the present study did not ask whether the stimuli

formed a grammatical entity, as most statistical learning tasks do; rather, it asked for an aesthetic

judgment of completeness. Such opinions may require even more experience with a style,

consistent with the increased ratings by the more experienced musicians. Second, the strong

main effect for silence suggests that acoustical properties of the last pitch influenced ratings

more than the feeling of finality experienced at the arrival of the last pitch. An alternatively

designed study could control for this variable by prematurely clipping the last note of every

cadence, which would shift attention towards the predictability of the last pitch.

Both Experiments 1 and 2 show the importance of pre-existing knowledge structures in

musical segmentation and the perception of closure. While the exposure period in this study

failed to create new knowledge structures, the main effect for musical expertise (measured by

formal training) suggests that previous knowledge did, in fact, influence the results. Research

reviewed in Chapters 3 and 4 suggests that both segmentation and the perception of closure

depend upon a listener’s expectations. Increased experience with a particular style would

presumably create more accurate expectations, which could be reflected by increased ratings for

more familiar music because the feeling of finality results from accurate expectations followed

by a rise in uncertainty for subsequent events. While Experiment 2 does not specifically examine

164

the influence of expectation on closure, Experiment 3 explicitly asks participants to predict when

musical phrases will end. Endings that are more predictable should correlate with an increased

feeling of finality.

165

CHAPTER 7

EXPERIMENT 3

This last experiment tests the third hypothesis of EST: anticipated endings followed by a

rise in uncertainty for subsequent events correspond with a feeling of finality. Zacks, Speer, and

Reynolds (2009) tested this hypothesis by asking participants to rate retrospectively the

predictability of clauses from a longer narrative. The authors found that the perceived

predictability decreased as the number of changes in the narrative increased. Participants also

tended to read less predictable clauses more slowly than predictable clauses, suggesting an

update of the event model during unpredictable clauses. While there was a correlation between

reading speed and predictability, Zacks, Speer, and Reynolds note that a retrospective rating of

predictability may not be the most reliable measure, and they suggest that a real time measure of

predictability might provide more support for EST.

In Experiment 3, I measure predictability by asking participants to anticipate the endings

of musical phrases while listening to three complete movements by W.A. Mozart. Following this

prediction task, participants rated the degree of completeness for short clips from these

movements. In this rating task, one group of participants heard the clips in order of the

movement, while the other group of participants heard a random-order presentation of the clips,

allowing an examination of the relationship between the perception of closure and the formal

structure of a composition. Meyer (1973) suggests that the formal structure of a composition is

articulated through a hierarchy of closes. His bottom-up construction of structure suggests that

stronger endings result from more parameters projecting closure; however, a listener’s previous

experience with typical formal structures could also influence an ending’s perceived strength.

This top-down view suggests that knowledge structures represent yet another parameter that can

then “project” closure, hence influencing the perceived strength of closure.

166

Method

Participants

The participants were divided into three subject groups determined by their levels of

formal musical training. Data from 24 undergraduate non-musicians (who received psychology

credit for participating in this study), 27 undergraduate musicians (who received extra credit in

their freshman Music Theory class), and 23 graduate student musicians (who received a $10 gift

card for their participation) were included in this study. Each subject group was divided into two

conditions that determined the nature of the rating task. In the random condition, participants

heard the clips in random blocks by movement, while in the visual condition participants not

only heard the clips in the order they occurred in the movement but also saw a visual

representation of the movement, an example of which is replicated as Figure 7.1.

Figure 7.1: Excerpt No. 6 from Mozart’s String Quartet in G Major (K. 156), third movement. Participants were instructed that the blue box represents the relative length and location of the clip.

Stimuli

Stimuli in this study used the minuet and trio movements from three string quartets by

W.A. Mozart: String Quartet No. 3 in G major (K. 156), third movement; String Quartet No. 8 in

F major (K. 168), third movement; and String Quartet No. 13 in D minor (K. 173), third

movement. Participants listened to all three movements as performed by the Amadeus String

Quartet. (The scores of these movements are located in Appendix B.) I chose these particular

movements because Mozart controls two musical features that may influence the predictability of

phrase endings: consistent four-bar hypermeter and clear cadential arrivals. First, while K. 168

167

maintains a consistent four-bar hypermeter throughout, the other two movements contain phrase

expansions and extensions that disrupt the established four-measure groupings. This variability

in the length of phrases forces the listeners not to rely exclusively on predictable metrical cycles

to anticipate endings. Second, the three movements contain a variety of cadential paradigms,

including all three significant cadence types: PAC, HC, and IAC).90 For experimental purposes,

there is also a practical advantage that the majority of cadences in these movements do not

involve a melodic suspension (i.e., the harmonic and melodic arrivals coincide).

For the prediction task, I combined all three movements into a single audio file, inserting

fifteen seconds of silence between successive movements (the order of the movements was

counterbalanced between participants). Participants listened to this file through the digital audio

software Audacity, recording their predictions for endings on a separate track by pressing a key

on the computer keyboard. I then converted this label track into a text file that listed every

participant response according to how much time had progressed from the beginning of the file

for subsequent data analysis.

I followed the model used in Experiment 2 to create the excerpts for the rating task. In

each movement, I selected excerpts that concluded different formal units in each composition,

representing subphrase, phrase, and section endings. As in Experiment 2, I created these clips by

splicing the original audio file using Audacity. The clips varied in length from two to six

measures, where each clip began with the onset of a formal unit (subphrase or phrase) and

concluded with the release of the last sound of that formal unit. These clips were drawn from the

first iteration of a passage on the recording, with the exception of clips from the return of the

Minuet section. The clips were paired with a visual representation of the movement and were

heard either in the order in which they occur in the movement (visual condition) or in random

blocks (random condition).

Procedure

After the participants gave informed consent, they read and listened to instructions for the

first part of the study before completing two practice excerpts:

In the first part of the study you will listen to several pieces. While listening, try to predict the moment at which a musical phrase ends. Your goal is to press the control-m at

90 My hypermetrical interpretations and cadence analyses are annotated on the scores in Appendix B.

168

the exact moment the phrase is completed. The composer could surprise you, so it’s okay if you press prematurely. Just keep on listening and try to anticipate the next ending. You will hear two practice excerpts—complete the practice and compare your answers with the ones provided.

The two practice excerpts were chosen because they illustrated the nature of the task and trained

the participants to predict endings actively rather than react retrospectively to a phrase ending.

The first example, the first reprise from Mozart’s String Quintet No. 4 in G minor, third

movement (K. 516), is a modulating contrasting period with phrases of different lengths. The

second phrase is longer due in part to an internal expansion caused by a deceptive cadence in m.

10. Most listeners who did the task correctly were initially tricked by this deceptive cadence,

although only some were also tricked the second time. Listeners heard this excerpt through

Audacity (the actual sound wave was hidden from view) and indicated their predictions on a

separate label track. After completing this first practice excerpt, participants compared their label

tracks to ones that indicated the cadences. Participants who did poorly or did not understand the

task repeated the task on this excerpt.

Example 7.1: Mozart String Quintet No. 4 in G Minor (K. 516), third movement, mm. 1–13

169

The quick tempo and the possibility for multiple interpretations made the second excerpt,

Mozart’s Sonata for Piano and Violin in B%&major, third movement, mm. 1–16 (K. 454), a bit

more difficult than the previous practice excerpt. While I hear a parallel period with a HC in m. 8

and a PAC in m. 16, it is also possible to interpret HCs in both mm. 4 and 12, creating a double

period. For this reason, this excerpt was chosen to demonstrate to the participant that there could

be multiple “right” predictions in this task, and also that their interpretation of phrase endings

might change over time.91 Because this excerpt also has a suspension, subjects were instructed to

predict when the goal harmony would arrive, not when the dissonant melodic tone(s) would

resolve. As before, after completing the practice task, participants compared their results with my

responses; if they did poorly, they repeated the task. Any remaining questions were addressed

before the participants began the prediction task with the test stimuli.

Following a successful completion of the practice tasks, participants began the prediction

task on the three minuet and trio movements. Afterwards, participants rated clips from these

movements on a seven-point scale to indicate how complete the end of the clip sounded.92 The

presentation mode of this task, random or visual, varied according to the subject’s assigned

condition. Following this rating task, participants filled out a questionnaire documenting their

familiarity with the compositions and their musical training.

91 As a side note, many participants only predicted endings at mm. 8, 12, and 16, pointing towards this

change. The first half of the second phrase (mm. 9–12) is the same as the previously heard first phrase, except for a

change in orchestration. If more predictable cadences correspond to stronger closes, this suggests that the V chords

in mm. 4 and 12 are less closed than the one in m. 8, which was only predictable in the second listening.

92 This task used the same directions as Experiment 2.

170

Example 7.2: Mozart’s Sonata for Piano and Violin in B%&Major (K. 454), third movement, mm. 1–16

Results

This results section is divided into two parts. The first examines the results only from the

prediction task, specifically assessing the musical features that led listeners to predict the end of

a phrase successfully. As in Experiment 1, I created windows around the endings for this part of

the analysis. Some of these endings coincided with a cadential arrival, while others merely

demarcated subphrases (all windows are marked on the scores in Appendix B). These windows

began 500 ms before the arrival of the ending and lasted for 1500 ms after this point.93 None of

the windows overlapped. I measured response time from the onset of the last note, so responses

occurring before the last note received a negative response time while responses occurring after

93 In the case of suspensions, the window began 500 ms before the beginning of the goal harmony.

171

the last note received a positive response time. The results show that listeners are more sensitive

to cadential cues than they are to hypermetric regularities: in general, listeners best predict the

tonic arrival in a PAC. The second section examines correlations between the participants’

ratings for closure and their responses from the prediction task, along with the correlation

between a listener’s response time in the prediction task and his/her rating of closure. While the

data indicate that both predictability and response time influence the ratings, so do other

independent variables such as cadential closure within the clip and the overall length of the clip.

The first regression, which analyzes the data from the prediction task, examines whether

formal units that conclude either with a cadence or at a temporal distance of four or eight

measures from the end of the previous phrase are associated with an increased probability that a

participant will predict the end of a formal unit (see Table 7.1). The significant value for group

(in the second row) indicates that as musical expertise increases, so does the participant’s ability

to predict the ends of phrases. Of the two independent feature variables, only the presence of a

cadence significantly predicts participant responses (participants are 1.8 times more likely to

respond when a cadence occurs). There is a significant interaction between musical expertise and

the presence of a cadence, where participants with more musical expertise are more likely to

respond at a cadential gesture (see Figure 7.2). The hypermeter variable examined whether

phrases that exhibit a regular 4-bar hypermeter are more predictable than phrases that have some

sort of phrase expansion. As seen in the bottom third of Table 7.1, there is no main effect for

ending points occurring 4 or 8 measures after the end of the previous formal unit. Evidently these

participants were able to predict endings based on the presence of a cadence and were not

necessarily influenced by hypermeterical regularities or irregularities.

Looking more closely at the main effect for cadence, I separated this variable into three

categories based on the type of cadence (PAC, HC and IAC) and ran an additional analysis with

these variables (the results from the mixed logit regression analysis are located in Table 7.2).

While there is not a significant main effect for the HC, there are significant main effects for both

the PAC and the IAC. Listeners are more than twice as likely to predict an ending when a tonic

chord concludes the phrase. The tonic arrival in both the PAC and IAC always follows a

dominant harmony, so listeners’ expectations for the ending are presumably influenced by the

high transitional probability between V and I. In contrast, because the dominant harmony of the

172

HC is not always preceded by the same harmony, a listener may not be able to accurately predict

its arrival.

Table 7.1: Mixed Logit Regression Analysis: Cadence and Hypermeter

Fixed Effect94

Coefficient Standard

error t-ratio

Approx. d.f.

p-value Odds Ratio

Confidence Interval

Intercept -1.176341 0.274569 -4.284 9023 <0.001 0.308405 (0.180,0.528)

Group 0.462805 0.114176 4.053 72 <0.001 1.588524 (1.265,1.995)

Cadence 0.607246 0.168288 3.608 9023 <0.001 1.835371 (1.320,2.553)

Group 0.801740 0.081178 9.876 9023 <0.001 2.229417 (1.901,2.614)

Hypermeter -0.114407 0.161683 -0.708 9023 0.479 0.891895 (0.650,1.224)

Group -0.094700 0.077748 -1.218 9023 0.223 0.909646 (0.781,1.059)

Figure 7.2: Interactions between Subject Group and the Presence of a Cadence

94 As in the analyses in Chapter 5, the odds ratio in each row for every within-subject variable shows the

odds of a participant’s response if the feature is present (with the exception of the first row, which is only needed for

the regression equation). The “Group” row for each within-subject variable shows the interaction between musical

expertise and the independent variable.

173

Table 7.2: Mixed Logit Regression Analysis: Cadence Types

Fixed Effect Coefficient Standard

error t-ratio

Approx. d.f.

p-value Odds Ratio

Confidence Interval

Intercept -1.298018 0.323561 -4.012 9022 <0.001 0.273072 (0.145,0.515)

Group 0.551098 0.127846 4.311 72 <0.001 1.735157 (1.345,2.239)

PAC 0.784639 0.197140 3.980 9022 <0.001 2.191616 (1.489,3.225)

Group 1.260553 0.103436 12.187 9022 <0.001 3.527371 (2.880,4.320)

HC 0.109291 0.182932 0.597 9022 0.550 1.115487 (0.779,1.597)

Group 0.708266 0.085718 8.263 9022 <0.001 2.030468 (1.716,2.402)

IAC 0.913797 0.221879 4.118 9022 <0.001 2.493774 (1.614,3.852)

Group 0.420757 0.104090 4.042 9022 <0.001 1.523114 (1.242,1.868)

I am defining these cadences by their traditional harmonic paradigms, as explained in

Chapter 5; however, these movements challenge some of these traditional markers of “cadence”

and “phrase.” Phrases are traditionally defined as having some sort of harmonic motion, with the

cadence representing the culmination of this motion. Several times in these movements, there is

no harmonic motion leading into the point of ending. One such example occurs in m. 16 in the

F-major. String Quartet (K. 168). Here, the B section ends on a V chord that arrives in m. 15;

however, the end of the phrase is not until m. 16, so I coded the HC as occurring at that point

(refer to the annotated scores in Appendix B). Also in this movement, a HC that arrives in m. 36

is followed by a post-cadential extension that repeats the cadential gesture. In this case, the I-V

gesture in m. 40 is not coded as a HC, despite the hypermetric four-measure groupings, because

it merely extends the ending that arrived in m. 36. Four measures later, in m. 44, the trio

concludes with the same type of cadential gesture from m. 15, but this one extends the tonic for

two measures. Even though the “goal-directed motion” concludes in m. 43, hypermetrical

expectations project an ending at m. 44, which is where the trio concludes.

Most of the cadential types are clear in these movements, but there are a few moments of

possible cadential ambiguity. In his 2010 talk at the Annual Meeting of the Society for Music

Theory, Burstein effectively demonstrated that distinguishing between a HC and an elided PAC

could be difficult, especially when there is continuous motion from the dominant chord of the

HC to the tonic beginning of the next phrase. Measures 54–55 of the G-major Quartet represent

one such case of this type of cadential ambiguity: despite the convincing arrival on the dominant

174

in m. 54, listeners could interpret the cello’s downward motion into the tonic pitch on the

downbeat of m. 55 as the ending instead. A similar situation occurs in mm. 24–25 of the same

quartet. Here an arrival on the dominant in m. 24 marks the end of the B section, but at this point

the second violin initiates a gesture that leads into the return of the A section. It could be possible

that without a break in the sound, listeners would not experience arrival closure in m. 24, but

rather retrospective closure when the opening theme recurs. This ambiguity surrounding the HC

may further explain the lack of a main effect for this cadence type.

Along with the main effect for the PAC and the IAC, musical expertise also affects the

results of the prediction task. Overall, as musical expertise increases, participants are more likely

to make their predictions within the two-second windows around the cadence points.

Specifically, for all three cadence types there is an interaction between the cadence type and

musical expertise (see Figure 7.3, which graphs the ANOVA estimated means for each

participant group). Compared to the non-musicians, the musician groups have a larger change in

their responses when there is a PAC. For the HC, only the graduate musicians are more likely to

respond; all participants respond to the IAC, but to different extents.

Response time data for the points at which listeners predicted an ending were also

analyzed. Response time data less than zero signifies that the participant responded prior to the

onset of the last note of the formal unit, while a response time greater than zero signifies that the

participant pressed after the onset of the last note. Most of the data is greater than zero, which

could reflect the time it takes for a participant physically to respond to a prediction. It could also

reflect the difficulty of the prediction task, where participants may be responding retrospectively

to an ending point despite instructions to predict endings. Even so, response times can still

measure the fulfillment of expectations, given that faster response times should correlate with

expected musical events.

175

Figure 7.3: Interactions between Subject Group and Cadence Type

176

Figure 7.3 (continued): Interactions between Subject Group and Cadence Type

Since there is no main effect for hypermetric regularity, these analyses will only consider

the influence of cadences on the timing of the prediction. As seen in Table 7.3, as musical

expertise increases, the response time decreases (note the negative coefficient for group in the

third row). This result suggests that more experienced musicians better predict the approach of

an ending. The positive coefficient for the presence of a cadence is surprising, because cadences

represent highly predictable patterns in music. The more predictable a pattern, the faster listeners

should respond to it. The interaction reveals that more experienced musicians respond faster to

cadential patterns.

177

Table 7.3: Mixed Models Regression Analysis: Response Time and Cadences

Fixed Effect95

Coefficient Standard

error t-ratio

Approx.

d.f. p-value

Intercept 0.821911 0.072698 11.306 3988 <0.001

Group -0.126699 0.031078 -4.077 72 <0.001

Cadence 0.114577 0.038407 2.983 3988 0.003

Group -0.048349 0.017685 -2.734 3988 0.006

Looking at specific cadence types (Table 7.4), there is a main effect for all three cadence

types on the response time. A smaller coefficient corresponds with a smaller increase in the

response time for that cadence. Participants respond faster to a PAC than to either a HC or an

IAC. For windows in which participants responded to a HC, their response time was faster than

for an IAC. However, it is important to remember that this analysis uses a slightly different data

set, only using the points where subjects responded. This may have removed then more

ambiguous cadences, leaving those that were especially predictable. For both the PAC and the

IAC, there is a subject group interaction indicating that participants with more musical

experience responded more quickly.

Table 7.4: Mixed Models Regression Analysis: Response Time and Cadence Types

Fixed Effect Coefficient Standard

error t-ratio

Approx.

d.f. p-value

Intercept 0.821587 0.072173 11.384 3986 <0.001

Group -0.125998 0.031354 -4.019 72 <0.001 PAC 0.104156 0.043245 2.408 3986 0.016

Group -0.071626 0.019831 -3.612 3986 <0.001

HC 0.135055 0.046483 2.905 3986 0.004 Group -0.035412 0.021012 -1.685 3986 0.092

IAC 0.264896 0.052113 5.083 3986 <0.001

Group -0.095702 0.023666 -4.044 3986 <0.001

95 This table (and Table 7.4) is similar to the corresponding one found in Chapter 6 (Table 6.2). In Table

7.3, there is a significant main effect for cadence: when a participant is predicting a cadential arrival, the coefficient

for that variable is factored into the regression equation. The first group variable (a three-level variable where

1 = non-musicians, 2 = undergraduate musicians, and 3 = graduate musicians) is always present in the equation

whether or not a cadence is present, but the group variable under cadence is only factored into the equation when a

cadence occurs.

178

Data from the second half of the study reveals no main effect for the rating condition,

whether visual or random on the ratings of closure, nor did the condition factor into in any

interaction. There are several possible interpretations: 1) participants in the visual condition may

have disregarded the visual information; 2) participants in the random condition may have been

able to place the clip correctly within the formal hierarchy, given that they heard each movement

in its entirety prior to the rating task (which seems improbable due to memory constraints); or 3)

the visual information may have corroborated the rating that would have occurred even without

it. Both the first and last possibilities support Meyer’s statement that the form of a piece emerges

from its hierarchy of closes (1973). Because condition did not influence the rating results, it was

not included in the data analysis.

The remaining independent variables (whether the participant anticipated a particular end

in the prediction task, the length of the excerpt, and the presence of a cadence) all significantly

influence the rating task. Both the “predicted” and the cadence variables are binary variables

(1 = participant predicted that particular ending in the previous task and 1 = presence of a

cadence), while the length variable is coded in seconds. For each of these three variables, an

increase corresponds to a significant increase in the rating of closure, with the presence of a

cadence having the largest effect. Before taking into account any of the independent variables,

there is no significant difference between the ratings made by subjects with different levels of

expertise, but there are interactions between subject group and predicted ends as well as subject

group and the presence of a cadence. Participants with more musical experience consistently rate

the clips higher when these variables are present.

Table 7.5: Mixed Models Regression Analysis: Ratings

Fixed Effect Coefficient Standard

error t-ratio

Approx.

d.f. p-value

Intercept 1.851092 0.310883 5.954 3643 <0.001

Group 0.054959 0.142029 0.387 72 0.700 Predicted 0.418119 0.157657 2.652 3643 0.008

Group 0.270692 0.076986 3.516 3643 <0.001

Length 0.288156 0.031036 9.285 3643 <0.001 Group -0.022383 0.014364 -1.558 3643 0.119

Cadence 1.296993 0.162001 8.006 3643 <0.001

Group 0.191809 0.078961 2.429 3643 0.015

179

The final analysis uses only the rating data from clips in which the participant

successfully predicted the ending to see if there is a correlation between ratings and response

time. A negative coefficient for the response time variable in Table 7.6 indicates that as response

times increase the ratings of closure decrease. While there is no main effect for response time,

there is an interaction: as musical expertise increases, subjects are more likely to give the clips

with a faster response time in the prediction task a higher rating.

Table 7.6: Mixed Models Regression Analysis: Ratings and Response Time

Fixed Effect Coefficient Standard

error t-ratio

Approx.

d.f. p-value

Intercept 5.044658 0.297488 16.958 1875 <0.001

Group 0.186685 0.128796 1.449 72 0.152

Response Time -0.282416 0.311667 -0.906 1875 0.365

Group -0.405425 0.152425 -2.660 1875 0.008

Discussion

Overall, the data support the hypothesis that anticipated musical endings evoke a feeling

of closure. The data illustrate a correlation between a listener’s ability to predict an ending as the

composition unfolds and that listener’s subsequent rating of closure for that particular ending.

Further, cadences that are traditionally considered more closed were better predicted in the first

task and had faster response times (in other words, participants responded more consistently and

quickly to an anticipated PAC than to the other cadences).

As seen in the previous studies, musical experience influenced the participants’ results.

Here, the main effects were magnified for the participants with more musical experience:

participants with more experience successfully predicted more endings, and for these endings

experienced participants predicted all the cadences types better than participants with less

experience did. Their ability to anticipate endings more quickly and accurately in the prediction

task suggests that the participants with increased musical experience drew from knowledge

structures supported by many more exemplars of common ending paradigms in this style. These

experienced participants showed a stronger correlation between their ratings and their data from

the prediction task: both the endings they predicted and their faster response times correlate with

higher ratings for closure.

180

Surprisingly, the rating condition had no influence on the rating task: participants in both

the visual and random rating conditions showed no difference in their clip ratings. These data

support Meyer’s assertion that form emerges from a hierarchy of closes more than top-down

knowledge of formal structure influences the perception of closure. Appendix B shows windows

for each movement, the percentage of participants who indicated an ending in that window, and

the mean rating in each window. While there is not always an exact match between the formal

structure and the data, many times the endings that demarcate the conclusion of a formal section

were better predicted, and subsequently were given a higher rating.

Event Segmentation Theory posits that an increase in the transient prediction error creates

a perceptual boundary. While I was unable to measure directly any increase in this transient

prediction error following a cadence, it is safe to assume that the ability of a listener to predict

upcoming musical material following a cadence is lower than their ability to predict the cadential

arrival. The correlation between the prediction and rating tasks further suggests that larger

increases in prediction error results in a hierarchically significant musical boundary, eliciting a

stronger feeling of closure.

181

CHAPTER 8

CLOSURE

Four characteristics of closure inferred from the musicological literature form the

foundation for my definition of closure and my cognitive model for the perception of closure

outlined in this dissertation:

1) closure segments a continuous musical stream into discrete events 2) closure is stylistically dependent 3) closure is a completion of a goal-directed process resulting in an arrival of relative

stability or rest 4) the strength of closure depends on many musical variables and plays an integral role in

the hierarchic construction of a composition

While my own definition of closure, the anticipated end to a musical segment, responds to the

concept of “closure” as used in musical analysis, my methodology is removed from the music

itself as an object of study, alternatively focusing on the perception of closure. In other words,

instead of examining closural processes in a particular musical style or a specific composer’s

corpus, the cognitive model for the perception of closure (developed in Chapters 3 and 4 and

supported by the three experiments in Chapters 5, 6, and 7) uses recent research in event

segmentation and musical expectation to explore how and why a listener perceives closure.

Corroborating previous studies examining event segmentation, Experiment 1 established

the possibility of a shared cognitive process in musical segmentation. Specifically, the results

from this study showed that subjects were highly consistent in their segmentation responses, both

within an individual subject and between subjects, and that subjects perceived event structure

hierarchically, where smaller musical segments combine to form larger segments. This study also

revealed that specific musical features can predict a listener’s perception of an ending, and this

correlation grows stronger with musical training. Many times, a perceptual boundary occurred at

the end of a schematic unit or following a discontinuity in the musical surface, corresponding

with the arrival and change features in Experiment 1. Segments that terminate with an arrival

feature would presumably sound more closed (resulting in anticipatory or arrival closure) than

would segments ending with a change feature (resulting in retrospective closure).

182

Results for segmentation consistency and nested lower levels did not significantly vary

for the participants who segmented Mozart and the participants who segmented Bartók, but there

was a difference in the types of features that signaled an end in these two styles. This suggests

that the cognitive mechanism of segmentation remains constant between styles, while the

specific features that signal closure may change. Subjects who segmented Mozart tended to rely

on arrival features, while subjects who segmented Bartók tended to rely on change features. The

arrival features in Mozart represent well-learned endings from the common-practice style, while

the arrival features that predicted endings in Bartók tended to be peculiar to these particular

compositions. Because participants who segmented Bartók could not rely on previously learned

endings representing a wide variety of compositions, they tended to rely more on surface

changes during their task.

Learning ending gestures for a style, or even for a specific piece, is an unconscious

process dependent on listener experience. According to Hintzman’s multiple trace theory (1986),

every encounter with a stimulus creates a trace in long-term memory. Of course, the quality of

the information stored in the trace is contingent on listener attention to the stimulus and the type

of encounter with the stimulus. The number of traces in LTM that anticipate the conclusion of a

particular ending gesture determines whether a listener experiences closure. For instance, while a

listener may have veridical expectations for an ending in a particular composition, a listener may

also have many more traces in LTM for continuation at that point, lessening the sense of closure.

Experiment 2 used a learning task to explore whether a listener can associate the arrival

features of a particular compositional style with the feeling of closure. The aim was to see

whether a brief period of listening to excerpts by either Mozart or Bartók would influence

subsequent ratings of closure of similar endings. While there wasn’t an interaction between the

composer heard during the exposure period and a listener’s rating of that composer (maybe due

to the exposure period being too brief, or to my inability to control the transitional probabilities

between sound elements), participants with more musical training rated all cadential excerpts

higher than did those participants with less training. Given that more experienced musicians

presumably have had more exposure to cadential gestures in both styles, these higher ratings

support the learned association between closure and a particular musical gesture. Further,

graduate music students who were exposed to Bartók tended to be more sensitive than other

183

subject groups to the learning task, suggesting that their increased training allowed them to

assimilate cadential cues more quickly from a less-familiar style.

The perception of goal direction towards an ending also supports our ability to pick out

transitional probabilities. The feeling of moving towards a musical goal is an artifact of being

able to predict with increasing certainty subsequent events in a phrase. Studies have revealed

expectations for acoustic continuity as well as expectations for learned musical patterns. Event

segmentation theory is an expectation-based model of event segmentation where an unexpected

event (e.g., a discontinuity in the musical surface) or the expected completion of an event causes

a perceptual boundary. According to my definition of closure, not every end of every segment

produces a feeling of finality. Anticipatory and arrival closure capture the experience of an

anticipated ending, while retrospective closure represents a failure to predict the moment of

closure. In retrospective closure, a discontinuity signals a beginning, but the preceding ending

wasn’t accurately predicted or recognized at the moment it occurred.

Experiment 3 asked listeners to predict endings in three Mozart minuet movements. Most

of their responses coincided with cadences—especially authentic cadences, which represent

highly predictable endings in the common-practice style. Data from a subsequent rating task

showed that listeners rated the endings they predicted in the previous task as more closed than

other endings from the same composition. These data suggest that the strength of closure is

directly related to the predictability of an ending; larger structural boundaries were generally

more predictable and received higher ratings, supporting Meyer’s argument that form emerges

from a hierarchy of closes.

The degree to which closure permeates the musicological discourse is a testament to its

analytical and aesthetic importance and speaks to an essential characteristic of the music

listening experience. Despite stylistically varied markers of closure, I posit that an innate

cognitive mechanism engenders the perception of closure. EST provides a model for musical

segmentation and closure that transcends stylistic boundaries and captures some of our musical

intuitions about closure: that closure is contingent upon musical expectation and prompts a

hierarchical understanding of a composition. The perception of closure is thus a product of an

ongoing cognitive process that segments our continuous life experiences into discrete events.

184

Two different agents of closure can be inferred from the language used in musicological

discourse: the music (referring to a compositional process) or the mind (referring to a

psychological experience). While these perspectives may seem irreconcilable, the four

characteristics of closure can serve as a point of intersection, and only the language differences

remain. Because the perception of closure (or at least the evocation of closure) shapes musical

analyses of all kinds, we should look past the differences in language and recognize the

underlying role of expectation (whether musical or disciplinary) in musical analysis. By

considering closure as a result of musical expectation, we can better reevaluate how we use the

concept of closure to shape our analysis of music.

While this project builds a strong case for the role of expectation in both the creation of

musical segments and the perception of closure, work on this topic remains to be done. My own

studies were large in scope, deriving their stimuli from actual musical compositions, and there

was no limit on the number of segmentation/prediction responses a subject could make. I plan to

reanalyze some of the data collected because there are additional ways to examine it that I didn’t

pursue in this dissertation. For instance, participants in Experiment 1 could be divided into

subject groups based on how often they indicated a boundary during the segmentation task. From

a pragmatic perspective, this would ensure that subjects segmenting on the same hierarchical

level would be grouped together, and preferred segment length might distinguish musically

experienced listeners better than did degree programs. In Experiment 2, I did not discuss the data

collected during the exposure period (while listeners listened to music in the exposure period,

they indicated endings using a computer keyboard). While there wasn’t an interaction effect for

exposure composer and rated composer, there might be a correlation between the endings

identified by a participant in the exposure period and that participant’s ratings of closure in

subsequent task. In all of the studies, additional musical features could be examined for their

effect on participants’ segmentation/prediction responses.

Additionally, I plan to perform more focused and controlled studies that I hope will

produce more robust results in support of these theories. For instance, to replicate the failed

learning task in Experiment 2, I could create a statistical learning task where I compose the

material in the exposure period, controlling for the transitional probabilities between pitches

(using a non-tonal style). Every pitch could be presented at a steady rate, but cadential figures

185

(which would be a composed pitch pattern particular to this task) would be followed by rests. I

would then compare the listener’s ratings of these cadential pitch-patterns to the listener’s ratings

of pitch-patterns from the beginning and middle of segments.

Two avenues for future research include examining the influence of previous knowledge

structures on the formation of new closural expectations and the influence of non-compositional

features on the perception of closure. This expectation-based theory of closure posits that

through statistical learning listeners associate ending gestures with closure even in an unfamiliar

style, but the extent to which already learned closural gestures may influence this process is

unknown. While a learning task similar to Experiment 2 could explore this issue, such an

experiment could also be expanded into a cross-cultural study that could specifically examine the

influence that learned closural gestures in one style may have on the perception of closure in a

different style. In regards to non-compositional features influencing the perception of closure, the

data from Experiment 2 showed that the acoustic properties of the final pitch of a segment might

influence the perception of closure. A study that specifically manipulated the final sound of a

segment could reveal how performance aspects may shape the perceived structure of a

composition. Another avenue for further research is the influence of bodily gestures, including

those that necessarily accompany a live performance, on the perception of formal structure and

closure.

This project differs from previous theoretical and cognitive studies regarding closure by

casting both the creation of discrete segments from an ongoing musical stream and the

perception of closure at the end of some of these segments as contingent upon a listener’s

musical expectations. While much work remains to be done, the theoretical literature, previous

cognitive studies in both music and event segmentation, and the three experiments presented in

this dissertation support the connection between segmentation and expectation and their

influence on the perception of closure.

186

APPENDIX A

SEGMENTATION RESPONSES IN EXPERIMENT 1

The figures in this appendix plot all of the responses made over the course of each

movement in the Fine 2 and the Coarse 2 trials. Points on the solid red line illustrate the total

number of segmentation responses made during each beat in the Fine 2 trial; points on the dashed

purple line represent the total number of segmentation responses made during each beat in the

Coarse 2 trial. These counts represent the responses made by all three subject groups: non-

musicians, undergraduate musicians, and graduate musicians. Relatively high numbers of

responses are labeled on the Figure with the measure and beat number of their occurrence (where

the first number in the label is the measure number and the second number is the beat within that

measure). Figures do not illustrate the same number of measures because they are divided by

formal sections within each movement. Figures A.1 and A.2 represent the responses made by the

participants in Experiment 1a (n = 32), and Figures A.3 and A.4 represent the responses made by

participants in Experiment 1b (n = 33).

187

Figure A.1: Bartók, String Quartet No. 4, third movement

188

Figure A.1 (continued): Bartók, String Quartet No. 4, third movement

189

Figure A.1 (continued): Bartók, String Quartet No. 4, third movement

190

Figure A.2: Bartók, String Quartet No. 4, fifth movement

191

Figure A.2 (continued): Bartók, String Quartet No. 4, fifth movement

192

Figure A.2 (continued): Bartók, String Quartet No. 4, fifth movement

193

Figure A.2 (continued): Bartók, String Quartet No. 4, fifth movement

194

Figure A.2 (continued): Bartók, String Quartet No. 4, fifth movement

195

Figure A.3: Mozart, String Quartet No. 19, fourth movement

196

Figure A.3 (continued): Mozart, String Quartet No. 19, fourth movement

197

Figure A.3 (continued): Mozart, String Quartet No. 19, fourth movement

198

Figure A.3 (continued): Mozart, String Quartet No. 19, fourth movement

199

Figure A.4: Mozart, String Quartet No. 21, second movement

200

Figure A.4 (continued): Mozart, String Quartet No. 21, second movement

201

APPENDIX B

ANNOTATED SCORES FOR EXPERIMENT 3

This appendix includes all three movements used in Experiment 3, including some

annotations: cadences are marked above each system, the hypermeter is notated between the

violin 2 and viola parts, and vertical lines through the score signify the points I used in data

analysis for the prediction task. Under each system, I included the percentage of participants who

predicted an ending at various points (on the first listening) as well as the mean rating. Not every

point from the prediction task was included in the rating task due to time constraints.

Example B.1: Mozart, Quartet No. 3 in G Major, K. 156, third movement

1 2 3 4 1 2 3 4 3

HC PAC

20.5% 2.6

6.8%

63.0%

5.0

4 1 2 3 4 1 2 3 4

PAC

9.6% 1.4

79.9% 5.2

1

10

202

Example B.1 (continued): Mozart, Quartet No. 3 in G Major, K. 156, third movement

3 4 1 2 3 4 1 2 3

IAC HC

42.3% 3.5

32.4%

19.2% 2.0

15.1% 2.1

4 1 2 3 4 1 2 3 4

HC IAC PAC

27.0%

2.5% 1.4

66.2%

66.2% 6.4

1 2 3 4 1 2 2 2

4.1% 1.6

19

28

37

203

Example B.1 (continued): Mozart, Quartet No. 3 in G Major, K. 156, third movement

Example B.2: Mozart, String Quartet No. 8 in F Major, K. 168, third movement

3 4 1 2 3 4 1 2 3

4 1 2 3 4 1 2 3 4

PAC

HC PAC

67.1% 4.9

5.6% 1.9

56.9% 2.5

11.0% 1.7

75.7% 3.8

69.4% 5.7

1 2 3 4 1 2 3 4 1

HC PAC

5.4% 1.6

53.3% 3.1

82.9% 5.6

45

54

204

Example B.2 (continued): Mozart, String Quartet No. 8 in F Major, K. 168, third movement

2 3 4 1 2 3 4 1 2 3 4 1

2 3 4 1 2 3 4 1 2 3 4

1 2 3 4 1 2 3 4 1 2 3 4

“HC” HC

PAC PAC

HC “PAC”

20.5% 2.1

80.0% 4.4

51.3%

85.3% 6.2

12.2% 1.9

83.6%

5.6

29.7%

28.4% 3.2

57.3% 4.4

10

22

33

205

Example B.3: Mozart, String Quartet No. 13 in D Minor, K. 173, third movement96

96

There is an incorrect note in m. 8: the cello performs a B!" instead of the notated C3.

1 2 3 4 1 2 3 4 3 4

1 2 3 4 1 2 3 4 3 4

1 2 3 4 1 2 3 4 1 2

IAC IAC

PAC HC

IAC

60.6% 3.1

63.5% 2.5

81.9% 5.75

24.3% 2.4

60.3%

4.3

51.4%

4.2% 2.0

11

21

206

Example B.3 (continued): Mozart, String Quartet No. 13 in D Minor, K. 173, third movement

3 4 3 4 1 2 3 4 1 2 3 4

1 2 3 4 1 2

1 2 3 4 1 2 3

HC PAC

HC

PAC

62.0% 3.1

41.7% 2.6

84.9% 6.5

38.4% 3.0

78.1% 5.9

31.5% 3.0

31

43

49

207

Example B.3 (continued): Mozart, String Quartet No. 13 in D Minor, K. 173, third movement

4 1 2 3 4 1 2

3 4 1 2 1 2 3 4

HC

HC PAC

69.0% 3.6

41.9%

76.4% 6.0

56

63

208

APPENDIX C

COPYRIGHT PERMISSION LETTERS

209

210

211

APPENDIX D

IRB APPROVAL LETTER AND

INFORMED CONSENT LETTER

Office of the Vice President For Research

Human Subjects Committee

Tallahassee, Florida 32306-2742

(850) 644-8673 · FAX (850) 644-4392

APPROVAL MEMORANDUM

Date: 6/23/2010

To: Crystal Peebles

Address:

Dept.: MUSIC SCHOOL

From: Thomas L. Jacobson, Chair

Re: Use of Human Subjects in Research

Listener perception of segmentation and closure in music

The application that you submitted to this office in regard to the use of human subjects in the

proposal referenced above have been reviewed by the Secretary, the Chair, and two members of

the Human Subjects Committee. Your project is determined to be Expedited per 45 CFR §

46.110(7) and has been approved by an expedited review process.

The Human Subjects Committee has not evaluated your proposal for scientific merit, except to

weigh the risk to the human participants and the aspects of the proposal related to potential risk

and benefit. This approval does not replace any departmental or other approvals, which may be

required.

If you submitted a proposed consent form with your application, the approved stamped consent

form is attached to this approval notice. Only the stamped version of the consent form may be

used in recruiting research subjects.

If the project has not been completed by 6/22/2011 you must request a renewal of approval for

continuation of the project. As a courtesy, a renewal notice will be sent to you prior to your

expiration date; however, it is your responsibility as the Principal Investigator to timely request

renewal of your approval from the Committee.

You are advised that any change in protocol for this project must be reviewed and approved by

212

the Committee prior to implementation of the proposed change in the protocol. A protocol

change/amendment form is required to be submitted for approval by the Committee. In addition,

federal regulations require that the Principal Investigator promptly report, in writing any

unanticipated problems or adverse events involving risks to research subjects or others.

By copy of this memorandum, the Chair of your department and/or your major professor is

reminded that he/she is responsible for being informed concerning research projects involving

human subjects in the department, and should review protocols as often as needed to insure that

the project is being conducted in compliance with our institution and with DHHS regulations.

This institution has an Assurance on file with the Office for Human Research Protection. The

Assurance Number is IRB00000446.

Cc: Nancy Rogers, Advisor

HSC No. 2010.4328

213

Office of the Vice President For Research

Human Subjects Committee

Tallahassee, Florida 32306-2742

(850) 644-8673 · FAX (850) 644-4392

RE-APPROVAL MEMORANDUM

Date: 5/4/2011

To: Crystal Peebles

Address:

Dept.: MUSIC SCHOOL

From: Thomas L. Jacobson, Chair

Re: Re-approval of Use of Human subjects in Research

Listener perception of segmentation and closure in music

Your request to continue the research project listed above involving human subjects has been

approved by the Human Subjects Committee. If your project has not been completed by

5/1/2012, you must request a renewal of approval for continuation of the project. As a courtesy, a

renewal notice will be sent to you prior to your expiration date; however, it is your responsibility

as the Principal Investigator to timely request renewal of your approval from the committee.

If you submitted a proposed consent form with your renewal request, the approved stamped

consent form is attached to this re-approval notice. Only the stamped version of the consent

form may be used in recruiting of research subjects. You are reminded that any change in

protocol for this project must be reviewed and approved by the Committee prior to

implementation of the proposed change in the protocol. A protocol change/amendment form is

required to be submitted for approval by the Committee. In addition, federal regulations require

that the Principal Investigator promptly report in writing, any unanticipated problems or adverse

events involving risks to research subjects or others.

By copy of this memorandum, the Chair of your department and/or your major professor are

reminded of their responsibility for being informed concerning research projects involving

human subjects in their department. They are advised to review the protocols as often as

necessary to insure that the project is being conducted in compliance with our institution and

with DHHS regulations.

Cc: Nancy Rogers, Advisor

HSC No. 2011.6316

214

FSU Behavioral Consent Fonn Listener perception of segmentation and closure in music

You are invited to be in a research smdy on how listeners segment musical experience. Please

read this fomt and ask any questions you may have before agreeing to be in the study.

This study is being conducted by Crystal Peebles from the College of Music.

In this study, you v.<i11 listen to several musical excerpts and make evaluative decisions regarding

how the music can be di\<ided into smaller parts and how wen certain musical features end musical passages. You v.ill indicate your responses on a computer keyboard and fill out a survey afterwards. The entire smdy will last about an hour.

There are no foreseeable risks or discomforts if you decide to participate in this study. You tvil1 be able to adjust the music to a comfortable voltune level. While you will not receive any personal benefits front this study, you カNセ u@ be contributing to the field of music cognition. Participants who are c-urrently enro11ed in Introduction to Psychology and participate in this experinlent through the Psychology Department tviU receive the appropriate antount of course

credit.

The records of this study will be kept private and confidential to the eA"tent permitted by law. In any sort of report I might publish, I will not include any infonnation that wi11make it possible to identify a subject. Research records tviU be stored securely.

Participation in this smdy is voltmtary. Your decision whether or not to participate will not affect your current or fi.tture relations tvith the University. You may temlinate your participation in this smdy at any tinte カNセエィッ オエ@ penalty. You tviU still receive the research credits that you have

earned カNセエィ@ your participation to that point in the study.

The researcher conducting this study is Crystal Peebles You may ask any questions you have now. If you have a question later, you are encouraged to contact the researcher at

.. The adviser for this study is Nancy Rogers, 644-4142, [email protected].

If you have any アオ・ウ エゥ ッ ョ セ@ or concerns regarding this study and would like to talk to someone other than the researcher, you are encouraged to contact theFSU IRB at 2010 Levy Street, Research Building B, Suite 276, Tallahassee, FL 32306-2742, or 850-644-8633, or by entail at

[email protected].

You will be given a copy of this infomJation to keep for your records.

Statement of Consent:

I have read the above infonnation. I have had the opportunity to ask questions and have received answers. I consent to participate in the study.

SigrJature Date

FSU Human Subjects Committee Approved 6/23/10. Void after 6/22/11 HSC# 2010.4328

215

REFERENCES

Aarden, Bret. 2003. “Dynamic Melodic Expectancy.” Ph.D. Diss., Ohio State University.

Agawu, Victor Kofi. 1987. “Concepts of Closure and Chopin’s Opus 28.” Music Theory

Spectrum 9: 1–17.

Allanbrook, Wye Jamison. 1994. “Mozart’s Tunes and the Comedy of Closure.” In On Mozart,

edited by James M. Morris, 169–89. Cambridge: Woodrow Wilson Center Press and the

Press Syndicate of the University of Cambridge.

Anson–Cartwright, Mark. 2007. “Concepts of Closure in Tonal Music: A Critical Study.” Theory

and Practice 32: 1–17.

Baird, Jodie A. and Dare A. Baldwin. (2001). “Making Sense of Human Behavior: Action

Parsing and Intentional Inference.” In Intentions and Intentionality: Foundations of

Social Cognition. Edited by Bertram F. Malle, Louis J. Moses, and Dare A. Baldwin.

193–206. Cambridge: The MIT Press.

Baker, Dorothy Zayatz. 2003. “Aaron Copland’s Twelve Poems of Emily Dickinson: A Reading

of Dissonance and Harmony.” The Emily Dickinson Journal 12: 1–24.

Baldwin, Dare, Annika Andersson, Jenny Saffran, and Meredith Meyer. 2008. “Segmenting

Dynamic Human Action via Statistical Structure.” Cognition 106 (3): 1382–407.

Bharucha, Jamshed Jay and Carol L. Krumhansl. 1983. The Representation of Harmonic

Structure in Music: Hierarchies of Stability as a Function of Context. Cognition 13:

63–102.

Bharucha, Jamshed Jay and Keiko Stoeckig. 1986. “Reaction Time and Musical Expectancy:

Priming Chords.” Journal of Experimental Psychology: Human Perception and

Performance 12 (4): 403–10.

Brower, Candace. 2000. “A Cognitive Theory of Musical Meaning.” Journal of Music Theory

44 (2): 323–79.

Bryden, Kristy A. 2001. Musical Conclusions: Exploring Closural Processes in Five Late

Twentieth-century Chamber Works. Ph.D. Diss., University of Nebraska.

Burstein, Poundie. 2010. “Half, Full or In Between? Distinguishing Between Half and Authentic

Cadences.” Paper Presentated at the Annual Meeting of the Society for Music Theory,

Indianapolis, 4–7 November.

Byros, Vasileios (Vasili). 2009. Foundations of Tonality as Situated Cognition, 1730–1830: An

Enquiry into the Culture and Cognition of Eighteenth-Century Tonality with Beethoven’s

Eroica Symphony as a Case Study. Ph.D. Diss., Yale University.

Cadwallader, Allen and David Gagné. 2006. Analysis of Tonal Music: A Schenkerian Approach,

2nd

ed. New York: Oxford University Press.

Caplin, William E. 2004. “The Classical Cadence: Conceptions and Misconceptions.” Journal of

the American Musicological Society 57 (1): 51–117.

216

Carlsen, James C. 1981. “Some Factors which Influence Melodic Expectancy.”

Psychomusicology 1: 12–29.

Cherlin, Michael. 1991. “Thoughts on Poetry and Music, on Rhythms in Emily Dickinson’s “The

World Feels Dusty” and Aaron Copland’s Setting of It.” Intégral 5: 55–75.

Clarke, Eric F. 2001. “Meaning and the Specification of Motion in Music. Musicæ Scientiæ: The

Journal of the European Society for the Cognitive Sciences of Music 5 (2): 213–34.

Clendinning, Jane Piper and Elizabeth West Marvin. The Musician’s Guide to Theory and

Analysis. New York: W.W. Norton & Company.

Clifford, Robert. 2005. “Perennial Questions: Atonal Closure: Process, Completion, and

Balance.” Tempo: A Quarterly Review of Modern Music 59 (234): 29–33.

Cook, Nicholas. 1987. “The Perception of Large-Scale Tonal Closure.” Music Perception 5 (2):

197–206.

Cuddy, Lola L, & Lunney, Carol. A. 1995. “Expectancies Generated by Melodic Intervals:

Perceptual Judgments of Continuity.” Perception and Psychophysics 57: 451–62.

Deliège, Irene. 1987. “Grouping Conditions in Listening to Music: An Approach to Lerdahl and

Jackendoff’s Grouping Preference Rules.” Music Perception 4 (4): 325–260.

_________. 2006. “Emergence, Anticipation, and Schematization Processes in Listening to a

Piece of Music: A Re-Reading of the Cue Abstraction Model.” In New Directions in

Aesthetics, Creativity, and the Arts, edited by Paul Loher, Colin Martindale, and Leonid

Dorfman. 153–73. Amityville, NY: Baywood Publishing Company.

Deliège, Irene, Marc Mélen, Diana Stammers, and Ian Cross. “Musical Schemata in Real-Time

Listening to a Piece of Music.” Music Perception 14 (2): 117–59.

Deutsch, Diana. 1991. “Pitch Proximity in the Grouping of Simultaneous Tones.” Music

Perception 9 (2): 185–98.

Eberlein, Roland and Jobst Peter Fricke. 1992. Kadenzwahrnegmung und Kadenzgeschichte: Ein

Beitrag zu einer Grammatik der Musik. Frankfurt am Main: Peter Lang.

Edwards, George. 1991. “The Nonsense of an Ending: Closure in Haydn’s String Quartets.” The

Musical Quarterly 75 (3): 227–54.

Forrest, David. 2010. “Prolongation in the Choral Music of Benjamin Britten.” Music Theory

Spectrum 32 (1): 1–25.

Gauldin, Robert. 1985. A Practical Approach to Sixteenth-Century Counterpoint. Long Grove,

Illinois: Waveland Press.

———. 1988. A Practical Approach to Eighteenth-Century Counterpoint. Prospect Heights,

Illinois: Waveland Press.

———. 2004. Harmonic Practice in Tonal Music, 2nd

ed. New York: W.W. Norton & Company.

Gjerdingen, Robert O. 1988. A Classic Turn of Phrase: Music and the Psychology of

Convention. Philadelphia: University of Pennsylvania Press.

———. 1994. “Apparent Motion in Music?” Music Perception 11 (4): 1994.

217

Graubart, Michael. 2003. “Perennial Questions: What Are Twelve-Note Rows Really For?”

Tempo: A Quarterly Review of Modern Music 57 (225): 32–36.

Grave, Floyd K. 2009. “Freakish Variations on a ‘Grand Cadence’ Prototype in Haydn’s String

Quartets.” Journal of Musicological Research 28 (2–3): 119–45.

Hanninen, Dora A. 2001. “Orientations, Criteria, Segments: A General Theory of Segmentation

for Music Analysis.” Journal of Music Theory 45 (2): 345–433.

Hard, Bridgette M., Barbara Tversky and David Lang. 2006. “Making Sense of Abstract Events:

Building Event Schemas.” Memory & Cognition 34 (6):1221–35.

Hasty, Christopher. 1981. “Segmentation and Process in Post-Tonal Music.” Music Theory

Spectrum 3: 54–73.

———. 1984. “Phrase Formation in Post-Tonal Music.” Journal of Music Theory 28 (2):

167–90.

Hébert, Sylvie, Isabelle Peretz, and Lise Gagnon. 1995. “Perceiving the Tonal Ending of Tune

Excerpts: The Roles of Pre-existing Representation and Musical Expertise. Canadian

Journal of Experimental Psychology 49 (2): 193–209.

Hepokoski, James and Warren Darcy. 2006. Elements of Sonata Theory: Norms, Types, and

Deformations in the Late-Eighteenth-Century Sonata. Oxford: Oxford University Press.

Hintzman, Douglas L. 1986. “‘Schema Abstraction’ in a Multiple-Trace Memory Model.”

Psychological Review 93 (4): 411–28.

———. 1988. “Judgments of Frequency and Recognition Memory in a Multiple-Trace Memory

Model.” Psychological Review 95 (4): 528–51.

———. 2010. “How Does Repetition Affect Memory? Evidence from Judgments of Recency.”

Memory & Cognition 38 (1): 102–15.

Hopkins, Robert G. 1990. Closure and Mahler’s Music: The Role of Secondary Parameters.

Philadelphia: University of Pennsylvania Press.

Huron, David. 2006. Sweet Anticipation: Music and the Psychology of Expectation. Cambridge:

MIT Press.

Hyland, Anne M. 2009. “Rhetorical Closure in the First Movement of Schubert’s Quartet in C

Major, D. 46: A Dialogue with Deformation.” Music Analysis 28 (i): 111–42.

Jonaitis, Erin McMullen and Jenny R. Saffran. 2009. “Learning Harmony: The Role of Serial

Statistics.” Cognitive Sceience 33: 951–68.

Jones, Evan and Matthew Shaftel. A Critical Approach to Sight Singing and Musical Style,

preliminary ed. Plymouth: Hayden-McNeil Publishing, 2009.

Joichi, Janet M. 2006. Closure, Context, and Hierarchical grouping in Music. A Theoretical and

Empirical Investigation. Ph.D diss., Northwestern University.

Kessler, Edward J., Christa Hansen, and Roger N. Shepard. 1984. “Tonal Schemata in the

Perception of Music in Bali and in the West. Music Perception 2 (2): 131–65.

218

Knösche, Thomas R., Christiane Neuhaus, Jens Haueisen, Kai Alter, Burkhard Maess, Otto W.

Witte, and Angela D. Friederici. 2005. “Perception of Phrase Structure in Music.” Human

Brian Mapping 24: 259–73.

Krumhansl, Carol. 1990. “Tonal Hierarchies and Rare Intervals in Music Cognition.” Music

Perception 7 (3): 309–24.

———. 1996. “A Perceptual Analysis of Mozart’s Piano Sonata K. 282: Segmentation, Tension,

and Musical Ideas.” Music Perception 13 (3): 401–32.

Krumhansl, Carol L., Jukka Louhivuori, Petri Toiviainen, Topi Järvinen, and Tuomas Eerola.

1999. Melodic Expectation in Finnish Spiritual Hymns: Convergence of Statistical,

Behavioral and Computational Approaches. Music Perception 17: 151–95.

Krumhansl, Carol L., Pekka Toivanen, Tuomas Eerola, Petri Toiviainen, Topi Järvinen, and

Jukka Louhivuori. 2000. Cross-Cultural Music Cognition: Cognitive Methodology

Applied to North Sami Yoiks. Cognition 76: 13–58.

Kurby, Christopher A. and Jeffery M. Zacks. 2007. “Segmentation in the Perception and

Memory of Events.” Trends in Cognitive Sciences 12: 72–79.

Kurth, Richard B. 2000. “Moments of Closure: Thoughts on the Suspension of Tonality in

Schoenberg’s Fourth Quartet and Trio.” In Music of My Future: The Schoenberg

Quartets and Trio, edited by Reinhold Brinkmann and Christoph Wolff, 139–60.

Cambridge: Harvard University Press.

Lerdahl, Fred and Ray Jackendoff. 1977. “Toward a Formal Theory of Tonal Music.” Journal of

Music Theory 21 (1): 111–71.

———. 1983. A Generative Theory of Tonal Music. Cambridge: MIT Press.

Margulis, Elizabeth Hellmuth. 2005. “A Model of Melodic Expectation.” Music Perception

22 (4): 663–713.

———. 2007. “Silences in Music are Musical not Silent: An Exploratory Study of Context

Effects on the Experience of Musical Pauses.” Music Perception 24 (5): 485–506.

Marvin, Elizabeth West and Alexander R. Brinkman. 1999. “The Effect of Modulation and

Formal Manipulation on Perception of Tonic Closure by Expert Listeners.” Music

Perception 16 (4): 389–408.

McCreless, Patrick. 1991. “The Hermeneutic Sentence and Other Literary Models for Tonal

Closure.” Indiana Theory Review 12: 35–73.

Meyer, Leonard B. 1956. Emotion and Meaning in Music. Chicago: University of Chicago Press.

———. 1973. Explaining Music. Chicago: University of Chicago Press.

Magliano, Joseph P., Jason Miller, and Rolf A. Zwaan. 2001. “Indexing Space and Time in Film

Understanding.” Applied Cognitive Psychology 15: 533–45.

Monahan, Seth. 2011. “Success and Failure in Mahler’s Sonata Recapitulations.” Music Theory

Spectrum 33 (1): 37–58.

Narmour, Eugene. 1990. The Analysis and Cognition of Basic Melodic Structures: The

Implication–Realization Model. Chicago: The University of Chicago Press.

219

Neuhaus, Christiane, Thomas R. Knösche, and Angela D. Friederici. 2006. “Effects of Musical

Expertise and Boundary Markers on Phrase Perception in Music. Journal of Cognitive

Neuroscience 18 (3): 472–93.

Newston, Darren, Gretchen Engquist, and Joyce Bois. 1977. “The Objective Basis of Behavior

Units.” Journal of Personality and Social Psychology 35: 847–62.

Nusseck, Manfred and Marcel M. Wanderley. 2009. “Music and Motion—How Music-Related

Ancillary Body Movements Contribute to the Experience of Music.” Music Perception

26 (4): 335–53.

Ockelford, Adam. (2006). “Implication and Expectation in Music: A Zygonic Model.”

Psychology of Music 34: 81–142.

Pearce, Marcus T. and Geraint A. Wiggins. 2006. “Expectation in Melody: The Influence of

Context and Learning.” Music Perception 25 (5): 377–405.

Pearce, Marcus T., Daniel Müllensiefen, and Geraint A. Wiggins. 2010. “The Role of

Expectation and Probabilistic Learning in Auditory Boundary Perception: A Model

Comparison.” Perception 39: 1367–91.

Pearsall, Edward. 1999. “Mind and Music: On Intentionality, Music Theory, and Analysis.”

Journal of Music Theory 43 (2): 231–55.

Pellegrino, Catherine. 2002. “Aspects of Closure in the Music of John Adams.” Perspectives of

New Music 40 (1): 147–75.

Reber, Rolf, Piotr Winkilman, and Norberr Schwarz. 1998. “Effects of Perceptual Fluency on

Affective Judgments.” Psychological Science 9 (1): 45–48.

Reinhard, Thilo. 1989. The Singer’s Schumann New York: Perlion Press.

Reti, Rudolph. 1951. The Thematic Process in Music. Westport, CT: Greenwood Press.

Reprinted 1978.

Roeder, John. 2010. “Superposition in Saariaho’s ‘The claw of the magnolia . . .’.” Paper

Presentated at the Annual Meeting of the Society for Music Theory, Indianapolis, 4–7

November.

Rogers, Michael R. 1984. Teaching Approaches in Music Theory: An Overview of Pedagogical

Philosphies. Carbondale and Edwardsville: Southern Illinois University Press.

Rosner, Burton S. and Leonard B. Meyer. 1982. “Melodic Processes and the Perception of

Music.” In The Psychology of Music, edited by Diana Deutsch, 317–41. New York:

Academic Press.

———. 1986. “The Perceptual Roles of Melodic Process, Contour, and Form.” Music

Perception 4: 1–40.

Saffran, Jenny, R. 2001. “Constraints on Statistical Language Learning.” Journal of Memory and

Language 47 (1): 172–96.

Saffran, Jenny R., Richard N. Aslin., and Elissa Newport. 1996. “Statistical Learning by 8-

month-old Infants.” Science 274 (5294): 1926–28.

220

Saffran, Jenny R., Elizabeth K. Johnson, Richard N. Aslin, and Ellisa L. Newport. 1999.

“Statistical Learning of Tone Sequences by Human Infants and Adults.” Cognition

70 (1): 27–52.

Satyendra, Ramon. 1997. “Liszt’s Open Structures and the Romantic Fragment.” Music Theory

Spectrum 19 (2): 184–205.

Sarver, Sarah. 2010. Embedded and Parenthetical Chromaticism: A Study of Their Structural

and Dramatic Implications in Selected Works by Richard Strauss. Ph.D. Diss., Florida

State University.

Schellenberg, Glenn E. 1996. “Expectancy in Melody: Tests of the Implication–Realization

Model.” Cognition 58: 75–125.

———. 1997. “Simplifying the Implication-Realization Model of Melodic Expectancy.” Music

Perception 14: 295–318.

Schmuckler, Mark A. 1989. “Expectation in Music: Investigation of Melodic and Harmonic

Processes.” Music Perception 7 (2): 109–49.

Shepard, Roger. 1964. “Circularity in Judgments of Relative Pitch.” Journal of the Acoustical

Society of America 36: 2346–53.

Serafine, Mary Louise. 1988. Music as Cognition: The Development of Thought in Sound. New

York: Columbia University Press.

Snyder, Bob. 2000. Music and Memory. Cambridge: MIT Press.

Sridharan, Devarajan, Daniel J. Levitin, Chris H. Chafe, Jonathan Berger, and Vinod Menon.

2007. “Neural Dynamics of Event Segmentation in Music: Converging Evidence for

Dissociable Ventral and Dorsal Networks.” Neuron 55 (3): 521–32.

Soll, Beverly and Ann Dorr. 1992. “Cyclical Implications in Aaron Copland’s Twelve Poems of

Emily Dickinson.” College Music Symposium 32: 99–128.

Sutcliffe, W. Dean. 2010. “Ambivalence in Haydn’s Symphonic Slow Movements of the 1770s.”

Journal of Musicology 27 (1): 84–134.

Swallow, Khena M., Jeffrey M. Zacks, and Richard A. Abrams. 2009. “Event Boundaries in

Perception Affect Memory Encoding and Updating.” Journal of Experimental

Psychology: General 138 (2): 236–57.

von Hippel, Paul. 2000. “Questioning a Melodic Archetype: Do Listeners Use Gap-Fill to

Classify Melodies?” Music Perception 18 (2): 139–53.

Wheelock, Gretchen A. 1991. “Engaging Strategies in Haydn’s Opus 33 String Quartets.”

Eighteenth-Century Studies 25 (1): 1–30.

Zacks, Jeffrey. 2004. “Using Movement and Intentions to Understand Simple Events.” Cognitive

Science 28: 979–1008.

Zacks, Jeffrey M. and Khena M. Swallow. 2007. “Event Segmentation.” Current Directions in

Psychological Science 16: 80–84.

221

Zacks, Jeffrey M., Barbara Tversky, and Gowri Iyer. 2001. “Perceiving, Remembering, and

Communication Structure in Events. Journal of Experimental Psychology: General

130 (1): 29–58.

Zacks, Jeffrey M., Nicole K. Speer, and Jeremy R. Reynolds. 2009. “Segmentation in Reading

and Film Comprehension.” Journal of Experimental Psychology 138 (2): 307–327.

Zacks, Jeffrey, M., Nicole K. Speer, Jean M. Vettel, and Larry L. Jacoby. 2006. “Event

Understanding and Memory in Healthy Again and Dementia of the Alzheimer Type.”

Psychology & Aging 21: 466–82.

Zacks, Jeffrey, M., Nicole K. Speer, Khena M. Swallow, Todd S. Braver, and Jeremy R.

Reynolds. 2007. “Event Perception: A Mind–Brain Perspective.” Psychological Bulletin

133 (2): 273–93.

Zacks, Jeffrey, M., Shawn Kumar, Richard A. Abrams, and Ritesh Mehta. 2009. “Using

Movement and Intentions to Understand Human Activity.” Cognition 12: 201–16.

Zacks, Jeffrey, M., Todd S. Braver, Margaret A. Sheridan, David I. Donaldson, Abraham Z.

Snyder, John M. Ollinger, et al. 2001. “Human Brain Activity Time-Locked to Perceptual

Event Boundaries.” Nature Neuroscience 4 (6): 651–655.

Zwaan, Rolf A. and Gabriel A. Ravansky. 1998. “Situation Models in Language Comprehension

and Memory.” Psychological Bulletin 123 (2): 162–85.

Zwaan, Rolf A., Mark C. Langston, and Arthur C. Graesser. 1995. “The Construction of

Situation Models in Narrative Comprehension: An Event-Indexing Model.”

Psychological Science 6 (5): 292–97.

222

BIOGRAPHICAL SKETCH

Crystal Peebles received a B.M. in Music Education from East Carolina University and a

M.M. and Ph.D. in Music Theory from The Florida State University. Crystal has presented

research at a variety of conferences including the International Conference for Music Perception

and Cognition, the Annual Meeting for the Society for Music Theory, and numerous regional

conferences. She currently teaches Music Theory at Northern Arizona University.