Ph.D. Thesis - Singing as one: community in synchrony

274
Singing as one: community in synchrony Guy Hayward Trinity College This dissertation is submitted to the Faculty of Music for the degree of Doctor of Philosophy. December 2014

Transcript of Ph.D. Thesis - Singing as one: community in synchrony

Singing as one: community in synchrony

Guy Hayward Trinity College

This dissertation is submitted to the Faculty of Music

for the degree of Doctor of Philosophy.

December 2014

2

Declaration This thesis is the result of my own work and does not include work done in collaboration.

No part of this thesis has been submitted for another degree or qualification.

The thesis does not exceed 80,000 words, including tables, but excluding table of contents, appendix, bibliography, and figures.

Acknowledgements I wish to thank my supervisor, Professor Ian Cross, for guiding me through this process gracefully and masterfully, with such a unique breadth of knowledge to call upon. He has suggested various papers in fields that, although seemingly unconnected, have turned out to be very useful in my inter-disciplinary navigation, yet his lightness of touch has allowed me to go my own way too.

I am grateful to my colleagues at the Centre for Music and Science, Faculty of Music, Cambridge, for their generosity with both time and expertise, and for the spirited debates we have enjoyed. In particular, I am grateful to David Greatrex for reading through my thesis, and for his insightful and probing questions. Thanks also to those colleagues who have taken charge of getting us together to debate, eat, and have fun, resulting in a supportive, happy community.

Thanks are due to Trinity, my college, for supporting me in so many ways. My Trinity-sponsored Domestic Research Studentship, the Projects Fund, Graduate Fund, and M.Phil. funding combined enabled me to undertake this work. Further thanks also to my supportive tutor, Professor Grae Worster, and Lynn Clift and Sian Gardner of the Tutorial Office who helped me through this process with equal helpings of practicality and kindness.

Thanks are also due to Jill Purce for introducing me to the practice of chanting for personal transformation, for demonstrating the importance of chanting within a ceremonial context, and for supporting and guiding me. Thanks also to Rupert Sheldrake for introducing me to new ways of thinking about collective organisation, and for his support and guidance. And to Chris Watson and my fellow choristers in Trieste who gave me the rare opportunity of being a participant observer singing Gregorian Chant in five days of Holy Week services, thus enabling me to perform the field study in this thesis.

Finally, I must thank my parents Julie and Charles, my brother Hugh, and the rest of my family, for making this possible and supporting me through everything in life.

3

Abstract This thesis investigates the process termed ‘entrainment’ by which a group of singers can become ‘in time’ or ‘in synchrony’ with each other. Group entrainment is examined through the lenses of a number of disciplines throughout the thesis: ethnography, ethnomusicology, sociology, animal behaviour, complex systems theory, and psychology. Ch. 1 defines the concepts such as entrainment, chant, ritual, participatory and presentational musics, and the ethnographical method that will be used in this thesis. Ch. 2 reveals, through an ethnographical survey of the function of group singing in various traditions around the world, that a central purpose of group singing is to form community. Ch. 3 examines the form of group singing throughout the world, and finds that a general theory of metre, the perceptual basis of entrainment, can be applied to geographically-distinct traditions, but also that some chanting traditions do not need metre for entrainment to be possible. Ch. 4 contextualises Turner and Durkheim’s community-forming concepts of ‘communitas’ and ‘collective effervescence’ within a particular music anthropology study of the Suyán Indian ‘Mouse Ceremony’ in Amazonia, with reference to the concept of ‘entrainment’. After the initial ethnographic survey (Chs. 2-4), the rest of the thesis (Chs. 5-8) explores ways of thinking scientifically about the process of group entrainment. Ch. 5 discusses bottom-up and top-down approaches to understanding group behaviour, with reference to flocking and schooling behaviour in starlings and fish, and chanting and free jazz improvisation in human musical behaviour. Ch. 6 puts the bottom-up and top-down dialectic in the context of the current entrainment literature, in order to suggest new approaches for investigating entrainment in group contexts. Ch. 7 reviews the literature surrounding bottom-up aspects of group interaction such as gesture and visual communication, and the top-down influence of group hierarchy, and also inter-group and joint speech forms of entrainment. The various disciplinary perspectives discussed in previous chapters then provide the basis for an empirical field study in Ch. 8 of group entrainment in Gregorian chant, involving semi-structured interviews with choristers and systematic video analysis of their chanting. The study investigates how these choristers are able to start in synchrony with each other, looking particularly at the role of gesture and metrical perception. I find that no single mode of communication or perception offers an explanation for the precision of onset synchrony on its own, and therefore conclude that group onset synchrony must depend on a combination of many factors. This thesis moves the study of entrainment further by discussing ways in which we can think about and investigate group, as opposed to dyadic (joint), entrainment. It also shows the importance of combining the insights of both anthropological and scientific approaches when investigating group entrainment.

4

Contents Abstract 3 Preface 8 Chapter 1 - Main concepts 10

1.1 Introduction 10 1.2 What is entrainment? 10 1.3 Speech, chant and song 15

1.3.1 What is the difference between speech and song? 15 1.3.2 What is the difference between chant and song? 18

1.4 My ethnographical method 19 1.5 When is music participatory or presentational? 21

1.5.1 Participation 22 1.5.2 Presentation 23 1.5.3 Is Gregorian chant participatory or presentational? 24

1.6 What is ritual? 26 1.7 Summary 29

Chapter 2 - Functions of singing and chanting around the world 32 2.1 The functions of music 32 2.2 Music as mediator between individual and group 32

2.2.1 The time-structuring of oneness 34 2.3 Music in worship 36

2.3.1 Music, nature, spirit, and survival 37 2.4 Chant for the individual 40 2.5 Text in religious chant 42 2.6 The offensive and defensive uses of chant 45 2.7 Summary 47

Chapter 3 - One metre, one communality 50 3.1 Actions in common 50

3.1.1 Repetition 52 3.1.2 Participation and entrainment 55 3.1.3 Cultural difference in body movement entrainment 57

3.2 The metre vs. rhythm distinction 58 3.2.1 Cross-cultural problem examples 61 3.2.2 Is rhythm free from metre possible? 65 3.2.3 Clayton’s general theory of metre and rhythm 67

3.3 Summary 70 Chapter 4 - Singing, communitas and effervescence 73

4.1 Introduction 73 4.2 Turner and Durkheim 74

4.2.1 Turner’s ‘communitas' 74 4.2.2 Durkheim’s ‘collective effervescence' 75 4.2.3 The similarity of communitas and effervescence 75

4.3 Talking about experience 76

5

4.4 Communal singing, communitas and effervescence 77 4.5 The Suyán Indian Mouse Ceremony 78

4.5.1 The Mouse Ceremony songs 79 4.6 The role of musical entrainment in social process 81

4.6.1 Entrainment for good or ill 87 4.7 The making of society through song 90

4.7.1 The force that changes social structure 92 4.7.2 How is this force created by communal singing? 92

4.8 The remaking of society through song 94 4.9 Individuality and collectivity in musical ritual 96

4.9.1 Individuality and collectivity in the Mouse Ceremony 98 4.10 Summary 101

Chapter 5 - The Unity of the Community: more than the sum of its parts? 104 5.1 Introduction 104 5.2 Group Interaction 105

5.2.1 The top-down vs. bottom-up dynamic in group interaction 107 5.3 Luhmann's systems theory 109 5.4 Complex systems theory 111 5.5 Complex systems and collective animal behaviour 116

5.5.1 Potts' 'Chorus Line Hypothesis' 117 5.5.2 Criticism of current explanations of flocking behaviour 118 5.5.3 Schooling behaviour in fish 120

5.6 Top-down vs. bottom-up processes in musical group interaction 122 5.6.1 Stability vs. instability in musical interaction 123 5.6.2 Is musical improvisation inherently unstable? 125 5.6.3 Emergent structure in musical interaction 127

5.7 Summary 129 Chapter 6 - Group Entrainment: is it planned or does it emerge? 132

6.1 Introduction 132 6.2 Joint action 133

6.2.1 Entrainment as the basis for joint action 134 6.2.2 The collective ‘we’ 137

6.3 Entrainment in two forms 140 6.3.1 Planned coordination 141 6.3.2 Emergent coordination 144 6.3.3 The integration of emergent and planned coordination 145

6.4 Empirical challenges in investigating group entrainment processes 148 6.5 Summary 150

Chapter 7 - Group Entrainment: gestures, sensory communication, and group hierarchy 154

7.1 Introduction 154 7.2 Gesture 154

7.2.1 Gesture in Music 156 7.2.2 A conductor’s gestures 158

6

7.2.3 Emergent multi-level pulse hierarchies in gesture 158 7.2.4 Categories of musical gesture 159

7.3 Sensory channels 160 7.3.1 Do we need visual contact with another to entrain? 160 7.3.2 Gazing behaviour at boundary points in musical performance 162 7.3.3 Which cues do performers use most for entrainment? 163 7.3.4 The effect of group hierarchy on visual communication 164

7.4 Intra-group hierarchy in entrainment 165 7.4.1 Group hierarchy in performer-led small ensembles 165 7.4.2 Group hierarchy in conductor-led large ensembles 168 7.4.3 The effect of the leader-follower relationship on entrainment 170

7.5 Inter-group entrainment 173 7.6 Joint speech entrainment 175 7.7 Summary 177

Chapter 8 - Gregorian Psalmody Study 181 8.1 Introduction to Gregorian Chant 181 8.2 Introduction to empirical study 184

8.2.1 Summary of interviews with choristers 185 8.3 Analysis of Gregorian Psalmody performance 187 8.4 Gesture Analysis 189

8.4.1 Individual Gesture Analysis 191 8.4.2 Leadership Analysis 194 8.4.3 Collective Gestural vs. Vocal Onset Synchrony 195

8.5 Metrical perception 198 8.5.1 Method 199 8.5.2 Pilot test 200 8.5.3 Main results 201

8.6 General Discussion 202 8.7 Future Directions 206 8.8 Summary 212

Chapter 9 - Conclusions 213 9.1 How does Gregorian chant fit within this thesis? 213 9.2 Future Directions 220 9.3 Concluding thoughts 225 Bibliography 226

Appendix 1 - Videos of various ethnographic examples 256 Appendix 2 - Interview summary for Gregorian Psalmody Study 261

2.1 Introduction 261 2.2 How does psalmody differ from other forms of singing in terms of keeping in time with each other? 261

2.2.1 Do you feel Gregorian psalmody is best done in speech rhythm or 'even syllable rhythm'? 262 2.2.2. Which is easier: a fast or slow tempo? 263

7

2.3 Are visual or aural cues more useful for getting 'in time' with other people? 263

2.3.1 Which bench formation is best? 265 2.3.2 Is reading the notation and words correctly more important than being aware of the other people? 266 2.3.3 Do you look out of the corner of your eye? 267 2.3.4 Are you aware of other people's breathing influencing when you come in? 268

2.4 Are you aware of the signals you give to others? 269 2.5 Do you focus on certain people? Are they part of a hierarchy? 269

2.5.1 How important is the conductor for keeping in time? 271 2.6 How would performing psalmody differ in groups of 20 vs. 8 vs. 2 people?272 2.7 Has the chanting changed during the course of the week? 274

Appendix 3 - Online access to accompanying digital media 274

8

Preface Humans sing. We don’t always know why we are singing, but we do it anyway,

either out of simple desire or, in many cases, need. One can talk universally in this way because singing, whether done individually or as part of a group, seems to occur in almost all human societies on this planet (Lomax, 1968:3; see also Nettl, 2000). Indeed, Lomax’s ‘Cantometrics’ project has found that ‘the geography of song styles traces the main paths of human migration and maps the known historical distributions of culture’ (Ibid. 3).

This thesis is primarily concerned with communal singing, rather than individual ‘solo’ singing, and, in particular, will explore how group singing can inspire a sense of community by use of the concept of ‘entrainment’, defined as ‘the process by which independent rhythmical systems interact with each other’ (Clayton, 2012:49). In the context of dance, McNeill (1995:8) describes how ‘a blurring of self-awareness and the heightening of fellow-feeling with all who share in the dance…[is] the characteristic alteration of consciousness that sets in as the rhythm of muscular movement takes hold’. Thus, the project of this thesis relates to Birdwhistell’s notion that ‘at least 90 percent of the exchange along all the channels in any moment of human interconnectedness consists of signals that maintain the communication context and secure the base line of the conversation’ (see Lomax, 1968:173). I will argue that the process of entrainment in singing and chanting contexts comes to form a rhythmic ‘base line’ that allows for the maintenance of group interconnectedness.

If one accepts that most forms of group singing require at least a modicum of interactional unity or synchrony between the group of singers then it follows that some degree of entrainment is almost certainly operational, even if the goal of the music is not perfect synchrony. Perhaps the most extreme example in our species of group vocalisation is football chanting, where, on a weekly basis, crowds of thousands of musically-untrained individuals chant in synchrony with each other with no obvious ‘leader’ (see Appendices 1.1 & 1.2). It is astonishing that this number of football supporters can chant in synchrony, and, moreover, that they don’t need to learn formally how to do it (see also Kauffman, 1995, on complex systems; and Winkler et al. 2009 on entrainment in infants). Chanting and singing in general seems possible for all humans too; it can be practiced by both males and females of all ages.

9

Chanting, a form of singing that is a focus in this thesis, is clearly an important human activity (see Ch. 1.3 for definition). As Ch. 2 shows, chanting is globally widespread, and is observed in some of the remotest places on earth (e.g. the Hawaiian islands). Chanting also seems to fulfil important social functions in the annual calendar for many societies. Some religious traditions—Sufi, Buddhist, Hindu, Catholic, Anglican, and Islamic—use chanting as a means for accessing altered states of consciousness that draw both individuals and the group into experiencing ‘oneness’, the ultimate aim of many spiritual traditions. Chanting is a means of having autonomy and power for the community in Polynesia (Moulin, 1994), whereas in other cultures it can manifest violence, such as war chants etc. In Bulgaria, the women use it to pass the time with each other as they perform their daily chores (Rice, 1994:115-26; see also McNeill, 1995:51), whereas British members of Nichiren Shoshu Buddhism practice their chanting in order to create good karma (Wilson & Dobbelaere, 1994), and in some Benedictine orders Gregorian chant is said to maintain good health amongst the monastic community as well as perform its liturgical function (Tomatis, 2005).

These examples outline various differences in what chanting is used for, but perhaps it is most widely used to achieve what Sloboda (2004:358) calls ‘communality’. Indeed, in relation to this thesis’s focus on entrainment, Durkheim (1995:390) adds that in communal rituals: ‘what matters most is that individuals are assembled and that feelings in common are expressed through actions in common’.

Over the course of Chs. 2-4, I hope to go into more detail regarding the form and

function of various singing and chanting traditions around the world, and a few questions will guide the following ethnographical survey; e.g. ‘What is song and chant for?’, ‘What aspects of singing and chanting are shared in cultures and traditions across the world?’, ‘How does the social function of singing and chanting manifest itself in various cultures around the world?’, and ‘How is song and chant structured musically in such a way that group members can feel unified in their act of chanting?’. Thus, my aim in this survey is to develop a meta-anthropological account of singing and chanting traditions that looks at how chanting is a part of, and even creates, many aspects of social life.

The rest of the thesis, Chs. 5-9, focuses on how groups of singers and musicians become entrained, and includes an empirical field study of Gregorian Chant. But now, in Ch. 1, I first need to introduce concepts that are central to the thesis.

10

Chapter 1 - Main concepts 1.1 Introduction

This chapter introduces various concepts that will be discussed throughout this thesis. I will start by defining ‘entrainment’ because this is the primary concept by which I will explore group singing. Next I will define speech, chant and song, given that these are terms that will distinguish between different kinds of vocalisation throughout the thesis. The broad ethnographical survey in Chs. 2-4 involves many different kinds of ‘musical cultures’, and therefore, for the sake of clarifying the boundaries between these various kinds of culture, I then split the broad term ‘musical culture’ into three categories—local, regional, and transregional. The level of active participation varies between different music-making contexts and therefore I next define the distinction between participatory and presentational music-making, exploring the tension inherent in the use of these terms with reference to Gregorian chant, the subject of the empirical field study in Ch. 8. Finally, I define what I mean by ritual, given that Gregorian chant, and communal singing in general, often occurs within the context of ritual.

1.2 What is entrainment? Synchronisation can be defined as ‘doing the same thing at the same time’ (Cummins, 2012b). This limits synchronisation to a process of matching one pulse with another, whereas musical metre can be understood as a kind-of a hierarchical ‘grid’ with multiple pulse levels; i.e. two people can perform actions not at the same time and yet still be considered to be entraining to the same metre. The term ‘synchronisation’ is also not appropriate for discussing activities such as tango dancing where pairs of individuals move at different times from each other, even though the timing of each person’s movements is dependent on the other. Furthermore, even though unison group singing, for example, can be characterised as ‘doing the same thing at the same time’, there are clearly many degrees to which the singers can be said to be temporally coordinated (see Ch. 8.5). Synchrony is thus a limit case of temporal coordination, and therefore it needs to be placed in the broader context of entrainment.

11

Entrainment is perhaps a more useful concept than synchronisation for thinking about behaviours that demonstrate temporal coupling to varying degrees, although a definitive definition does not yet exist; partly because the study of entrainment is relatively young and partly because the boundaries between different degrees of entrainment are often blurred. The following discussion attempts to define entrainment for the purposes of this thesis. ‘Entrainment’ is ‘the process by which independent rhythmical systems interact with each

other. ‘Independent rhythmical systems’ can be of many types: what they have in common is some form of oscillatory activity (usually periodic or quasi-periodic in nature [in music, a ‘period’ is equivalent to one musical ‘beat’]); they must be independent in the sense of ‘self-sustaining’, i.e. able to be sustained whether or not they are entrained to other rhythmical systems (thus sympathetic vibration, as when a violin’s soundboard vibrates at the same frequency as one of its strings, is not an example of entrainment). In order for interaction to take place some form of coupling must exist between the rhythmical systems, and this too can take many forms. This process of interaction may result in those systems synchronising, in the most common sense of aligning in both phase and period, but in fact entrainment can lead to a wide variety of behaviours’ (Clayton, 2012:49). As the quote explains, the two or more independent rhythmical systems must be able to mutually influence each other for entrainment to occur, which is achieved by coupling between the systems (Himberg, 2013:30). The claim for entrainment regarding coupling is that ‘just as two clocks hanging on the same wall tend to synchronize because they are mechanically coupled (Huygens, 1673/1986), individuals may become automatically coupled through perceiving the same visual, auditory, or haptic information’ (Knoblich et al. 2011:66). However, the precise nature of the coupling between independent rhythmical systems is not defined by entrainment theory (Will, 2011:180). The kind of coupling observed has to be identified on a case-by-case basis, and precise identification may involve inter-disciplinary collaboration (e.g. ethnography and biology). Indeed, the ‘informational’ coupling between humans and other biological organisms, via sensory channels like hearing and vision, can be thought of as weaker than the ‘mechanical’ coupling that exists between physical, non-intentional, entities such as clocks hanging on the same wall. Having said that, both types of coupling are governed by the same dynamic principles ‘in spite of the vastly different media for the interaction of the two rhythmic units [e.g. mechanical clocks vs. humans]’ (Schmidt et al., 2011). Therefore, in line with Clayton’s definition, communal singing is a bona fide example of entrainment, because each person can sustain their periodic actions even

12

if everyone else in the group stops singing, and they are coupled with other singers through a number of sensory channels of communication (e.g. sound, vision, and sometimes touch). Entrainment is ‘an abstraction describing a process common to many different phenomena occurring at different scales of time and space, in both biological and mechanical systems’, and is therefore not exclusively associated with human musical behaviour (Knoblich et al. 2011:66). For example, musical entrainment in singing occurs at timescales such as the millisecond level (vocal chord entrainment), subseconds level (body movements and pulse), multi-second level (the breath and metrical bar), hourly/weekly/seasonal/yearly levels, and the historical levels of decades/centuries/millennium etc (see Ch. 9.2). However, it might be confusing to discuss entrainment at these different temporal levels, even though they are arguably relevant, and therefore this thesis will focus mainly on the level of the pulse (100-2000msec) (Van Noorden, pers. comm.; see also Clayton, 2012:52; 2000:87). An important concept related to entrainment is ‘phase’. Whilst rhythmical processes are continuous they often have some sort of reference point; e.g. in walking, the reference point would be the moment the foot strikes the ground. When examining phase relationships between two people walking next to each other, for example, the footstrike of one of the pair would constitute a possible reference point. The time delay at which a particular footstrike occurs in relation to the other person’s footstrikes can then be represented by a ‘relative phase angle’ (in degrees ˚), i.e. a specific point on the circumference of the unit circle corresponding to the cycle of the time interval between one footstrike and the next (the period). One fundamental property of ‘circular’ data is that the beginning and end of the scale coincide, that is, 0˚ = 360˚; i.e. the mean average of 30˚ and 350˚ is 10˚, not 190˚. Therefore, if the whole cycle of footfall was 0.6 secs, then one person’s footfall occurring 0.3s after the other person’s footfall would result in a point at 180˚ on the circumference, which shows entrainment that is maximally out-of-phase. In music, a person singing in 180˚ phase would be singing on the ‘off-beat’, i.e. in syncopation with the beat at 0˚. Although a phase analysis between two oscillating rhythms is relatively straightforward, it is more difficult in group contexts to decide the reference point to which each individual’s actions are measured, because any one individual could

13

serve as the reference point that all the others are judged against. One statistical method is by calculating an ‘order parameter’ for the whole group, which is a kind of collective reference point, allowing for continuous analysis of group entrainment (see Huang et al. 1998 for the Hilbert-Huang transform and Acebrón et al., 2005 for the ‘Kuramoto model’). However, this kind of statistical method requires objective ‘stationarity’: a quality of a process in which the statistical parameters (mean and standard deviation) of the process do not change with time (see Shumway, 1988). An example of a process that exhibits perfect stationarity is the ticking of an unchanging timekeeper such as a metronome. Apart from forms of music-making that exhibit highly-regular periodicity, most real-life human interaction contexts lack stationarity and therefore cannot be analysed by these statistical methods. Moreover, gait synchronisation between pairs of walkers, or any other kind of bio-mechanical activity, is also prone to temporal variability in both phase and period, and without means to correct these errors the walkers will fall out of entrainment with each other (Repp & Keller, 2004:499; see also Himberg, 2013). A process of ‘error correction’ is thus a fundamental aspect of entrainment behaviour in all its forms, and again, as more individuals are added to the group, analysing error correction becomes exponentially more complex because there are more sources of potential error to accommodate. However, relationships between independent rhythmic sources that are error-prone still count as genuine entrainment behaviour if they satisfy two conditions: [i] the relative phase relationship must be stable, and [ii] if the relationship is disrupted then re-stabilisation of the previous phase relationship occurs (Clayton, 2012:50). Also, rhythms do not have to be either perfectly in-phase (0˚) or out-of-phase (180˚) to be entrained. For example, if a 223˚ phase relationship is stable, and robust when perturbed—e.g. even if a singer in the group starts to speed up (a perturbation) other people are able to maintain the previous (stable) tempo—then the relationship counts as entrainment behaviour. Therefore, entrainment does not require people to ‘do the same thing at the same time’, and this versatility makes entrainment a more useful concept than synchronisation. The first general way in which entrainment manifests is that entrained rhythms do not just exist in 1:1 ratio, but can also be in hierarchical relationships such as 2:1, 4:2:1, or 6:3:1, which ‘are so common in music as to be

14

trivial’ (Ibid.), and also polyrhythmic relationships between parts (3:2, 4:3)—to illustrate, a 1:1 ratio would refer to two melodies that are identical in terms of note durations, whereas a 2:1 ratio would mean the note durations of one melody would be twice as long as the other melody (Clayton, 2012:51). However, the more complex the polyrhythm, the less stable the entrainment will be (Himberg, 2013:32). The second aspect of entrainment is that metrical percepts can emerge from auditory stimuli (Clayton, 2012:51; see Chs. 3.2 & 5.6). Thirdly, entrainment can involve independent rhythms that have matching periods, but which are out-of-phase; e.g. syncopated ‘off-beats’, possibly including ‘off-beats’ that are not 180˚ out-of-phase (Ibid.). Fourthly, entrainment can be symmetrical—i.e. there is an equal reciprocal influence between rhythmical inputs; e.g. mechanical clocks—or asymmetrical—i.e. there is a power imbalance between the rhythmical inputs, e.g. circadian rhythms such as night and day (Ibid. 52). In relation to the last point, musical entrainment can either ‘be symmetrical in an

ensemble made up of peers, asymmetrical when people play or dance along with pre-recorded music they cannot influence…[or] relatively symmetrical…where some people are more likely to have influence than others (e.g. conductors, section leaders, soloists, senior musicians). Music may then be a particularly good forum for investigating the interdependence between timing coordination and

social power relationships’ (Ibid. 52). Thus, musical entrainment can fall anywhere on the symmetrical-asymmetrical continuum, and we will look more closely at the influence of social power relationships on the symmetry of entrainment in Chs. 7.4, 7.5 & 8.4.2. To complete this general definition of entrainment, Clayton (Ibid. 51) describes three basic global forms of entrainment behaviour: [i] intra-individual entrainment, which takes place within a particular human being, from the oscillations of neurons to the coordination between limbs etc. (Large & Kolen, 1994; Large, 2000, 2010; Large & Jones, 1999; London, 2012); [ii] inter-individual/intra-group entrainment, which concerns the kinds of inter-individual and intra-group entrainment that will feature most prominently in this thesis; and [iii] inter-group entrainment, which concerns the coordination between different groups of individuals, such as the two teams of supporters chanting different chants at a football match (see Chs. 2.6, 4.5.1, 4.6.1., 7.5 & 8). These different forms of musical entrainment exist within a nested hierarchy, with inter-group entrainment being dependent on each group’s intra-group coordination, which is in turn dependent on each individual’s coordination of their limbs and voices, which is in turn dependent

15

on their ability to perceive metrical structure in external stimuli. All three forms of entrainment behaviour are relevant in the context of this thesis, but the main focus will be on intra-group entrainment. 1.3 Speech, chant and song

Singing and chanting are the main forms of music-making that will be examined in this thesis because both, particularly chanting, offer many instances of collective performance for exploration. The case study in Ch. 8 will investigate Gregorian ‘Chant’, and to define chanting it is necessary to first look at the difference between speaking and singing, because, in a general sense, chant is intermediate between speech and song. These three categories, speech, chant, and song would seem to be distinguished around the world, and are therefore worth examination (see also List, 1963).

1.3.1 What is the difference between speech and song?

Both speech and song display shared basic characteristics in that they are vocally produced, linguistically meaningful (apart from nonsense and vocalise forms), melodic, and rhythmic (List, 1963:1; see also Feld & Fox, 1994). Given the focus on entrainment in this thesis, I will focus mainly on how speech and song compare on the ‘rhythmic’ level. Conversational speech rarely displays ‘periodicity’—i.e. sound events occurring at regular time intervals—and is therefore not ‘rhythmic’ in the way music is (Cummins, 2012a; Dauer, 1983). Nevertheless, Turk & Shattuck-Hufnagel (2013) argue that even though there is a lack of evidence for periodicity in the actual sounds of speech, ‘it is still very much an open question whether or not speech is (1) controlled using periodic [motor] control structures, and/or (2) perceived as periodic’ (see also Ghitza & Greenberg, 2009; Barry et al., 2009). Furthermore, whilst conversational speech is, broadly-speaking, not periodic, other speech registers can show periodicity (see Knight, 2013). For example, ‘dramatic representations, the delivery of sermons, or the telling of jokes and tales [and] auctioneering’ are more stylised and therefore likely to have a more uniform rhythmic profile than everyday speech (see also List, 1963:3). Intermediate forms of communication in between speech and song also exist in ostensibly musical contexts; for example, operatic recitative (see Aroui, 2009:4; App. 1.3).

16

Although speech is, on the whole, not periodic, Patel (2008:97) argues that native fluency in a language requires not just an understanding of its phonemes, vocabulary, and grammar but statistical learning of the ‘prominence’ patterns of timing and accentuation that characterise a particular spoken language (Ibid.; Taylor, 1981; Faber, 1986; Chela-Flores, 1994; Niebuhr, 2009). Furthermore, empirical findings suggest that spoken prosody influences musical culture (Patel et al., 2006), and that listeners can classify songs according to language of origin from rhythmic information alone (i.e. no words, no melody) (see Hannon, 2009).

Patel (2008:96) defines rhythm as ‘the systematic patterning of sound in terms of timing, accent, and grouping’, and not only do both music and speech share these characteristics in a general way, but the findings just described suggest that music and language may mutually influence each other on the rhythmic level. Having said that, whilst individual speech sound segments can have predictable durations when the context and speech style are known (Klatt, 1976; Kohler, 2009; Arvaniti, 2009; Nolan & Asu, 2009), speech rhythm over longer domains such as sentences is typically less regular than the rhythm of musical phrases.

It is arguable that syllable durations in singing can be said to be longer than syllable durations in speech, due to the need to sustain vowel sounds, but of course a song with a very fast tempo might create shorter vowel sounds than speech. Therefore the measurement of syllable duration in itself is not adequate to distinguish between song and speech, but the hierarchical relationships between syllable durations may be.

Metre in music has usually been conceptualised in terms of the hierarchical organisation of its beats (strong vs. weak), which is also how speech rhythm is often described (Patel, 2008:97; see Ch. 3.2 for distinction between metre and rhythm). Strong beats are ‘perceptually accented’ points in music in the same way that words can be accented in speech, i.e. by being louder and more percussive. However, accented musical ‘beats’ exist within the context of a periodic framework, in contrast with accented words or syllables in the majority of speech contexts. One conceptualisation of this periodic framework is Lerdahl & Jackendoff (1983)’s hierarchical ‘metrical grid’. Clayton (2005:31) describes this grid ‘in terms of the

interaction of two or more concurrent levels of pulsation, in such a way as to generate ‘beats’ which are relatively strong or weak (in an abstract structural sense, not necessarily louder or otherwise more stressed than the ‘weaker’ beats). A time point which is perceived as a beat on two different levels of pulsation is ‘structurally stronger’ than a point which is felt as a beat on only one level. For music to

17

have metre, therefore, it must be perceived to have at least two such pulse levels: often there will be three or more.’

A musical grid is usually much stricter and fixed than a linguistic grid, as one might expect given the requirements of periodicity (Jackendoff, 2002). The practice of setting linguistic texts to music in songs, chants etc., is, put simply, an attempt to align the musical and linguistic grids (Jackendoff, 2002:115; cf. also poetry in Lerdahl, 2001). However, because the linguistic grid is typically subordinate to the musical grid, this process of alignment often changes the natural speech rhythm considerably (Ibid.). Having said that, some chanting traditions prioritise the speech rhythm which can lead to an irregular pulse (see Ch. 3.2.1).

The above discussion concerns how music and speech are related to each other

on the level of ‘rhythm’. Rhythm was just one of the characteristics—‘vocally produced, linguistically meaningful, melodic, and rhythmic’—that List (1963:1) argues are shared by music and speech. It is evident that both singing and speech are vocally produced, so I will not go into further analysis here, even though the vocal production associated with each form of vocalisation differs dramatically (Titze & Worley, 2008).

However, not all vocables—words, terms, names, designations—in song are ‘linguistically meaningful’; for example, meaningless syllables are almost always present in American Indian songs, but are seldom found in their speech (List, 1963:1; see also Frisbie, 1980; App. 1.4). Meaningfulness relates to building new utterances ‘out of parts that have occurred in previous utterances, putting them together by patterns familiar from previous utterances. [Therefore,] [s]ince the audience, as well as the speaker, has had previous experience with the parts and patterns, the new combination is understood’ (Hockett, 1959:34).

Whilst the ‘meaningless vocables’ combine to form new utterances in American Indian songs, for example, these utterances are not understood to relate to previous familiar patterns of vocables that have meaning. Similarly, the Kalapalo people of Amazonia do not know the meaning of the stories in their songs, even though they may know the words off by heart; i.e. the words do not mean in any sense analogous to that evident in the everyday function of language (Basso, 1981). In a similar vein, many Western ritual chants are performed in archaic languages which are not always intelligible to performers or listeners, such as Gregorian Chant, which is performed in Latin (Purce, pers. comm.; App. 1.5). Of course, on the other hand, many singing or chanting traditions involve text that is based on conventional patterns of vocables understood to be meaningful. Therefore, the linguistic content

18

of song can range from nonsense to meaningful, and therefore this criterion is not reliable for the purposes of defining song and speech.

It is also difficult to distinguish between speech, chant, and song in terms of melody, given that speech can have a distinct melodic component (e.g. in tone languages) and chant can exhibit song-like melodies.

From the above discussion, we can see that there are multiple forms of vocalising

in between speech and song; song is sometimes linguistically meaningful, sometimes not; and it can exhibit longer duration of vocables in comparison with speech, but often does not; and that different forms of vocalising are more or less ‘melodic’, but not categorically so. In short, it is difficult to find characteristics that distinguish between speech and song reliably and universally for all cultures. Regular periodicity is, however, a feature of many forms of singing and may perhaps be the characteristic that most reliably distinguishes singing from speech, but even this is not a totally reliable criterion for categorisation (e.g., see Ch. 3.2.1). Nevertheless, I will distinguish chant or song from speech by the presence of regular periodicity within its performance.

1.3.2 What is the difference between chant and song?

The above conclusion still leaves the problem of distinguishing chant from song. Chant is like song; indeed, the English word ‘chant’ often meant ‘sing’ coming from the French ‘chanter’ (to sing) (Dresher, 2008:47). However, if one takes the Western and Eastern examples of Gregorian Chant and Buddhist Mantric Chant, then chanting can be characterised as differing from song in a few main respects, and even though the following polythetic definition of chant is not sufficient to make a clear distinction between song and chant across all cultures, it will nonetheless suffice for the purposes of this thesis.

Chant, as opposed to song, exhibits a high degree of repetition in either melodic, rhythmic, or textual parameters, or a combination of each. Chanting can often consist of a single phrase repeated over and over again. It can be metrically simple, sometimes isosyllabic (i.e. regularly periodic but not perceived as metrical), and sometimes influenced by speech rhythm. It is [i] often melodically simple, possibly even as simple as being primarily monotonic with occasional auxiliary tones; [ii] it may also exhibit melodic contours that are related to speech intonation; or [iii] it may be performed in languages that are either archaic or nonsense, and are therefore not

19

linguistically comprehensible. Having said that, none of [i]-[iii] are defining characteristics of chant.

Aside from the sacred contexts in which chant features, chanting also occurs in many different social contexts such as school and playground games, taunting, sports events, political rallies or protests, marches and other occasions when a group of people want to sing the same words at the same time (Dresher, 2008:47; Liberman, 1975; App. 1.1, 1.6, 1.7, 1.8, 1.9). As these examples show, chants can be quite varied melodically or rhythmically—ranging from the songs of football matches to the call-and-response recitations of political rallies—but all contain strong degrees of repetition (Dresher, 2008:47). Therefore, the overall defining characteristic of chant as opposed to song is that it exhibits a particularly high degree of repetition within its structure, either melodically, rhythmically, or textually. However, the exact degree of repetition required to distinguish it from song is not clear, and therefore each particular instance would also need to be contextualised within its wider cultural context.

1.4 My ethnographical method

I will be referring to different kinds of ‘musical cultures’ in Chs. 2-4, and therefore the term ‘musical culture’ needs to be split into categories, because Slobin (1992:2) argues that due to the ‘cultural counterpoint’ between the domains of the individual, community, ‘small-group’, state, industry, and global industry, the boundaries between cultures are blurred (Ibid. 4). Slobin describes how ‘local musics’ are known by ‘certain small-scalebounded audiences’, and only by them; ‘regional musics’ refer to a ‘flexible sense of region, partly as a result of the spread of broadcasting and recordings’ and ‘transregional musics’ have a ‘very high energy which spills across regional boundaries, perhaps even becoming global’ (Ibid. 7-9).

This ethnographic survey is focused on chanting and other communal singing

traditions and the various examples used will fall into different categories of musical cultures; e.g. local musics, regional musics, and transregional musics. For example, the musical traditions of the Mbendjele and Suyá peoples (both examples of ‘local’ musics) are clearly functioning on a different scale to Gregorian chant or stadium football chants, which are widespread examples of transregional practices. However,

20

exploration on both the levels of the local and transregional can contribute to a cross-cultural discussion about the form and function of chanting/singing/music-making.

Indeed, Slobin has argued that cultural analysis has often ignored the reciprocity between the various types of musics, possibly because doing so may blur the categories (Ibid. 72). For example, ethnomusicologists often use interviews with prominent performing musicians within a culture, where bands are ‘style-carrying small groups’, and then take these individual examples and ‘jump from these micro-worlds to the ‘group’ as a whole’ (Ibid. 21). This common methodology inevitably blurs the ‘lines between single activists and whole traditions, between ensembles and institutions’. Having said that, in the interest of gaining as full and rounded a picture as possible, I will use quotes from prominent performing musicians that result from wide-ranging surveys of a regional or transregional culture; e.g. quoting a Corsican singer to make claims about Corsican singing. However, interview material will only be used when it seems to hint at something more widely-applicable than personal experience, both to the local/regional/global culture in which the performer is embedded, and to music-making in general.

Although the boundaries are most likely to be fuzzy with transregional musics,

the boundaries of local and regional musics are also difficult to demarcate due to the fact that they are shaped by ‘society, polity, economy, geography, interactional fields, collective identities, ethnicity, cultural practice, linguistic codes, communicability and comprehension, and regional networks’ (Brightman 1995:519; quoted in Bashkow, 2004:451). This is particularly relevant to the present survey when one considers a group of people making music, because they, their music, and the ritual which often frames their activity are all shaped by a complex interaction of these various components of culture. For example, even though chanting is often associated with highly ritualised or religious forms of performance that are resistant to outside influence (Moore, 1978:41), any form of musical practice is subject to the dynamic web of cultural interactions that Slobin and Bashkow highlight. These extra-musical cultural interactions can often provide further insight into the nature and purpose of the musical culture itself (see Ch. 2).

In ethnomusicology, form (pitch structure, rhythms etc.) has typically been the

aspect most studied, but the difficulty here is that formal features can be simultaneously both variable and homogeneous, both within and across cultures.

21

Exploring the function of an activity—i.e. the purpose it is deemed to fulfil—provides a context within which to locate and explore formal similarities and differences, enabling form and function in music to be correlated or distinguished. Furthermore, exploring function allows interpretations to be situated within both wider cultural practices and generic biologically-based, or even evolutionary, sets of constraints.

The ethnographical approach I am taking here is perhaps unusual in that it is ultimately searching for regularities rather than variation. Whilst there may be legitimate cautions against such an endeavour—most obviously that there may be no such thing as ‘universal’ forms of behaviour—this thesis hopes to widen the narrow focus of much ethnomusicological investigation which has largely focused ‘on specific ethnographic examples rather than seeking to develop generalisable frameworks that may be applied across cultures’ (Cross, 2013:6).

Some studies represent exceptions to this general approach of ethnomusicology, by making universal distinctions; e.g. Turino’s (2008) study which distinguishes between participatory and presentational music-making, and Lomax’s ‘Cantometrics’ project which, in a similar way, distinguishes ‘groupy and integrated’ music-making from that which is ‘individualized and little integrated’ (Lomax, 1968:22). Turino and Lomax could be said to be distinguishing between two poles of a continuum that applies to all cultures. We will now discuss their work in more detail.

1.5 When is music participatory or presentational? Thomas Turino, in his book Music as Social Life: The Politics of Participation, has

created a theory that separates music-making across the globe into two main forms: participatory and presentational. Turino (2008:26) describes participatory performance as ‘a special type of artistic practice in which there are no artist-audience distinctions, only participants and potential participants performing different roles, and the primary goal is to involve the maximum number of people in some performance role’. By contrast, presentational performance refers to ‘situations where one group of people, the artists, prepare and provide music for another group, the audience, who do not participate in making the music or dancing’.

22

1.5.1 Participation Turino uses the idea of participation ‘in the restricted sense of actively

contributing to the sound and motion of a musical event through dancing, singing, clapping, and playing musical instruments when each of these activities is considered integral to the performance’, and describes how ‘In participatory music making, one’s primary attention is on the activity, on the doing, and on the other participants, rather than on an end product that results from the activity’ (Turino, 2008:28, original emphasis). For him, the crucial point to be made is that ‘it is not that

people do not make qualitative judgements about other participants’ performance inwardly or that everyone is happy about problematic contributions to a performance—overall, people have a better time when the music and dance are going well. It is simply that in participatory traditions a priority is placed on encouraging people to join in regardless of the quality of their contributions’ (Ibid. 34).

Lomax’s concept of ‘group-involving’ performance is similar to Turino’s concept of ‘participatory’ performance in that Lomax defines a ‘group-involving’ performance as one where ‘all present can join in easily because of the relative simplicity and repetitiousness of the patterns’ (Lomax, 1968:16), and the music is usually ‘choral, with repetitious text, metrically simple, melodically simple, with no ornamentation, usually clear [i.e. no vibrato], and slurred enunciation’. These formal musical characteristics should be read only as a general guideline, however, because the music-making of some participatory traditions may involve greater complexity.

What may come as a surprise to those of us who are steeped in the customs of Western music is that certain social groups around the world do not regard participatory musical activities as inferior to professional concerts—instead they can be ‘the centre of social life’ (Turino, 2008:35). For example, in Prespa weddings, there are established procedures for the more experienced singers to help the less experienced, so that people of all ages and abilities can perform, as they are expected to (Ibid. 49; cf. Sugarman, 1997). If a less experienced singer sings a solo but it is clear that she needs help then more experienced singers will accompany her, and this is fine because the tradition stipulates that everyone sings a solo at some point during the proceedings, which reduces any potential embarrassment (Ibid.). Fürniss (2006:5) describes how for the Aka people in the Congo, ‘there are no professional musicians, every

interested person can join in singing or learn to play an instrument…all members of the community have an equivalent status, i.e. nobody earns his/her living from music making and nobody is

excluded from a performance, although certain singers are more competent or virtuoso than others’. An example of participatory behaviour much closer to home is that of Karaoke; however, whereas American or European individuals can choose whether or not to

23

sing on their own, in Japanese karaoke individuals have no choice but to perform (Ibid.). Turino refers to this form of compulsory participation which involves every individual in a group taking part one by one, rather than simultaneously, as sequential participation.

A particularly common aspect of participatory music-making is its ‘cloaking’ function. Lomax (1968:15) describes how song performances in indigenous cultures frequently consist of ‘the whole of the society (or a large sector of it) vocalising in unison in one giant harsh voice’ (see App. 3, track 1). Given that there are no exposed melodic lines due to the ‘dense, loud heterophony’, performers are spared the potential humiliation of being heard as individuals (Turino, 2008:46). For these reasons Turino (Ibid.) argues that the noisy music-making typical of ‘participatory’ practices ‘provide a crucial cloaking function that helps inspire musical participation’.

One other common aspect of ‘participatory’ performances is the use of what Turino terms ‘feathered beginnings and endings’. These ensure that the start and end of the piece are not clearly defined; i.e. that ‘one or two people may begin pieces and others join in gradually as they recognise it and find their place, and at the end people may just drop out’ (Ibid. 38; see App. 3, track 5). Feathered beginnings and endings have the potential to equalise the social organisation by rendering it ‘leaderless’, and are particularly common in the singing of societies where the leaders have little real authority; for example, North American Indians, Australian Aborigines, and New Guinea Highlanders (Lomax, 1968:156). Feathered beginnings and endings also enable performances to be ‘open form’; i.e. that the music can stop in its own natural way and ‘can be repeated for as long as the [group of] participants and situation requires’, made possible by short forms of a minute or less which are repeated ‘over and over’, e.g. short sung refrains or mantras (Turino, 2008:37).

1.5.2 Presentation

Presentational music-making is the opposite of participatory music-making, because it ‘refers to situations where one group of people, the artists, prepare and provide music for another group, the audience, who do not participate in making the music or dancing’ (Turino, 2008:26); i.e. there is a separation within the performance context of performer and listener. In a similar sense, Lomax’s concept of ‘group-dominating’ performance refers to a performer ‘command[ing] the communication space by presenting a pattern that is too complex for participation’ (Lomax, 1968:16). Again, the formal constraint of ‘complex patterning’ should be read only as a general

24

guideline, however, because some of the music-making of presentational performers may be simple enough for audience members to join in with. However, joining in is usually not appropriate in a presentational context, and it is this convention that tends to define presentational performance.

Even music in the West that is called ‘popular’ is presentational in that has a clearly defined audience, with little audience participation, and people often pay money to see their favourite act. Popular music displays a clear artist-audience distinction, but in some instances presentational music-making can be ‘exclusionary’ even amongst the artists; for example, in a Qawwali performance, a performer (always exclusively male) may be replaced by one of several performers with whom he is in direct competition (Qureshi, 2005:137; App. 1.10). However, a Qawwali singer chooses songs ‘in accordance with the spiritual needs of his audience’ and therefore one might say that Qawwali music-making is a weaker form of presentational music-making, because although there is a clear artist-audience distinction, the performers include the real-time reactions of the audience in their decision-making with regard to choosing songs.

1.5.3 Is Gregorian chant participatory or presentational?

The principal case study of this thesis is Gregorian psalmody, which is a form of Gregorian chant where the biblical psalm texts are set to fixed melodic formulas known as psalm tones (Chen, 1983:87; see also Ch. 8.1). According to Westermeyer (2005:35), psalm tones—tones being simple melodies—are ‘ancient and essentially congregational in nature’, and until chant was written down in notation, it existed in the church’s ‘oral memory’, and thus everyone was able to take part. Having said that, Westermeyer also states that ‘it is very hard if not impossible to nail down with precision who sang what and how it was sung’ (Ibid.). Nowadays, the congregational forms of Gregorian chant sung by everyone tend to be simpler, and feature less frequently in services than those sung by the choir and soloists. From now on, I will use the term ‘Gregorian chant’ throughout this thesis as a collective term for all the various forms of Gregorian chant, such as single-note intonations of prayers and lessons, psalmody, antiphons, tracts, responsories, graduals, introits, offertories, alleluias, and congregational responses etc.

From Gregorian chant’s earliest beginnings the chant has been performed by a

select choir, to whom the congregation has ‘merely listened—perhaps not even

25

responding at the end’ (Crocker, 2000:93). Indeed, the music produced by the choir may not even be directed at the congregation; one chorister whom I interviewed said that ‘in a monastery monks sing the psalms for themselves and God rather than for eavesdropping listeners’. The majority of the performance of Gregorian chant thus represents a typical presentational configuration—a select group performing, with others only listening. Indeed, even in monastic worship, where the whole monastic community participate in most of the singing, ‘the most complex chant has often been sung by a select group of practised singers’ (Ibid. 4).

Jeffery (1994:82; quoting Hucke 1966:72) has argued that at the heart of the debate

about Gregorian chant in Catholic worship is a conflict between those who advocate a ‘pastoralist’ ideological agenda, ‘permitting the people to sing and…having the liturgical texts sung in the vernacular’, and those with a ‘sacred’ agenda who want to ‘safeguard the heritage of church music that has come down to us from the past, as well as to keep watch over the artistic character of church music, choral singing, and the Latin language’, which usually means that the congregation sing less. In 1903, Pope Pius X set forth new regulations in his motu proprio, Tra le sollecitudini, for the nature of musical participation in the Roman Catholic Church. Some of these are relevant to our discussion about the interaction between participatory and presentational values:

2. ‘It must…never produce a bad impression on the mind of any stranger who may hear it’. ‘It

must really be an art, since in no other way can it have on the mind of those who hear it that effect which the Church desires in using in her liturgy the art of sound’.

3. ‘…Especially should this chant be restored to the use of the people, so that they may take a more active part in the services, as they did in former times’.

9. ‘The liturgical text must be sung…so that it can be understood by the people who hear it.’ (quoted in Copeman, 1989:302-3; my emphasis).

Regulation (3) sets out a participatory or ‘pastoral’ agenda. However, in order for the text to be heard and understood, as per (9), everyone participating would need to chant the text at the same time, but this would discourage people from making mistakes without fear of exposure, thus creating performance conditions that are not conducive to participation. Furthermore, the fact that there are prescribed rules as per (2) about creating ‘art’ and not creating a ‘bad impression’ puts further restriction on uninhibited self-expression. By extreme contrast, self-expression

26

would not be restricted in the kinds of participatory situations characterised by Turino that involve loud cacophony and non-synchronised text, because an individual can be much freer given that their voice would be harder to single out (e.g. listen to track 1 on CD).

In the Catholic services I am familiar with, the choir are usually highly trained and, as mentioned above, do most of the singing in a service; whereas the congregation occasionally sing short and simple melodic phrases in Latin as responses to the priest as a very small percentage of the total music-making in the service. This is still the case in spite of the regulations above that promote more active participation. However, as Jeffery (1994:82) points out, research relating to this tension between participation and presentation in the history of Christian liturgical music is lacking.

Therefore, in conclusion, I would suggest that some forms of Gregorian chant can be regarded as participatory (e.g. congregational responses), and others presentational (e.g. an eavesdropping audience listening to choral chants). At some moments in a service there may also be a mix of participatory and presentational elements (e.g. call-and-response). Gregorian chanting is thus a confusing mix of both participatory and presentational elements; although on balance it is perhaps presentational, given the minimal congregational participation. Therefore, perhaps participation and presentation should be seen as two poles on a continuum, and that different genres of music-making even within the same musical tradition may occupy different positions along that continuum (e.g. see Ch. 4.9.1).

I will use the participatory vs. presentational distinction over the following

chapters in order to clarify the social function of the communal singing traditions around the world that will be discussed.

1.6 What is ritual? The communal singing activities I refer to in Chs. 2-4 usually occur within a ritual

context and so it is necessary to attempt to define what rituals are. Merker (2008:45) describes how ‘a ritual culture is one in which certain behaviours, whatever their purpose, goal or

function might be, have a ‘correct’ form, in the sense that one particular acquired mode of execution among many possible alternatives is an obligatory part of its performance, without necessarily being

superior to its alternatives in an instrumental or practical sense’. As Turner (1987) has said ‘The

work of ritual (and ritual does ‘work’ as many tribal and post-tribal etymologies indicate) is partly

27

attributable to its morphological characteristics. Its medium is part of its message. It can ‘contain’ almost anything, for any aspect of social life, any aspect of behaviour or ideology [or mythology],

may lend itself to reutilisation [quoting Nadel, 1954:99]’. Similarly, Moore & Myerhoff (1977:3) describe collective ritual ‘as an especially

dramatic attempt to bring some particular part of life firmly and definitely into orderly control’, and argue that once this has been achieved it has ‘a tradition-like effect’ whether ‘performed for the first or thousandth time’ (Ibid. 8). But to achieve this in performance, ‘people must not only agree on a particular script of what actions must be

performed when, but also they must collaborate in the performance and display this cooperation. In

this way, rituals may reinforce not just cohesion but public commitment to cohesion’ (Lienard & Boyer, 2006:818). Thus, communities can create their continuity by resonating with the memory of their collective past through public commitment to highly-ordered ritual performance.

Consequently, ‘the criterial measure of adequacy is adherence to the socially approved form of the ritual itself’ and not the instrumental outcome or utility for the individual—even though, of course, individuals may benefit instrumentally in various ways (Merker, 2008:46). Similarly, in the case of rituals which involve communal singing, the singing itself is often governed by its own complex set of formal musical and linguistic properties that need to be adhered to, nested with the complexity of the ritual as a whole.

However, although the form of ritual is a defining feature I will be focusing on the way that the functional aspects of the ritual relate to the group singing itself. Indeed, participants in most rituals are likely to ‘believe or insist on [their] efficacy with regard to external goals or functions’ (Ibid.). For example, in the case of rituals such as a wedding ceremony, having an agreed upon ‘proper way’ of doing it means that ‘by its correct performance, one simply is married’ (Ibid.). Thus, in the words of Ronald Grimes, ritual is ‘a transformative performance revealing major classifications, categories, and contradictions of cultural processes’ (Turner, 1987:5).

Rituals are usually based around a common activity through time, and often

involve actions that are performed at specific moments in the ritual; an aspect that is particularly relevant for this thesis, which is concerned with the structuring of time. Furthermore, rituals can involve communal singing or chanting, which intensifies their relationship with time because in music-making people are often required to act and sound in time with each other to a high degree of precision (or at least more precise than non-musical activities in the majority of cases). Communal singing in

28

many ritual cultures can also be accompanied by dance and drumming, and is usually maintained by performing to a common beat (Merker, 2008:53).

Thus, perhaps the most primary function of rituals is to ‘set collectivity in

motion’ (Olaveson, 2001) and consequently almost always involve two or more (often many more) people participating in the ritual; although, of course, some rituals can be performed in solitude, such as private prayer or meditation (Merker, 2008:52). Rituals recreate society by reaffirming communally-shared ways of interacting; ritual may also create society through the powerful revitalising force of ritual that may lead to changes to those shared ways of interacting (see Ch. 4). The expectations of behaviour created by ritual may also create a ‘safe’ space for confrontation that is witnessed by the whole community. Ritual may thus also change relationships between individuals, as well as affirm the community as a whole.

Another key function of rituals or ceremonies is to gather communities to witness an event, especially in rituals where members of the community are going from one state to another, e.g. unmarried to married (Jill Purce, pers. comm.; see also Lienard & Boyer, 2006:825). By witnessing the couple becoming married, their extended community can affirm them in their new state of being married, which can often be difficult to maintain due to force of habit associated with their previous state of being unmarried (Purce, Ibid.).

Another fundamental aspect of ritual is to create its ‘centre’, i.e., the place where the central events of the ritual happen, to which all the gathered witnesses attend, but which is set apart from the rest of the space; for example, a wedding ceremony’s ‘centre’ would be the couple and the priest in front of the altar (Ibid.; Lienard & Boyer, 2006:816). Demarcating the centre of a ritual is usually achieved by significant persons moving towards the centre and then coming to rest at the centre, or by aspiring to the centre by spiralling or circling around it; e.g. the circumambulation of the Kaaba in Mecca during the Hajj pilgrimage (Purce, 1974:31; see also Lienard & Boyer, 2006:815). However, this thesis is more concerned with temporal, rather than spatial aspects of ritual (but see Widdess, 2012, and Ch. 4.9.1 for pieces on music and space in Nepali and Suyán rituals).

Taken at face value, the discussion of ritual so far is as fixed and ‘Platonic’; however, any ritual is bound to evolve once its participants reach a sufficient level of mastery in performing it and start to improvise and embellish particular elements of

29

the ritual, thus ‘[pushing] its development in the direction of differentiation and complexity’ (Merker, 2008:54). A ritual is thus an evolving organism even though, in order to keep its identity, ritual performance tends to be more conservative than creative (see Ch. 4 for further discussion).

In general, ritualised behaviour is associated with ‘high control, high attentional

focus, and explicit emphasis on proper performance’, and therefore differs from routinised action, which is associated with ‘possible automaticity, low attentional demands, and lesser emphasis on proper performance’ (Lienard & Boyer, 2006:824). A ritual can therefore be summarised as ‘a specific way of organising the flow of behaviour, characterised by compulsion (one must perform the particular sequence), rigidity (it must be performed the right way), redundancy (the same actions are often repeated inside the ritual), and goal demotion (the actions are divorced from their usual goals)’ (Lienard & Boyer, 2006:815; see also Bloch, 1974; Humphrey & Laidlaw, 1993; Rappaport, 1999). Therefore, one of the structural aspects of musical behaviour that lends itself to ritual is its high degree of repetition (see Ch. 3.1.1).

Collective rituals are powerful because they can serve to maintain any changes to social structure as a consequence of the community continuing to affirm whatever is witnessed in the ritual (Purce, pers. comm.; see also Durkheim, 1995; Turner, 1987; Moore & Myerhoff, 1977). This is of critical importance to the maintenance of a community as it moves through the cycle of life—for example, members joining (birth) and leaving (death)—and all of the changes and destabilisation which such events bring. Rituals may also allow for the community to relate to the wider environment and cosmos, and anything that it holds sacred.

1.7 Summary

This chapter has introduced various concepts that will appear in this thesis. First, ‘entrainment’ was defined as the process by which independent rhythmical systems interact with each other, as long as their phase relationship is stable, and robust when perturbed. The term accounts for a wider variety of instances of rhythmic interaction than synchrony, defined as doing ‘the same thing at the same time’, and comes in three forms: intra-individual, intra-group, and inter-group. All three forms will be discussed in this thesis, but with a particular focus on intra-group, and

30

sometimes inter-group, forms. In defining entrainment, concepts such as coupling, phase, period, and symmetrical power relationships were also defined because they will be discussed in the following chapters.

Second, I attempted to draw a distinction between song and speech, but found it difficult to distinguish characteristics that work reliably and universally for all cultures. The most reliable distinguishing characteristic is regular periodicity, which occurs significantly more in song than in speech, but not in every case. I then attempted to draw a distinction between song and chant, concluding that chant’s distinguishing characteristic was a particularly high degree of melodic, rhythmic or textual repetition. However, it is difficult to define the precise degree of repetition required for a style of singing to be called chant, because each particular instance would need to be contextualised within its cultural context.

Third, I split the broad term ‘musical culture’ into Slobin’s three categories of local, regional, and transregional ‘musics’ in order that over the next few chapters as I introduce various singing traditions it is clear which level of culture is being referred to. Each category of ‘music’ is also the result of a complex interaction of various extra-musical components of culture, which are argued to provide further insight into the form and function of every singing tradition. In the next few chapters I reveal aspects of function and form in singing traditions that seem to apply to many different cultures. Ch. 2 will involve a cross-cultural exploration of the function of communal singing, and, due to the focus on entrainment in this thesis, Ch. 3 will explore rhythmic features across cultures, and Ch. 4 will comprise a case study that integrates both the functional and formal elements of the singing of the Suyá Indians in Amazonia.

Fourth, I explored Turino’s distinction between participatory and presentational music-making because this is a key distinction to make with regard to the social function of a musical tradition. The primary goal of participatory performance is to involve as many people as possible in a performing role, and presentational performance usually involves a select performing group providing music for a non-performing audience. The participatory concepts of ‘cloaking function’ and ‘feathered beginnings and endings’ were introduced because these will be of relevance later. An exploration of the various ways in which Gregorian chant is performed showed that it may be more appropriate to think of any one single musical tradition as being made up of various forms of music-making that at different moments express both participatory and presentational aspects to varying

31

degrees. Therefore, it may not always be possible to definitively categorise a tradition as either participatory or presentational.

Finally, ritual was defined as an (often communal) activity that conforms to, and maintains a shared code of ordered behaviour, and can often involve a community witnessing a transformative or affirmative event, e.g. marriage, that can be thought of as having been achieved by the ritual itself. Rituals are seen to be of critical importance to maintaining order and creating a community’s continuity as it moves through the cycle of life, and as a means of relating the community to its wider environment and anything that it holds sacred.

In the next chapter, I turn to an exploration of the various social, religious, and natural functions of many traditions across the world that involve communal singing.

32

Chapter 2 - Functions of singing and chanting around the world 2.1 The functions of music

In this chapter, I will look at the various functions of chanting and singing traditions from around the world, which are drawn from different levels of culture and are associated with various worldviews. I am interested in what is common to all of them, and what makes them different. Aspects of how these functions manifest will be grouped around certain themes: collective musical interaction as a means of exploring the relationship between an individual and the group; the time-structuring function of music that allows a group of individuals to act together and feel ‘as one’; the function of music in religious ritual; the relationship between various dimensions of human life, such as spirituality, nature and survival; the function of chant for the individual; the function of text in religious chant; and, finally, the offensive and defensive uses of chant.

2.2 Music as mediator between individual and group Music brings people together by creating the conditions for forming social

relationships. Communal music-making allows people to appreciate the ‘sameness’ of each other, even with those people they might have differences with, who are not normally part of ‘the group’ (see Turino, 2008:18). Kapferer (1986:190) refers to this phenomenon as the ‘together the one’ experience, which can be understood as a group of individuals feeling as though they are ‘one’ (see also McNeill, 1995:8).

Sloboda has argued that in group music-making individuals ‘contribute to a larger whole, so that our small individual contribution becomes more significant’ (Sloboda, 2004:358). From a musical point of view, this has two effects. First, an individual’s singing is amplified acoustically when synchronised with the singing of others. Second, new musical aspects emerge, such as when a melody is put in counterpoint with another melody and the harmonic and rhythmic structure changes (Ibid.); i.e. by changing a part you are simultaneously changing the whole.

Victor Zuckerkandel (quoted in Basso, 1981:289) describes how singing can create ‘an enlargement, an enhancement of the self, a breaking down of the barrier separating the self from things, subject from object, agent from action, contemplator

33

from what is contemplated; it is a transcending of this separation, its transformation into a togetherness’. For example, in the ritual choral singing of the local Kalapalo culture, ‘the roles of singer and listener are combined, resulting in a person’s feeling at one with the group through what he/they are producing and which he/they listen to’ (Ibid.).

The phenomenon of music-making itself therefore provides a strong analogy for certain core aspects of worshipful devotion (Sloboda, 2004:358). In most forms of worship, there is some form of ‘surrender’ of one’s own will to a higher source of power, such as God (Ibid.). For example, one chorister that I interviewed thought that Gregorian chant (a ‘trans-regional music’) is ‘about adoring God…you’ve got to get rid of the self part’. Similarly, Crocker (2000:25) describes how the early Christians thought that people should sing together ‘as if with one voice (quasi una voce)’. In the trans-regional Sufi music tradition, the performer knows his job is to merely act as ‘a mouthpiece’ for Allah and ‘consistently denies having any personal share in the impact generated by his performance’ (Qureshi, 2005:137).

Bahuchet (1995:64-5,59; quoted in Fürniss, 2006:4) describes Aka singing in Central Africa (a ‘local music’), as ‘a reflection of the community as well as a communion of religious essence…[which] present[s] an extreme example where religion is nearly exclusively expressed through music and dance, without officiant, without prayer and without offerings, that is without any perceptible religious gesture’ (see Appendix 1.11). Thus, for the Aka, the singing is the worship and although this is an extreme example, most religious traditions involve communal singing to different degrees. Collective singing requires the effort, attention, and voice of each individual to join in common purpose (Fürniss, 2006:4). For example, in the context of the congregational singing in Christian churches of the West, Wren (2000:84) describes how ‘…as we sing together we belong to one another in the song. We agree…to compromise with each other, join our voice as if joining hands, listen to each other, keep the same tempo, and thus love each other in the act of singing’. A conductor I interviewed described a choir as ‘like an army of ants…it’s almost as if everyone’s singing is just one body’, and added that ‘the music allows them to have that sort of togetherness, or it provides a framework for them to be together’.

Joining in common purpose is also evidenced in Suyán rituals in Amazonia, where individuals sing ‘unison songs’ ‘not as a brother, lover etc. but as a member of the group whose identity was established through the song’ (Seeger, 1987:83; a ‘local music’; see also App. 3). Seeger (Ibid. 140) has argued that Suyán group singing

34

enables the Suyá to establish their evolving selves within the context of an evolving, but workable, sense of the community itself.

Music can also function as a way of getting to know others and also belonging to a group; for example, a Corsican singer describes how ‘through the song, in five minutes I know who I’m dealing with’ (Bithell, 2007:75, in a 2004 interview with a singer called Turchini; App. 1.12). The following quote about Corsican singing (a ‘regional’ music) could also be said of Suyán singing, which is that individuals share a ‘common code that only initiates can know…We are all members of the same brotherhood, the brotherhood of song’ (Ibid.).

There can also be a specifically musical aspect to the merging of the individual

with the group. For example, ‘regional’ Aymara music in Conima, Peru, requires the contributions of an individual to ‘merge with, and not stand out from, the overall sound of the flute or panpipe ensemble’ (Turino, 2008:47; App. 1.13). The musicians achieve this by means of a virtuosic technique (‘requinteando’) which allows one group member to fill in another member’s melody in order that the interaction can continue. This is a form of musical interaction which Turino terms ‘interlocking’ (Ibid. 135; see also Lomax, 1968, who also uses the term ‘interlocking’). This need to blend in musically with the collective or merge the individual ‘self’ within the group ‘self’ is common to many kinds of ensemble performance.

In many different contexts, music-making in a group can therefore smooth the transition from acting for oneself to acting on behalf of others. As Lomax (1968:171) has said, ‘teamwork of any sort demands that idiosyncracies and personal conflicts be subordinated to the requisites of a common goal’. Thus, music-making that prioritises participation seems to involve a tradeoff: the loss of individual creative freedom is accompanied by the benefits of being part of a community. As we have seen in several geographically-distinct musical contexts, collective participation is prioritised over individual expression (even though individual virtuosity can be a marker of group membership too).

2.2.1 The time-structuring of oneness

The fundamental basis of collective musical interaction is entrained action (see Ch. 1.2), and therefore individuals must act as part of a larger coordinated whole. This means that each individual’s different experience of the same performance is

35

able to ‘float’ above the entrained action of the group; Cross (2003a, 2005) has termed this property of musical interaction ‘floating intentionality’ (see also Ch. 6.2.2).

Indeed, Kapferer (1986:199) argues that: ‘The time-structure of music and dance, and their internal coherence in performance, contain the

potential for creating…an experience for the [individual] and for extending this experience to the members of the [group]. Music and dance, through their structuring capacity, can render as copresent and mutually consistent those dimensions of experience that might appear as distinct, opposed, even contradictory, from the rational perspective of the everyday world.’

A recent model of social cognition—the ‘social cognition’ model (SC model)—also describes what Kapferer & Cross are saying by conceptualising social cognition as a ‘[non-representational] emergent product of jointly recruited and time-locked processes rather than individual ones’ (Semin & Cacioppo, 2008:121, emphasis added); which is reminiscent of Cross’s ‘floating intentionality’ and Kapferer’s ‘together the one’ experience. In the context of music, Semin & Cacioppo’s concept of an ‘emergent product’ may partly relate to the musical metre by which everyone can synchronise their actions (see Ch. 3.2 for more on ‘metre’, and Ch. 5.6 for more on ‘emergence’). The idea here is that if all participants are aligned with the same temporal structure then all of their individual actions relate to a cohesive whole.

From another perspective, Sloboda (2004:358) observes that ‘Music has a tendency to

coordinate at least the shape of the rise and fall of emotional response to music in a body of people. We are all likely to feel most strongly at the same point, even if the precise colour of our feelings

differ from one to another.’ However, a coordinated emotional response is not equivalent to the ‘together the one’ experience because the latter, a group feeling as one entity, is strictly a subjective perception rather than an emotion. Nevertheless, there is likely to be an interaction between emotional response and subjective perception (e.g. see Ch. 4.7.2).

For Turino (2008:34) the sounds of music-making continually let those present know whether they are achieving commonality and communality. Therefore, music’s ‘time-structuring capacity’ can also be used, when appropriate, for the purpose of determining the extent to which a group of performers are ‘at one’ by measuring how entrained performers are with each other. This measurement can also be a reliable marker of how much performers and audience (if there is one) are enjoying the experience, or rate the performance as ‘successful’.

I am aware that there are other aspects of music-making that facilitate the experience of ‘oneness’ or communality (e.g. pitch, timbre, harmony, form, texture), and therefore my focus on timing in this thesis does not provide the complete

36

picture about how groups feel as one when making music. Yet, having said that, without structuring sounds through time, aspects such as melody, harmony and form would not be possible, and thus time is a fundamental element of music-making.

2.3 Music in worship The creation of ‘oneness’ is a significant function of worship, and worship is

probably the most common and widespread collective context in which music occurs (Sloboda, 2004:347). The following quote from Durkheim’s ‘Elementary Forms of Religious Life’ (1995:386) refers to worship rituals from a sociological perspective:

‘It is through [rites] that the group affirms and maintains itself, and we know how indispensable the group is to the individual…Once we have fulfilled our ritual duties, we return to profane life with more energy and enthusiasm, not only because we have placed ourselves in contact with a higher source of energy but also because our own capacities have been replenished through living, for a few moments, a life that is less tense, more at ease, and freer.’

Sloboda also describes how rituals involving music are well placed to provide direct, yet ineffable, contact with the ‘higher source of energy’ that Durkheim speaks of:

‘At the heart of much worship is the sense of being in the presence of that which is beyond capture by human concepts. In approaching the object of worship we are approaching that which is at the limits of our apprehension. And yet, neither the object of worship nor the activity of worship is alien to us.’ (Sloboda, 2004:351). ‘Where worship seems to be afforded a particularly strong foothold is precisely at the boundaries of what can be said. Music makes us aware of the ineffable very directly’ (Ibid. 355).

Because it is the subject of the case study of Ch. 8, I turn to Gregorian chant as an illustration of the place of music in worship (see Apps. 1.5 & 3). For example, Hiley (2009:2) describes how Gregorian chant makes us aware of the ineffable by ‘[adding] a

dimension to the religious experience commensurate with all those other things beyond the Latin text that enhance worship…Music is one of many non-verbal elements in worship, none the less essential

for being difficult to describe in words’. One Gregorian chanter I interviewed described that ‘after a while, it's a bit like a Catholic saying the rosary where you have a ‘Pater noster’, ten ‘Hail

Mary’s’, and then the ‘Glory be to the Father’: that is just a backdrop really to the devotion which you are making, and therefore I think singing the psalms…[makes me] feel ‘meditative…[and] in

communion with my maker’. Of course, officially, Gregorian chant functions as the ‘ceremonious declaration of sacred Latin texts’. However, it is also perceived that chanting the texts in a ‘measured, disciplined manner is a good way for the group of

37

worshippers to act together; the more harmonious the singing, the more inspiring the communal act’ (Hiley, 2009:2).

The ‘other things beyond the Latin text’ that enhance worship are the ceremonial actions themselves, as well as the ‘church architecture and stained-glass windows, images and

church furniture, the dress of the participants and the objects they hold and use, the bells and the incense. It is fair to say that these things have a stronger cumulative impact than the Latin texts being

recited’ (Hiley, 2009:4). It is also fair to assume that this particular set of features associated with Catholic ceremonies can be observed to varying degrees in many other ceremonial traditions discussed in this chapter.

Therefore, it would seem that musical worship, and particularly singing, is perceived to be a means of finding communion with the ineffable, with God or any other higher being, and with fellow worshippers; or, in the words of Sloboda (2004:357), a means of ‘[letting] God speak’. Gregorian chant is a Western example, but, as the next section shows, similar conclusions can be drawn about musical worship in Africa and Amazonia. Furthermore, McNeill (1995:92) refers to how in the Indian yoga traditions, the most common way to seek communion with the ineffable—more common than other traditional methods, such as fasting, breath control or drugs like hashish—is to chant for hours on end.

2.3.1 Music, nature, spirit, and survival

The ineffability of text in some chanting traditions might also make sense when we think about how music-making is used by these traditions to interact beyond the human boundaries of the worshipping community. Scholars have argued that, in these cultures, music is thought to form connections with the natural world (see also Bateson, 1972a). A living example of the way social groups conduct a form of ‘natural world sociality’ is the ‘local’ Mbendjele tribe’s use of language and music in Congo Brazzaville. According to Jerome Lewis (2009:240), the Mbendjee language is ‘an open, expansive communicative tool that imitates any other languages or meaningful sounds and actions that enable Mbendjele to interact with agents with whom they wish to maintain social relations…[such as] other Mbendjele, villager neighbours, crocodiles, duikers, monkeys and other

animals, and...the forest’. And the language is combined with music: ’When people really

want to charm the forest they turn their part of this conversation into a song, a song which involves their whole bodies, and mimics the forest back to itself. This is done using percussion, polyphonic singing and dancing…As their bodies intertwine, so too do their voices - singing out different melodic lines that overlay each other to constitute the polyphonic song. Like each creature of the forest, each melodic line is different, has its own period, and combines itself with other melodic lines, some with different periods. Typically each gender has its characteristic melodic lines, though they

38

may also sing each others’ lines from time to time. Singers switch melodies when they hear too many

singers singing the same one, and people seem to improvise freely’ (Ibid. 249, 251; App. 1.14). The same fundamental relationship with nature through song is instanced in the Amazonian ‘Suyá’ people’s singing, which is described as having ‘transcended the purely human, it participated simultaneously in the social and the animal realms’ (Seeger, 1987:60).

These examples show that through collective singing humans believe they are able to communicate with both themselves and other dimensions and aspects of their environment. This extends to dimensions such as the ‘spiritworld’ too; for example, ‘When the group [of Mbendjele] achieves the synergistic harmony familiar to good choirs

or orchestras, the forest shows its pleasure by allowing the mysterious forest spirit-creatures called mokondi, sometimes embodied as leafy dancers, sometimes simply experienced as an ambience, to enchant the participants and further deepen the profound communitas they experience…’ (Lewis,

2009:250; see Ch. 4.2.1 for definition of ‘communitas’). The Aka peoples also live in the Congo, like the Mbendjele, and for them collective singing and dancing is their offering ‘to the forest spirits and to the spirits of the ancestors that are supposed to take care of their children’, and is also fundamental to their way of life (Fürniss, 2006:2,4,28).

Songs have a spiritual function in the Suyán community too, as Seeger describes: ‘Songs were obtained from dangerous beings through an intermediary [‘teacher of song’] who had lost his or her spirit [or had it stolen by] the actions of a witch [often making the person ill], or who had confronted foreigners and learned from them’ (Seeger, 1987:54&61). A person whose spirit was with the bees could only teach bee songs, with the same going for birds, fish, plants etc. (Ibid. 55). For the Suyá, whereas the public speaking of community elders had ‘human’ authority, singing had ‘spiritual’ authority (Ibid.). Indeed, the Amazonian Kalapalo people believe that by singing they become spiritual authorities themselves ‘not by singing to a Powerful Being,

but by singing it into being; highly focused mental images of the Powerful Being are created in the minds of the performers by means of the performance itself…there is consequently a merging of the self with what is sung about; just as in myth Powerful Beings participate in human speech, so in ritual humans participate in itseke musicality, and thereby temporarily achieve some of the power of these Powerful Beings’ (Basso, 1981:288).

A belief in music as a tool for connecting with other dimensions of reality is not just confined to the Suyá, Mbendjele, Aka, and Kalapalo peoples; shamanistic traditions across the world believe that music and its accompanying poetry are ‘a

mode of transport, not just beyond the mundane and into the sacred, but into other realities and

39

dimensions where capricious spirits may possibly—nothing guaranteed—be harnessed to help humans out of very ordinary and concrete problems such as sickness and hunger’ (Balzer, 1997:317).

The idea that singing and chanting has ‘higher’ authority might suggest that song or chant can effect change more potently than speech. For example, Suyán invocations (similar to chant) are supposed to have a healing effect on another person’s body through the agency of spirits, and this is common to many shamanistic healing traditions (Vitebsky, 1995). Suyán invocations also had many other uses, such as keeping away bad weather, punishing ‘disdainful lovers’, and praying for good fortune (Seeger, 1987:32). Hence, the belief in music as a way of asking for help from spirits extends the range of potential human needs that music-making can fulfil.

For example, Seeger (Ibid. 132) notes that in Suyán culture ‘The ritual [of singing] made possible the mobilisation of men, women, and children fundamental to the economic system itself. The Suyá said ‘When we sing, we eat’. In some cases the corollary was ‘When we do not sing, we go hungry’ (Ibid.). An English-speaking female member of a Maasai community near Amboseli National Park, Kenya, told me that modern Maasai men and women will still perform ‘rainsongs’ to pray for rain in dry periods (a ‘regional music’; App. 1.15). It is common across the world for fundamental elements such as water and earth to be the subject of genres of songs composed and performed specifically for receiving rain and good harvest, and for protection against storms and droughts.

Similarly, the ‘regional’ Rangda/Barong ritual of Bali is performed when ‘some misfortune believed to result from the imbalance of cosmic forces befalls a village, such as crop failure, pestilence or too-frequent cases of mental illness’ (Becker, 1994:43). The ritual is designed to ‘restore the balance between the world of people and the world of the deities, spirits and demons of the ‘other world’ (Ibid.). In the ritual many young male volunteers enter into trance through the unison singing of ‘long, slow vocal lines of classical poetry’ by a chorus of women, and, accompanied by a furious and continuous two-note gamelan ostinato pattern, violently fight with a witch in order to resolve the imbalance between the cosmic forces that may have caused the village’s misfortune (Ibid.; App. 1.16).

Therefore, for some societies, music perhaps fulfils the most fundamental need of all: survival.

40

2.4 Chant for the individual As well as allowing communication with other people, the community, the wider

environment and transcendent unity, singing and chanting can also serve functions that are explicitly ‘individual’. For example, communal singing can increase positive affect and reduce stress (Beck et al. 2000; Kreutz et al. 2004; Vickhoff et al. 2013), reduce chronic pain (Kenny & Faunce, 2004), improve trust and cooperation (Anshel & Kipper, 1998), improve quality of life for cancer patients (Gale et al., 2012), improve general psychological well-being in individuals (Clift et al. 2007, 2010; Bungay et al., 2010; Cohen, 2009; Stewart & Lonsdale, 2013; Sanal & Gorsev, 2013; Bailey & Davidson, 2005), and even increase life expectancy (Glass et al., 1999).

There are many traditions to choose from in order to explore individual functions

of singing and chanting, but I have limited the discussion here to Christian rosary chanting and Buddhist mantra chanting because in both religions chanting practice can be an intensely personal pursuit.

Chanting, as opposed to other forms of music, is often melodically and rhythmically simple, and repetitive enough for any individual to perform themselves. Furthermore, most people have the use of their voice for their whole lives and therefore chant is a readily accessible means of music-making. To illustrate, Purce argues that ‘trans-regional’ traditions such as mantric chanting and overtone chanting found in Tibetan Buddhism, for example, ‘is less socially inhibiting than singing since the emphasis is less on ‘right’ and ‘wrong’ notes than on individual pitch’ (Franks, 1996; see also Bailey & Davidson, 2005; App. 1.17 & 1.18). The resulting collective sound made from various individual pitches functions much like Turino’s ‘cloaking function’ (see Ch. 1.5.1), with the implication being that it is more important for an individual to make a sound at all, rather than a particular kind of sound. Therefore one of chant’s functions is to engender a sense of belonging by allowing an individual to contribute to the overall sound of the group with their own voice, because this is often an uplifting and fulfilling experience (Purce, pers. comm.).

Buddhist chanting also has a more explicit individual purpose; Wilson &

Dobbelaere (1994:214) describe how ‘though it is claimed that ultimately chanting benefits the

world at large, it is not initially so much concerned with the condition of the wider public, but rather

with the circumstances of the one who chants’. Their work focuses on the trans-regional

41

Nichiren Buddhist tradition in Britain and they find that members of this tradition are hardly likely to carry out the ‘time-taking,…highly repetitive, and exacting’ task of chanting if they did not believe that it had an important, higher purpose for themselves (Ibid. 195; App. 1.19). This higher purpose may refer to either conspicuous or inconspicuous personal benefits. The authors interviewed members of a trans-regional Buddhist tradition, Soka Gakkai, and found that they believed chant could bring about ‘inner change’ that occurs gradually by raising self-awareness. In addition, by ‘changing the force of karma’ they thought that chant could in specific circumstances bring about ‘external change’, referring to outcomes that are externally-verifiable, such as having more fulfilling relationships and greater wealth (Ibid. 23).

The authors also describe how in Nichiren Buddhism ‘chanting is believed to release the individual’s buddhahood, a higher state of consciousness, which puts him in harmony with the laws governing the universe and the rhythms of life’ (Ibid. 8). For example, the tara puja Buddhist chant ‘aims to ensure a liberation from suffering [i.e. enlightenment]…and [is] stronger than medicine itself’ (Thram, 2002:134; App. 1.17). Therefore, the ultimate function of chant for an individual Buddhist is to achieve ‘buddhahood’ or enlightenment; however, according to Buddhist thinking, spiritual practice exists ‘for the benefit of all living beings’, and therefore there may be no such thing a purely ‘individual’ purpose.

Nichiren Buddhist chanting is similar to the chanted prayer and rituals of

Christianity (Wilson & Dobbelaere, 1994:213). Having said that, Christian chant may not be perceived to have the same automatic ‘cause and effect’ action that Buddhist chanting is believed to have. In Buddhist chanting there is also a greater focus on the meditative practice of being ‘present’ by listening to the chant as one produces the sound; whereas in Gregorian chant, for example, the official focus is more on the doctrine that is being chanted even if the chant happens to have a meditative effect. Interestingly, there seems to be an historical link between these two geographically and culturally distinct practices; i.e. the rosary was introduced to Europe by the crusaders, who took it from the Arabs, who in turn took it from Tibetan monks and the yoga masters of India (Lehmann, 1976). Buddhist and Christian chanting may also have healing and calming effects on those individuals that practice it; for example, it has been shown that reciting yoga mantras (Buddhism) and rosary prayers (Christianity) reduced respiratory rate, and the conclusion drawn was that

42

chanting the rosary or mantras ‘might be viewed as a health practice as well as a religious practice’ (Bernardi et al. 2001). In summary, individuals gain different benefits from chanting and with it can meet diverse needs, such as emotional, physiological, social, existential, and spiritual needs (see Chong, 2010).

2.5 Text in religious chant

In the context of religious chanting the most mysterious part of Sloboda’s quote (see 2.3 above) is his claim that ‘Music makes us aware of the ineffable very directly’. So why is it that many forms of sacred chanting not only involve an effable text of some sort but often prioritise the text in the act of worship? Indeed, the primary importance of text is common to many chanting traditions around the world. The most extreme example is the trans-regional Islamic tradition of the ‘recitation’ of the Qur’an, in which Muslims are not allowed to refer to the recitation as music, nor to refer to the reciter as singer, which is ‘to avoid any identification of the holy text with songs created by human beings’ (Graham & Kermani, 2007:131, 118; App. 1.20). And in contrast with Sloboda, Nettl (2005:252) interprets Islamic religious ‘music’—chanting the Qur’an, calls to prayer, etc.—not as direct communication with God but rather as ‘devices to remind humans of their religious duty’. In Indian temple singing the music is also subordinate to the text, and is a vehicle for the text’s expression (Snell, 1983; Clayton, 2005:123). Similarly, Augustine of Hippo (2008/397) found it difficult to decide music’s place within Christian religious practice, arguing that when sacred words are sung they ‘stir his mind to greater religious fervour and kindle in

me a more ardent flame of piety than they would if they were not sung’ but ‘When I find the singing itself more moving than the truth which it conveys, I confess that this is a grievous sin, and at those times I would prefer not to hear the singer’ (Wren, 2000:69).

Similarly, Qureshi writes about the way that the trans-regional religious chanting

of urdu poetry (‘tarannum’) ‘does not exist apart from the poetry it supports’ (Qureshi, 1969:444). Qureshi describes tarannum as essentially a linguistic communication, and the performance of tarannum is more like spoken recitation than music; for example, chanting would never stop for the sake of the tune, only for the words (Ibid. 434; App. 1.21). For Muslims music has always been associated with ‘emotional excesses’, ‘the wrong kind of pleasures’, and is even considered dangerous or unlawful (Ibid. 443). What is clear is that whether tarannum is musical

43

or not, the politics surrounding discussion of this genre of chanting make it difficult to draw conclusions one way or another.

In a similar way, but with less extreme implications for performance, many other

religious traditions believe that their sacred texts come from a divine source. Gregorian chant is another tradition that prioritises text over music, but to varying degrees depending on which texts are being chanted. According to Hiley (2009:3) there are four basic categories of Catholic chant. The first two categories—‘Readings from the Bible and other chosen literature…and prayers addressed to the Almighty’—and the third category, psalmody (see Ch. 8), are all performed by intoning texts on a single note, for the most part syllabically—i.e. one note per syllable—with slight inflections at the ends of clauses or verses.

In the first three categories, the same small set of melodic formulae are used for a large variety of texts which means that the melody is ‘depersonalised’; i.e. two texts may be respectively joyful and sorrowful and yet they would share the same melody. This has the effect of making chant sacred, by setting it apart from the drama of everyday speech (Ibid.). Among the choristers I interviewed there was overall consensus that Gregorian chant is non-emotional, regardless of the specific text being sung, with one describing the music as the ‘merest veil’ between you and the text.

The fourth category refers to verses for more elaborate singing, and these have much more complex and varied melodic formulae than the first three categories, in which several notes can be assigned to one syllable (known as the ‘melismatic’, as opposed to syllabic, style). Due to their variability, melodic formulae within this category can be associated exclusively with specific portions of liturgical text, as opposed to, for example, a psalm tone—a single melodic formula—which is used for multiple psalm texts (for more detail, see Ch. 8.1).

The complex melodic elaboration is to enhance the text, in a religious ‘doctrinal’ way, not in a secular ‘expressive’ way (Ibid.). However, Hiley (Ibid. 4) argues that while ‘Gregorian chant is undoubtedly a means of making the sacred Latin text audible, it does so in

ways whereby the text sometimes seems almost secondary. The sacred sound is more important than the sense…It is important to understand that the Latin texts are not being presented to an audience as a story-teller might address a group of listeners. They are more like a reference point for a religious musical experience, for a reaching out to the deity, who is no more to be comprehended in words than is music itself’.

44

Perhaps one of the reasons why many traditions are strict about the way the sacred text is performed is because performed text is believed to have the power to inflict both beneficial and disastrous outcomes on those who are chanting. For example, while learning Maori chant (a regional tradition), New Zealand anthropologist Jim Ritchie practiced his chanting by filling his mouth with pebbles in order that his chanting was not ‘real’ and therefore mistakes made during the learning process did not inflict disaster upon himself and others (Moulin, 1994:2). In the trans-regional Samavedic tradition, Howard (1977:4) describes how priests have to recite voluminous texts, yet what is interesting is that ‘even the smallest error is abhorred and believed sufficient to produce catastrophe’ (Ibid.; App. 1.22).

Merker (2008:48) describes the full Agnichayana sacrifice, which is part of traditional Indian Vedic culture, as ‘a formally structured 12-day progression of complex

interwoven chanting and ritual performances requiring months of preparation and rehearsal. It involves 17 priests, each specialised in the recitation of particular branches of the massive Vedic corpus of hymns and sacrificial formulas on which the rite draws (Staal, 1989 & 1993). For millennia, a considerable portion of the Brahman cast of the Indian subcontinent has devoted its intellectual resources to the syllable-perfect memorisation and correct recitation of this textual corpus and the preservation of the many rituals in the course of which it is recited. Such practices are not confined to civilisations such as the Vedic: hunter-gatherer cultures such as those of the Australian aborigines feature the memorised transmission of a corpus of sacred songs, rituals and associated objects and myths. Acquiring these ritual vehicles of tradition demands a substantial investment of time and energy on the part of initiates, and may involve undergoing severe bodily torture to prove worthiness

for becoming a carrier of the sacred lore (Elkin 1945; Strehlow 1947)’. The fact that ritual cultures around the world place so much importance on text,

and go to such lengths to memorise these texts would suggest that the text of a chant, not just the music, is clearly central to its function. However, even in many codified religious traditions it is not always clear what function the text of a chant serves. Perhaps the function is that listeners can gain teachings from comprehending the text, or that communities can restate their collective identities and traditions; on the other hand, it is also common for chanting traditions to use a language that most participants cannot understand.

Many singing communities around the world sing either nonsense syllables or in archaic languages. For example, in various cases of Suyán singing, no Suyán could explain to Seeger what their texts meant, yet always identify songs by their texts instead of their tunes (Seeger, 1987:40). Put another way, Seeger ‘never heard a man say ‘let’s sing the one that goes like this’ and hum a tune’ (Ibid.). In regional Apatani culture, the nyibu (shaman performer) is defined ‘primarily by his ability to use ritual

45

speech’, but the ritual texts he chants represent a form of special speech that distinguishes them from texts that use more ordinary, spoken speech (Blackburn, 2010:9; App. 1.23).

Similarly, in the context of Gregorian chant, Hiley (2009:4-5) describes how ‘It

might be objected that ordinary people in the Middle Ages did not know the Latin psalter by heart, that in fact they did not understand Latin at all. But a religious community performed the liturgy in the manner (including the language) established as the right way for praise and commemoration. The religious community did this both for itself and on behalf of the rest of mankind, for those who had mundane occupations and no time for praise and commemoration but who needed to know that the

religious were acting for them, in the proper manner.’ It is often seen as ‘proper’ that archaic and incomprehensible language should be used for worship (e.g. Sanskrit in Buddhist chant). Indeed, one chorister I interviewed said that, in comparison with Anglican chants in English, Latin chant had ‘depth’ and a ‘timeless’ quality. Purce has suggested to me that the incomprehensibility of the text also has the effect of allowing practitioners to avoid focusing on comprehending the texts, thus making them better able to receive their deeper teaching; i.e. the words carry a power of their own (Purce, pers. comm.). Indeed, there also tends to be more of a ‘proper’ way to perform texts in song/chant due to the restrictions placed on texts by setting them to music, as compared with other types of vocalisation, e.g. public speech/myth-telling, which often display more variation, and no exact fixed form.

In summary, it is difficult to say exactly how chanters and listeners actually

experience the texts associated with the chanting traditions described in this section. I suspect that in each instance, no matter what the religious interpretation is concerning the importance of text, performers and listeners respond to chant on many levels, some on a textual or semantic level and some on an experiential level. Indeed, sometimes these various aspects may be integrated; for example, a couple of choristers I interviewed thought that different aspects of Gregorian chant, such as textual meaning and the experience of spiritual communion were interwoven.

2.6 The offensive and defensive uses of chant No survey of chanting would be complete without considering forms of chanting

that are not ‘sacred’. Indeed, many are far from sacred, and can be used for aggressive goals. Most of us will be familiar with the trans-regional phenomenon of

46

football chants, whose purpose are either unifying at best or war-like at worst (or somewhere in between). Schiering (2008:221) argues that:

‘the football rite is structured by a strict succession of ritualised, verbal and non-verbal practices, such as football cheers and chants. Collective fan utterances fulfil the same function as other forms of ritual communication: they establish and strengthen a bond of unity within a group’.

Football chants are thus useful for ‘keeping the little-known traditional linguistic genre of ‘blason populaire’ alive in England today’ (Luhrs, 2008:233; see also Mac Coinnigh, 2013; App. 1.9). ‘Blason populaire’ can be described as ‘An expression of one’s group outlook and self-image, often involving the implied simultaneous detraction and/or detriment of another (rival) group’ (Green & Widdowson, 2003:9; quoted in Luhrs, 2008:233). A football chant, therefore, is well-placed to unite a group of supporters against the opposing side (both the opposing team and its supporters). Indeed, the footballchants.org website goes so far as to describe a modern football match as a ‘regulated war’, and how a group of supporters want any new song ‘that could be turned into a verbal weapon’.

The following example demonstrates just how important it is for a group to

understand what another group’s war chant is intended to communicate. In 1642, the European explorer Tasman and his ship came across Maoris on a distant shore who were making music at them. The Dutch responded with their own music in a sustained exchange and proceeded to row out unarmed to meet these friendly, music-making peoples. Unfortunately for them, this particular form of Maori music-making was actually an invitation to fight, and four of the seven unarmed Dutch rowers were killed—the other three swam back to the mother ship (Lodge, 2009:625-7; App. 1.26; see 4.6.1 for further discussion of the military function of music-making).

Chant can also be used defensively to give power to those that chant; for example, some Maori karakias can be used to ‘drive away unwanted flocks of birds’ (List, 1963:5; App. 1.25). Chant also gave the local delegation from the Valley of Atuona of the Island of Hiva ‘the power to overcome any attempts to dislodge them from their position of artistic and cultural strength’ (Moulin, 1994:1; App. 1.24). Chant is thus sometimes used with the explicit aim of trying to make a delegation’s culture resistant to the hegemonic or natural forces that threaten it. But what is particularly striking about the Hiva example is the ‘perceived causal relationship between the art form and a non-artistic outcome’ (Moulin, 1994:1); i.e. the chant itself has a defensive power.

47

Singing can also be responsible for resolving legal disputes. For example, if two men have a disagreement and begin physical fighting in regional Greenland Eskimo culture, ‘an older man intervenes and stops them, at the same time making an appointment for the

fighters to have what the natives call a ‘drum dance’. The entire community attends this meeting, and the two adversaries take turns in singing derisive songs at each other, accompanied by mocking gestures. The man whose songs best drive home his point, according to the consensus of public

opinion, wins the fight. There are rarely any further antagonistic incidents.’ (Nettl, 1956:13). One stipulation is that the songs are improvised; although, partial preparation may occur. These song contests are real, and can seriously affect the lives of the individuals concerned. They are ultimately beneficial, however, given that ‘No forms of physical combat or war are customarily present in Eskimo culture; they are not sanctioned nor is their existence recognised by the tribes’ (Ibid.). In summary, therefore, singing and chanting can be used offensively and defensively in groups, and also as a means of legal mediation between individuals within a group context.

2.7 Summary This chapter represents a survey of a selection of singing traditions from around

the world, with a range of examples from ‘local’ indigenous communities to ‘trans-regional’ religious traditions (see Ch. 1.4). Two phenomena associated with almost all the singing traditions were [i] the ‘together the one’ experience—the experience of feeling ‘sameness’ with a group of people, even if some of those people are outsiders to the group—and [ii] that the singing was in some way doing ‘work’ (in either a mental, physical, and/or spiritual sense). Related to the ‘together the one’ experience is the fact that in any collective music-making individuals are part of a larger whole, and therefore any contributions they make have to merge and entrain with the contributions of others. This experience is usually most intense in participatory contexts, due to the fact that as many individuals as possible that are present are encouraged to make a personal contribution to the music-making of the group.

The capacity for an individual to ‘merge and entrain’ with the contributions of others relates to the concept of entrainment introduced in Ch. 1.2. By coordinating their actions to be ‘in time’ with the actions of others in the group, individuals can experience an embodied ‘sameness’ with those other people. Music-making was also argued to coordinate the ‘rise and fall’ of emotional response in a body of people, coordinating what the majority of people in a group feel at any one time.

48

This relates to most forms of worship in which there is some form of ‘surrendering’ of one’s individuality to dwell within something larger than oneself, which could be anything from a worshipping community to a transcendent unity. Certain ritual traditions also employ group singing to communicate with the natural world around them, the spiritworld, and the cosmos—sometimes in a polyphonic way to mirror the multiple voices of the forest, for example—or to make appeals to spirits in other dimensions to heal sickness, feed the hungry, and even to ensure survival.

As well as being nested within a larger ecology, an individual can also chant specifically to bring health and psychological benefits to themselves as individuals—both through through practice on their own, and chanting with others. The simplicity associated with chant can allow individuals who are less confident with more elaborate singing to make their own sounds, which can be a positive experience for that individual, regardless of whether they are singing in a group or not—although singing in a group can often facilitate a positive sense of belonging to a group.

Singing, chanting, and music-making in general, can also make people aware of the ineffable—in particular, when archaic, non-vernacular, and nonsense languages are being sung or chanted. However, this does not necessarily mean that the sacred text of a chant is less important than the music. Indeed, some traditions believe that the power of the chant cannot be separated from the words, which is even the case for chant in archaic languages. In many cases, it was not always clear—despite the amount of space given in the literature on chanting traditions to their texts—what function the text of a chant serves, even with established, codified religious traditions such as Christianity and Islam. This was because most of the literature reviewed was not concerned directly with identifying the general functions of chant. Nevertheless, I concluded that the function of a sacred chant text might be a combination of participants gaining teachings from what the text means and that communities can restate their collective beliefs, creeds, identities, and traditions through ritual texts. More generally, it is likely that listeners respond subjectively to chant in many ways, either on textual, semantic, or experiential levels or combinations of these. Of course, even though individuals may respond to chant in individual ways, in group chanting the fact that they still have to coordinate their movements and sounds with each other means that on some level they are taken beyond their own subjective experience.

49

In terms of the aggressive and defensive forms of chanting it would seem that

chant has power to unify a group, thus increasing a group’s violent or defensive potential against other groups of people, or even animals. It can also be used to resolve conflict between members of a community. Therefore, chanting is not an inherently good or bad practice—what makes it one or the other is the intention of those performing, or listening to, the chant.

This chapter looked at the function of chant, the next will look more at its form, or rather, its temporal form.

50

Chapter 3 - One metre, one communality 3.1 Actions in common

‘What matters most [in rites] is that individuals are assembled and that feelings in common are expressed through actions in common’ (Durkheim, 1995:390).

‘Movements [of the ritual group] are stereotyped; everybody performs the same ones in the same

circumstances, and this conformity of conduct only translates the conformity of thought. Every mind being drawn into the same eddy, the individual type nearly confounds itself with that of the race” (Ibid. 18).

This chapter is split into two related parts. The first part explores how entrained actions lead to entrained feelings in groups of people. I do this by reviewing experimental evidence regarding the relationship between synchronous performance and social behaviour. I then examine the role that repetition plays in making synchronous performance possible. Following that, I discuss the differing impact of participatory and presentational music-making on the process of entrainment, and how cultures entrain in different ways, and display different rhythmic profiles of their body movements.

The second part will attempt to define the cross-cultural similarities and differences between metre and rhythm. First, I give a basic definition of what metre and rhythm are. Second, I describe how various cultural examples complicate this distinction. Third, I ask whether rhythm free from metrical or periodic organisation is possible, and then finish by exploring Clayton’s proposal for a cross-culturally general theory of metre and rhythm.

Durkheim suggests in the above quote that common action is related to social

conformity. I argue that the more entrained an action is with others, the more ‘common’ it can be said to be, and therefore even more likely to create social conformity. Music is thus a suitable activity for exploring Durkheim’s claim because it requires a high degree of entrainment between participants (see Ch. 1.2). There are, of course, other ways to make actions ‘common’ apart from entrainment (e.g. blending the timbral quality of sound), but timing is fundamental to interaction, and therefore entrainment plays a central role in acting and feeling in common. For example, ‘by moving together as a unit, participants think and value themselves as a unit, which enriches their subsequent cooperation’ (Fischer et al. 2013:116). (For

51

research on the effects of synchronised behaviour on social interaction in groups and dyads, see also Lakens, 2010; Lakens & Stel, 2011; Miles et al., 2009, 2010, 2011; Valdesolo & DeStono, 2011; Valdesolo, Ouyang & DeStono, 2010; Hove & Risen, 2009; Cohen et al., 2010; Launay et al., 2013; Lumsden et al., 2012; Koudenburg et al., 2011; Kirschner & Tomasello, 2010; Paladino et al., 2010; Kokal et al., 2011; Gill, 2011; McGarva & Warner, 2003; Chartrand & Bargh, 1999; Connor et al., 2006; and for research on the link between empathy and entrainment, Himberg & Spiro, 2012; Rabinowitch, Cross, & Burnard, 2012; Spiro & Himberg, 2012.)

Fischer et al. (2013) designed an experiment to test Durkheim’s suggestion that synchronous rituals are associated with higher levels of cooperation. Participants took part in either ‘exact synchrony’ rituals (synchronous vocalisations or movements, e.g. yoga, buddhist chanting, kirtan devotional singing); ‘complementary synchrony’ rituals (not exact synchrony, but shared ritual goal, e.g. capoeira, brazilian drumming group, choir, Christian church service); or ‘no synchrony’ rituals (no shared ritual goal, e.g. poker, cross-country running group) (Ibid. 117). To measure the effects of rituals on self-reported prosociality they applied standard psychometric scales both before and after ritual activities, and to measure observable prosocial effects of rituals on cooperative behaviours participants took part in a public goods game after ritual activities (Ibid.). They also measured by Likert scale how participants perceived each type of ritual in terms of ‘sacred values’; for example, one ‘sacred value’ question might be “This activity concerns things or values that are untouchable and should never be violated.”

Fischer et al. found support for their first hypothesis that ‘Synchronous movements are associated with higher levels of prosociality’, and for their second hypothesis that ‘sacred rituals are associated with increased levels of cooperation’ and that ‘[perceived] sacred values [of a ritual activity] predicted cooperative behavior in the economic game’. More specifically, they found that the highest ratings of entitativity (merging self within group), trust, sacred values, and prosocial behaviour (i.e. donating differing amounts of 5$ that they had been given to a common pool) were measured for exact synchrony rituals, with lower ratings for complementary synchrony rituals, and even lower ratings for no-synchrony rituals (Ibid. 119-20). The mechanism by which they explain this hypothesis is that ‘(a) synchronous actions (b) enhance feelings of oneness (entitativity) which (c) intensify sacred

52

values, thus (d) increasing prosocial behaviours’ (Ibid. 123). Although, of course, one must bear in mind that the opposite is also true, given that ‘A highly-charged verbal fight can sometimes demonstrate a good degree of coordination without the corresponding affect being positive’ (Di Jaegher & Di Paolo, 2007:496).

A different laboratory study by Reddish et al. (2013) investigated the effect of synchrony on generalised prosocial behaviour, and found that synchrony-induced prosocial behaviour was not only directed to fellow synchronous performers, but non-performers too. They also found that it did not matter if the ‘prosocial target’ was a non-performing group, rather than a non-performing individual as per the previous finding. What is interesting here is that prosociality is not only associated with those whom one has been acting in synchrony, it also affects other people that one comes into contact with after the activity has finished, who had had nothing to do with the synchronous activity.

3.1.1 Repetition Repetition may provide the fundamental basis for groups of people to participate

for extended periods in synchrony with each other. For example, Nettl (1983) describes how repetition is present in the music of every known human culture (Margulis, 2013:1). By contrast, on the whole, speech exhibits much less repetition than music (Ibid.; see also Praeger 1882–1883, and Ch. 1.3). Of course, different registers of speech are repetitive to varying degrees—e.g. story-telling and public oratory is likely to be more repetitive than conversational speech (see Knight, 2013); however, music is likely to show more repetition than even these forms of speech, and across more parameters too, such as rhythm, pitch, structure, form, and content. An intriguing demonstration of the ‘musical’ effect of repetition on speech is the ‘speech-to-song’ illusion, which is that one can start to hear a musical melody if one repeats a spoken phrase and puts it on loop (Deutsch et al., 2011).

One of the most common features of chant across cultures is its repetitive nature. In a given ritual the same tune is often used for large portions of text, split up into manageable chunks that roughly fit the tune, and sometimes the same text and melody are repeated over and over (e.g. Buddhist mantric chanting; App. 1.17). Gregorian psalmody is one example of a tradition which splits up large swathes of text into bitesize chunks accompanied by the same simple melody; however, its rhythm is variable due to the different text and varying phrase length associated with each psalm verse (see Ch. 8.1; see App. 3 video clips). Nevertheless, Margulis

53

(2013:2) hypothesises that ‘for repertoires with less rich hierarchic structuring repeated exposures might push attention down to attributes like microtiming and microdynamics’, and therefore although Gregorian psalmody is less repetitive than mantric chanting examples in a textual sense, choirs performing it can still achieve amazingly precise temporal synchronisation due to repeated exposure to the same formulaic melodic structure (see Ch. 8.4.3).

In participatory forms of music-making, Turino (2008:41) argues that ‘highly

repetitive forms and rhythms [can] actually add to the intensity of participatory performance because more people can join in and interact—through synchronised, interlocked sound and motion—and it is this stylised social interaction that is the basis of artistic and spiritual pleasure and experience’. In the context of Gregorian psalmody, one might assume that long bouts of repetition are a struggle for many worshippers; however Whitehouse argues that ‘ritual habituation provides reassurance and perhaps even a mildly hypnotic euphoria in the face of an uncertain, stressful, and fast-moving world’ (Whitehouse, 2004:98). Thus, rather than leading to boredom, participating in repetitive chanting practice can also lead to special forms of pleasure and experience, such as the examples of transcendent experience that were mentioned in Ch. 2.3, or ‘trance’ (see Becker, 2004).

Becker (1994) defines ‘trance’ as a ‘state of mind characterised by intense focus, the loss of the strong sense of self, [and] access to types of knowledge and experience that are inaccessible in non-trance states’, which is ‘fairly common even among middle-class, well-educated Westerners’ (Ibid. 41, 43; see also Oohashi et al., 2002, for EEG evidence that trance is associated with higher power in the theta and alpha frequency ranges of neural activity). Indeed, the trance experience can manifest in various musical contexts, such as ‘the performer who feels herself to be one with the music she

plays; the mild trance of the listener whose whole attention becomes focused on the music; possession trance, in which one’s self appears to be displaced and one’s body is taken over by [or unified with] a deity or a spirit [e.g. Suyá and Kalapalo culture, Sufi mysticism, Vajrayana Buddhism etc., see Ch. 2]’

(Ibid.). For example, one Gregorian chorister described to me how when ‘on a real roll with the chant, my mind empties and I just sort of…it is a different experience’.

One aspect of musical performance associated with many of these examples is the presence of repetitive rhythmic patterns. For example, in the context of the Balinese ritual described in Ch. 2.3.1, Becker (Ibid.) describes how ‘fast, loud and short temporal cycles’ with no melodic elaboration indicate the presence of demons and fighting in all Balinese and Javanese gamelan music (see App. 1.16). This kind of

54

musical repetition can be associated with parts of the ritual in which participants are in an intense trance and stab themselves, whereas ‘long, slow temporal cycles with much melodic elaboration indicate refined characters and peaceful scenes’ (Ibid. 49). Becker is keen to make the point that it is not necessarily the music that directly causes the trance or self-stabbing, given that many community members hear the music but do not go into trance (Ibid.); hence, Becker argues that any given trance experience ‘nearly always bears the imprint of a particular society’s beliefs about it’ (Ibid. 41). However, it would seem that the ‘cyclic’ and repetitive rhythmic patterns she describes are strongly associated with particular kinds of trance in Balinese and Javanese culture (see App. 1.27).

In more general terms, Kramer (1988:63) describes how when one listens or

participates in music which is constantly changing there is no guarantee that what you hear now has any relevance to what you might hear later and therefore one is required to have prior knowledge of the music, which would make joining in spontaneously very difficult for many people. To illustrate, if the tempo is in constant flux (e.g. rubato) it can be difficult to predict when the next beat might occur, which makes entrainment difficult. Of course, strictly speaking, one could also argue that however stable a tempo might be, each beat is executed by humans who exhibit ever changing degrees of variability from beat-to-beat, and therefore there is no such thing as an absolutely constant tempo (see London, 2012, on expressive timing).

By contrast to indeterminacy, Kramer describes how ‘stasis, persistence, and consistency’ all help facilitate feelings of ‘social comfort, belonging, and identity’ and also function to create a predictable framework (e.g. metre) for performers to use in order to plan their next musical move at any given moment, thus providing appropriate conditions for participatory music-making. In the context of worship, constant repetition can also serve to create the feeling of eternity in music which may contribute to music’s ability to create transcendent experience. For example, in mantric chanting one can be drawn to the ‘eternal’ present moment because each repetition of the mantra is musically and textually the same, i.e. the music is ‘going nowhere’, whereas through-composed songs that develop from ‘here to there’ draw you more into a ‘non-eternal’ sense of linear time (Purce, pers. comm.).

55

From a cognitive perspective, Margulis (2013:3) argues that repetition engages, intensifies and makes possible the mental process of ‘chunking’ information into automated sequences, and that it is this ease of mental processing and the activation of motor regions that such automation enables which facilitates the strong emotional responses often associated with repetitive activity. This would make sense with regard to why it is that familiarity and recognition are often associated with pleasure. Of course, ‘ease of processing’ and ‘activating motor regions’ are not by themselves sufficient explanations for the often transcendent experiences that participants can have when listening to, or performing, music or chant, but they do illustrate the link between action and feeling that is central to this chapter. Furthermore, in addition to singing, chanting and music-making, ritual can elicit strong emotional responses, perhaps because it features unusually-high degrees of repetition, albeit on a different time scale to music (Ibid. 2).

3.1.2 Participation and entrainment Lomax (1968:14) argues that for ease of entrainment and participation in group singing musical parameters, such as melody, rhythm, form, structure, and ‘low specificity’ (i.e. freedom in what counts as musically acceptable) must be simple. The opposite conditions are likely to correlate with presentational performance, such as complexity of musical parameters and high specificity (i.e. ‘it has to go like this’). These are, of course, general guidelines that apply in the majority of cases, but not necessarily each particular case—e.g., some instances of participatory music-making might be complex and highly-prescribed, and some instances of presentational music-making might be simple and improvisatory.

To achieve general blend in group singing, choristers have to match ‘pronunciation of vowels and consonants, use of pitch, rhythmic attack, and certainly many other aspects of phonation [e.g. releases, levels of emphasis etc.], to a model shared by the group’ (Ibid. 170-1). On the rhythmic level, Lomax states that ‘if all attacks and releases are precisely coordinated, the rhythmic blend is considered maximal’, and if, on the other hand, ‘individual voices [are] discernible on all attacks and the movements from note to note [are] invariably ragged, the rhythmic blend is considered minimal’ (Ibid. 45). Indeed, choristers have to attend ‘on a very subtle level to many qualitative aspects of the act of phonation’; for example, in addition to entraining to the rhythm of the group, an individual might also match, to a certain extent, their vocal timbre with the rest of the group (Ibid. 117).

56

It is somewhat remarkable that choristers are able to match these various levels of phonation in order to produce a blended choral sound (Ibid. 171). Nevertheless, Lomax (Ibid.) remarks how ‘singing with blended voice, like marching in step, is found on all six continents, some individuals and societies being more given to it than others’. This process of ‘matching’ and entraining may be partly attributable to the well-documented phenomenon of involuntary mimicry in affiliative face-to-face interaction, with perfectly synchronous behaviour being an example of an extreme form of mimicry (see Ch. 6.3.2 for more detail on ‘emergent coordination’, and also Iacoboni, 2009 on the relationship between mimicry and empathy).

A group acting in common in all these different ways is likely to be perceived as a unity, both by observers and the participants themselves (see Lakens, 2010, for empirical evidence supporting this claim; and Hagen & Bryant, 2003, and Loersch & Arbuckle, 2013, for an evolutionary perspective). Turino (2008:42) suggests that our experience of how entrained we are with others in social interaction can lead to a ‘vague feeling’ of either pleasure or displeasure:

‘The subtle feelings of comfort or discomfort we experience in given social interactions are typically based in these signs [of social synchrony], which we often only vaguely feel rather than directly attend to…Being in or out of sync with others results more in what we feel than in what we can verbalise about a given situation.’

By contrast, in ‘participatory’ music-making Turino (2008:43) argues that participants pay direct attention to being closely entrained with others and that if they become entrained participants are seen as members of the group:

‘It is in participatory settings, however, that focal attention to synchrony becomes the most pronounced and important. Because the music and dance of participatory performances are not scripted in advance, participants have to pay special attention to the sounds and motions of others on a moment-to-moment basis…[for example] In a Shona ceremony, singers and dancers try to interlock their parts with the parts of those around them. As people introduce new formulaic or improvised melodies or dance movements, which are then repeated for sometime, others may change their sung, hand-clapped, or dance parts to fit with these new contributions on an ongoing basis…When the performance is cooking, the synchronicity in sound and motion is a confirmation, direct and unspoken, that the participant has been seen, heard, understood, and is a valued member of the group’ (Ibid. 136).

The relevant point here is that because participatory music-making is not scripted or notated, participants need to pay more attention than usual to what their fellow music-makers are doing in order to synchronise. This might explain on one level why participatory music is so bond-forming. Another interesting observation is that after improvised formulas take hold, they get repeated for a while before the

57

next improvised melody is introduced. This process of moving between periods of stability and instability is a characteristic feature of entrainment. We will return in later chapters to the themes of paying attention to others, and moving between stability and instability in entrainment.

3.1.3 Cultural difference in body movement entrainment

The precise form that entrained body movements take may depend on whether those interacting are ‘locals’ or not. Hall (1983) documented the fact that when people walk down a crowded street, for example, people will move with each other ‘in a culturally appropriate pace and rhythm—in sync’ (Turino, 2008:42; emphasis added). Hall (1976:75) argues that ‘each culture has its own characteristic manner of locomotion, sitting, standing, reclining and gesturing’ (see also Tannen, 1984). For example, he claims that ‘Northern European people have a single beat that [our bodies] dance to, whereas the Tiv of Nigeria have four drums, one for each part of the body…’; talented dancers from the Tiv culture move to all four beats (Hall, 1976:78; App. 1.28). Having said that, Toiviainen et al. (2010) found that when Westerners moved to a 12-bar blues programme different parts of their body were associated with different metrical levels (see further discussion in Ch. 7.2.3). Nevertheless, Hall’s research may offer one explanation for the intuition that interaction with others differs depending on which culture they are from.

One of the means by which entrainment is achieved is through the use of specific

bodily gestures, both stylised and ‘natural’, which often differ cross-culturally. For example, Clayton (2005:209) describes how gestures are central to Indian classical music performance, which is ‘a rich and many-layered gestural dance, in which musicians

make statements, appeal, instruct, or plead, relate to others, express their physical and emotional beings and describe their environments. For music to make sense as music these gestures (and so on) have to be experienced in time, to mark duration and to impart a sense of regularity and recurrence’

(see App. 1.29). Many of the gestures in Indian classical music are discrete, and the same is true for many other musical cultures (for example, see Ch. 8.4.1 for an analysis of discrete gestures in Gregorian chant).

By contrast, in other musical cultures visual gestures are needed less, because performers have continuous physical contact with each other through touch. For example, in Corsican polyphonic singing, Bithell (2007:69) describes how ‘close interaction between the singers […] is necessary for the successful execution of a paghjella…[and] is in turn responsible for a range of affective qualities that further

58

enhance the sense of togetherness’. This ‘close interaction’ comes from the two outer male singers placing their hands on the central singer’s back (the three singers are arranged in a horseshoe formation, often with their eyes closed, see App. 1.12).

In summary, both the rhythm of people’s body movements and the methods they

use to communicate when making music differ from culture to culture. The rest of this chapter will look at rhythmic and metrical organisation in various cultures from around the world. My aim is to acknowledge cultural difference, but also show that that any differences that do exist are measured by degree, rather than kind, and that basic principles of metrical and temporal organisation are shared by the music and chant of all cultures. By showing this, I hope to be able to talk about entrainment and its social consequences as something that can be generalised across all cultures. 3.2 The metre vs. rhythm distinction

What follows is a discussion of the means by which acting and feeling in common in group singing and chanting is made possible—i.e. the musical aspects that underpin the ‘base line’ of human interaction (see Preface). In this section, I show that this base line can be understood as a kind of mental and sonic framework that repeats cyclically with which performers entrain.

In participatory traditions it is important to have a regular framework of ‘beats’ that performers can refer to because otherwise there is no shared basis to the group activity that members can rely upon if they get out of time with each other. Keeping this framework regular and cyclic is so important that it is often up to expert performers within the group to be time-keepers. For example, Turino describes how in the Murehwa musical tradition in Zimbabwe ‘…the core specialists were not the stars of the situation singled out from secondary participants or audience but the ones responsible for maintaining a solid rhythmic groove and a melodic-harmonic foundation that made fuller participation both possible and enjoyable’ (Turino, 2008:133; App. 1.30).

Lomax (1968:49) describes how in most musical styles, ‘the performer or performers employ a single, overall rhythmic scheme or ‘ground plan’ which serves as a point of reference for the infinite variety of rhythmic detail possible within the scheme’. This ‘ground plan’ can be referred to as ‘metre’, which, in simple terms, ‘help[s] you play the rhythms properly’ (London, 2012:3). So, on the one hand, we

59

have a framework which is based on the temporal regularity of its sounded units, which we can call ‘metre’, and, on the other hand, a kind of temporal counterpoint that plays off this regularity is provided by more ‘surface’ sonic interjections (which we can call ‘rhythm’). In other words, metre might refer to ‘periodicity, regularity, and recurrence…[and] cyclicity’ and rhythm might refer to ‘gestural, figural, and (in principle) unpredictable’ aspects of temporal organisation in music (Clayton, 2005:23). London (2012:4) provides another definition more grounded in cognitive processes, arguing that ‘patterns of duration that are phenomenally [i.e. perceptually] present in

the music…are [often] referred to as rhythmic groups…[by contrast] metre involves our initial perception as well as subsequent anticipation of a series of beats that we abstract from the rhythmic surface of the music as it unfolds in time’.

The perception of metre requires the presence of a periodic pulse, but ‘a tactus

[i.e. pulse] in and of itself, is insufficient for a sense of metre’ (London, 2012:15). Tempo is the rate of the pulse, i.e. the number of pulses per minute. The pulse is a single periodicity that gives the listener the expectancy that ‘something should happen on the next beat’ (Ibid.). Taken by itself, it would represent a series of one-beat ‘bars’: 1,1,1,1,1,1 (Ibid. 16). However, this is not what is meant by metre. In a simple duple bar, with beats numbered 1,2/1,2/1,2…, on beat 1 we expect something on beat 2, as one would with a normal pulse; however, in this case ‘our expectations are even greater that a musically significant event will occur on the following downbeat [beat 1 again]’ (Ibid.). Therefore, the metrical hierarchy in this case has two levels, the pulse and the bar. Consequently, London stipulates that ‘At minimum, a metrical pattern requires a [pulse] coordinated with one other [usually superior] level of organisation [e.g. the bar]’, but ‘more typically metre involves three or more levels’ (Ibid. 16, 24). Western music notation reflects this, given that in 4/4 metre, the tactus level is the crotchet, subdivisions exist at a level below the tactus (quavers), and bars are groups of four crotchets (Ibid. 24). There may also be additional levels above the measure (e.g. groups of bars) and below the first level of subdivision (e.g. semi-quavers) (Ibid.).

Like London, Himberg (2013:27) describes metre as an hierarchical cognitive

concept that allows us to ‘entrain in a flexible way to different metrical levels and can switch the

level at will. For example, we can clap in time with music but shift from clapping on every beat to clapping on every other beat, which corresponds to shifting to synchronising with a higher metrical level…[therefore, metre] is not just an abstract psychological concept but something that we also act

60

and represent in our body movements. A simple example of this is tapping the beat with a foot while playing a piece of music on an instrument’ (Ibid. 27-28; see also Lerdahl & Jackendoff’s grid in Ch 1.3).

Hence, when a jazz singer sings ‘I’ve got rhythm’, they probably mean that ‘they have got an excellent sense of metre and they are able to align the rhythms of their note onsets with the underlying metrical grid in such a way that it generates a sense of movement in the listener and a pleasurable, positive affect in the audience, in response to their performance’ (Gill, 2011:121). Metre therefore refers to patterns of temporal invariance that manifest in our minds (e.g. counting the beats) and bodies (e.g. tapping to the beat). Rhythm relates to the variable patterns of phenomenal aspects of sound (i.e. the physical aspects of sounds perceived by the senses through time).

Metre is inferred by the listener from the rhythmic surface, and can often be ambiguous when the rhythmic surface offers two or more possible metres, even though in practice we only ever ‘hear’ one metre at a time (London, 2012:67). This suggests a complex mechanism in musical cognition, whereby metre is ‘inferred subjectively from the rhythmic surface, which is itself then interpreted with reference to this very metrical framework’ (Clayton, 2005:30; see also Kolinski, 1973). And thus it might be argued that setting up a boundary between metre and rhythm might be misleading because ‘metre tends to direct rhythm, and even to suggest or to generate rhythm’ (Clayton, 2005:35). The question of whether rhythm drives metre or metre drives rhythm is therefore a ‘chicken or egg’ question (see Ch. 5.6.1 for further discussion).

Metre is an important concept in the context of this thesis which looks at group

singing, because London (2012:4) describes how metre is a ‘musically particular’ form of the more general process of entrainment (see Ch. 1.2 for definition of entrainment). London describes how ‘metrical entrainment allows listeners to synchronise their perception and cognition with musical rhythms as they occur in time…by latching onto temporal invariants, that is similar events that occur at regular intervals’ (Ibid. 5, 24). Indeed, most people have this more general capacity, which is evident whenever we attend to the gallop of a horse, or the drip of a tap, or when we walk, run, and sing a simple tune (Ibid. 5, 6). As we will see in the next section, it is this ability to latch onto similar events that occur at regular intervals that makes group singing possible, at least for the majority of singing traditions.

61

3.2.1 Cross-cultural problem examples The following discussion investigates problems in using the metre versus rhythm

distinction in various musical traditions. First, Central African musical traditions are suggested to exhibit isoperiodic organisation, which means that only one metrical level exists, the isochronous pulse or tactus. One hierarchical level is not enough to satisfy London’s definition of metre, which requires two or more (see 3.2 above). In the case of the Aka people, Fürniss describes how ‘each dance can be identified by its own polyrhythmic formula, which is a combination of different rhythms repeated together in a cyclic, or periodic structure’ (Fürniss, 2006:4; see also Lewis, 2009; Apps. 2.1 & 2.5).

Arom (1991) suggests that the term ‘metre’ is less appropriate in the context of Aka polyrhythm because there are no strong or weak beats as such, and therefore the term ‘isoperiodicity’ should be used because whilst the complex web of rhythmic patterns do not imply an ‘accentual matrix’, they are based around an accentless isochronous pulse (see Clayton, 2005:31). By contrast, London (2012:139) describes how some African rhythmic patterns involve ‘a series of non-isochronous [time] intervals and [that] those intervals are not symmetrically distributed within the N cycle [comprising the total number of time intervals that the series involves]’; however, he argues that because of this scholars can ‘plausibly speak of accentless metre’ (see also Locke, 1998). From a different perspective, Agawu (2003:73) has criticised the notion of an African ‘accentless’ isoperiodicity on the grounds that the standard ostinato patterns are always heard in relationship with the movement of the performers’ feet, which ‘move according to the main beats of regular, four-beat metre’ (London, 2012:138-9). Agawu’s observation would suggest that in order to understand metrical perception in African music, and perhaps more generally too, sounds have to be evaluated alongside the body movements they are associated with.

Chanting is another example of a musical activity that shows ambiguous

temporal organisation. It is often described by scholars either in terms of some basic form of metrical organisation, or in terms of speech rhythm, which is usually not periodic. However, speech rhythm itself varies from person to person, and, similarly, the same text can often be performed in a number of rhythmic styles (Frigyesi, 1993:66). Furthermore, chanting is rarely done in relation to speech rhythm alone without any reference to some form of stylistic convention (Ibid.). We might

62

therefore think of many forms of chanting as a borderline case, because it can often be both free from, and constrained by, periodicity (see Ch. 1.3 for distinction between rhythm in speech and song). Even so, Cummins (2002, 2009) has shown that joint speech entrainment between two people speaking the same text at the same time is possible even when the periodicity of their speech may not be consistent, and therefore one might presume the same is possible for chanting too (see Ch. 7.6 for further discussion). I will now illustrate this periodic/non-periodic tension by exploring Gregorian chant performance from musicological and historical perspectives (see also Chs. 8.1 & 8.5).

The entry for ‘Gregorian Chant’ in the Harvard Dictionary of Music outlines three widely-held interpretations of rhythm in Gregorian Chant (Randel, 2003:364). The first is known as the ‘mensuralist’ interpretation, which holds that there are ‘a variety of different note lengths with a precise relationship to one another, most often two values in the ration of 2 to 1’ (Ibid.). For example, the Commemoratio brevis, an important source c.900AD, attached ‘proportional value to long and short notes’, saying that ‘The longer values consist of the shorter, and the shorter subsist in the longer, and in such a fashion that one has always twice the duration of the other, neither more nor less’; i.e. Gregorian chant performance was organised by a simple binary metre (Bailey, 1979:103; quoted in Hiley, 2001). In addition to the two basic mensural units of the 2:1 ratio, other ‘mensuralists’ propose ratios of three or more units (e.g. 3:2:1 and 4:2:1) (Hiley, 2001). The ratios between the different syllable lengths in the text are reminiscent of Lerdahl & Jackendoff’s metrical ‘grid’, and imply a regular periodic pulse (Hiley, 1993:282; see Ch. 1.3).

However, Hiley (1993:280; see also Crocker, 1958) argues that a linguistic distinction ‘has to be borne in mind between metre…and stress [i.e. word-accent]’, which forms the reasoning behind the ‘accentualist’ interpretation (formulated by Solesmes monk Joseph Pothier in 19th century), which holds that ‘all of the notes in chant are of essentially equal value [i.e. duration], with word-accent determining the nature of the generally free and speechlike rhythm’ (Randel, 2003:364). In this view, a relationship between an irregular pulse and a regular pulse is held in tension, because even though some syllables are longer than others due to ‘word-accent’, Crocker argues that there is still a ‘more or less equal succession of durations’, and that this leads to the perception of a ‘pulse’, but one that is ‘weak’, ‘can vary at any moment’ and is not like a regular beat (Crocker, 2000:44,53). Crocker’s ‘accentualist’

63

claims thus give mixed messages about the stability of metrical regularity in Gregorian chant.

Thirdly, the ‘grouping’ interpretation is a relative of the accentualist interpretation, formulated by student of, and successor to, Joseph Pothier, André Mocquerau. It holds that the rhythm of Gregorian chant is ‘essentially free…employing

for the most part notes of equal value, but not as deriving primarily from word-accent, even though rhythm and other features of the chant are said to derive from the nature of the Latin language…rhythm is characterised instead by an alternation of rising (arsis or élan) and falling (thesis

or repos) and the free succession of groups of two and three notes’ (Randel, 2003:364). This interpretation is less rhythmically regular than the mensuralist approach. Nevertheless, a variety of notational markings in modern chant books that embody this interpretation guide the specific interpretation of the speech rhythm (Ibid.). However, in terms of exact duration, I would argue that these markings act more as a guide than a prescription, thus allowing chant performance to be idiosyncratic and faithful to the speech rhythm of each different Latin verse.

The ‘standard’ volume of Gregorian chant notation in use today, the Liber Usualis, argues for a mixture of accentualist and grouping approaches, saying that ‘neither

[Gregorian chant’s] rhythm nor its melody can be rightly appreciated or sung apart from the meaning of the text, the correct pronunciation of the words, and their proper grouping into phrases…Nor is a knowledge of music sufficient; one must somehow understand the Latin text and its liturgical context and cultivate a kindred spirit in order to interpret aright the accompanying melody…For good diction we must also cultivate a rhythmic sense; verbal rhythm and accent are of first-rate

importance.’ (Liber usualis, 1950:xxxv; quoted in Copeman, 1990:306). In this vein, the rhythm of Gregorian chant is open to interpretation in light of the meaning of the text, in contrast with the rules of strict proportional duration laid down by the ‘mensuralist’ interpretation of the Commemoratio Brevis (see above). As a performer, I would say that the ‘accentualist’ and ‘grouping’ interpretations are most representative of modern performance (although practices vary from choir-to-choir)—i.e. that the syllables in chant seem to be of roughly equal, but occasionally variable, duration. None of the scholars involved in this debate give anything more than a vague description of the regularity or irregularity of pulse in the chant; although, it is still possible for a group to sing chant in synchrony even with an irregular pulse (see Chs. 7.6 & 8). The case study of Ch. 8 aims to take this debate further by empirically measuring the pulse in Gregorian psalmody.

64

Intriguingly, Kaufmann (1975:11) states that ‘With a few minor modifications taking into account the characteristics and vague accents of the Tibetan language, the description of plainsong rhythm as given in the Liber Usualis can be applied to the Buddhist chant of Tibet as well’ (see App. 1.5 & 1.17). The following description of Tibetan religious songs is certainly reminiscent of Gregorian chant: ‘A vague sense of

rhythm can be observed because a large group of monks would be unable to chant freely in unison, but in many instances the rhythm [read ‘metre’] is so indistinct that it is hardly more than a sequence of accents that the spoken language provides’ (Ibid.).

Kaufmann describes how in songs with texts that have a peaceful character, or prayers which are chanted for blessings and prosperity, ‘metrical and rhythmical strictness disappear’ (Ibid.). He also describes how the drum beats that accompany the Gyantse form of Tibetan Buddhist chanting are not of a regular pulse. By contrast, songs with texts relating to power and fear are often chanted in strict metre and rhythm. Nevertheless, a lack of strictness seems to be the default. This is illustrated by the fact that the notation for all songs prescribes a number of introductory drum beats to occur before the singing begins in order to establish a beat, yet after a few notes (and syllables) in strict tempo the melody often ‘assumes a rhythmically much less distinct form’ (Ibid. 13).

Another example of a singing tradition which is not metrically regular is Corsican paghjella singing (App. 1.12). To illustrate, singer Francis Marcantei describes how ‘[paghjella] songs are not measured but they have an interior beat…an interior rhythm…at a certain moment we step outside the measure in order to enter…into the song’s ‘internal truth’’ (Bithell, 2007:66). This beat is unlikely to be metrical in the strict sense given that the songs are unmeasured, and it is also unclear to what degree the beat is periodic given the desire for rhythmic dynamism in Corsican singing (Ibid.).

The requirement for sung rhythm to at least approximate the irregularity of

periodicity in speech seems to be an aspect of other singing traditions too. For example, Spanish cante jondo, Indian ragas, and Japanese geisha songs are ‘frequently heavily embellished and metrically free’ (Lomax, 1968:150; see London, 2012, for a cross-cultural analysis of non-isochronous metre; Apps. 1.31, 1.32, 1.33, 1.34). This freedom from metre is also characteristic of the chanting of religious texts across Christian, Jewish, Islamic, Hindu, Buddhist, Shinto, and shamanic traditions, and art music in Arab countries from North Africa to Central Asia, and India, China and Japan, and the folk music of Romania, Mongolia, South Africa, Java, and many

65

others (Clayton, 1996:324; see also Frigyesi, 1993, for an analysis of free rhythm in Jewish cantillation). It is worth noting that the majority of these genres refer to solo performance (Clayton, 1996:324). This is relevant because in solo performance singers can afford to be more variable in their timing because they do not have to coordinate with others, whereas group performance places more demands for periodicity because of the need for coordination.

Therefore, in summary, there seems to be confusion as to how much periodicity is exhibited in the above examples of singing and chanting traditions, and also, consequently, whether they are metrical or not. Indeed, Frigyesi (1993:64) observes that the ‘intermediate ground between metric and nonmetric rhythm has received so little scholarly attention that we hardly acknowledge that it exists at all’.

This intermediate ground is observed particularly in chant traditions in which chanting to the rhythm of speech is encouraged, which in religious traditions might be due to the need to honour sacred texts. This non-metrical requirement poses its own dilemmas in group chanting because, as Kaufmann states above, some ‘vague sense of rhythm’ is needed for monks to chant the texts in unison. In this case and others, on the one hand the text has its own ‘internal’ rhythm which may be irregular, and, on the other, a group of chanters need a regular periodicity to chant synchronously in order that the text can be heard clearly; of course, one must also consider the possibility that chanters are entraining with each other rather than with an isochronous pulse.

The majority of the studies referred to in this section make claims that are not the result of empirical research on durational patterning in chant performance, and therefore in Ch. 8 I perform an empirical study on periodicity in Gregorian psalmody.

3.2.2 Is rhythm free from metre possible?

In practice, ‘metric regularity’ and ‘rhythmic freedom’ are poles of a continuum. Most ‘metrical’ musical performance is not normally metronomic and, vice versa, ‘free’ rhythmic variability in music can still often be perceived as metrically organised. Frigyesi (1993:64) makes the point that the intermediate ground between these two poles ‘is basic to the understanding of many music cultures outside of the Western tradition’; indeed, a preliminary survey of the ethnomusicological literature has found a list of 70 genres that are non-periodic, both religious and secular, with examples from every continent, but with around half coming from Asia (Clayton,

66

1996:323-4). Of course, there may be more genres in the literature that have not been explicitly identified as non-periodic (Ibid.). Furthermore, this intermediate ground ‘is often a determining, idiosyncratic characteristic of a given musical genre or style’, and rhythmic style may differ fundamentally between two such genres or traditions, or even within the same genre (Frigyesi, 1993:64; Clayton, 1996:324).

But is it possible for music to be rhythmically ‘free’, without the constraint of periodicity?

Clayton (2005:204) describes how ‘Free rhythm may or may not have a simple pulse, but whenever this pulse is organised periodically, rhythm cannot be described as ‘free’’. Clayton argues that studies on rhythm perception have shown that we often subjectively impose a pulse on a piece of music, even if it is ‘unpulsed’, and, consequently, that it is ‘actually rather hard to play or sing in such a way that no pulse is perceived’ (Ibid.). Thus, determining ‘free-rhythm’ would depend on being able to objectively measure periodicity from sound information alone. At the other extreme, Widdess (1994) has demonstrated that music can be founded on a consistent pulse, and yet nevertheless appear to be completely unpulsed (Clayton, 2005:98).

These findings would suggest that the presence of either an objective or subjective pulse is crucial to making and engaging with music. Given that so many other aspects of our behaviour are governed by beat-based mechanisms (e.g. neural oscillation, Fujioka et al. 2012) it is perhaps not surprising that ‘most music is, at some level, organised around a pulse’ (Clayton, 2005:98). From another perspective, Condon (1985:131) has shown that ‘speech and body motion are precisely synchronised across

multiple levels in the normal speaker, suggesting that they are the product of a unitary neuroelectric process. This speech/body motion hierarchical organisation can also be interpreted as wave-like, since it exhibits characteristic periodicities’ (quoted in Clayton, 2005:99).

To summarise, I quote Clayton (1996:329): ‘All music has ‘rhythm’; some, but not

all, has a perceived pulse; of this ‘pulsed’ music some but not all has this pulse organized periodically; and some, but not all forms of periodic organization may be described as ‘metre’’. It is not clear exactly what is meant by ’pulse’, or how easy it is to perceive the pulse, but Clayton at least outlines a provisional framework for understanding free rhythm (which he defines as being perceived without periodicity), even if the boundaries between these categories may be unclear. For these reasons, and the fact that ‘a factor common to most free rhythm forms is that there appears to be no conscious organization of rhythm’, ‘free rhythm’ has so far

67

been neglected by musicologists, music psychologists and ethnomusicologists as a concept requiring study (Ibid. 331).

At this stage, an apocryphal anecdote is pertinent: a listener, unacquainted with jazz, once asked a famous performer, ‘What is rhythm, really?’, and received the answer ‘Lady, if you gotta ask, you’ll never know’. Foolishly or not, this discussion is about trying to know what we can, and I now turn to one particular theory that attempts to explain the difference between metre and rhythm in general terms.

3.2.3 Clayton’s general theory of metre and rhythm

The preceding discussion around the distinction between metre and rhythm in different cultures paints a confusing picture. As an attempt to clarify this picture, I now refer to Clayton’s six general points about the relationship between metre and rhythm that he argues can be applied cross-culturally, which arose from his own case study on North Indian classical music.

1. Much music (but not all) is organised with respect to a periodic and hierarchical temporal framework, in such a way that a cognitive representation of this framework may be generated in the mind of the listener. This organisation and its representation are termed ‘metre’.

2. Metre can be said to exist when two or more continuous streams of pulsation are perceived to interact; these streams are composed of time points (beats) separated by durations definable as multiples of a basic unit. Time points which are perceived as beats on more than one level are ‘stronger’ than those which are beats on only one level; metre can thus be regarded as necessarily hierarchical.

3. Beats may be differentiated by stress and/or duration (i.e. they can be perceived as strong and weak, and/or long and short).

4. The relationship between metre and rhythm has two complementary aspects: metre is inferred (largely subjectively) on the basis of evidence presented by rhythm, while rhythm is interpreted in terms of its relationship to that metre.

5. The inference of metre is a complex phenomenon which is influenced by the musical experience and training of the listener, and more indirectly perhaps by his or her general experience and cultural background. Consequently both metric theory and practice are culturally determined to a great extent, although they are ultimately founded on the same psycho-physiological universals.

6. The cognition of metre appears to be dependent on one or more of the following factors: the extent of the perceptual present (determining that pulses are unlikely to be separated by more than 2-3 secs); the function of short-term memory; and the ability to comprehend recurring patterns as single Gestalts which combine notions of stress and duration. (Clayton, 2005:41-42).

Points 1-3 are related to the nature of the metrical framework and how this

framework might be represented in our minds. Point 4 looks at the mutual relationship between metre and rhythm. Points 5 & 6 define limits—cultural and

68

cognitive—that bound the workings of metre in practice; for example, in sections 3.1.3 & 3.2.1 we saw that the way we infer metre depends on our specific cultural background.

The Indian concept of ‘Tal’ can be tested here in the light of Clayton’s general

theory of metre and rhythm. Tal is basically equivalent to the Western concept of metre in that it is a quantitative metric hierarchy often with as many as 16 beats per cycle split up into smaller groups of beats, underpinning syllabically-conceived rhythm, and is often defined by qualitative factors such as accentual patterns, pitch, and timbre variation (Ibid. 66; App. 1.35). In other words, Tal has two main aspects: ‘one as an abstract temporal scheme manifested through clap patterns [satisfying Clayton’s point 1 above], the other as a repeated rhythmic pattern represented by a theka (a set of tabla strokes)’ (Ibid. 199).

Tal acts as a temporal framework for rhythmic exploration, just as we have defined metre above, and tal has metric ‘levels’ in the sense of Lerdahl and Jackendoff’s grid, satisfying point 2 (Ibid.). And tal may simply be ‘a repeated rhythmic pattern with parameters of stress, timing, and timbre’, satisfying point 3, and thus rhythm and metre in tal are complementary, satisfying point 4 (Ibid. 200). Tal’s one difference with conventional descriptions of Western metre is that it allows for the possibility of the ‘middle pulse level’ being irregular, a challenge to point 2 which requires durations to be ‘multiples of a basic unit’ (Ibid.). However, London’s (1995:68) definition—‘metre minimally consists of two levels: B [beat] and M [measure/bar] (where M = some modular ordering of Bs)’—gives space for an irregular ordering of beats within the bar (Clayton, 2005:200). Clayton concludes that metre ‘is an important aspect of tal (i.e. that tal includes metre), but that tal is a broader concept, involving dimensions not encountered in other metric systems [such as pitch and timbre]’, satisfying point 5, that metrical systems vary between cultures (Ibid. 201).

In Indian music there is a fundamental assumption ‘that music should be organised by an explicit metric structure (conceptually distinct from rhythm) that this should be done accurately and unambiguously, and that all nibaddh (metrically bound) music is organised by the same system’ (Ibid. 45). In terms of our focus on chanting, it is interesting that in one of the most popular forms of Hindustani vocal music today, the vilambit (slow) or ati-vilambit (very slow) khyäl, the relationship between the tal (metre) and surface rhythm does not seem to be defined; i.e. it is

69

possible for Indian musicians to not only conceptualise metre and rhythm as separate from each other, but also perform music in a way that can demonstrate their separation, which seems to contradict point 4 (Ibid. 50; App. 1.29 & 1.33). Clayton writes that the melodic style is highly melismatic, and individual articulation points within the melody or text are ‘not always clearly defined temporally’ (Ibid.). By using a very slow tempo this style of singing creates ‘space’, i.e. longer time spans, in which singers can ‘develop arguably the most emotionally expressive form of classical singing heard in North India’, i.e. perhaps over-stretching the cognitive limits of metrical perception described in point 6 (Ibid.).

Nevertheless, Clayton concludes: ‘there are enough points in common between

the Western concept of metre and the North Indian concept of tal for us to be able to extract those points and use them to help build a more general theory of metre’ (Ibid. 202). This is a welcome conclusion because it demonstrates that whilst conceptions of metre may differ from culture to culture, there are many aspects which seem to be fundamental to most, if not all, metrical systems. However, chanting traditions based on speech rhythm may yet prove that not all musical group entrainment is not dependent on metre or periodicity. But in the majority of cases, metre would seem to be a fundamental organising feature of musical performance across cultures, and therefore plays an important role in any psychological theory of group entrainment.

Although Clayton’s conclusion is perhaps surprising given that, from a Western

perspective, the time dimension of Indian music might be perceived as ‘unknowably complex’. However, Clayton argues that in the same way that a Westerner finds metre and rhythm in Indian music difficult to master, an educated Indian listener might perceive metrical ambiguity in 20th-century Western classical music, but none in Indian music (Ibid. 5,6). Nevertheless, due the conventions of Western notation, it is often difficult to notate metre and rhythm of non-Western musical traditions (Ibid. 29-30; see also Kolinski, 1973; Pantaleoni, 1987; Arom, 1991:206-11; Lomax, 1982).

Clayton (Ibid.) argues that whilst time signatures may be appropriate for non-Western musics, that cannot be assumed to be the case, and suggests that if we are to use Western notation for non-Western music we need to be clear on four issues. First, which time unit is the ‘beat’ and how do we notate it? Second, is a grouping of 2, 3, 4 enough or does one need a higher level grouping (e.g. 6, 8, 12)? Third, where does the measure begin and end? Fourth, which pulse is a beat and which is an off-

70

beat? (Ibid. 30). The test case that Clayton describes is Gamelan music (App. 1.27), in which the last beat of a time unit is the most important structurally-speaking. He asks whether the transcriber should use this beat as the first of his measure, and says that if the answer is ‘Yes’ then the structure of piece is unclear, and if ‘No’ then the notation might be misread. This demonstrates that transcribing into notation may not always be a reliable means of recording a non-Western music, but it does show the importance of understanding a metrical system from an ethnographical perspective.

In this section, we have seen how musical metre provides a psychological

framework to which participants can structure their actions. The relevance of metre for entrainment as a hierarchical framework (as distinct from variable rhythm) is that it allows participants to perform at many different levels of rhythmic complexity and yet still be aligned with a central framework. However, we have also seen that metre is not always obviously present in all cultures; for example, an unaccented, non-metrical pulse governs some African music, and many chanting traditions are driven by speech rhythm, rather than regular metre. But what is clear is that structural regularity of time in musical performance is useful for collective musical participation.

3.3 Summary This chapter has explored the time-organising architecture behind Durkheim’s

statement: ‘feelings in common are expressed through actions in common’, if by ‘actions in common’ one means entrained action. We have seen how entrained action contributes to prosocial behaviour, and one possible explanation for how it allows one to ‘feel in common’ is that moving in time with others requires one to pay closer attention to other performers than one would in non-entrainment-focused activities. Another explanation might be that entraining with others embodies a blurring of the distinction between self and other; i.e. if one acts at the same time as someone else, one can feel some kind of similarity with that person.

Music-making often exhibits a high degree of entrained activity. However, in order that as many people as possible can fully participate, the music being performed needs to be relatively simple, repetitive, and easily memorisable. Many participatory singing and chanting traditions are repetitive, and Margulis

71

hypothesised that a high degree of repetition in musical activity may have a positive impact on a performer’s ability to focus on microtiming and microdynamics, thus making entrainment simpler for those participating. This suggestion, coupled with the activation of motor regions that repetition facilitates, might offer an explanation for how the highly-precise temporal synchronisation demonstrated by choirs in chant is possible, even when that chant is based on speech rhythm.

The aim of this chapter was to show that basic repetitive principles of metrical

and temporal organisation are relevant cross-culturally. Metre has been defined in terms of a grid of time relationships that participants subjectively infer from the rhythmic surface; i.e. metre is a psychological construct, as opposed to being objectively ‘in the notes’. From a survey of musical traditions it appears that periodicity and ‘metre’ provide a structuring influence on the dynamic mental process that helps facilitate entrainment.

In terms of singing and chant, which usually involves text, it has been argued that linguistic metre and musical metre have proportional relationships of durations and patterns of stress that apply to their respective syllables and notes. One characteristic that seemed to set apart some chanting traditions (e.g. Gregorian and Tibetan Buddhist chanting traditions) is that the ‘internal’ speech rhythm of the text, which can be non-periodic and non-metrical, needs to be reflected in the sung chant. This goal is then in tension with the need for a regular pulse if a group of chanters want to sing in unison (although ‘singing in time’ is less important in Tibetan Buddhist chanting). However, the ‘objective’ degree to which each of these traditions is ‘non-metrical’ or ‘non-periodic’ is unclear.

The challenge in objectively determining whether musical traditions exhibit periodicity or not is compounded by our tendency to subjectively impose a pulse on temporal patterns that we hear. It is also unclear whether rhythmically-variable, text-based chant precludes the presence of a regular pulse, partly because this ‘pulse’ has not been clearly defined in relation to strict periodicity, and partly because little ethnomusicological research has been done on ‘free rhythm’. Of course, when thinking about entrainment in non-periodic contexts, one must consider the possibility that chanters may be entraining with each other rather than with the pulse. The relationship between group entrainment and non-periodic pulse will be empirically examined in Ch. 8.5.

72

In any case, our body movements in everyday social interaction have been found to be organised by periodicity, even though we are often unconscious of it (see 3.1.3). It would therefore seem that periodicity is a key component of social interaction, of which musical activity is but one form. However, it has also been found that the metrical/periodicity profile of body movements differs culture to culture, like musical traditions do too. But whilst there is significant cultural variation in timing systems in music, Clayton, who has done significant cross-cultural work on metre and rhythm, has concluded that it is possible to create a general theory of metre that can be used to underpin human musical interaction in all its variety. The fact that metre underpins all musical interaction will serve as a basis for our discussion of entrainment throughout this thesis in terms of what it is that performing singers are entraining to. However, by contrast, we will also discover that pairs of speakers can entrain their non-periodic and non-metrical speech to a high degree of accuracy (see Ch. 7.6).

Having surveyed the form and function of various cross-cultural examples of

singing and chanting traditions, in the next chapter I focus on a particular case study of a singing ritual of the Amazonian Suyá people, the Mouse Ceremony, in order to contextualise the role of musical entrainment within Suyán societal process.

73

Chapter 4 - Singing, communitas and effervescence 4.1 Introduction

The previous chapter showed that the psychological construct of metre plays an important role in collective musical entrainment, and is relevant cross-culturally. It also argued that the metrical underpinning of common action can facilitate common feeling within and between groups. The argument of this chapter is that communal singing—built on entrainment processes—often plays a fundamental role in managing the flux between social stability and instability in ritual, which is also related to managing the tension between the needs of the individual and the collective.

Social instability, or anti-structure, will be associated with Durkheim’s and Turner’s conceptions of ‘collective effervescence’ and ‘communitas’, and social stability will be associated with social ‘structure’. The sense of being part of a collective is often generated by social ‘anti-structure’, whereas in everyday society an individual must find their place within social ‘structure’ (Turner, 1969:153). By social ‘structure’ Turner meant ‘‘the patterned arrangements of role-sets, status-sets and status sequences’ consciously recognised and regularly operative in a given society and closely bound up with legal and political norms and sanctions’ (Turner, 1974:201, quoting Merton). Thus, social anti-structure is an unusual experience with others that occurs ‘in the moment’, and social structure is a cognitive construct that pervades everyday life (Olaveson, 2001:105).

I believe it is important to study the role of singing in society, given that certain

traditional societies sing nearly 3 or 4 hours a day, which is nearly as long as they spend doing manual work for subsistence (Seeger, 1979:373; Carneiro, 1961). Seeger makes the point that, strangely, we know much more about the socio-economic features of these traditional societies than the musical.

I will focus in this chapter on the role that communal singing plays in creating communitas and collective effervescence, because singing tends to be a common feature of rituals in which these phenomena are observed. On the other hand, social stability and structure are most often evident in everyday activities in which communal singing is not often observed.

74

4.2 Turner and Durkheim As mentioned above, Turner’s ‘communitas’ and Durkheim’s ‘collective

effervescence’ are both components of what Turner terms anti-structure, and are defined and synthesised below.

4.2.1 Turner’s ‘communitas' Turner’s ‘antistructure’ comprises two components: liminality and communitas; which he defines each separately here:

‘Liminality [a term borrowed from van Gennep] occurs in the middle phase of the rites of passage which mark changes in an individual’s or a group’s social status and/or cultural or psychological state in many societies past and present…Such rites characteristically begin [by] marking the separation of the subject from ordinary secular relationships…and conclude with a symbolic rebirth or reincorporation into society as shaped by the law and moral code…in liminality extreme authority of elders over juniors often coexists with scenes and episodes indicative of the utmost behavioural freedom and speculative license. Liminality is usually a sacred condition…[and] is a movement between fixed points and is essentially ambiguous, unsettled, and unsettling’.

‘Communitas tends to characterise relationships between those jointly undergoing ritual transition. The bonds of communitas are anti-structural in the sense that they are undifferentiated, equalitarian, direct, extant, nonrational, existential, I-Thou (in Feuerbach’s and Buber’s sense) relationships. Communitas is spontaneous, immediate, concrete—it is not shaped by norms, it is not institutionalised, it is not abstract. Communitas differs from the camaraderie found often in everyday life, which, though informal and egalitarian, still falls within the general domain of structure, which may include interaction rituals.’ (Turner, 1974:273-4).

Liminality, as defined above, is concerned with how ritual is situated within the societal process of a community, at the interface between different states of social structure (e.g. unmarried/married, boy/man). On the other hand, communitas refers to the nature of the relationships between people who are participating in ‘liminal’ rituals. Although the two concepts both involve a breaking-down of social structure, I am not so much concerned with the way that the ritual structure achieves this, but rather with the relationships between people engaged in a liminal ritual. This is necessarily a fundamental (though often overlooked) component of ritual, and I will, accordingly, focus on the concept of communitas rather than liminality.

The above quote explains how the bond formed in communitas is not the ‘pleasurable and effortless comradeship that can arise between friends, co-workers, or

75

professional colleagues any day…[it] is a transformative experience that goes to the heart of each person’s being and finds in that root something profoundly communal and shared’ (Turner, 1969:138). Communitas itself is spontaneous, but will often occur during ‘liminal’ occasions that are planned for, such as ‘a seasonal change, a

calendrical event, the initiation of young people a wedding, funeral, or coronation—in other words, something that could be anticipated for weeks or months, and carefully prepared for. Appropriate foods had to be gathered and prepared in advance; costumes and masks designed; songs and dances rehearsed.’ (Ehrenreich, 2007:17).

4.2.2 Durkheim’s ‘collective effervescence' Durkheim termed the creative form of collective effervescence (hereinafter effervescence, always understood as collective) ‘creative’ effervescence, which is ‘characterised by intense [collective] emotion, and in which the outcome is uncertain and may produce new ideas’ (Olaveson, 2001:101). Durkheim named the consolidating social power of effervescence ‘recreative’, ‘in which there is also intense emotion and excitement, and a bond of community and unity among participants, such that they feel morally strengthened’, but here the emphasis is on reminding the group, through sacred ritual, of the ‘oneness’ of their moral and spiritual life that had been created in a previous experience (Ibid.; see also Pickering, 1984). Communitas is not split into ‘creative’ and ‘recreative’ categories like effervescence is, even though these aspects are both contained within the concept of communitas.

4.2.3 The similarity of communitas and effervescence Both Durkheim and Turner see communitas and effervescence as a means for ‘making and remaking’ society (Olaveson, 2001:94). Olaveson (Ibid. 107) frames general similarities between communitas and collective effervescence, arguing that both concepts refer to a collective, egalitarian, intensely emotional, out-of-the-ordinary, creative or potentially destructive, ephemeral, re-enacting, revitalising, and fundamentally ambiguous mode of social interaction. Both authors recognised that ‘The experience of collective effervescence/communitas is a fundamental human need’ that counterbalances the ‘alienation’ created by (albeit necessary) social structure (Ibid.). Olaveson argues that finding a ‘harmonious balance’ between collective effervescence/communitas and social structure is the fundamental goal of all societies, whether global or local (Ibid.).

76

4.3 Talking about experience I think that part of the reason communitas and effervescence are ambiguous concepts is because one cannot talk directly about the experience they both refer to, due to two insurmountable factors. First, any phenomenological account rests on the ability of individuals to introspect, which is ‘necessarily limited to an account of conscious experience, excluding what Edelman (1989) terms both non-conscious and unconscious processes.’ (Clarke, 2011:197). Second, words used to describe experience tend to ‘either run beyond their object—taking on a drama and dynamic of their own—or fall short of a phenomenon whose corporeality, temporality, and multiplicity elude the rational, spatial, and linear character of the written word’ (Clarke, 2011:197-8).

Even though these two factors militate against a phenomenological account of communitas and effervescence, I suggest that collective music-making facilitates these phenomena by creating the conditions for what Edelman (1989) calls ‘primary consciousness’, which is ‘roughly equivalent to what is in an organism’s current awareness, or the contents of its perceptual present’. By contrast, higher-order consciousness ‘brings with it the capacity to be aware of, and reflect on, a past and a future, and to construct and consider a narrative of events’ (Clarke, 2011:194-5). One means of living in ‘primary consciousness’ is being intensely engaged and focused on an activity ‘in the present’.

To be at least minimally engaged in collective musical activity, a person’s consciousness needs to be ‘taken up with the activities or events at hand’, such as being aware of fellow music-makers, and, crucially, paying continuous attention to the ongoing flow and directedness of the pulse and melody. In the extreme, this kind of ‘primary’ attention is conducive to what Csikszentmihalyi (2003:42-61; 2002) calls ‘flow’, which is, put simply, ‘super-concentration leading to super-absorption in whatever you are doing’ (Widdess, pers. comm.). For example, Widdess (2013:126-8) describes how ‘flow’ allows Newar Buddhists in Bhaktapur, Nepal, to sing for hours at a stretch without getting tired or bored. Music is a particularly powerful initiator of flow, as the Marshal of France, Maurice de Saxe, has acknowledged, talking of the importance of drum beats for military marching: ‘Everyone has seen people dancing all night. But take a man and make him dance for a quarter of an hour without music and see if he can bear it…’ (McNeill, 1995:9, 7).

77

Of course, in addition to attending to pulse, melody and other musicians, participants also require learned and explicit knowledge to engage even minimally with musical performance, and therefore any form of experience is likely to involve both ‘primary’ and ‘higher-order’ aspects of consciousness. Nevertheless, collective musical activity provides possible conditions for individuals to be ‘in the moment’ (see Custodero, 2005), and, by extension, to experience communitas and effervescence (and also flow), which are phenomena that can only be experienced in the present moment, and require experiencers to be aware of how they relate to others around them.

However, as mentioned above, representing experience directly is problematic. My ‘solution’ is to approach experience indirectly by observing the external manifestation of experience: behaviour. Behaviour never tells the complete story of human experience, but, regrettably, a phenomenological account is beyond the scope of this thesis (see also Ch. 6.2.2).

4.4 Communal singing, communitas and effervescence My aim in this chapter is to show how the role of communal singing in ritual can manage cultural process, by reference to a detailed case study by Anthony Seeger of the ‘Mouse Ceremony’ of the Suyá Indian community, amongst other anthropological examples. This chapter is, to my knowledge, the first attempt at situating the role of communal singing within the theoretical framework of Turner’s communitas and Durkheim’s collective effervescence. In the overall context of this thesis, this chapter demonstrates the importance of field research for integrating the lower-order process of entrainment within higher-order cultural processes.

Just before I refer to the Suyán Mouse Ceremony, I want to introduce the Kalapalo people, a different Amazonian tribe, because they manage the structure/anti-structure tension in cultural process with communal singing in such a clearly defined way. Kalapalo public rituals consist of two parallel series of events: on the one hand, some of the ritual events use the verbal channel, which focuses attention upon various ritual ‘administrators’ who make sure everything economic and logistic runs smoothly; on the other hand, there are ritual events that use a musical channel, and these events manifest attributes of liminality, such as a ‘destructuring of social relations, of ordinary village space, and of normal time’ (Basso, 1981:283).

78

Therefore, for the Kalapalo, communal singing not only symbolises but is explicitly employed to manage liminal moments in the social process. We now turn to their neighbours, the Suyán Indians.

4.5 The Suyán Indian Mouse Ceremony The Suyán Indians live in a single circular village with a population of about 120, on the Suiá-missu River in the Xingu National Park, in the state of Mato Grosso, Brazil. Their Mouse Ceremony is an example of a sacred ritual which lasts for two weeks, happens only once a year, requires communal investment of considerable time and resources, and involves a lot of speaking and singing (Seeger, 1987:3, 71). During the ceremony, the community—divided into separate family houses in everyday life—becomes increasingly integrated as a whole, and the feeling of togetherness climaxes on the final night of the ceremony.

The Mouse Ceremony is based on a myth of a mouse teaching a mother about how to find, prepare, and cook corn, an important event in the history of the Suyán community. The following quote by Seeger (1987:2) summarises what the Mouse Ceremony is about from a social perspective:

‘The Mouse Ceremony is a rite of passage in which a young boy begins his initiation into the male-oriented activities of the village plaza. It is one of a number of initiation rituals that punctuate a Suyá male’s life from birth through old age, with their greatest concentration around puberty. The Mouse Ceremony is one that focuses on the relationship between an adult man and the boy to whom he has transmitted his own names, and it highlights their relationships to other kinsmen (especially the man’s sister and boy’s mother) and to certain age groups within the society as a whole. Although one boy is the focus of the ceremony, each performance of it also reaffirms the relationships of all men with their name receivers, their sisters, their joking relatives, formal friends, and affines. Every performance also reestablishes certain relationships between human beings and animals, between the village and its surroundings, and between the Suyá and the cosmos they have created and within which they live.’

From this quote, we can see that the Mouse Ceremony marks a liminal stage in a young boy’s life. Songs are used as an accompaniment to this (presumably) painful initiation of the young boy into manhood; the ceremonial movement being his leaving his natal household to join the men’s house. It is interesting that Suyá songs that are part of this process of initiation are taught to boys from an early age; i.e. a crucial part of being ready for initiation involves song knowledge (see also Richards,

79

2007:2). The ceremony also integrates the wider relationships both among individuals in the community and between the Suyá community, its immediate environment, and the cosmos.

Seeger (1987:3) explains how a ‘period of heightened euphoria and ritual activity…[lasts] throughout the entire two-week period, created and accompanied by extended periods of singing, dancing, and collective activities’. Communal unison singing of seasonally-appropriate songs occurs every day at the pre-dawn and late-afternoon time slots, along with communal shout song cacophonies, sung invocations and instructions that occur at other times, and special unison songs for the final night of the ceremony, discussed later (Ibid. 6; see App. 3, track 2). Some genres of Suyá singing, e.g. shout songs, are, by contrast, unique to a certain individual—such as the song which opened the 1972 ceremony sung by a Suyá Indian named Hwinkradi.

The Mouse Ceremony characterises what Seeger has described as the ‘ritual

mode’ for the Suyá Indians (Ibid.). The ‘non-ritual mode’, by contrast, is ‘everyday’, which means an Indian is more likely to interact with their nuclear family or be on their own, and that food gathering and distribution groups are smaller in number. One important contrast between the ritual and non-ritual mode is the presence of public forms of verbal art, i.e.unison singing, shout singing, and public speech forms, such as plaza speeches and solo myth-telling. This is emphasised to the extent that it is ‘common to hear singing all day for days or weeks on end’ (Ibid. 7).

It is worth noting at this stage that Seeger’s entire study of the Suyán Mouse

Ceremony would not have possible were it not for how much the Suyá Indians valued Anthony Seeger and his wife Judy’s singing. Anthony and Judy could not hunt, fish, or scrape maniocs, but they could sing, and this, Seeger believes, is the only reason the Suyá did not force them out of their camp (Ibid. 20). As we will learn in the following sections, singing may not just be valued by the Suyá community, it is considered essential.

4.5.1 The Mouse Ceremony songs

The akia and ngére song genres are performed in all the major Suyá ceremonies. The genre (ngére) relates to unison group singing in a low pitch register, contrasting with the akia genre of individual shout songs in a high register (Seeger, 1979:374).

80

In ngére unison songs ‘men try to blend their voices’ and ’sound as one’, because ‘[t]he individuality of the singers of an ngére is not important—indeed it is suppressed’ (Ibid. 385, 390). All ngére songs relate to a ceremonial group, not to kinship-based groups or individuals, and ‘[Suyá] people praise the singing of their kinsmen and faction members [in these ceremonial groups] and criticise the singing of the others’ (Ibid. 385, 379; cf. ‘blason populaire’ in Ch. 2.6). Thus, groups demonstrate inter-group rivalry, and rival groups will sing the same ngére song, but different parts of it, or, ‘when there are two men’s houses the moieties will sing different songs: one slowly, the other rapidly’ (Ibid. 390).

The egalitarian nature of ngére singing, as per communitas and effervescence, is demonstrated by the fact that inside the singing group itself ‘there may be political

opponents, brothers-in-law who never speak to each other, or the best of friends. The way they feel about each other has nothing to do with the way they sing except in extreme cases where, because he is angry, a man may refuse to sing at all. That in itself is a strong statement. Factional disputes occasionally come to a head in ceremonies because suddenly what had been covered up comes out into the open (literally: into the plaza [an open space in the middle of the village])’ (Ibid. 386-7).

The socially creative power of communal singing is reflected in the fact that it is only once a group sings together that the group can be said to be clearly established (Ibid. 385), and ‘[t]he care with which ngére are performed in unison is the musical expression of, and creation of, a group’ (Ibid. 390).

In contrast to ngére unison songs, each man or boy has his own individual akia

song, and these are performed either as solos or in loud cacophony with others (Ibid. 374; see App. 3, track 3). Seeger explains that men sing their own akia songs loudly in a group cacophony because they ‘want to be heard [as individuals] in spite of everyone else singing…It must be possible to hear them, yet every akia must be recognisably different from the others so that its singer can be distinguished from the other singers’ (Ibid. 379). This means that ‘certain musical features will be regularly present: high pitch, strained vocal quality, descending contour, and individualising differences in rhythm, melody, and text’ (Ibid, 383; the opposite of these features are characterised by ngére group singing). Singers also shake their rattles, and move their bodies, together (Ibid. 379); all this combined is quite an effort, given that ‘the men may sing their akia for as long as 15 hours on the final day of a ceremony’ (Ibid. 383).

81

4.6 The role of musical entrainment in social process Performing ngére unison songs is dependent on the process of entrainment, and

my hypothesis is that entrainment to a sufficiently regular and predictable pulse is a key mechanism in the creation of communitas and effervescence (see Ch. 3.1). From a brief glance at the transcription of an ngére rainy season unison song in Fig. 4.1 below, one thing is clear: this kind of group song has a periodic pulse. I would argue this song exhibits isoperiodic, and possibly metrical (15/8) organisation, with the rattle keeping an unaccented constant beat (see App. 3, track 2). I chose to notate this extract in triple time (duple would have worked too), but either duple or triple time would in this case be approximations to a kind of ‘swung’ beat, which is neither duple nor triple (see London, 2012). Although there is no video for this particular rainy season unison song on the EVIA online video archive (see reference for Seeger, 1987), the videos for dry season unison songs (in 2/4 metre) that I analysed show about thirty Suyá men stamping in synchrony with their right foot on the spot on the downbeat every two rattle shakes, sometimes accompanied by a forward-backward torso movement for emphasis.

Fig. 4.1. Excerpt from beginning of a Rainy Unison song. (My own modified version of Roseman’s 1987 transcription.) — Seeger (1987:90) describes how the text is typical of the rainy season genre, agachi ngére. The words (spelt phonetically here) are ‘what the Suyá call ‘song words’ because they have no direct referents. These particular syllables are specific to the agachi ngére and less frequently found in other kinds of unison song’ (Ibid.; cf. discussion of meaningless vocables in Ch. 1.3).

82

Fig. 4.2 below shows rough transcriptions of three akia from a 1976 ceremony: ‘The first akia is sung by an older man…in a high register with a forced voice, as is the second akia, sung by a younger man. The third example is the akia of a seven-year-old boy’ (Seeger, 1979:379). As you can see in Fig. 4.2, the akia shout songs exhibit metrical organisation in duple (2/4) time (Ibid. 381-2; see App. 3, track 1). Akia shout songs (including those below) alternate between being sung either in a loud group cacophony or solo several times in each overall akia session, which often lasts twenty minutes and is performed every day during the Mouse Ceremony. Group metrical organisation is evident even in what Seeger calls ‘cacophony’, with each 2/4 bar being marked on the downbeat by each man shaking their rattle (see Fig. 4.2), accompanied by either a step forward with the right foot, or a step backward to normal standing position (this group movement is maintained during individual shout songs too). These collective synchronous movements accompanied every example of akia singing I found in the EVIA video archive (see also Figs. 4.3, 4.4). In a typical akia session in the plaza, after singing each complete solo/cacophony segment without moving around, the front men then bend forward and lead off stomping in synchrony around in a circle without singing (see Fig, 4.5), returning to their original position to start singing again.

83

84

Fig. 4.2 - Three Suyá “Amto Akia” from 1976 Mouse Ceremony: 1. older man; 2. younger man; 3.

7-year-old boy (transcription taken from Seeger, 1979:381-2).

85

Fig. 4.3. A photo of the typical spatial configuration of afternoon akia singing in the village plaza (men in front, boys behind, two-by-two in a line). This image and the following images are included by kind permission of the Suyá people and Anthony Seeger.

Fig. 4.4. Same as Fig. 4.3, but with image taken from other side.

Fig. 4.5. Suyá men stomping round in a circle in synchrony (not singing).

86

Another ‘bee ceremony’ unison song has a strong group unison ‘chant’ element in triple metre, with solo voices on top of the polyphonic texture and communal chanting at the bottom of the texture, with the rattle marking the beginning of each bar (not part of Mouse Ceremony; see App. 3, track 4). In fact, all Suyán communal singing displays a periodic pulse and metrical organisation. Usually, in male communal singing this pulse is sounded by rattle shakes or some other percussion, and in female communal singing, by foot stamps. The beat is therefore the lowest common denominator of the interactive field with which everyone, both high- and low-status, must align their sounds and movements, and is thus the great social simplifier.

Durkheim writes that ‘probably because a collective emotion cannot be expressed collectively without some order that permits harmony and unison of movement, these gestures and cries tend to fall into rhythm and regularity, and from there into songs and dances’ (Durkheim, 1995:218; quoted by Olaveson, 2001:113). Seeger illustrates Durkheim’s idea powerfully when he describes a part of the final night of the Mouse Ceremony:

‘…at this moment the assembled men are above all representing the collective men’s groups, and may be at a particularly powerful moment of their transformation, for they do not sing their individual songs [akia] in an individualising way, but stamp and hum them producing a definitely unison sound composed of fairly quiet individual melodies and a regular, loud, unison stamping rhythm’ (Seeger, 1987:118).

Similarly, the Kalapalo people’s dance movements, according to Basso, seem ‘designed to emphasise the rhythm of the music’ and the unison movements are continuously repetitive (Basso, 1981:289). For the Kalapalo, ‘A tune cannot be easily sung without the movement of the body, especially the legs, nor is the song complete without the rhythmic accompaniment of the dancers’ feet’ (Ibid.). Olaveson writes that:

‘I believe that it is not just a coincidence that both Durkheim and Turner discussed such phenomena as rhythmic percussion, singing, dancing [et al.], specifically in relation to their concepts of collective effervescence and communitas’ (Olaveson, 2001:113).

Indeed, McNeill (1995:2,94) defines his concept of ‘muscular bonding’ as ‘the euphoric fellow feeling that prolonged and rhythmic muscular movement arouses among nearly all participants’ in communal exercises such as music, song, chant, dance, and military drill etc., and suggests that muscular bonding facilitates energetic communal activity better than words and doctrine alone.

87

However, it is necessary to point out that group entrainment is only one factor amongst many associated with the phenomenon of collective ecstasy; indeed, entrained collective activity can occur without necessarily creating an ecstatic experience, and vice versa. For example, the Suyá say that without humour (often in the form of old men behaving like clowns) the heights of ecstasy that are reached in their ceremonies would be impossible. Although sociology has investigated group ecstasy to a certain extent, most sociological articles investigate crowd behaviour by focusing on non-ecstasy-creating aspects such as ‘the structure of the group…its pattern of recruitment, its ideology and its contradictions, the mechanisms used to gain commitment, and the maintenance and evolution of the group within a group context’ (Lindholm, 1990:66, 70; quoted in Ehrenreich, 2007:16). This is largely how Durkheim and Turner have explored effervescence and communitas. However, the dynamic process of entrainment may have an explanatory advantage over these more ‘static’ features of crowd behaviour, because it is expressed in real-time interaction.

Other dynamic musical parameters also map onto the structure and anti-structure distinction. For example, the repetition of known songs at traditional points of the ceremony or calendar year might be considered a means of reinforcing social ‘structure’, whereas the Suyán ‘cacophony’, where all individuals are singing their unique individual songs in a way that would not be possible in everyday life, could be considered an example of social ‘anti-structure’.

4.6.1 Entrainment for good or ill As mentioned in Ch. 2.6, setting collectivity in motion does not always have positive consequences. Any form of group violence such as war requires collective solidarity too, which is often expressed by entrained action. Durkheim was aware of this and it would seem that the majority of strong examples of group solidarity, be they for both good and ill, demonstrate collective effervescence or communitas.

The non-violent singing and drumming battles of the Afro-Brazilian religious ritual called Congado (in Minas Gerais, Brazil, lasting 3-7 days) demonstrate directly how the concept of rhythmic unity relates to the feeling of group identity and how that unity can be used as a ‘weapon’ against another group (Lucas, 2006). Lucas et al. (2011:77) describe that ‘There are different types of groups – e.g. Congo, Moçambique, and

Candombe – each one having its own functions and associated with distinct uniforms, ritual objects, and musical instruments, and performing its own rhythms, dances, and songs throughout the whole

88

ceremony…it is also important for ritual purposes that the separate identity of each group is not compromised, especially when groups belonging to different communities come to perform at the same event’.

When different groups belong to the same community they fall into synchrony easily, at least when their tempi are fairly close together and they are in close proximity; but not when the groups belong to different communities (Clayton, 2012:54). The following is a quote by a captain of the Moçambique group talking about rhythmic unity in a ritual battle:

‘If another Mozambique meets us and they get motivated and speed up their rhythm, this may indicate that they have a suspect intention towards us. So if our group enters in the same rhythm as theirs, this doesn’t mean that we are in harmony with them, it means we are losing ourselves. We can say it is a battle. And they are winning because we have followed them. Now, if we are concentrated in our own rhythm, in our verses and our devotion, and the other rhythm does not affect ours, then that group will simply go away and nothing will happen to us because we remained firm. But once we lose our rhythm and enter the other’s, it may take a long time for us to find ourselves and our normal rhythm again.’ (Lucas, 2006:6).

Therefore, in inter-group ritual battles like the Congado, communities must not only show within-group entrainment to demonstrate their ‘spiritual power’, but also be able to resist entrainment when different communities meet during the ritual. Thus, the Congado is a clear example of how entrainment directly relates to feelings of group identity and oneness, but also an example that shows that entrainment can be used to feel ‘at one’ against another group.

In a different, violent context, Peters (2006; quoted in Richards, 2007) reports that RUF fighter groups in Sierra Leone often prepared for attacks with long sessions of singing and dancing. Peters (2011:173) describes how the ‘repetitive, dance-like action, assisted by alcohol, [took] the group out of itself and on to a different plane’. One of the soldiers describes how ‘The whole night before the attack we are singing and

dancing and drinking. We use our own voice, not an amplifier set…We also sing the RUF anthem. That one is the last one that we sing before we go to the battlefront. The dancing we do is like parading, but not like the official parade.’ (Ibid.)

McNeill (1995) refers to countless examples of the use of “rhythmic” group activities (often involving music) for improving military capability throughout human history, such as those practiced by, for example, the Dutch, French, Chinese, Moslems, Maoris, Sumerians, Romans, Athenians, to name but a few. The most horrific example he gives is of how Hitler used ‘keeping together in time’ to ‘unite and

89

barbarise a whole nation, regardless of how well educated, highly skilled and sophisticated it [was]’ (Ibid. 149). Indeed, referring to the Nuremberg rallies in which thousands of Nazis made synchronous salutes to him, and marched in time, Hitler once remarked:

‘The audience is not being informed…it is made to perform; and its performance makes history’ (Ibid.).

Thus, musical performance can be understood as offering a ‘rehearsal space’ for coordinating potential ‘real-life’ social formations and actions (cf. Cross, 2001). But this power could also be used for more subversive ends too; for example, in British history, the ‘carnivals’ witnessed in the Middle Ages, full of music and dance, began to be used for more permanent (not merely playful) subversion of authority from the 1500s onwards. At that moment in time,

‘large numbers of people begin to use the masks and noises of their traditional festivities as a cover for armed rebellion, and to see, perhaps for the first time, the possibility of inverting hierarchy on a permanent basis, and not just for a few festive hours…[for example] Robin Hood—or at least figures representing him—began to play a starring role as lord of misrule in annual summer festivities’ (Ehrenreich, 2007:103).

Indeed, McNeill (1995:99) argues that:

‘…keeping together in time [has] played its principal role in urban, civilised history—partly by reconciling the poor and disinherited to their lot, partly by challenging constituted religious and political authority, and partly by reforming and renewing established forms of worship’.

Given that any form of collective action requires within-group coordination, and that entrainment is an associated mechanism of coordination, entrainment might underpin both peace and violence depending on what is being practised.

Due to music’s floating intentionality, facilitated by entrainment (see Chs. 2.2.1 & 6.2.2), and music’s ability to allow for the simultaneous sounding of multiple voices, Richards argues that music is perhaps uniquely placed to facilitate ‘polyphonic justice’, defined as ‘an ability to listen and respond to several distinct strands of argument about justice at once’ (Richards, 2007:12). For example, the Aka people have found a way of allowing for simultaneous polyphony and unity in their music, in which there are four parts—three of which are polyphonically-sung strings of meaningless syllables, and the other part, the mòtángòlè, the principal solo male voice, sings the essential words of the song allowing the other singers to know which

90

song they are singing without ambiguity (Fürniss, 2006:11). In this example, we have an example of how a baseline reference point allows the others to confidently sing their own individual parts in harmony. This is analogous to the way I have argued that the hierarchical framework of metre allows people perform in their own way whilst remaining in temporal harmony with the actions of others around them, and, intriguingly, even though the role of the mòtángòlè seems to be a linguistic one, the word ‘mòtángòlè’ literally means ‘the one who counts’ (Ibid.).

On the other hand, music can also coordinate a group to the extent that violent unified action can occur in the name of the group when a way of life is perceived to be threatened (e.g. in ‘piacular’ rites that collectively atone for a crisis situation). Thus, rather than multiple voices existing in harmony, the voices conform to become one autocratic voice; hence, the very same mechanism of entrainment that facilitates harmony can also facilitate violence (see also Ch. 3.1). Therefore, although many studies tend to associate entrainment with facilitating social interaction that has exclusively ‘positive’ consequences (see Ch. 3.1), it is more precise to think of entrainment as having power that can be used for good or ill.

4.7 The making of society through song When Durkheim and Turner say rituals create society they refer to the way ritual provides the social energy which powers the formation of new beliefs and cultural practices (Richards, 2007:13). ‘Social energy’ is difficult to define and so too are its analogues: communitas and effervescence. Nevertheless, we can talk about the profound and specific effects on society that communitas and effervescence can have, for example:

‘…oppositions may have become alliances, and vice versa. High status may have become low status and the reverse. New power may have been channelled into new authority and old authority lost its legitimacy. Closeness may have become distance and vice versa. Formerly integral parts may have segmented, formerly independent parts may have fused. Some parts may no longer belong to the field after a drama's termination, and others may have entered it. Some institutionalized relationships may have become informal; some social regularities become irregularities or intermittences. New norms and rules may have been generated or devised during the attempts to redress conflict; old norms may have fallen into disrepute. Bases of political support may have altered. The distribution of the factors of legitimacy may have changed, as have the techniques (influence, persuasion, power, and so on) for gaining compliance with decisions.’ (Turner, 1987:27).

91

As argued above in 4.6, music can often be associated with the types of social change in this quote, and perhaps this is due in part to Cross’s idea that musical performance allows us to practise social moves before they are used (Richards, 2007:13; see Cross, 2001). Seeger expands on this idea in the quote below in the context of ritual/musical performance in Suyán society, although here he argues that rather than being ‘practise’ for social moves, the singing is the social move itself.

‘Each performance re-creates, re-establishes, or alters the significance of singing and also of the

persons, times, places, and audiences involved. It expresses the status, sex, and feelings of the performers, and it brings these to the attention of the entire community, which interprets them in a variety of ways’ and ‘Although the Suyá could use the sun, stars, moon, and constellations to calculate time, its important social markers were imposed with song…When the new season’s song had begun, it was really that season—whether or not the rains suddenly stopped or began to fall once again…’. (Seeger, 1987:65, 70).

Thus, for the Suyá, ‘What is expressed by singing is crucial, not incidental. And the very importance of music in Suyá society—in the talk of its members and the amount of time and resources devoted to musical activities—may lie in the active role music plays in the creation and life of society itself: its musical creation and musical living’ (Seeger, 1979:392).

One obvious social change in the Mouse Ceremony is the transformation of men into mice through singing, during which the community’s social structure is in a liminal state, because mice are obviously not restricted by human social conventions. As Seeger explains: ‘This transformation…is established partly in the singing after the meal…The

silent entry into the house by the rear wall is like that of a mouse, and the humming may well mark the crucial moment of the transformation, for at that point they do not even sing the words to their

songs.’ (Seeger, 1987:116). Participants in the Mouse Ceremony also change from being energetic to exhausted; from healthy to wounded (the piercing of men’s faces by arrows is a part of the ceremony); and from ‘dying’ to being bathed and reincorporated into the community (Ibid. 125,129). Indeed, the identity of the Suyá tribe is likely to change in the minds of those outsiders that witness or hear about the Mouse Ceremony; sometimes, outside witnesses—such as nurses, doctors, casual visitors, and any others that wish to come—are invited to the ceremony and one can imagine that these witnesses report back on their experience of the Suyá community to their own families and friends (Ibid. 105).

In the case of the Mouse Ceremony, one can also list many elements that display change or transformation made possible by song. But not only do the humans change, but the songs and ceremonies themselves undergo transformation as they

92

are subjected to improvisation in an atmosphere of euphoric creativity. As Seeger (Ibid. 71) describes: ‘The Suyá did not usually repeat ceremonies in consecutive years. They said they did not like to sing the same thing all the time.’ For example, the ideal for akia songs is that they are newly-composed, not ‘old’ (Seeger, 1979:379).

4.7.1 The force that changes social structure In order to ‘make’ society you need a social force powerful enough to change social structure. Turner describes how social ‘structure’ is ‘not…a permanent ordering of social relations but merely a temporary mutual, accommodation of interests of the relevant social field’, and therefore can be seen as an evolutionary process, albeit one that moves slowly enough to appear fixed (Turner, 1974:44, 240). Moore (1978:39; quoted in Turner, 1987) shows how social structure is vulnerable to change:

‘social life presents an almost endless variety of finely distinguishable situations and quite an array of grossly different ones…It proceeds in a context of an ever-shifting set of persons, changing moments in time, altering situations and partially improvised interactions. Established rules, customs, and symbolic frameworks exist, but they operate in the presence of areas of indeterminacy, of ambiguity, of uncertainty, and manipulability. Order never fully takes over, nor could it. The cultural, contractual, and technical imperatives always leave gaps, require adjustments and interpretations to be applicable to particular situations, and are themselves full of ambiguities, inconsistencies, and often contradictions.’

The fact that social structure is in process means that it can evolve, especially if communitas or effervescence can generate a social force powerful enough to inspire change at the level of the collective. Durkheim (1995:217-218; quoted by Olaveson, 2001:99) describes how this force may be generated:

‘The very act of congregating is an exceptionally powerful stimulant. Once the individuals are gathered together, a sort of electricity is generated from their closeness and quickly launches them to an extraordinary height of exaltation. Every emotion expressed resonates without interference in consciousnesses that are wide open to external impressions, each one echoing the others. The initial impulse is thereby amplified each time it is echoed, like an avalanche that grows as it goes along.’

4.7.2 How is this force created by communal singing? The Mouse Ceremony of the Suyá tribe is characterised by heightened euphoria, vigorous singing and dancing, huge feasts, and little sleep. Seeger (1987:73) describes how ‘Suyá ceremonies tended to snowball in complexity and enthusiasm, with occasional lulls’, similar to the ‘avalanche’ of emotion in Durkheim’s quote. Musical

93

activity is the primary source of euphoria—the revitalising force. Singing can often be heard all day for days or weeks on end (Ibid. 7). The listeners can be made weak with laughter, as well as the long duration of sound and activity (Ibid. 8). Being strong, able to dance vigorously and sing loudly for long periods of time are considered highly desirable attributes for the Suyá (Ibid. 121). Ecstasy is created by the saturation of individual shout songs in a cacophony, combined with the overwhelming sound of countless rattles being shaken and powerful stamping (App. 3, track 1). The ceremony comes to a climax when the parts that had previously been sung separately are transformed eventually into unison (Ibid. 113). Everyone is left exhausted at the end (Ibid. 126). Music is therefore a powerful force in Suyán society.

Durkheim (1995:327-8) describes how in certain forms of ritual behaviour participants feel and act upon intense emotions that often have an ‘essentially non-rational character’ (quoted by Olaveson, 2001:102). When one feels intense emotion, particularly in a group context, new possibilities for change often seem to present themselves, and if a group feels an intense emotion together at the same time, then they can form stronger relationships with each other that help to establish those changes in the community. The Kalapalo have a word ‘ail’ (‘being happy’) for the positive collective feeling that results from the group resolving a problem or accomplishing a difficult task, which is expressed, or even created, by music (Basso, 1981:290). The Aka people also have a similar expression for the feeling of ‘collectivity as happiness’ which relates to the common practice of polyphonic singing (multiple voices working separately but in harmony with each other); their verb kàmuz ‘denotes a musical aspect of the concept of happiness: ‘to be happy’, ‘to agree’, ‘to give the response in a song’’ (Bahuchet 1995:64-5; quoted in Fürniss, 2006:4; see also McNeill, 1995:7-8 & Ehrenreich, 2007). In Suyá society it is thought that ‘People who sing a lot express their ‘happiness’ (a kind of existential happiness), and their support for the way things are. People who do not sing are implicitly saying that they are not ‘happy’’ (Seeger, 1979:375).

As testament to the power of communal singing, one feature peculiar to the Suyá tribe and neighbouring Ge-speaking groups, in comparison with South American Indians in general, is that neither alcohol or hallucinogenic plant drugs were part of their ceremonies, even though they had appropriated tobacco and manioc beer in the past hundred years from outside. Ingested substances would normally be essential euphoria-making elements of a South American ritual, but the Suyá tribe seem to

94

need only collective musical activity, communal feasting, humour and sustained physical exertion to achieve euphoria.

The examples used in this chapter are from traditional societies where singing is commonplace. In other cultures, and more generally, collective singing (often accompanied by dance) is best at binding people together when ‘(1) it is intrinsically

pleasurable, and (2) it provides a kind of pleasure not achievable by smaller groups…[e.g.] Practitioners of ecstatic dance rituals in ‘native’ societies attested to the pleasures of their rituals; so can any modern Westerners who have participated in the dances and other rhythmic activities

associated with rock concerts, raves, or the current club scene’ (Ehrenreich, 2007:25). Depending on how it is done, singing is often an inherently pleasurable thing to do as a group and that in itself may provide motivation for a community to come together (see Kreutz et al. 2004 for a study on how singing reduces stress hormones).

4.8 The remaking of society through song Both Turner and Durkheim wanted to find out how societies encourage individuals to conform to their ‘values, norms and deep knowledge of itself’, particularly when these norms thwarted individual needs and desires (Olaveson, 2001:93). Durkheim saw that during the highly emotional social experiences just described ‘a society’s collective ideals are presented or enacted in symbolic form’, and therefore argued that one needs ritual to motivate individuals to obey societal rules (Ibid. 97, 94; see also Wallwork, 1985:201-218).

Durkheim suggested that this ‘motivation’ comes from the collective force of recreative effervescence. Turner also argues, like Durkheim, that ritual can provide ‘a periodic re-statement of the terms in which men of a particular culture must interact if there is to be any kind of coherent social life’ (Ibid. 94, quoting Turner, 1968:6; see also Ch. 1.6). Moore describes how ‘[r]ituals…are cultural representations of fixed social

reality, or continuity…By dint of repetition they deny the passage of time, the nature of change, and the implicit extent of potential indeterminacy in social relations…[here] the attempt is made to fix social life, to keep it from slipping into the sea of indeterminacy’ (Moore, 1978:41; quoted in Turner, 1987).

One example of collective articulation of ‘societal rules’ in many societies is chanting. Chanting collective teachings, stories, or shared beliefs together is a very powerful means of establishing collective agreement, and the practice of chanting gives life to

95

these texts. For example, one can observe recreative effervescence in situations where a group collectively articulates moral teachings which can inspire feelings of communality, such as in sacred chant, and which often have temporal depth due to the use of archaic texts (see Ch. 2.5). Indeed, the Suyá ‘maintained that people who heard well also knew, understood, and acted properly’, reflected in the fact that their verbs ‘to hear’ (mba) and ‘to behave morally’ (ani mba) are very close (Seeger, 1987:79). In the Mouse Ceremony, the focus is on one boy and his initiation into the ways, the groups, and the music of the plaza as a man. The boy is taught new skills and ways of behaving and performing by his elders.

As mentioned above in 4.5, the ceremony also reaffirms the relationships between the men and those other men who share their name, their sisters, their relatives, friends, animals, the environment, and the cosmos (Seeger, 1987:2). Unison songs are used to associate performers with one or another name-based plaza group, and solo performances also determine name-set membership (Ibid. 129; see 4.6). The ceremony also allows for the singing together of informal companions, and in so doing establishes, reaffirms and reintensifies their friendship and mutual support (Ibid. 119; see also McNeill, 1995). The re-establishment and re-intensification of the brother and sister relationship is also a significant part of the ceremony, and is particularly interesting given the different modalities of expressing love used by each sibling: the sister provides food and the brother provides song (Ibid. 114).

Seeger describes how the style of Suyán ceremonial singing is appropriate for reaffirming relationships within social structure: ‘The singing, with its combination of

individual and collective perspectives, the text with its animal names and first and third person verbs, and the leaping movement are all just the kind of exaggeration or combination that makes

relationships an object of reflection’ (Seeger, 1987:117; quoting Turner, 1967:103). These exaggerated articulations and movements are reminiscent of the kinds of exaggerated sounds a mother or caregiver makes when bonding with a baby, and it would follow that exaggerated musical and vocal movements may enhance the bonding between participants in the Mouse Ceremony (Trainor et al., 2000).

Another critical component of remaking the Suyán community through collective ritual was that ‘without collective rituals there might not have been villages at all’ given that Suyá families spent a lot of time away from the village in everyday circumstances; hence, the community is literally ‘re-created’ by ritual. Ceremonial activity was perhaps one of the main reasons for a Suyá Indian individual to spend

96

time in, and acknowledge, the village, and the promise of singing, dancing, and feasting would also be a way to attract surrounding people (e.g. both Suyáns and non-Suyán neighbours) (see also Widdess, 2013). Having said that, in Suyán society, communal singing was ‘not an option, but an obligation, where everyone sings but only a few people speak in public, where song structure and social groups replicate each other and are further reproduced in performance’.

Furthermore, singing and reenacted myth are seen to connect the Suyá not just with their living relatives and friends across space as just described, but also across time with their ancestors, who first taught them their ceremonial songs, and thus the music ‘makes possible a return to and renewal from the sacred past’ (Seeger, 1987:7; see also Menezes Bastos, 1978 & Basso, 1985). It is perhaps of interest, as McLeod (1974:103) has noted, that the conservatism of musical traditions in societies all over the world has puzzled ethnomusicologists for some time. It could be suggested that by retaining faithful repetition of the music of the past one feels connected to one’s ancestors and this has the effect of grounding each individual in a much wider community than the living community (Purce, pers. comm.; see also Howell, 1994; Goodman, 2003).

Moore, Durkheim, and Turner are saying that the re-creation of society can be thought of as a continuous process. This means that rituals serve as particularly powerful junctures within ‘the social process’, not only because they can change or reaffirm social reality in the present, but because this new reality can then be maintained into the future as a consequence of the collective witness serving to reaffirm whatever happens in the ritual (Purce, pers. comm.; see also Lienard & Boyer, 2006:825). In the Mouse Ceremony the collective witnessing of the boy’s transformation from one state of being (boyhood) to another (manhood) means that if the boy shows disbelief or apprehension regarding his new role then all those in the community who witnessed his change can remind him of the change and support him through the transition (Purce, pers. comm.).

4.9 Individuality and collectivity in musical ritual It was argued in Ch. 1.6 that the primary importance of religious rituals all over the world is to ‘set collectivity in motion’, i.e. to form communities, often through collective sound. But there is a tension between an individual needing to be an

97

individual and a community needing to be more than just a group of individuals. Turner explains how ‘communitas is intrinsically dynamic, never quite being realised…precisely because individuals and collectivities try to impose their cognitive schemata on one another…’ (Turner, 1987:16).

One aspect of both communitas and effervescence is the freeing of individuals from their normal status role (anti-structure), and yet this often occurs in the context of structured ritual in which the actions of the individual are contained and managed in relation to the group (cf. Olaveson, 2001:104; see Turner, 1977:36). Thus it is simplistic to categorise any given social activity as being either collectively or individually motivated, or, alternatively, as representing either ‘structure’ or ‘anti-structure’; instead, these distinctions should always be understood as referring to dialectic social forces present within all social activity.

However, certain kinds of rituals are particularly well placed to manage the flux between these forces because they create conditions for spontaneous action at the individual and collective level within the context of socially-sanctioned, ordered behaviour (see Ch. 1.6). Indeed, because of this, Turner says that ‘performances, particularly dramatic performances, are the manifestations par excellence of human social process’ (Turner, 1987:17). In the context of this chapter, a relevant example in collective musical performance of an interface between the individual and the group is the tacit negotiation of the ‘pulse’ and metrical framework, because although individuals have the freedom to alter the collective pulse, they must conform to this pulse in order to sustain group musical performance.

The association between a collective rhythmic pulse and egalitarianism is an ancient one; for example, Dionysian rituals that were based on democratic principles were associated with ‘rhythmic unity’. According to Ehrenreich (2007:34), ‘Dionysus was an accessible and democratic god, whose thiasos, or sacred band, stood open to the humble as well as the mighty…what [Dionysus] demanded, according to Nietzsche, was nothing less than the human soul, released by ecstatic ritual from the ‘horror of individual existence’ into the ‘mystical Oneness’ of rhythmic unity in the dance.’ In such ecstasy and mad excitement it is almost impossible for the ruling classes or the social elite to retain their social status (Ibid. 44). Indeed, there are many egalitarian rituals (likely to involve music, singing, and dance) that are explicitly about mocking those with power and high status; e.g. Israeli ‘Purim’ which ridicules the rabbis,

98

Roman Saturnalia, Feast of Fools in medieval Christianity, Ecuadorian mocking of agricultural bosses, the Holi festival in Kishan Garhi (Ibid. 89; see also Orloff, 1981:178 & 187).

Another characteristic of ‘egalitarian’ ritual performances is that it is usually not that important whether those who participate are ‘good’ or ‘bad’ at chanting or singing; what is important is that they are singing together. As mentioned in Ch. 2.4, the music is likely to be conducive for participation because it is relatively easy to remember and sing. Therefore, even though chanters may vary in musical competence and certain chanters may ‘lead’ the chanting more than others, everyone in the community is levelled by doing the same thing.

4.9.1 Individuality and collectivity in the Mouse Ceremony Although all the members of the Suyá community are obliged to attend the Mouse Ceremony and sing together, the ceremony also gives voice to individuals. For example, at certain times during the ceremony every individual has the right to be heard by the community and to express their ‘voice’ through song, revealing emotions such as anger, sadness, or euphoria; normally, direct public spoken confrontation would be rare amongst the men, and public oratory restricted to male elders (Seeger, 1987:130). In this ceremonial context expectations of behaviour are particularly open yet constrained by ritual structure and this creates a space for confrontational singing that is witnessed by the whole community. This means that ’To a certain extent Suyá could sing who they were, what they would like to be, and how they felt’ (Ibid.). Indeed, according to McLeod, it is so common for song to allow people to sing what they cannot say that ethnomusicologists regard this as a truism (McLeod, 1974:112; see also Chapter 2).

Other egalitarian features of the ceremony are that [i] every Suyán shares in abundant feasts of communally-gathered food, available for all; [ii] the ceremony’s ‘rules’ apply to everyone; and [iii] visitors from outside the community are also welcome (Seeger, 1987:105). The fact that the ceremony is seen as ‘beautiful’, and that true euphoria is reached only when everyone is participating, demonstrates the fundamental importance of collectivity to the Suyá.

A musical example from the Mouse Ceremony (see quote below, Ibid. 18) shows how it is possible to act out the tension between the individual and collective modes in performance:

99

Then all of them [the Suyán men and boys] begin to sing the first half of their shout songs at once. The sound is a cacophony…each singer falls silent after their verse. Following a moment’s silence, starting at the back of the line with the smallest boys, each singer [in turn] sings the verse of the first half of their shout song. After the man at the front of the line ends his verse, cacophony returns for a moment…then each sings the full verse of the second half of his shout song, except for the young boys, whose songs have only one ‘half’ or part, and one adolescent who forgot the second half of his and repeated the first half. When the singers have finished their solos, they sing simultaneously again….

This quote highlights musical features that Turino (2008) argues are common to what he calls participatory performance: the collective ‘cacophony’ is reminiscent of ‘cloaking function’, and the individual ‘solo’ section, where each singer sings their song in turn, is reminiscent of ‘sequential participation’. In track 5 of Appendix 3, you can hear the cacophony start again after the individual solos, and at this transition moment one can hear a ‘feathered beginning’ in which one young man starts singing metrically, following by another, following quickly by a few more men, and, shortly after that, many young boys (see Ch. 1.5.1 for discussion of features of participatory performance).

Thus, within the ostensibly ‘collective’ sections of cacophony each person is able to sing their own individual song, and in the ‘individual’ section everyone is singing their own unique song as part of the whole sequence of solos that together make up the community as a whole; and therefore the needs of both the individual and collective are met in each section. This same integration of two extremes is shown on the final night where men sing akia (individual) alternately with ngére (collective) during the final night. Seeger describes, how, ‘after singing akia for several hours in the plaza

the men regroup and go marching into each house. As they march in they sing their akia. At the end of his strophe each man falls silent until only the unison shaking of the rattles can be heard. Then they sing the ngére. As soon as the ngére is over each starts up on his own akia again and they all rush toward the door and charge out’ (Seeger, 1979:387).

More generally, an akia song marks a Suyán male’s ‘participation, strength, feelings, and

individual existence…Through his singing he can reveal his attitudes about himself as well. For example, two men of the same age may sing differently—one singing in the style of an older man (starting at a lower pitch and forcing his voice less) thus stressing his seniority, while the other sings in the style of a younger man by forcing his voice to the fullest, thus stressing his strength and youth…When the Suyá hear each other singing akia they know not only about the general situation, but also how a particular man feels about something. Suyá akia are one of the ways Suyá men can say something publicly about themselves.’ (Ibid. 384).

100

Although the participatory nature of Suyá ceremonies has been emphasised in this chapter so far—it does have a few presentational features. Most significantly, in many Suyá ceremonies men are the performers and women the audience (Seeger, 1987:75). Also, in a performance of the rainy season ngére song, Seeger recalls how there was segregation by age, where ‘the older, more knowledgeable, and prestigious men sat in the back, and the younger men sat in the front’ (Ibid. 96). In this song, although there is no official ‘conductor’, the ‘ritual specialist’ held the rattle and led the singing, and the rest of the singers were arranged according to how ‘expert’ they were’ (Ibid. 74, 97). Performing ‘correctly’ was also important and when the unison was broken by people making mistakes ‘there would be comments of consternation and the performance was not considered a good one’ (Ibid. 97). Furthermore, the style of singing presented Seeger with vocal challenges, particularly regarding the strength of his voice; indeed, the Suyá themselves said that ‘you had to be tough to do it well’ (Ibid.). These various ‘presentational’ elements can be individualising and thus may work against feelings of being part of a collective; however, on balance, Suyán singing is more participatory than presentational because all the men in the community are expected to join in the singing.

In exploring how the tension between individual and collective elements of the Mouse ceremony is managed, my exclusive focus has so far been on the music, but music also interacts with space; for example, the ‘individual’ element can be mapped onto the shout songs of the plaza and the spaces in front of the houses, and the ‘collective’ element onto the unison songs inside the houses (Ibid. 121). Similarly, music and space interact when a man is initiated into the men’s house, never to return to his natal home, because ‘the spatial remove [is] established in the ceremony in which he is singing’. After that, he cannot hug his sister/s, or eat with them, and is ashamed to enter his natal house, but he is allowed to sing at a distance to them which means that he can communicate with his sisters without reneging on the new social contract (Seeger, 1979:384). In a parallel and geographically close but distinct society, the music and dance of Kalapalo rituals also ‘[unite] discrete places, dissolving the difference between autonomous houses, and uniting the residents into an undifferentiated whole’ (Basso, 1981:289).

101

It is difficult to conceive of what the middle ground between the modes of individuality and collectivity may be. However, the discussion above shows how a ceremony like the Mouse Ceremony can serve to keep both forces in balanced tension so that there is no undue imposition of the value of one mode over the other; i.e. there is a balance of tension between the self in collective (e.g. individual expression within collective sound) and the collective of selves (e.g. the whole sequence of individual solos).

4.10 Summary Song is often used in context of social uncertainty, such as any situation where the community is undergoing change; a boy becoming a man, someone being born, dying or getting married. As Seeger (1987:52) said, ‘Among the Suyá, where there was metamorphosis, there was song’. McLeod (1974:113) also adds:

‘what music symbolises is an altered state of consciousness, be it a transition from one status to another, the adoption of a ritual attitude, or the acting out of personal or social impotence in the face of tensions implicit in the social structure. In all cases, music is directed at areas regarded as uncertain…music tends to occur at points of conflict, uncertainty, or stress within the social fabric. Its function may also be viewed as cyclical; it tends to damp down anxiety and irritation, but does not permanently alter the situation.’

In the examples of the ngére and akia communal singing, we have seen that a collectively-shared musical pulse can create conditions required for stable interaction within the bounds of ritual, and how this stability can help to manage a socially uncertain situation: a boy’s rite of passage to become a man (Cross, 2012). In the case of the Suyán Mouse Ceremony, the explicit aim of the ngére song genre is to ‘sound as one’ and, through being in unison, to establish groups of men that are formed only in a ceremonial context, which include people that would otherwise not get on. Music’s capacity to entrain the actions of a large group of people can intensify a general feeling of ‘oneness’, which in ritual can underscore the community’s ability to reaffirm collectively-witnessed change—e.g. a boy becoming a man—once people return to their everyday lives.

Turner and Durkheim’s concepts of communitas and effervescence are both associated with this general feeling of oneness, and are both often associated with participatory music-making. Furthermore, both authors explicitly make the link between rhythm

102

and effervescence and communitas. Musical activity can function to create communitas because everyone, no matter what social status they have, is equal in the sense that they have to conform in some way to the fundamental pulse, as evidenced in the communal singing of Suyán ngére and akia songs (although, as expressed in Chs. 3.2.1 & 8, a fundamental pulse may not be necessary for participants to entrain with each other). Thus, singing within a collective sound allows an individual to simultaneously express their own personal sound and be heard by others, as well as be influenced by the sound of the collective.

In the case of the Mouse Ceremony of the Suyá Indians, the opportunity for individual expression through song combined with the feeling of oneness that must arise from such a long duration of communal singing annually ‘re-makes’ the community. The ceremony re-affirms the relationship structure between community members, whilst also integrating changes to this structure caused, for example, by the boy’s initiation. Singing offers an opportunity that speech does not for individual members of the community to safely express how they really feel; this is common in indigenous societies throughout the world. The vigorous singing, dancing and euphoria is thus the revitalising force behind Suyán social structure, and makes compulsory participation in the Mouse Ceremony desirable, all without the need for ingesting plant substances.

In stark contrast to the ‘positive’ benefits of communal singing, we have also seen how it can be used for violence. Music thus has the energising potential to inject new life into, and even create, shared feelings that lead to both ‘positive’ and ‘negative’ effects. Musical performance also enables communities to practice coordinating ‘social moves’—both peaceful and violent—before they use these moves for real.

In keeping with the central message of this chapter, that musical activity is a singularly powerful force within human community, I conclude with a quote from Basso (1981:291):

‘It is our own twentieth-century understanding of music—that it is essentially a matter of entertainment—that does injustice to the theoretical power of our anthropological understanding of religious symbolism, and contributes, of course, to our misunderstanding of other people's musicality. It also prevents us from clearly seeing the possibility of a musical religion.’

103

In this chapter I have explored the role that entrainment plays in social and cultural process. In Ch. 5 I will switch from an anthropological perspective towards exploring how the process of group entrainment works from an inter-disciplinary scientific perspective.

104

Chapter 5 - The Unity of the Community: more than the sum of its parts? 5.1 Introduction

In this chapter I shift away from situating collective singing within anthropological research towards examining how the process of group synchronisation works from a scientific perspective. Both anthropological and scientific approaches towards understanding group dynamics provide insights at different levels of explanation that can inform each other. The principal question in this chapter is how top-down and bottom-up explanations can be argued to operate within the dynamic process of group entrainment. Most of the theoretical basis of understanding this dynamic process comes from systems theory (also known as ‘complex systems theory’, ‘complexity theory’, ‘complexity science’, or ‘systems biology’) in the form of Artificial Life and Autopoiesis. Research on real-life collective animal behaviour (e.g. flocks of birds and schools of fish) and (human) jazz improvisation will also be used to illustrate group entrainment in the context of complexity theory.

One can think about musical group entrainment both at the level of the group

(the system) and the individual. It will be argued that the the tension between group and individual in musical entrainment is analogous to the way that the ‘top-down’ collective agreement of a stable pulse (or ‘tactus’) interplays with ‘bottom-up’ individual perturbations of timing through error or improvisation (see Ch. 3.2 for definition of pulse). Complex systems theory describes the dynamics of how this collective agreement might emerge out of the interaction between individuals and the ‘system’ of the pulse.

The distinctions drawn here between top-down versus bottom-up, and between system versus individual, share common ground with distinctions discussed in previous chapters; e.g. regular versus free rhythm in Ch. 3.2, stability versus instability in Ch. 4.2.3, and collective versus individual in Ch. 4.9. Indeed, the stability versus instability dynamic is also evident in musical interaction in the way that the actions of the group as a holistic entity (the system) tends to be more robust than the actions of an individual, who through creativity or error may tend to destabilise the actions of the group.

105

Just as there is always tension between structure and chaos, so there is always tension between any dualistic conceptions of a phenomenon—in this case, top-down versus bottom-up processes in group entrainment. I will not argue for the dominance of one form of organisation over the other, nor provide an (ultimately desirable) integrated synthesis of both upward and downward causation. My aim is more modest than that: to clarify the ways in which both forms of organisation operate. 5.2 Group Interaction

In thinking about phenomena such as society, culture, and the individual, investigation into human nature has, until quite recently, focused on concerns such as the nature of human brains, the structure of cognition, the origins of language and its innate structures, and the high levels of social cooperation shown by our species (Levinson, 2006:39). Levinson argues that our largely exclusive focus on these aspects of human reality has meant that the very nature of everyday human interaction has been largely overlooked—perhaps the aspect that is most likely to have wide-ranging implications (Ibid.). I would also argue that due to the importance of singing and chanting in many societies around the world, and musical activity in general (see Ch. 2), it is essential for cross-cultural linguistic research to consider the structure of social interaction in both speech and music if we are to understand human interaction in all its fullness (see Ch. 1.3 for discussion of difference between speech and song).

Studying human interaction at the group level is, however, very difficult because

studying interaction between just two people is hard enough, let alone unpicking the complexity of group interaction. Mitchell (2012:178) makes this point even stronger, arguing that ‘[i]t is likely that all the factors contributing to the complete cause of some physical event, say a window breaking when hit by a rock, cannot be represented by any single theory in the syntax of logic or even the language of physics’. If we cannot ever gain a full understanding of a rock breaking a window then gaining an understanding of human social behaviour is even more of a challenge. Furthermore, from an academic perspective, human interaction lies in an ‘interdisciplinary no-man's land: it belongs equally to anthropology, sociology, biology, psychology, and ethology but is owned by none of them. Observations, generalizations and theory have therefore been pulled in different directions, and nothing close to a synthesis has emerged’ (Levinson, 2006:39).

106

Synthesising the findings of these various disciplines would mean that we might be able to understand more about universal aspects of social interaction (Ibid. 40).

In this chapter away I move away from analysis of cultural variation towards more prospectively universal features of social organisation. This is because descriptions of cultural variation are rarely grounded in a theory of interaction that has a few central organising principles against which cultural difference can be defined (Ibid. 61). Such a project necessarily simplifies our understanding of unimaginably complex systems, but it serves as a grounding device which we currently lack in our understanding of the way humans interact. The purpose is not to undermine the importance of ethnography but to examine the universal process of social interaction that makes cultural variation possible in the first place (Ibid. 55).

Levinson (Ibid.) argues that cultural variation may not even contradict his ‘universalist’ project, because the evidence suggests that the fundamentals of human face-to-face interaction can be observed in most societies. Sidnell (2001; quote in Levinson; see also Sidnell, 2007) argues that although ethnography may seem to challenge the search for universals ‘as in Basso’s (1970) account of massively delayed greetings

in Apache, or Albert (1972) on turn-taking according to rank in Burundi, or Reisman (1974) on ‘contrapuntal conversation’ in the West Indies, there is reason to believe they are describing

something other than the unmarked conversational norm’. Even though aspects of human conversation—such as spacing, posture, gesture, and linguistic form—do show significant cultural variation, the actual fundamental structure of normal everyday human conversation is fairly stable across the world; e.g. the rhythm of who speaks when, or who gives and takes when (Levinson, 2006:46). For example, Stivers et al. (2009) describe that whilst there is some cultural variation in the timings of turn taking, there are ‘robust universals’ such as the tendency to organise conversational interaction in order to minimise gaps and overlap.

The case of unison singing or chanting is fundamentally different to conversational rhythm because everyone makes a sound at the same time, i.e. there is no turn-taking. Levinson (2006:46) describes how interaction ‘is characterised by

expectation of close timing—an action produced in an interactive context (say a hand wave) sets up

an expectation for an immediate response’. Interaction between individuals in unison singing is therefore even more immediate. Furthermore, the examples of group singing explored so far all involve many more individuals than those involved in the typical one-on-one conversations that form the basis of Stivers et al. survey (though conversation may also involve more than two participants; see Sidnell, 2001, re.

107

Goffman). Group singing therefore requires new ways of thinking about interaction between multiple individuals, which I explore in this chapter.

5.2.1 The top-down vs. bottom-up dynamic in group interaction

One useful distinction in the context of group interaction is the difference between top-down and bottom-up influences. In the case of group interaction, a top-down influence refers to a level of organisation that is distinct from ‘the sum of the parts’—the sum of parts being the aggregation of the actions of the individuals within the group—and which influences those individual actions. A better translation of Aristotle’s famous phrase ‘more than the sum of the parts’ is ‘the whole is over and above its parts, and not just the sum of them all’ (Mitchell, 2012:174); ‘over and above’ will be used in the following discussion as opposed to ‘more than’ in order that it is easier to understand top-down influence in the way that I define it.

Understanding top-down influences as ‘over and above’ the sum of their parts is a reasonable starting point to investigating collective action, given that ‘everywhere we look in nature, at whatever level or scale, we find wholes that are made up of parts that are themselves wholes at a lower level’ (Sheldrake, 2012:50). For example, a ‘whole’ musical performance can contain notes, within harmonies, within phrases, within sections, within pieces etc. The performance as a whole depends on the way all these elements combine and relate to each other. The same is true at other levels of performance, such as the fact that individual musicians are part of groups, in ‘local musics’, in ‘regional musics’, in ‘transregional musics’ etc. (see Ch. 1.4), or that instruments or voices are themselves made up of component physical parts. A community of singers is embedded in its immediate physical environment which is within a region, country, continent etc.. Thus, at whatever level one looks one finds organised systems that exist in nested hierachies, and therefore it is usually necessary to focus on one level of the hierarchy when investigating phenomena, with an awareness of how that might relate to other levels. For example, in this chapter, I will be focusing on the level of collective behaviour in the form of collective physical movements of birds and fish, or the collective group sound of musical ensembles.

‘The sum of them all’ can refer to what is termed ‘aggregation’ in complexity science. ‘Aggregation’ is defined as ‘a particularly simple kind of compositional relationship between component parts and the whole. The weight of a pile of rocks being the aggregate of the weight of each component rock is an example’ (Mitchell,

108

2012:174). Therefore, in contrast with top-down explanations, bottom-up explanations of group interaction come from individuals whose actions, when aggregated, make up the collective action of the group. Top-down influences tend to be more stable because a group’s unity depends on the whole group adhering to principles that are ‘above’ any one individual. The point is that although the lower-order actions of the individual are often necessary for the maintenance of higher-order stability in the group, they may also act as destabilising influences.

Bottom-up aggregation is a linear part-whole relationship; however, in more complex system processes as witnessed in social behaviour, the part-whole relationship can also be represented by nonlinear dynamics which is based on ‘dynamical instability in which a physical system could end up in wildly different end states depending on very small differences in its initial state’ (Ibid. 179). Non-linear dynamics depend on feedback loops between individual components, with interactions going both ways: up towards stable structure and down towards instability. It is through these feedback loops that small variations in a system’s initial state are amplified to the extent that a system can end up in wildly different states. Therefore, ‘even if a behaviour, described at a higher level of organisation, is determined by the interactions of entities at a lower level of organisation, if the dynamics are nonlinear, the behaviour will not be predictable’ (Ibid. 180).

One well-known physical example of a non-linear, but relatively predictable, system is the organising influence of a magnetic field on a set of iron filings. Here, the individual filings are in a relationship with the field, but the field is not reducible to an aggregation of the interactions between the individual filings. As an iron bar is magnetised, individual magnetic domains start to line up in a particular direction, with others following suit, eventually resulting in all domains pointing in the same direction; thus, the field of the magnet as a whole emerges from the fields of the individual domains, and the magnetic field then in turn influences and organise the domains (Sheldrake, 2003:116).

An example of a non-linear system of living organisms in which the individuals move more freely is the way bees assess their colony’s nutritional status. Seeley (1989) describes how in a bee colony ‘forager bees’ fly out of the hive to forage for food. When they return they ‘unload’ their haul to a younger ‘unloading’ bee who transfers the food to an empty cell. What communicates to the forager bee whether to leave the hive again to either carry on or stop foraging is the relative difficulty of finding an unloading bee, which is dependent on how easy it is for an unloading bee

109

to find an empty cell, which is dependent on how much nectar has been stored in the hive. The relative difficulty of offloading to an unloading bee tunes the number of foragers to the rate of nectar intake in the hive.

Mitchell (2012:183) relates this tuning process to the dynamics of emergence, an important concept in complexity theory, with the ‘emergent structure’ being the amount of stored nectar. Mitchell argues that ‘the amount of nectar stored in a hive is not a property of any of the individual bees, although it is the sum of the results of their individual behaviour’. The emergent structure at the higher level emerges from the actions of the individual bees and the amount of stored nectar in turn affects the subsequent actions of the individual bees, creating a feedback loop.

The most accessible examples of commonly-used emergent phenomena in the context of physical systems are temperature and pressure. Temperature and pressure are higher-level phenomena that are only meaningful in the context of large ensembles of interacting molecules because an individual molecule possesses neither temperature nor pressure. These phenomena emerge through the multiplicity of interactions between vast numbers of molecules (Sipper, 1995:2).

Mead (1932) first proposed the concept of emergent structure as relating to ‘the spontaneous evolution of structure and meaning’. Emergence is a useful concept for thinking about aspects of social interaction; for example, emergent meaning in conversation, and emergent structure in team interaction. Emergent structure also relates to musical entrainment, and this will be discussed below in 5.6 after I have discussed complex systems theory in its various manifestations, and in the context of collective animal behaviour.

5.3 Luhmann's systems theory One manifestation of systems thinking, Niklas Luhmann’s ‘systems theory’, is

primarily ‘top-down’. Luhmann argued that a theory of social systems does not need to concern itself with the individuals who are part of that system (Gershon, 2005:100; see also Luhmann, 2002). What matters from Luhmann’s perspective is that a system can determine what is system and what is environment, and individuals are merely ‘environment’ from a system’s perspective—their individual agency is irrelevant. In Gershon’s words, ‘people provide [inputs] for a social system to re-frame according to the system’s needs, an activity no different than what any part of the environment

110

contributes to its system’. An environment does not need a system, but a system needs an environment (Ibid. 100).

Thus, Luhmann’s ideas are of use in understanding something about how individuals act under the influence of overarching systems that to a large extent delimit their behaviour. The fact that Luhmann treats individuals as part of a system’s environment is what distinguishes him from another proponent of sociological systems theory, Talcott Parsons, who tried to integrate both individual agency and the system in one sociological theory (see Parsons, 1991). Interestingly, Durkheim would have agreed with both Luhmann and Parsons when he said: ‘Precisely because society has its own specific nature that is different from our nature as individuals, it pursues ends that are also specifically its own; but because it can achieve those ends only by working through us, it categorically demands our cooperation. Society requires us to make ourselves

its servants, forgetful of our own interests.’ (Durkheim, 1995:209). Luhmann argues that a system’s nature is fundamentally different to an individual’s nature; Parsons argues that we need a theory for the interaction between individual and system, because a system only works if the individual conforms to it. Durkheim’s work on recreative effervescence (discussed in Ch. 4.2.2 & 4.8) was an attempt to show on the level of culture how society encourages an individual to conform to a societal system.

Luhmann’s conception of a system is one of reduced complexity in comparison with its far more complex environment. Gershon (2005:102) describes how ‘[a] system is constantly reformulating the noise and chaotic complexities that leave the environment and enter the system into order. But creating order is also always creating a simplification; it is reducing complexity to what is manageable’. A system can only bring order to complexity to a limited degree; if the environment becomes too complex the system will not be able to bring it into order. Related to this is the fact that a system can only select a limited amount of information from its environment; however, this ‘limit’ changes as the system adapts by changing its ability to structure its environment (Ibid.).

A dynamic process, such as real-life social interaction, would not survive when

only top-down processes such as collective goals are functioning exclusively because there would be no means for repairing any mistakes that individuals might make, such as the bottom-up mechanism of interaction between body movements (see Chs. 5.6.3 & 6.3). However, as Luhmann would seem to have it, top-down primary systems presuppose an ‘over and above’ or ‘God’s-eye’ perspective which basically views local versions (i.e. individuals) of the system as lesser and limited variants of

111

this more ‘perfect’ overseeing perspective. Thus local versions of the system are, by nature of being within the system, merely to be judged as juxtapositions to the system; i.e. they do not fundamentally challenge the system’s superiority (Gershon, 2005:105). Luhmann’s theory is top-heavy, but the fact that he allows for a system to adapt to its environment by changing its structure means that there is some potential for a bottom-up influence, although it is not clear as to how it would work.

There are examples of collective human behaviour that conform best to Luhmann’s theory where a top-down common goal is dominant. For example, as Canetti (1973:32) says about crowds:

‘A goal outside the individual members and common to all of them drives underground all the private differing goals which are fatal to the crowd as such. Direction is essential for the continuing existence of the crowd…A crowd exists so long as it has an unattained goal.’

In the context of the Haka ritual, Canetti (Ibid. 34; App. 1.26) describes how ‘the

tribe feel themselves a crowd. They make use of it whenever they feel a need to be a crowd, or to appear as one in front of others. In the rhythmic perfection it has attained the haka serves this

purpose reliably. Thanks to it their unity is never seriously threatened from within’. The top-down ‘common goal’ in the Haka is rhythmic unity, but this description illustrates how certain group activities seem to have the explicit goal of reducing individual perturbation and increasing singularity of purpose and action.

Similarly, a school of fish moves in such ways as to demonstrate a singularity of purpose. Here, large groups of individual fishes swim ‘in tight formations, more or less parallel to each other, changing direction and reversing in near unison’ (Sheldrake, 2003:117; App. 1.36 & 1.37). Furthermore, most species of fish, including herring and mackerel, ‘form schools that have no leaders’ (Ibid.; see also Partridge, 1981). The fact there is no easily decipherable hierarchy within schools of fish suggests that top-down organisation is most dominant (for the collective behaviour of ants and termites, see also Wilson, 1971; von Frisch, 1975; Marais, 1973; Gordon, 1999). We will return to schooling behaviour in fish in 5.5.3.

5.4 Complex systems theory Complex systems theory attempts to explain the crowd and schooling behaviour

in the examples just given, as well as group organisation in general. The term ‘complex system’ refers to ‘a structure whose behaviour cannot be extrapolated from the

behaviours of its individual components. A large number of such systems can be found in fields as diverse as particle physics, ecology, economics, neurology, sociology and computer science. Their

112

behaviour cannot be controlled or designed in a hierarchical way and they evolve unusual characteristics as they interact with and adapt to the environment in which they operate.’ (Worrall, 2004:121).

The fact that a complex system ‘cannot be extrapolated from the behaviours of its individual components’ would suggest that the system has top-down influence. However, the mathematical models associated with complex systems theory model a complex system as a bottom-up synthesis of the network of interactions between individual components. But it is not clear if a complex system acts in a primarily bottom-up or top-down manner. Whilst abnormal actions of individual components may become replicated through the system and force the system to readjust (bottom-up), if the system is sufficiently complex then it is probably able to tolerate quite a bit of individual disturbance before it needs to adapt (top-down), as in Luhmann’s theory. The challenge for mathematical models of complex systems is to be able to determine which disturbances are a threat to the system and which are not. However, the complexity of non-linear dynamics with its multiple feedback loops makes such an analysis difficult.

‘Emergent structure’ is a core concept in complex systems theory (related to

‘systems theory’), and is analogous to Luhmann’s concept of a ‘system’ (see 5.2.1). Emergence’s defining features are ‘novelty, unpredictability and the causal efficacy of emergent properties or structures, sometimes referred to as downward causation’ (Mitchell, 2012:173). Downward causation is the idea that the emergent property or structure can influence the behaviour of lower-order components; i.e. top-down influence. However, although an emergent structure is a high-level structure that has downward causation like Luhmann’s ‘system’, it is a structure that emerges from the complex interactions between individual components, and therefore arises from a ‘bottom-up’ synthesis of the network of local interactions.

It is easier to describe a snap-shot of the emergent structure once it has emerged, as opposed to describing the process by which it emerges. But snap-shots are static, and the process is dynamic, so snap-shots are of limited descriptive or explanatory use. Having said that, writing about dynamic process is very difficult and so a tool such as a snap-shot of a complex system can be very useful if you want a description of a particular moment in the process.

Emergence is also a key concept in another systems theory of complex behaviour, autopoiesis, which refers to a closed system that can create itself (‘auto’ meaning self and ‘poiesis’, creation or production). Autopoietic organisation is

113

defined as ‘a unity by a network of productions of components which (i) participate recursively in

the same network of productions of components which produced these components, and (ii) realise

the network of productions as a unity in the space in which the components exist’. The central tenet of the philosophy of autopoiesis is that of a bottom-up synthesis: ‘the properties

of a unity [i.e. a cell, organism, etc.] cannot be accounted for only through accounting for the properties of its components…the living organisation can only be characterised unambiguously by specifying the network of interactions of components which constitute a living system as a whole,

that is, as a “unity”’ (Varela et al. 1974: 187). A ‘unity’ (or ‘system’) is thus either an ‘unanalysable whole’ with properties that define its unity, or a ‘complex system’ of mutual relations between its components through time (not the properties of these components) (Ibid. 187 & 188). Although, in the latter case, a ‘unity’ is hard to pin down because it is constantly changing.

‘Artificial Life’ is an applied version of systems theory that relates to the concept of autopoiesis. The ‘artificial’ in Artificial Life (hereinafter ALife) ‘signifies that the systems in question are human-made; that is, the basic components [i.e. computer representations] were not created by nature through evolution’ (Sipper, 1995:1; see also Newtson, 1993). The most important properties of ALife systems, which are ‘large collection[s] of simple, basic units’, are those which emerge at higher levels—i.e. emergent properties (Ibid. 4). ALife is ‘devoted to understanding life by attempting to

abstract the fundamental dynamical principles underlying biological phenomena, and recreating these dynamics in other physical media, such as computers, making them accessible to new kinds of

experimental manipulation and testing’ (Ibid. 2, quoting Langton). ALife is distinct from traditional artificial intelligence (AI), which is top-down in that ‘complex behaviours (for example, chess playing) are identified and an attempt is made to build a system that presents all the details of said behaviour’ (Ibid. 5).

Like autopoiesis, ALife aims not to define the properties of the individual components of a system, but to define the properties of the network of mutual relations between individual components. ALife is thus synthetic, attempting to construct phenomena from their elemental units, as opposed to analytic, trying to break down complex phenomena into their basic components (which is characteristic of traditional biological research) (Ibid. 2). Therefore, like autopoiesis, ALife is a bottom-up project.

ALife offers opportunities for conducting experiments ‘that are extremely

complicated in traditional biology or not feasible at all’ (Ibid. 1). Varela et al. (1974:189) have said their autopoiesis model ‘permits the observation of the

114

autopoietic organisation at work in a system simpler than any known living system’. However, it remains to be seen whether autopoietic and ALife models can model the full complexity of living systems situated within their ever-changing environment, nested as they are in systems within systems. Even if Autopoiesis and ALife were able to model real-life systems in principle, we may never have enough computer power to test this. Furthermore, as we will see in the next section, in light of a few animal collective behaviour studies, the initial assumptions of ‘fundamental dynamical principles’ made by these models may not be correct. Another criticism might be that each computer model used in ALife for experiments are generative rather than predictive, with each outcome based on the initial assumptions.

As Rene Thom (1975) pointed out, the power of mathematical models declines rapidly as systems become more complex:

‘The excellent beginning made by quantum mechanics with the hydrogen atom peters out slowly in the sands of approximations in as much as we move towards more complex situations…This decline in the efficiency of mathematical algorithms accelerates when we go into chemistry. The interactions between two molecules of any degree of complexity evades precise mathematical description…In biology, if we make exceptions of the theory of population and of formal genetics, the use of mathematics is confined to modelling a few local situations (transmission of nerve impulses, blood flow in the arteries, etc.) of slight theoretical interest and limited practical value…The relatively rapid degeneration in possible uses of mathematics when one moves from physics to biology [let alone social psychology] is certainly known among specialists, but there is a reluctance to reveal it to the public at large…[T]he feeling of security given by the reductionist approach is in fact illusory’.

Various versions of systems theory (ALife, Autopoiesis etc.) have been attacked from within the sciences as anti-reductionist because they focus on the ways in which the mass behaviour of collections of simple-attribute units cannot be explained by reference to the attributes alone, only to their aggregation. Systems approaches attempt to be holistic, and also take into account the fact that complex systems need a certain degree of noise (i.e. internal and external perturbations) throughout the system for order to emerge (e.g. the ‘mean field theory’ of Toner & Tu, 1998:4854). However, I would argue that whilst complex systems theory is presented as holistic instead of reductionist, one must ask whether modelling the web of known local interactions based on minimalist assumptions is possibly just another reductionist endeavour, cloaked in holistic language. Any model starts with assumptions that may be wrong. These assumptions can generate hugely complex aggregations of inter-relationships but if the assumptions and algorithms themselves reduce

115

individual behaviour to its barest essentials, then complex systems theory is still reductionist.

Having said that, systems theory does provide a useful framework for understanding complex social behaviour. For example, Worrall (2004:121) defines complex systems as displaying self-organisation, emergent behaviour, and as adaptable and robust even in the context of an ever-changing environment. These characteristics can certainly be attributed to group behaviour, and also the phenomenon of entrainment (see Ch. 1.2). A recent multidisciplinary overview of social behaviour, edited by Moore, Szekely, & Komdeur (2010), concluded that in order to understand social behaviour researchers needed to adopt systems biology as the primary approach.

Moore et al. (2010:544) have found that ‘when researchers dissect [animal] behaviour they reveal unexpectedly complex networks of mechanisms’. Blackwell & Young (2004:123) also note how ‘[m]any animals exhibit remarkable collective behaviour. Social insects

gather in large numbers—swarms—to forage and build nests. The ability of flocking birds to coordinate their motion in order to avoid obstacles and to rapidly change direction of flight is well

known to us all’. Complex systems theory has therefore been used ‘in contemporary computer animation practice for the generation of such visual effects as fire, ocean waves, as well as schooling, flocking, herding and swarming’ (Worrall, 2004:121; see also Reynolds 1987; Bonabeau et al., 1999). As advocates of complex systems theory and the bottom-up synthetic approach, Blackwell & Young (2004:123) argue that ‘this

collective behaviour does not necessarily derive from central organisational control or leadership, but arises from the local behaviour and interaction of (relatively) simple organisms…each swarm member is only aware of other members in its immediate neighbourhood. A dramatic example is to be found with the huge shoals of migrating herring, sometimes up to seventeen miles long and with millions of members; it is hard to conceive of any centralised method of communication that can account for this

collective behaviour (Reynolds, 1987)’.

In a small flock of ten pigeons it has been found that the flock’s movements are hierarchically organised (Nagy et al., 2010). However, a response governed by leaders or hierarchical structure in flocking and schooling behaviours in large groups of thousands or millions of individuals would not make sense, because, in the context of predatory attack, unless the leader or leaders happen to be right next to the predator there would be no global reaction of the group and the flock or

116

school’s safety would be threatened (Cavagna et al. 2010a:1; e.g. App. 1.37 & 1.38). So how do they do it? 5.5 Complex systems and collective animal behaviour Complex systems theory is a project which aims to be able to simulate ‘real-life’ natural behaviour by artificial means. Sheldrake (2003:113) notes how, in contrast to the numerous attempts to simulate flocking, swarming, and schooling behaviour on computers, there have been ‘surprisingly few studies of the detailed behaviour of [real-life] flocks of birds’, for example, especially given Cavagna et al. (2010a:1)’s observation that ‘Of all distinctive traits of collective animal behaviour the most conspicuous is the emergence of global order, namely the fact that all individuals within the group synchronise to some extent their behavioural state’. The best known of the computer models that attempt to model coordinated animal motion is Craig Reynolds’ boids model, developed in the 1980s, that demonstrated ‘the basic architecture of ALife systems—a large number of elemental units [i.e. birds], relatively simple, interacting with a small number of nearby neighbours, with no central controller…[and] [h]igh-level, emergent phenomena resulting from these low-level interactions are observed’ (Sipper, 1995:7). Blackwell & Young (2004:123) also claim that Reynolds demonstrated that the behaviour of flocks, schools and swarms can arise ‘merely from local interactions between the entities…[t]here is no need for global coordination.’ They also note that ‘a common theme of [Reynolds’ and Bonabeau et al.’s] investigations is the desirability of decentralised organisation [i.e. power does not reside in one location within the flock, school etc.], from the perspective of stability and adaptability.’ (Ibid.). Sheldrake (2003:113) explains how the two-dimensional boids model, based on complex systems theory, starts from individual boids, and these boids are programmed to behave according to three simple rules:

1. Steer to avoid being too close to neighbours. 2. Steer towards the average direction that neighbours are heading in. 3. Steer to move towards the average position of neighbours.

However, he argues that while this model, based on local interactions only, allow a computer screen to convincingly imitate collective animal motion (e.g. in films like

117

‘Lion King’ and ‘Batman Returns’), ‘it bears little relation to the behaviour of real, three-dimensional flocks of birds’, for example (Ibid. 114). 5.5.1 Potts' 'Chorus Line Hypothesis'

Wayne Potts (1984) performed real-life biological research on the banking movements of large flocks of dunlins. By analysing films of dunlin flock movements he found that ‘a single bird may initiate a manoeuvre which spreads through the flock in a wave.

The propagation of this ‘maneouvre wave’ begins relatively slowly but reaches mean speeds three times higher than would be possible if birds were simply reacting to their immediate neighbours. These propagation speeds appear to be achieved in much the same way as they are in a human chorus line: individuals observe the approaching manoeuvre wave and time their own execution to

coincide with its arrival’ (Potts, 1984:345; App. 1.39). Potts argues that dunlin flock coordination is achieved through visual communication, leading to the chorus line hypothesis that the neighbours that follow the initiating bird ‘will be delayed by at least their own reaction time but, further away, response times should fall as birds are able to estimate the arrival of the approaching maneouvre wave’ (similar to that observed in the ‘Mexican Wave’ phenomenon). This fits with his research on human chorus lines which indicates that ‘rehearsed maneouvres, initiated without warning, propagate from person to person approximately twice as fast (107.7 +/- 6.8ms, n=3) as the 194ms human visual reaction time’ (Ibid. my italics; see Teichner, 1954).

In line with his theory, Potts found that, in those dunlin films in which initiators of banking movements and their neighbours were discernible, the movements ‘were initiated by one or a few individuals (one initiator, n=9; two, n=3; three, n=2; >three, n=0; where n = no. of banking movements)’. Potts observed that the waves that ‘radiated’ outwards from the initiating bird travelled along every major axis, even from back to front, suggesting that any region of the flock could initiate a manoeuvre. The mean propagation time of the wave from neighbour to neighbour was 14.6ms (+/- 6.7 ms; n=9), which was lower than the laboratory measured startle reaction time of 38.3ms (+/- 3.1 ms; n=110). The reaction time of the first neighbours to respond to the initiating bird was 67ms (+/- 24 ms; n=14). Potts’s conclusion was that the mean manoeuvre waves ‘travel at speeds nearly three times faster than possible if flocks are following the actions of adjacent neighbours’ (Ibid.). This conclusion would seem at first glance to contrast with the assumption of Reynolds’ boid model that flocking behaviour is based solely on neighbour-to-neighbour interaction. However, a slower initiation speed (67ms) which accelerates to high propagation speeds (14.6ms) is consistent with the chorus line hypothesis as defined

118

above, because it means that these ‘early birds’ did not have the advantage that later birds had of anticipating from a greater distance the arrival of an already-established manoeuvre wave. Potts also did not observe any unison maneouvres, which also supports the chorus line hypothesis, and would suggest that flocking behaviour of dunlins arises from bottom-up organisation. However, more mysteriously, ‘no preliminary movements which might signal that a turn is imminent were seen, although such movements should be visible on film if they are to be visible to what is often thousands of tightly packed flock members’ (Ibid.).

5.5.2 Criticism of current explanations of flocking behaviour

Sheldrake’s argues that Potts’ assumption that the birds exclusively employ the visual channel to coordinate the movement of the flock ‘would entail practically

continuous, unblinking, 360-degree visual attention. Even assuming total, continuous attention, how could this work when birds were reacting to waves approaching from behind [which are common in dunlin flocks, which are not V-formation flocks]? No birds have 360-degree vision, whether they have their eyes at the front, like owls, or at the side of their head like geese, dunlins and starlings’ (Sheldrake, 2003:114-15). For example, starlings have ‘lateral visual axes and a blind rear sector’ (Ballerini et al. 2008b:213; see Martin, 1986). It is also difficult to imagine how sonic communication between individual chirping birds could explain this phenomenon either, given that individual chirps would be hard to distinguish in the context of the chatter of thousands of birds.

Sheldrake’s second criticism of Potts is that banking maneouvres are far more

complex in their quantitative details than well-rehearsed standard human chorus line maneouvres; i.e. the dunlins have to sense exactly how to turn as well as sensing the advancing wave in order to change the overall pattern of flight within a densely-packed flock without bumping into each other, and this would involve coordinating both the speed, angle, and duration of any turning movement (Sheldrake, 2011:362). Moreover, what is particularly intriguing about banking movements is that this precise and complex organisation happens faster than a startle reaction, such as a response to a sudden flash of light; this is extraordinary given that startle responses are non-specific in their directional movement (Ibid.).

Indeed, the implications of flocking behaviour may get even more remarkable in

light of further observations of flocks of starlings near Rome by Cavagna et al. (2010a), who measured the velocity fluctuations of each individual bird and

119

determined to what extent they were correlated with other birds (see App. 1.40). They found that ‘every bird in the flock was influenced by every other bird, however large the flock’. The birds showed ’scale-free correlations’, meaning that ‘the group

cannot be divided into independent subparts, because the behavioural change of one individual influences, and is influenced by, the behavioural change of all other individuals in the group. Scale-free correlations imply that the group is, in a strict sense, different from and more than the sum of its parts…The effective perception range of each individual is as large as the entire group and it becomes possible to transfer undamped information to all animals, no matter their distance, making the group respond as one.’ (Ibid. 2).

The implications of these findings are hard to accommodate within complex systems computer models, which are often based on the assumption that any emergence arises from local interactions, because scale-free correlations are non-local. Another finding by Ballerini et al. (2008b:210) was that at different sizes ‘flocks seemed to have a characteristic shape, being thin in the direction of gravity and more extended perpendicular to it’. The authors found that the flock was organised in a way that maintained its proportions, which poses some interesting questions, such as ‘Is it the individuals themselves or some external stimulus that keeps the group’s proportions constant?’ and ‘If the individual birds are responsible, how do they achieve this, starting from a purely local perception of the aggregation?’. Cavagna et al.’s and Ballerini et al.’s findings led Cavagna et al. (2010a) to invoke the concept of the ‘collective mind’: ‘Our empirical results, together with further study on the role of criticality in animal groups, may contribute to move the fascinating ‘collective mind’ metaphor to a more quantitative level’ (see also Couzin, 2007 & 2009; McDougall, 1920). The importance of the ‘collective mind’ that is implied by Cavagna et al. would suggest that the flocking behaviour in birds displays top-down organisation in the ‘over and above the sum of its parts’ sense. In other words, the birds are being influenced by a higher entity—i.e. the flock as an entity in its own right.

In a different study, taking into account other factors in addition to velocity, Cavagna et al. (2010b; see also Ballerini et al., 2008a) found that ‘the interaction at play between starlings during flocking has a topological nature, each bird coordinating with a fixed number of interacting neighbours during motion, approximately seven, irrespective of their distances’. Although this means that interaction does not depend on metric distance, in fact birds have an ‘exclusion zone’ around them that is comparable to the average wing span of an individual which means they do not collide. Therefore interaction is topological, but becomes metric if

120

birds get too close to each other (Ballerini et al. 2008b:213). The finding also means that for parameters apart from velocity, correlation is not scale-free. Even so, Cavagna et al.’s collective mind hypothesis is still an option given that we still do not know which channel of communication the starlings are using, taking into account Sheldrake’s point that an explanation by the visual channel only would require unblinking 360-degree vision given the complexity, spontaneity and speed of the movements.

One of the interesting developments in computer modelling of flock behaviour is

that several models have now been proposed that attempt to improve upon the boid-type models by treating a flock as a field (e.g. magnetic or gravitational) (Sheldrake, 2003:116; see also section 5.2.1, Vicsek, 2012, and for field theory in social science, Lewin, 1952). Similarly, other models of ‘flock behaviour’ are based on analogies with the flow of fluids, and when physicists model flow behaviour, ‘they do not start with individual atoms or molecules, but rather with the fluid as a whole’ (Ibid.).

5.5.3 Schooling behaviour in fish

As previously mentioned in 5.3, the ‘schooling’ behaviour of fish also shows interesting properties. It is clear that a fish school, at least visually, could be said to resemble a large composite organism, often with millions of individuals ‘wheeling and reversing in near unison’ (Sheldrake, 2011:357; App. 1.36). As Niwa (1994:123) describes: ‘One striking feature of a school of fish is its polarisation [cf. magnets], i.e. the parallel

arrangement of the members. The distance between individuals is uniform and the motion of individual fish is synchronised (Hunter, 1966; van Olst & Hunter, 1970). The tendency of the fish to remain at the preferred distance serves to maintain the structure’.

Indeed, the fish have no leaders, i.e. ‘speed and heading are not closely related to those of

any other single fish. The strong correlations are observed between the velocity of the individual and average velocity of the entire school…Thus, in a sense, the entire school is the leader and an

individual is a follower. This raises the question of self-organisation (Haken, 1983)’ (Niwa, 1994:123). Thus, Niwa and Haken suggest that schooling is non-hierarchical, and the ‘school as leader, individual as follower’ observation would suggest a dominant top-down influence.

Vicsek (2012) describes how from one second to another a disorganised shoal can become a disciplined school in situations such as ‘avoiding a predator, resting, feeding or travelling’ (see also Moyle & Cech, 2003). The most interesting example of one of these rapid shifts is the so-called ‘flash expansion’, in which ‘each fish

121

simultaneously darts away from the centre of the school as the group is attacked [by a predator]’ (Sheldrake, 2003:117; App. 1.37). The complete expansion may take only 20 milliseconds, with the fish accelerating to a speed of 10 to 20 body-lengths per second within this short time (Ibid.). The most extraordinary fact in all of this is that they do not collide, because this means that ‘[n]ot only does each fish know in advance where it will swim, if attacked, but it must also know where each of its neighbours will swim [like the dunlins, see 5.5.1]’ (Ibid.). Such behaviour is difficult to explain ‘in terms of sensory information from neighbouring fish because it happens far too fast for nerve impulses to move from the eye to the brain and then from then brain to the muscles’ (Ibid.). This is all made even more extraordinary by the fact that schools of fish still swim at night in pitch-black water, and in laboratory experiments fish have still schooled normally even when fitted with opaque contact lenses to blind them temporarily, and therefore vision is not essential. Neither is the detection of pressure changes in the water essential, demonstrated in an experiment where the fishes’ pressure-sensitive organs—the lateral lines which run along their length—were lacerated, and yet they could still swim as a school in the normal way (Ibid.; see Partridge, 1981). And although the fish schooled abnormally if both vision and lateral lines were eliminated simultaneously, the distress caused to the fish by such measures may have been the reason for the altered behaviour (Partridge, 1981; Sheldrake, pers. comm.). Nor is sonic communication between individual fish via displacement of water caused by their movements likely to be an explanation either, given that, like the dunlins, individual sound signals would be hard to distinguish in the context of thousands or millions of fish.

What is clear from all of this is that the ‘bottom-up’ approach starting from individuals and their neighbours needs to be complemented by a ‘top-down’ model of the group as a whole. Indeed, Makris et al. (2009) argue that the collective rapid shifts they observed, involving hundreds of millions of individuals in oceanic shoals, are ‘indicative of the advantage the group has over the isolated individual in transferring information over great distances [e.g. 40km]’. It would seem that the empirical research on real-life flocking and schooling behaviour suggests a strong ‘over and above the sum of the parts’ form of downward causation on individual birds or fish. Indeed, it would make sense to have an ordered, collective form of organisation, rather than one based on local interactions in the context of such vast numbers of fish or birds performing complex coordinated movements (cf.

122

Luhmann’s description of a central system reducing the complexity and noise of its individual members and its environment).

5.6 Top-down vs. bottom-up processes in musical group interaction The perspective of complex systems can also be applied to the process of group

musical entrainment. Entrainment is organised by a pulse (or ‘beat’) that functions as an ‘emergent property’ of the performance. Of course, metre, as a hierarchy of pulses, can be constructed individually around this pulse, and on occasion the level of coordination is such that the term metre is appropriate. However, I will refer to a shared pulse throughout this chapter because it applies over a wider variety of cases.

The pulse can be said to have a downward causation if we assume that the organisation of mutual interactions between individuals in music-making depends on a shared focus to synchronise with the pulse. For example, if performers attempt to join a performance that is already in motion, then they need to entrain in the same way as the whole group is entraining to the pulse; i.e. the pulse is ‘over and above’ the individual performers. Other aspects of the form of the musical performance can change (e.g. the melody, who is participating), but in most cases a performance can only be described as ‘together’ if the group of performers align themselves with a shared pulse. In a Luhmannian sense the pulse can be thought of as a ‘system’ that reduces the complexity of timing interactions in musical process.

However, due to the non-linear feedback interaction between top-down and bottom-up organisation, it is not obvious whether a pulse arises from negotiation between the individual performers (bottom-up), or the whether the pulse exists ‘over and above’ individual local interactions (top-down). For large groups of singers it is possible for a pulse to be maintained even when one or two individuals are not entrained with it, which would suggest that pulse can operate as a system that can tolerate environmental ‘noise’. On the other hand, for groups with few performers, the direction of causation is more likely to be bottom-up because the system is relatively less complex, with less noise, and therefore requires less of a top-down influence. However, determining the critical number of individuals required for a particular instance musical interaction to be primarily top-down or bottom-up is a complex task, because there may be other factors involved such as [i] the complexity of the music being performed, or [ii] the musical competence of the individuals, or

123

[iii] the varying degrees to which there is a hierarchy between performers, where even in small groups one performer may lead more than others.

5.6.1 Stability vs. instability in musical interaction The tension between top-down and bottom-up primacy in complex systems is analogous to the tension between the uncertainty that underpins any live musical performance and the stable structure that often emerges within it. Visell (2004:151) argues that uncertainty is a universal feature of music-making: ‘uncertainty in any live performance, especially (but not exclusively) those involving direct human intervention, is a physical and inevitable fact which is essential to the character of being live’.

What I term ‘stable’ music-making is highly predictable and has top-down primacy, and is therefore easy for a group to follow as a whole and more difficult for individuals to disrupt. What I term ‘uncertain’ music-making has bottom-up primacy because each individual member of the group makes unpredictable (often improvisational) contributions that can disrupt the action of the group as a whole. The fact that stable music-making is primarily top-down and uncertain music-making is primarily bottom-up implies that both forms of music-making are subject to both top-down and bottom-up influence, and it is just a question of which level of organisation is dominant.

Stable musical performance is perhaps epitomised by the ceremonial chanting of mantras in Hinduism and Buddhism, when short melodies accompany short phrases of text, repeated often hundreds of times, and thousands of people can participate for long periods at a time (see App. 1.17-19). At the other extreme, examples such as ‘free jazz’ and some spontaneous forms of highly-participatory communal singing represent uncertain musical performance (App. 1.41). In these ‘uncertain’ cases, the degree of uncertainty is dependent on ‘the presence (or absence) of a priori agreements, whether explicit or tacit’ (Blackwell & Young, 2004:124). One ‘uncertain’ a priori agreement might be that each member of the ensemble is expected to improvise, as opposed to one individual improvising with the rest of the ensemble accompanying. Another a priori agreement might be ‘the avoidance of recourse to notation or other pre-existing materials [or memorised songs]’ (Ibid.). An example of a group of musicians managing the transition between stability and uncertainty is the ‘feathered beginning’ that occurs in the transition between the ‘stable’ solo

124

singing, ‘uncertain’ silence, and then ‘stable’ cacophony of Suyán akia performance (see Ch. 4.9.1).

Many parameters of music-making can be either ‘stable’ or ‘uncertain’, but the parameter I will focus on here is pulse. I argued in Ch. 3.2 that the stability of a regular pulse is contrasted against the uncertainty of free-rhythm. London (2012:24) argues that attending to a pulse ‘involves both the discovery of temporal invariants in the

music and the projection of temporal invariants onto the music…once we have established a pattern of temporal attending we tend to maintain it in the face of surprises, noncongruent events, or even contradictory invariants… Music often depends on our making an effort to project and maintain an

established metre, as in passages that involve syncopation and hemiola’. Temporal attending is therefore both bottom-up in terms of ‘discovering’ temporal invariants in the music, and top-down in terms of ‘projecting’ an established pulse. Furthermore, metre is both bottom-up in the sense that it is an emergent property that arises from ‘our engagement with the production and perception of tones in time’, but also top-down in the sense that it is ‘learned…rehearsed and practised…[given that] musical rhythms are often stereotypical, stylistically regular, and hence familiar’ (London, 2012:4; see Ch. 3.2). This would suggest that the principle of invariance (or stability) would seem to be fundamental to musical process. But of course, even if live musical performance exhibits a relatively stable periodicity, the pulse will nevertheless vary with each separate performance, and therefore the specific structure that emerges in each performance is ‘uncertain’.

As discussed in the first few chapters, in addition to the structures that emerge within the performance, one also has to consider the more ‘static’ structures in which the live performance is embedded; e.g. institutional, personal, political, social, self-consciously subcultural. In a personal communication, Ian Cross described how free jazz collectives in the 1970s often spent ‘almost as much time discussing the political context

of what they were doing as doing it—such as Tony Williams’ Lifetime, and their inadvertent performance of an excerpt from Cage's 4:33 live on air, after they had decided shortly before the broadcast that to start overtly was a claim to dominance and thus to be rejected’ (from an anecdote of

Humph).

More typically, however, the free improvisation strands of modern jazz and Western classical music attempt to distance themselves from outside influences, and seek ever-newer creative contributions by participants in performance, and, according to Blackwell & Young (2004:124), are ‘deliberately and self-consciously uncertain’. The

125

ideal of freely-improvised music is to resist classification in terms of any one genre or influence, and, although it may seem impossible to achieve this ideal, the authors (Ibid. 124, 125) suggest that whilst the ‘experiences, prior learning, practices and habits (whether individual or culturally determined)’ of individual performers will constrain their own solo improvisations, in a group setting the individual performers and their past profiles are entered into a complex dynamic together, making for ‘less-certain’ improvisation. Of course, it could also be argued that once a group forms, the group itself, not just its individuals, will build up its own experiences, practices and habits too.

5.6.2 Is musical improvisation inherently unstable? In more general (not explicitly ‘free’) musical improvisation there are top-down

influences at play, such as stylistic training, rehearsed performance approach, the pulse, etc.. For example, at the pulse level, the ‘rhythm’ players in early jazz would be expected to ‘establish and maintain a clear and easily heard rhythmic pulse you could orient yourself to as you played, always knowing ‘where [beat] One was’, such as the ‘oompah’ rhythm: a strong bass note on the first and third beats of a bar, and a firm chord in the right hand on beats two and four (Faulkner & Becker, 2009:125). However, a stable pulse can also play an important role in smoothing uncertain transitions between qualitatively-distinct phases of performance. For example, Schögler (2000) found that ‘when two [jazz] musicians improvise together, their playing…become[s] significantly more synchronous just prior to points of qualitative ‘musical’ change’ [defined as a change in rhythmic structure from 4/4 to 3/4, or a change in dynamics from fortissimo to pianissimo]’.

Another example of disturbance resulting in structure in jazz improvisation might be the group process of ‘substitution’. Every time a melody is repeated, an individual might spontaneously decide to change a given note/chord/harmony/rhythm in a melody, which is then picked up by the rest of the group. The next time the melody is repeated, the previous modification may or may not become the norm, depending on whether the change is accepted by the group. The process is then repeated many times, sometimes to the point where the melody is almost unrecognisable from the original, yet agreed upon by the band. One can see from this example of this particular jazz convention how spontaneity/uncertainty can give rise to an emergent structure, and how, if a group

126

disagrees in the process of live performance about accepting or rejecting each individual substitution, this structure is held in tension with uncertainty.

This way of understanding jazz improvisation, i.e. from spontaneity to structure, is created by each player responding to each other player as a series of ‘local interactions’ which eventually produces ‘structure’ after multiple iterations from the ‘bottom-up’. However, the emergent structure is often fragile and can change quickly, and this also relates to what jazz musicians call ‘groove’ (see Prögler, 1995). For example, Doffman (2008) found that ensemble timing interactions cannot be reduced to a single phase relationship or degree of entrainment in groove (see Clayton, 2012:54; App. 1.42). Instead, ‘the ideal [phase] relationship is inherently dynamic and playing jazz involves meaningful variations within the permissible range of looseness and out-of-phaseness’ (Ibid.; see also Feld, 1988). For example, one player is quoted by Faulkner & Becker (2009:8) saying ’If sometimes I might play a phrase differently from another man, it’s not that critical. As long as we’re together most of the time’. Therefore, the relationship between the perceived pulse and the timing of note onsets in jazz is constantly shifting, i.e. dynamic.

There is also a more fixed hierarchical structure in terms of the way musicians interact with each other in live performance. For example, even though the goal of free jazz is musical ‘freedom’, in reality each musician within the group will have a place within the hierarchy, depending on competence and experience, and at different times will have specified functions; a drummer has a primarily rhythmic function whereas a saxophonist would likely have a more melodic function. Vallacher & Jackson (2009:1228) have suggested that factors such as power asymmetry and role relationship may result in the timing of one person’s behaviour lagging behind the other’s. Although the authors were referring to interaction between two people, asymmetrical entrainment behaviour due to power relationships is likely to occur between members of a group interacting too. This is certainly the case anecdotally in a jazz ensemble—Kernfeld has been quoted as saying ‘You learn that it’s the bassist, not the drummer, who has the greatest responsibility for maintaining the beat, and at some basic level of near incompetence, it’s actually much more important to keep playing boom boom boom at a steady tempo on some indecipherable low note than to get the changes right while dragging the rhythm. You are the rock upon which the band rests…’ (Faulkner & Becker, 2009:123; emphasis added). The structure of power relationships and how it affects timing within group musical interaction will be discussed further in Ch. 7.4 & 7.5.

127

5.6.3 Emergent structure in musical interaction The power hierarchy in a jazz ensemble will have a relatively static top-down influence on the performance, but even this static influence can become dynamic, given that the function and hierarchical position of each performer may change during the performance. The structure of this power hierarchy at any one moment in performance can thus be described as an ‘emergent structure’. Blackwell & Young (2004:125) argue that ‘In improvised music, the macro-level [i.e. the overall form of a performance] can only be described with the benefit of hindsight and reflection, once the complex interactions that cause structure to emerge are complete.’

A structure is only emergent when it can affect local interactions by downward causation, in line with Mitchell’s (2012) definition (see 5.4). I argue that emergence, or self-organisation, functions at a level ‘over and above’ local interactions, precisely because of the notion of a system ‘self’ doing the organising, and I therefore suggest that the most likely explanation of how group musical performance works will involve a more holistic view of non-linear dynamics between higher-order and lower-order components of a system, but where the higher-order components of a system are distinct from the lower-order components. These higher-order components would have to be defined in terms of attractor states within the system (cf. discussion of field metaphor in 5.2.1 & 5.5.2); rather than networks of relationships between local components (Sheldrake, 2012:138-9). Furthermore, identifying emergent structure by computational analysis is tricky, given that the specific initial assumptions made by programmers might lead to, for example, the identification of different moments at which structure is said to emerge depending on whether it is the computer algorithm or the performers themselves doing the identification.

This computational difficulty is illustrated by the fact that emergent structure not only manifests in embodied group interaction as ‘the synchronised motoric behavior of interacting individuals’ but also as ‘a shared reality among interacting individuals engaged in joint action’ (Vallacher & Jackson, 2009:1227). From a scientific perspective it is easier to measure and model physical motor behaviour than mental ‘shared representations’, which is the reason for the current trend of studying measurable aspects of entrainment behaviour such as body movements and sounds (see also Ch. 6.2.2 for further discussion). This trend is referred to by Marsh (2011) as the ‘embedded-embodied approach’. According to Himberg (2013:44) embodied

128

approaches that focus on body movements explain processes in bottom-up, ‘dynamic’ terms (see also Wilson & Golonka, 2013). Himberg (2013:69) acknowledges the importance of thinking in dynamic terms for studying musical entrainment when he says that ‘each repetition of the [pulse] takes place in a new environment: every reaction is also an action, and the linear error correction model no longer fits what is going on’.

Having said that, as argued before, the pulse also exerts a top-down influence on the process of entraining to a pulse in the sense that performers can anticipate the timing of their future actions based on the pulse that has gone before, as opposed to reacting to the movements and sounds of the other performers. The pulse is also top-down in the sense that it can be shared by all performers. Indeed, other musical parameters, such as melody and harmony, are dependent on a stable operational pulse (even though they too display systemic properties), because if performers get out of time with each other then their various melodic and harmonic contributions will not line up with one another, resulting in discordance. Therefore, pulse would seem to be the most fundamental and widely-applicable top-down parameter in musical performance. However, as Himberg argues, the process of entraining to a pulse is also dynamic, insofar as some musicians in a group might start moving in a new tempo, and yet the others will still be able to adapt to the new beat.

As discussed before, we will never fully understand any system like a pulse, unless we also acknowledge that it will always be embedded in other systems which are also embedded in systems, etc. However, in the context of studying bodily entrainment, Vallacher & Jackson (2009:1227) describe how ‘In practice, there are

diminishing returns in expanding the level of analysis to include systems at the macro end (e.g., culture) or the micro end (e.g., neural dynamics) when attempting to capture mental and behavioural processes of interest to social psychologists. Where one draws the line, however, is an unsettled matter and warrants further consideration, particularly on the part of those who are wedded to the

embedded–embodied approach’. The pulse is arguably a good level of analysis for understanding musical performance because it is a system of organisation that comes somewhere between ‘culture’ and ‘neural dynamics’, and is directly observable in the performance itself through the body movements and sounds produced by performers.

129

5.7 Summary The aim of this chapter was to start to explore the processes by which a group of

singers come to organise their action through entrainment. The reason I have examined social systems in various disciplinary domains such as sociology, complex systems theory, animal behaviour and embodied psychology was to integrate the complementary approaches by which different disciplines investigate group behaviour, each with their own valuable perspective. In terms of empirical studies of musical activity, music psychology is progressing in understanding how synchronisation (‘entrainment’) works in laboratory settings, but almost exclusively in the context of interaction between two individuals (see Ch. 6.4). I am interested here in how synchronisation occurs in systems that are much more complex, in which many individuals perform together, each with their own perspective, in real-life contexts.

In the context of choral singing, Potts’ chorus line hypothesis might suggest that an initial onset that is not preceded by any sound starts with some initial movement (a visible breath or gesture) from one or a few individuals, setting in motion a very fast chain reaction which might potentially result in a perfectly unison onset, where every chorister sings exactly at the same time as everyone else (this potential explanation will be tested in Ch. 8.4.3).

In those cases when a choir has a conductor, one might say it is the conductor who provides a single focus that reduces their dependency on being aware of other choristers, and therefore the ‘chorus line hypothesis’ may not be appropriate in these contexts. However, conductors sometimes ask their choir to ‘triangulate’; i.e. to look at other choristers as well as the conductor in order to improve the cohesion of the choir (e.g. in Anglican and Catholic churches, when two ‘sides’ of the choir face each other). This would suggest that visual contact with as many choristers in the group as possible improves group synchronisation, which is compatible with the chorus line hypothesis. One must also ask whether, in the absence of a conductor, one to a few confident choristers function as ‘leaders’ in order to reduce effort and increase accuracy for choristers, or whether the choir as a whole organises its own actions.

In unison choral singing everyone makes a sound at the same time, and in order

to explore how this might be possible one might think in terms of how the timing of actions of individual choristers are related to each other through webs of relationships, including feedback loops, which are the basis of complex systems

130

approaches. However, an approach based exclusively on unstable person-to-person dynamic interactions is incomplete because, for a musical performance to hold together, all performers need to have a stable top-down collective understanding of how their individual role fits in to the performance as a whole, both in terms of pulse and other musical parameters. Of course, this collective understanding must also be tolerant of individual variation that happens ‘in the moment’, so that when an individual does something unexpected the group can flow with that individual and maintain a shared framework of performance. In the sense of being stable, emergent, and tolerant of perturbation, the pulse can be considered an emergent product of a complex system, which is required to exhibit qualities such as self-organisation, emergence, and robustness.

Bottom-up computational approaches such as ALife and autopoietic

organisation—which are useful for understanding unstable musical performance conditions, such as when performers speed up or slow down pulse—are based on fundamental assumptions. The most important of these assumptions is that emergent structures can be modelled exclusively on their bottom-up local interactions. Reynolds’ boids model, a computer program based on this assumption, was designed to emulate the real-life behaviour of flocking and schooling movements in birds and fish. Potts challenges the exclusive focus of the boids model on neighbour interactions with his chorus line hypothesis, arguing that members of a flock of dunlins anticipate an incoming ‘maneouvre wave’. However, the near-instantaneous collective movement implied by scale-free correlations of collective velocities in flocking movements in starlings rule out a ‘maneouvre’ wave explanation, and this may apply to dunlins too.

It is clear from analysing real-life flocking and schooling behaviour that the speed, complexity, diversity, ‘scale-free’ correlations, and leader-less organisation of the collective movements combined suggest that it might be fruitful to think about these kinds of mass behaviour in terms of the top-down metaphor of a ‘collective mind’, which is similar to Luhmann’s concept of a ‘system’ in that it is ‘over and above’ the sum of the local interactions. Exclusively bottom-up explanations of flocking and schooling behaviour based on merely the sensory channels of vision and hearing are limited. Indeed, it has been argued that even when birds cannot see other birds’ movements in flocking movements, they can still react to those birds’ movements. Similarly, communication by the hearing channel is also unlikely given

131

the cacophony of thousands of birds, with the same going for fish too. Fish also seem to be able to communicate in schooling movements without vision or sensing pressure changes.

The movements and sounds of mass choirs of humans singing in unison are similar to flocking and schooling behaviour in that an astonishing degree of synchrony can often emerge in a group of individuals who are moving and singing together (sometimes in the hundreds or thousands; e.g. football stadium chanting). Therefore, to understand complex group entrainment both in animals and humans we therefore need an explanation that integrates bottom-up and top-down process; i.e. local-to-local interaction and the ‘collective mind’. One of the parameters in group music-making that lends itself to being understood in terms of the ‘collective mind’ metaphor, both because it is a mental construct and is shared collectively, is the pulse. Hence, understanding musical entrainment in computational terms is a huge challenge because it would require that the computer model’s assumptions are representative of musical process from both embodied and mental perspectives. This would require an understanding of all factors involved in musical performance, that, in turn, could be expressed in computational language. As we saw from Mitchell, this is highly unlikely given that even a simple event like a rock breaking a window ‘cannot be represented by any single theory in the syntax of logic or even the language of physics’, let alone complex musical interaction. Although this may be a disheartening conclusion, it should not stop us from inquiry into musical interaction in general, or entrainment processes in particular. The next chapter will attempt to describe the various theoretical approaches by which we can explore musical entrainment.

132

Chapter 6 - Group Entrainment: is it planned or does it emerge? 6.1 Introduction In this chapter I turn to experimental psychology to examine how the process of embodied entrainment, which forms the basis of most communal music-making, is currently explained. The ethnographical survey of Ch. 2 and the Suyán case study of Ch. 4 investigated why communities entrain their singing. This chapter, following on from the cross-cultural study of time structuring in Ch. 3, and the analysis of dynamic group organisation in Ch. 5, looks at how groups entrain their singing. Group vocalising takes various forms, some of which were mentioned in Ch. 1.3; for example, chanting, choral singing, choral speaking, group prayer/recitation, a group’s pledge of allegiance to a higher authority or idea, etc. (see Cummins, 2013). All these forms of group vocalising are globally widespread, and underpinned by a capacity to synchronise in large groups. Children in classrooms today often speak in collective synchrony, such as when they all say ‘Good Morning, Mrs. Johnson’ together (see also Occupy Wall Street & Syrian protest chants, Malaysian Choral Speaking competitions, Finnish shouting choirs; and group recitation of the Nicene Creed in the Christian Church; Apps. 1.7, 1.8, 1.43, 1.44, 1.45).

Synchrony may occur in many collective activities partly because, in terms of information flow and coordination, it is easier if a group of individuals are orientated in time and space in such a way that everyone can see, hear and anticipate each other in such a way that they can coordinate with each other. In the context of group singing, seeing, hearing and anticipating each other is necessary for singers to share a stable sense of pulse. The most visible and measurable feature of any kind of social entrainment process is body movement. As Lakens (2010) has suggested, understanding social interaction requires ‘a deeper insight into the role movement synchrony plays in social psychological processes’ (cf. Marsh’s embedded-embodied approach in Ch. 5.6.3, and list of social entrainment papers in Ch. 3.1). In the relatively brief history of entrainment studies, quantitative study of timing interactions has only been

133

performed in the context of two-person (dyadic) interaction (although see Himberg & Thompson, 2011, & Wing et al., 2014). Explanations of entrainment behaviour on the dyadic level are a long way off accounting for the collective behaviour of multi-person choirs. The present survey of the empirical studies regarding entrainment in social interaction points towards ways of making a small beginning in tackling these more complex group interactions. I will first introduce joint action in terms of how it relates to the top-down vs. bottom-up dynamic introduced in Ch. 5.2.1, and then show how entrainment underpins such joint action, and then reframe the top-down vs. bottom-up dynamic in group entrainment in terms of planned vs. emergent coordination. 6.2 Joint action

As mentioned above, the majority of social coordination research has focused on ‘joint action’ which usually refers to dyadic interaction. Although group coordination is more complex than, and qualitatively different to dyadic interaction, some basic principles from the ‘joint action’ literature are useful in understanding group interaction. For example, consider the example of two boys carrying a log which neither could lift on their own. As Woodworth (1939:823; quoted in Knoblich, 2011), points out ‘You cannot speak of either boy as carrying half the log. . . .Nor can you speak of either boy as half carrying the log. . . .The two boys, coordinating their efforts upon the log, perform a joint action and achieve a result which is not divisible between the component members of this elementary group’. As Knoblich (2011:60) puts it, the coordination of a joint action ‘seems to require some kind of interlocking of individuals’ behaviours, motor commands, action plans, perceptions, or intentions’. This interlocking of minds and bodies is fundamental to group coordination too.

The ‘two boys carrying a log’ example displays different aspects of the bottom-up vs. top-down distinction in Ch. 5.2.1. The interaction is clearly made up of two individuals, whose body movements can contribute differently in a bottom-up manner, but they also share an understanding about how to lift the log that exerts a top-down influence on their behaviour, ordering their actions. However, if one boy’s grip on the log was to slip they would be exerting a bottom-up influence on the interaction. Such actions are also instanced in group action; for example, Marsh et al.

134

(2006:20) also gives the example of a truck getting ‘accidentally’ stuck and requiring two groups of boys to pull it out. In this example, the shared understanding of the task that emerges joins both groups together toward a common goal, i.e. the pulling the truck out of the mud or the need to lift the log for the two boys. In chanting, the ‘common goal’ is to perform the chant together, and in unison chants this requires individuals to sing together at the same time, the same pitch, and the same words. This is similar to the examples above which required individuals to either lift or pull at the same time.

The nonverbal communication literature has for a long time assumed the

separateness of individuals, in the language of one person ‘sending’ a message and the other individual ‘receiving’ it (Ibid. 14). However this theoretical perspective cannot account for the indivisible shared understanding or ‘common goal’ as described in the previous examples, because it is divided into two isolated selves (Clark, 1996; quoted in Marsh et al. 2006). This shortcoming is compounded when many more than two individuals are interacting, such as in a flock of starlings, school of fish, or crowd of football supporters.

This shared understanding or goal, conceived in the literature as a shared

internal cognitive structure, is a top-down representation of social interaction that leaves out the moment-to-moment intricacies of each individual’s contribution to its formation; it is a goal that acts as an attractor for interacting individuals. The ‘moment-to-moment intricacies’ might be aspects of social interaction such as individual spoken or sung utterances, body movements, gaze direction, moments of touching, spatial positioning etc.. In short, theories which categorise and order social interaction in terms of cognitive models are not telling the messier, bottom-up half of the story, which may include a lot of information that does not fit a given model. Furthermore, social interaction emerges in time, and therefore requires a framework that can articulate process in terms of nonlinear dynamical systems (see Ch. 5.2.1), which creates difficulty because we often think and communicate in linear form (Marsh et al. 2006:18; Bavassi et al., 2013).

6.2.1 Entrainment as the basis for joint action

Underpinning all joint action is the phenomenon of entrainment, which, in the broadest sense, is the process of the coming together of behaviour between two or

135

more coupled individuals (even thousands, see Strogatz, 2003), and is therefore fundamental to understanding joint action which requires two or more people to time their actions in such a way that they can work more efficiently. As discussed in Ch. 1.2, entrained action comes in many forms, and different degrees of synchrony, and refers to a time-scale spectrum between small-scale rhythms found in music-making and much larger-scale ‘circadian’ or seasonal rhythms.

This thesis is primarily concerned with entrainment behaviour in groups of

humans. Over the past few years, an increasing number of experiments provide evidence that people ‘cannot resist’ entraining their behaviour with others (Knoblich et al., 2011:67). It seems that when we act in the presence of others they exert an organising influence on our own behaviour, not only in the way we think and feel, but also in the ways we are oriented ‘physically and perceptually’ (Marsh et al. 2006). These are general statements that account for the whole spectrum of entrainment behaviour between humans, which includes that which is periodic, such as music-making, as well as that which involves less periodicity but is still temporally precise, such as turn-taking in conversation (e.g. Stivers et al. 2009), and the ‘log’ and ‘truck’ examples of joint action (see 6.2 above). Group singing, the focus of this chapter, is associated with the more ‘periodic’ end of the spectrum of human entrainment.

Entrainment between humans occurs by informational coupling such as hearing

and vision, not just physical coupling in the manner of, for example, the wall that physically couples the two hanging clocks in Huygens’ example, described in Ch. 1.2. Both forms of coupling govern limb coordination, for example, within and across individuals (Knoblich et al, 2011:63, 85). In the context of interpersonal embodied interaction the bio-mechanical and informational limitations of body movement provide ‘strong gravitational and inertial constraints’ on entrainment (Cummins, 2012b). One example of a bio-mechanical constraint is that a torso has larger physical dimensions and larger mass than an arm and ‘therefore has a higher moment of inertia and consequently a longer specific period of oscillation’ (Toiviainen et al., 2010; see also Van Noorden & Moelants, 1999 for the resonance phenomenon; and MacDougall & Moore, 2005 for constraints on human locomotion). An example of an informational constraint is the notion of a shared mental framework that individuals attend to in joint action, e.g. the perceived metre shared by a group of musicians,

136

because the actions of the individual are constrained by the framework to which they are collectively attending.

In the empirical literature, human rhythmic movements exert a stronger coupling

attraction than rhythmic movements made by machines, even though they are less rhythmically accurate. Kirschner & Tomasello (2009) discovered that 2.5-year-old children strayed from their own default drumming tempo more when they drummed with a real human than when they drummed with a mechanical device producing the same rhythmic sequence as the other human being. In another experiment, Himberg (2011) found that pairs of people who were told to tap to a mechanical metronome first drifted away from the metronome as they were attracted to each other’s beat, were then pulled back to the reference point of the metronome, and were then pulled towards each other’s beat again.

A key aspect of group coordination is that interacting individuals have to adjust to any ‘errors’ in timing that their partner may make, and this has been shown in ‘tapping’ or ‘drumming’ experiments between pairs of individuals, where the goal is to achieve synchrony between their rhythmic actions. The number of errors are magnified by the presence of more people. Keller (2007) argues that in larger musical ensembles, ‘cohesion’ ‘may vary as a function of the sensitivity of ensemble members to each other’s use of error correction’. This is a theory that has not yet been tested in group contexts, probably because the feedback loops present between all the different interacting individuals make it difficult to study error correction (see Ch. 5.2.1). Indeed, given that in group contexts ‘the beat is jointly abstracted and redefined

on a continuous basis, the ground truth of ‘correct’ beat timing does not exist. As perturbations are the norm and not the exception, error correction should be renamed continuous mutual maintenance, as it becomes impossible to say what constitutes an error and what would the correction then be’ (Himberg, 2013:35; see also Nowicki et al. 2007).

Even in musical scenarios in which the top-down influence of ‘leaders’, such as conductors or soloists, potentially reduces the number of interactions that would need to be analysed, it would still be difficult to study group interaction due to the fact that pairwise statistical analysis can only focus on a pair of individuals at a time, rather than the network as a whole. Some statistical advances have made the quantitative study of timing interactions in group entrainment easier, such as the ‘Kuramoto model’ (Acebrón et al., 2005) and the Hilbert-Huang transform (Huang et

137

al. 1998), but these resort to extracting a single measure across the whole group at any given time, rather than detailed quantitative analysis of the web of error-correction processes (see Himberg, 2013:165-6 and Ch. 1.2).

6.2.2 The collective ‘we’ Marsh et al. argue that to understand entrainment in joint action, we need an

integrative approach which focuses on ‘[both] an emergent collective that is nonequivalent to the summation of individuals’ responses…and the physical dynamics of movement in an emergent cooperative social entity’ (Marsh et al. 2006:18; cf. Mitchell’s definition of emergence in Chs. 5.2.1 & 5.4; see also Pacherie, 2012). Marsh et al.’s call to examine the emergent collective, or the experience of ‘we’ as I interpret it, in the context of the physical dynamics of movement, without ‘decomposition or analysis of ‘we’ into its linear parts’ is thus a challenging proposition (see also Baron, 2002a, 2002b). According to Di Jaegher & Di Paolo (2007), this requires us to see the interaction itself, as well each individual agent within the interaction, as an autonomous entity; i.e. the agents sustain the interaction, and the interaction sustains them. Secondly, it would seem impossible to analyse the collective interaction of individual body movements in such a way that the emergent collective mutuality is not expressed in terms that break it down into individual actions. Thirdly, and more generally, it is hard for us to think about, and use language to talk about, ‘wholes’ without immediately thinking or talking about their parts. What is needed is a new language of holistic mutuality for understanding collective phenomena that goes beyond the analysis of parts.

We know that even as onlookers we can perceive the notion of the collective ‘we’

through collective body movement because of that familiar experience of noticing one person falling out of sync with everyone else in group performance (e.g. theatre, music, dance, military drill, e.g. Apps. 1.28, 1.47, 1.48 & 1.49), and perceiving the group to be no longer a group (see Lakens, 2010; Lakens & Stel, 2011). Even in activity that is not supposed to be synchronous it was found that ‘when one person’s body goes still for longer that it is supposed to, it becomes noticed’ (Gill & Borchers, 2003). Gill (2011) argues that group awareness is a shared sensitivity to the ‘spatial-temporal trajectory of the group flow’ and any changes that may occur.

138

However, the study of the ‘group experience’ has a problem: we know that individuals subjectively experience feeling part of something bigger than themselves in group performance, but it is hard to tease apart the subjective experience of an individual from the subjective experience of a group, if indeed a group can be thought of as a subject. As argued previously, one solution might be to turn our attention to bodies, in terms of our own body and the bodies of other people, because they ‘enable us to recognise the reality of other selves, other experiencing beings’ (Abram, 1991:37). Bodies seem to be the interface between individual and collective experience, as Abram (Ibid.; see also Husserl, 1960) argues:

‘While one’s own body is experienced, as it were, only from within, these other bodies are experienced from outside; one can vary one’s distance from these bodies and can move around them, while it is impossible in relation to one’s own body…Despite this difference, Husserl discerned that there was an inescapable affinity, or affiliation, between these other bodies and one’s own. The gestures and expressions of these other bodies, viewed from without, echo and resonate one’s own bodily movements and gestures, experienced from within. By an associative ‘empathy’ the embodied subject comes to recognise these other bodies as other centres of experience, other subjects.’

Similarly, Bloch extends this point to voices as well, arguing that when we synchronise with others ‘one is not sure whether it is oneself or another inside oneself who is acting and using one’s voice and one’s body’ (Bloch, 2002:142; quoted in Schüler, 2012:83-4).

Therefore, many ostensibly ‘subjective’ experiences in fact involve collective phenomena that are responded to, and experienced by, multiple sensing subjects that interact from different bodily perspectives. These collective phenomena are ‘not merely subjective; they are intersubjective phenomena’ (Abram, 1991:38). Therefore ‘the conventional contrast between ‘subjective’ and ‘objective’ realities could now be reframed as a contrast within the subjective field of experience itself—as the felt contrast between subjective and intersubjective phenomena’; i.e. an objective reality can be thought of as an intersubjective consensus (Ibid.).

The point of this phenomenological digression is that when discussing the experience of commonality or a shared goal within a group of singers, one can only talk in terms of intersubjective experience. However, most of the following discussion about shared ‘goals’ or ‘representations’ in the next few sections is necessarily reductive because the multiple perspectives that are implicit in creating these shared goals and representations tend not to be included within an abstract, top-down model.

139

We know intuitively that in a dyadic conversational context each self maintains some sense of individuality, yet is able to be aware of the other person and their perspective (Gill & Borchers, 2003). Indeed, the Middle English sense of the word ‘conversation’ is ‘living among, familiarity, intimacy’ (ODE), and an OED definition is ‘[t]o hold inward communion, commune with’. It is more challenging for an individual to be aware of every other person’s unique perspective in a group context, than it is in a dyadic context. Therefore, although the idea of a single entity such as a ‘shared mind’ is attractive because each individual would need to use less cognitive processing power than if they were required to be aware of multiple individual perspectives simultaneously (cf. Luhmann’s idea of a system reducing complexity), it still has the same difficult philosophical implications of thinking of a group as a subject in itself.

The idea of a ‘shared mind’ is related to the discourse on ‘shared intentionality’. Intentionality refers to the ‘power of minds to be about, to represent, or to stand for, things, properties and states of affairs’ (Jacob, 2010). Shared intentionality refers to ‘some jointly focused entity that we know we share but are viewing from different angles’, and is based on our capacity and motivation to understand, and cooperate with, other people ‘as intentional agents who have a perspective on the world that can be followed into, directed, and shared’ (Tomasello & Rakoczy, 2003:125; Tomasello, 2008:344; see also Schweikard & Schmid, 2013 for SEP entry on ‘Collective Intentionality’). In music and the kinds of group phenomena dealt with in this thesis individuals within a group may not always share the meaning or ‘intentionality’ of the collective performance, but they may be able to jointly focus on, and act in relation with, an entity that the whole group attends to, e.g. the pulse or any other shared ordering principle. Therefore, each individual’s representation of the performance can ‘float’ above their joint commitment to focus on this shared entity, and therefore Cross’s concept of ‘floating intentionality’ may be appropriate in these scenarios (see also Ch. 2.2.1).

The beat in this context is an intersubjective consensus, but the thorny issue still

remains of how to talk about the experience of intersubjectivity implied by ‘floating intentionality’. Shared aspects like the ‘beat’ or musical metre are more empirically accessible than intersubjective experience. However, most empirical analysis or investigation is done after the fact; yet experience happens in the moment. The idea of having a separate person interviewing each singer whilst they are singing is absurd,

140

but even if it were possible, it would be very difficult, if not impossible, for each person to translate their multi-sensory subjective experience into words. Therefore, although concepts like the collective ‘we’ and intersubjective experience are essential elements in group coordination, they represent considerable obstacles to the project of understanding such coordination. Bodies, and the sounds made by bodies, on the other hand, are a surer way forward for investigation; nevertheless, the experience (either subjective or intersubjective) of group coordination will always be the theoretical ‘elephant in the room’ in the rush to understand social coordination.

6.3 Entrainment in two forms Vesper et al. (2010:1002) have proposed that a few processes need to be in place

for any form of joint action to occur: ‘planning for immediate actions, action monitoring and action prediction, as well as ways of simplifying coordination [e.g. exaggerating movements and reducing temporal variability]’. To return to the distinction of top-down and bottom-up processes in coordination, ‘planning for immediate actions’ and employing a strategy of modifying one’s own movements for the purposes of simplifying coordination are examples of top-down influences on coordination, whilst monitoring and predicting the actions of others are examples of bottom-up processes. Knoblich et al. (2011:62) make the distinction between top-down ‘planned coordination’, and bottom-up ‘emergent coordination’ in entrainment.

In collective musical performance individual musicians have to organise their own sounds through time, but they also need to organise their sounds in relations to those of others, which are often unpredictable (Keller, 2008; cf. ‘uncertainty’ in Ch. 5.6.1). The following discussion will concern the processes underlying the tension between the individual and the group within group entrainment. Keller also makes a distinction that is similar to that between planned and emergent coordination when he describes how musical ensemble entrainment requires performers to both ‘share common goal representations of the ideal sound [cf. planned coordination]…[and] possess a suite of ensemble skills…that enable these goals to be realised [cf. emergent coordination]’ (Keller, 2008). Keller (Ibid.) also notes that ‘social factors, knowledge of the music, and familiarity with the stylistic tendencies of one’s co-performers’ influence shared goal representations which may also, in turn, affect these ensemble skills. Thus, terms of a musical analogy, one might say that emergent

141

coordination is trying to explain improvisation while planned coordination is trying to explain composition (Himberg, 2013:48). I will now define ‘planned’ and ‘emergent’ coordination in more detail, starting with planned coordination.

6.3.1 Planned coordination Planned coordination is ‘driven by representations that specify the desired outcomes of joint action and the agent’s own part in achieving these outcomes’ (Knoblich et al. 2011:62; see also Decety & Sommerville, 2003). Planned coordination thus relates to music in the most general sense by referring to culture-wide representations about how music should sound, and, in a more focused sense, the conventions of certain genres, and then more focused still, groups of specific individuals getting to know each other’s idiosyncrasies and rehearsing how they want their music to sound. Like Keller (see 6.3 above), Will (2011:181) describes how ‘our bodily responses to music are also influenced by familiarity with the music, by musical training, cultural practices, and even belief systems (see Will & Turow, 2011, for examples and references; [Goebl & Palmer, 2009])’.

Shared task representations ‘specify in advance the individual parts each agent…is going to perform [and] also govern monitoring and prediction processes that enable interpersonal coordination in real time’ (Knoblich et al. 2011:65; see also Sebanz et al. 2005). Vesper et al. (2011) have found empirically that in situations of unpredictability and limited task knowledge, co-actors use coordination strategies to make their actions more predictable as means of acquiring coordination and task success. Pecenka & Keller (2007) found that ‘high-predicting’ pairs of individuals who synchronise by predicting each other’s tapping were more accurate and less variable than ‘low-predicting’ pairs of individuals, who were more likely to react to each other’s tapping. It is difficult to know whether these predictive or reactive behaviours were conscious strategies, yet they might have been, given that Stephan et al. (2002) found that individuals can consciously influence their strategy for entraining and tapping to stimuli with variable periodicity (although the variable stimuli were produced mechanically, not by humans).

An example of the integration of individual and shared task representations in group music-making can be found in orchestral music, where the flute part, for example, is different from the violin part, which is different from the clarinet part, but, nonetheless, all parts constrain each individual’s movements and exist within

142

the shared framework of the piece that binds the entire orchestra. However, we do not know how many parts (i.e. task representations) instrumentalists are able to run in parallel with their own. In the case of dyadic joint action there are only two actors, but in an orchestra there can be up to 120 individuals each performing complex tasks; is it really the case that a single individual will mentally attend to every orchestra member’s part alongside their own? To common sense, this sounds costly and unnecessary, and the idea of a single centralised task representation (i.e. the full score or the conductor) to which all instrumentalists relate their own individual contributions makes some sense. However, even if such a centralised representation does exist, it would be very difficult to describe it beyond simply referring to the score or interviewing the conductor, and even more difficult when the music is improvised and there is no conductor or obvious leader.

Researchers attempting to explain coordinated movement typically appeal to representational processes: ‘Perceiving an action activates the mental representation of this action, which in turn leads to the performance of the action’ (Dijksterhuis & Bargh, 2001:8; quoted in Marsh et al., 2006:22). Marsh et al. identify a problematic implication of this approach which is that ‘[p]articipants in an interaction would have to gauge the others’ changing body positions in space–time, project that into the future, and then move their own limbs and body in a similar fashion’. For example, even in the highly-planned scenario where a conductor and orchestral players may share similar mental representations of the score and thus have a good idea of how each other might move, they still often need to rehearse together to improve coordination (Pacherie, 2012:366).

Keller (2007) takes the idea of a shared task representation in musical ensemble performance to a theoretical extreme. He argues that ensemble performance is ‘predicated upon group members sharing a common goal; a unified concept of the ideal sound’ and that ‘[t]hese goals then reside in memory as idealised mental representations of the sounds constituting the musical piece’. Keller is referring to presentational music-making when he talks of the ‘musical piece’ and therefore a common goal, e.g. playing the right notes in the right order at the right time, is to some extent feasible in this context (see Ch. 1.5 for definition of presentational performance). However, the idea of a ‘unified concept of the ideal sound’ seems rather abstract.

143

From my own personal experience as a classical singer who has performed with piano accompanists, I would say that, to the extent that such things are fixed or even known, each performer has their own individual conception of their preferred performance which is then negotiated (often tacitly and sometimes preconsciously) in ensemble performance. However, for the sake of argument, even if both individuals happened to share exactly the same conception of the ideal performance we have no way of measuring how conscious each individual was of their own conception, or even what this representation might be.

We are therefore reminded that the fundamental problem with the idea of top-down shared task representation in social coordination is that it is very unclear as to what these representations actually are. It would seem that the current explanatory models for coordinated movement, even those for ‘simple’ dyadic interaction, rely on simultaneously complex and vague notions of shared representations of movement patterns. It is even more challenging to imagine how such (presumably fixed) representations behave in dynamic contexts such as free improvisation, and how the interaction between these representations and ‘unplanned’ emergent behaviour might be theorised.

As Shapiro (2007:340) says: “Why bother with a representation of the world if the world is right there in front of you?”. However, one might reply thus: ‘Given our limited capabilities, representations simplify the world to a level of complexity that is more practical to work with’. Nevertheless, it is always worth bearing in mind that representations never tell us what actually is.

Lewis Carroll illustrates the necessity of simplification with his story about a group of German scientists who intend on making bigger and better maps, who, in the end, make a map with a scale of a mile to the mile, which has everything on it. However, unfortunately, the narrator describes how ‘the farmers objected: they said it would cover the whole country, and shut out the sunlight! So we now use the country itself, as its own map, and I assure you it does nearly as well’ (Carroll, 1893).

We now turn in the next section to ‘emergent coordination’, which attempts to deal with the world that is right in front of us, or ‘using the country as its own map’.

144

6.3.2 Emergent coordination Emergent coordination is based on bottom-up mechanisms that are, for theoretical purposes, distinguished from ‘planned’ top-down representations. Emergent coordination occurs ‘due to perception-action couplings that make multiple individuals act in similar ways; it is independent of any joint plans or common knowledge (which may be altogether absent)’ (Knoblich et al., 2011:62). Perception-action coupling is the largely unconscious phenomenon whereby ‘the act of perceiving another person’s behaviour creates a tendency to behave similarly oneself’ (Chartrand & Bargh, 1999:893).

Emergent coordination is therefore a kind of ‘resonance’ phenomenon that leads individuals to ‘start to act as a single coordinated entity’ (Knoblich et al. 2011:62; see also Van Noorden & Moelants, 1999; Marsh et al., 2009; Spivey, 2007). Resonance can be defined in this context as the process of being exposed to an ‘external force [which is] equal or very close to…[that of] of the system’ in such a way that one moves in an increasingly similar way to the external force (Van Noorden & Moelants, 1999). ‘Resonant’ perception-action couplings thus have the effect of blurring the boundaries that separate individual selves, and therefore emergent coordination promotes rapport, which for individuals in groups can facilitate the sense of belonging to the group (Knoblich et al. 2011:77; see Ch. 3.1).

Emergent coordination is spontaneous; for example, ‘pedestrians often fall into the same periodic walking patterns (Van Ulzen, et al. 2008; Zivotofsky & Hausdorff, 2007; [Zivotofsky et al., 2012]) and people engaged in conversation synchronize their body sway (Shockley, Santana, & Fowler, 2003) and mimic one another’s mannerisms (Chartrand & Bargh, 1999)’ (Knoblich et al, 2011:62).

Emergent coordination is also often unintentional. For example, in Himberg’s (2011) experiment (mentioned in 6.2.1) both participants were instructed to keep perfect time with a metronome, which they reported was easy to do, and yet the data shows they had in fact been entraining with each other, not the metronome. Similarly, I found that walkers unintentionally entrained their walking while consciously performing other tasks (Hayward, 2009), and Clayton (2007a:48&49) has observed in his study of North Indian classical music that even though tanpura players intended their rhythms to be mutually independent from each other, they were not aware of a complex set of ‘proto-metrical’ relationships between different periodic rhythms that

145

emerged nonetheless. These examples of small group entrainment demonstrate that the dynamics of rhythmic attraction may also be unconscious to a large extent (see also Konvalinka et al., 2010; Ch. 7.2.3 and App. 1.46). In a mass group context it has been observed that audiences in theatres tend to clap in unison spontaneously, presumably with a high degree of periodicity (Neda et al., 2000a,b). All these examples, displaying varying degrees of periodicity, demonstrate the principle of emergence through interaction. The kinds of communication channels that make joint emergent action possible will be discussed in the next chapter.

6.3.3 The integration of emergent and planned coordination In summary of emergent and planned perspectives, one could say that the emergent perspective understands behaviour as an a posteriori consequence of psycho-physical principles, and the planned perspective understands behaviour as following from a priori prescriptions in the form of mental representations (Schmidt et al., 2011:836; see also Kugler & Turvey, 1987; Turvey, 2005; Turvey & Shaw, 1995).

However, although this distinction may seem definite, specific instances of collective entrainment usually involve both sides of the bottom-up vs. top-down dynamic; in ‘planned’ coordination scenarios there is room for creativity and error, and in ‘emergent’ coordinated scenarios a spontaneous shared structure will often emerge. For example, in terms of musical genre, ‘shared task representations’ may be the dominant mechanisms behind highly-rehearsed and notated music, and ‘emergence’ is the mechanism behind freely-improvised music. Having said that, even highly-rehearsed, notated music can be performed with spontaneity, and free improvisation can be restricted by the training and enculturation a performer has had.

In the current dichotomy between emergent and planned coordination, ‘shared task representations’ at the group level are static and too reductionist, but, equally, analysing and modelling the emergence of the entrainment of individual body movements within group coordination is made difficult by its inherently complex, dynamic processes. Furthermore, although the perception-action couplings responsible for emergent coordination are explained by a process of ‘resonance’, resonance does not in itself explain how/why someone makes the first move, or how someone consciously and flexibly adjusts their coordination strategy in phases of transition, for example, and therefore some element of ‘planning’ may be necessary. Hence, there is a need for an integrative theory for both types of coordination.

146

Keller (2008) found that it is possible to predict the degree of body sway coordination in seven pairs of pianists using a combination of three separate indices of skill level—related to visual-motor representations (planned), metric attending (planned/emergent), and adaptive timing (emergent)—but not one indice on its own. This suggests that musical coordination relies on both planned and emergent coordination. Furthermore, when individuals need to perform different actions from each other, it can be important for each individual to be able to resist the perception-action couplings with others that form the basis for emergent coordination (Knoblich et al., 2011:88). Thus, studying the hierarchical structure within a group is one approach to integrating planned and emergent coordination (see Chs. 7.4 & 7.5).

It is also difficult to grasp how an individual’s ‘planned’ task representation, based on their own life experience, preferences, and perspective, can be transformed into a shared task representation with a group of individual with similarly individual perspectives. Is the ‘shared task representation’ truly shared, in the sense that there is one unified representation acting downwards on the whole group, or is it the result of a continual ‘emergent’ process of negotiation between individual perspectives of the different members?

Perhaps coordinating as a whole is achieved by ‘predictive’ planning, and accommodating individual variation is achieved by ‘reactive’ negotiation (Maduell & Wing, 2007:617). In terms of timing coordination, a shared sense of pulse allows individuals to predict the next beat, whereas individual variation from that shared pulse forces the group to react and renegotiate the pulse. For example, a given cycle of tal (see Ch. 3.2.3) has a fixed number of beats, yet musicians can sometimes make mistakes, such as adding or dropping a beat, or, more often, adding a segment of two or four beats because they have counted one segment when they have actually played it twice. At these moments the music is not conforming to the metrical ‘plan’, and therefore the musicians suddenly need to renegotiate the tal between each other in live interaction in order to keep going (Widdess, pers. comm.).

This brings us back to the (by now) familiar tension—stability/uncertainty, global/local, representation/emergence, order/chaos—this time situating it within the context of musical entrainment. Prediction may also be associated with musical training, as well as a kind of ‘social memory’ (i.e. ‘shared task representation’) of past rhythmic interaction with familiar musicians (Oullier et al., 2008:3). However,

147

the comparison between predictive planning and ‘shared representations’ is limited because shared task representations seems to be characterised more like a Platonic ‘idea’—a kind of unchanging ‘map’ or ‘manual’ of how to perform a particular joint activity—rather than a time-dependent process of prediction that changes and evolves in response to creativity and error, as discussed in relation to any dynamic system.

To conclude this exploration of planned vs. emergent coordination, analogous to top-down vs. bottom-up processes, I want to make two main observations. First, it is unclear what ‘planned’ or ‘emergent’ coordination actually are in more than a theoretical sense, once one starts to observe the messy and complicated realities of live interaction. Second, as argued above, we still know very little about how these two forms of coordination interact, if, indeed, they can be separated in the first instance. Indeed, Knoblich et al. (2011:85) make the point that ‘numerous studies indicate that planning joint actions taps into several different mechanisms of emergent coordination recruiting the functionality of these fast and parallel mechanisms’. The following quote elaborates on this mutual interdependence:

‘Most forms of joint action likely require both emergent and planned coordination because there are complementary limits on what each can achieve. On the one hand, planning alone does not make people act at the right time, fall into synchrony, or predict others’ upcoming actions based on their own action repertoire. Although planning can prepare actors to perform their individual parts of a joint action, it does not guarantee successful implementation. Emergent coordination is likely the key to dealing with the real-time aspects of joint action. On the other hand, emergent coordination alone is limited in that it does not allow people to distribute different parts of a task among themselves, nor to adjust their actions to others so as to flexibly achieve joint outcomes. These aspects of joint action require planned coordination. The complementary limits of emergent and planned coordination suggest that it is the synergy of emergent and planned coordination that allows people to make music together, play team sports, or build a house.’ (Ibid. 91).

It is therefore clear that emergent coordination and planned coordination are interdependent aspects of the singular phenomenon of joint action. It seems that a ‘joint action’ can refer to a structured ‘task’ or activity that involves unpredictable human beings, not predictable machines. Groups of humans achieve these tasks and accommodate the inherent unpredictability of their members through messy bottom-up interaction guided by top-down shared goals. For Knoblich many unanswered questions remain and yet need to be tackled if an integrative theory is to be formed. These are: “How can shared task representations tap into mechanisms of entrainment, perception-

148

action matching, and predictive action simulation? Which perceptions need to be shared so that mechanisms of planned and emergent coordination will act in combination? Does emergent coordination have a role in how joint action plans are set up and how roles are distributed between individual actors? What is the role of emergent coordination in generating joint perceptions?…How do attributions of intention and knowledge in the pursuit of joint action goals interact with the mechanisms of emergent and planned coordination?…To what extent can shared task representations also be modulated by explicit beliefs about the partner’s task, or by beliefs about the partner’s beliefs, or intentions about one’s own task?” (Ibid. 92).

The next two chapters of this thesis will attempt to provide partial answers to these big questions.

6.4 Empirical challenges in investigating group entrainment processes The laboratory findings in this chapter regarding entrainment are largely based on dyadic ‘tapping’ or other simplified rhythmic tasks outside real musical contexts. Focusing on dyadic interaction enables ‘experimental control of information exchange and a precise quantification of the nature and strength of the social interaction’ (Oullier et al. 2008:3), which would be more difficult to achieve with group interaction. Marsh et al. (2006:28; emphasis added) believe that ‘the logic of emergent phenomena of coordinated movement at the dyadic level should also hold for larger groups of individuals’. However, as soon as one starts to think about empirically testing multi-person timing interactions, there is a kind of ‘interactional complexity explosion’ of the kind associated with the bottom-up approach of complex systems theory and associated computer models. Oullier et al. (2008:3) state that this is the most obvious obstacle in investigating group entrainment behaviour, but even dyadic forms of joint action, e.g. the rhythmic interaction between a dyad like a mother and child, are themselves extraordinarily complex. Impressive examples of mass synchronisation in human behaviour display intriguing ‘emergent properties’ (see Ch. 5.5). For example, Neda et al. (2000a,b) examined synchronised clapping in audiences and found that the applauding audience often starts by clapping in a highly unsynchronised (or uniformly distributed) way and then after a while suddenly claps in synchrony. What Marsh et al. (2006) find important is how ‘a clapping beat spontaneously emerges, disappears, and reemerges despite the fact that every member of the audience has a preferred clapping tempo and there is no external beat being brought to bear on the audience’

149

(see App. 1.51). Neda et al. (2000a,b) found that a cycle of synchrony emerged in the clapping: it started off fast and loud, because of more clapping per unit of time, but then slowed down to half the initial tempo to avoid synchronisation breakdown, and therefore lost volume, but then sped up again presumably to get louder (Knoblich, 2011:69).

Furthermore, and especially relevant in the context of flocking and schooling behaviour (see Ch. 5.5), in the mass context of Neda et al. study ‘the common tempo of synchronous clapping emerged as a result of each clapper affecting the other, locally as well as globally’, and, as expected, there was a strong correlation between the local and global values (Marsh et al., 2006:28). However, it is difficult to describe the relationship between the local and the global to a satisfactory level of detail because they only compared the global noise intensity with the noise intensity of one (rather than many) local vicinity of individuals. Noise intensity spread over even a small group of individuals is a crude measurement of interaction compared with Cavagna et al. (2010a)’s measuring of the velocity of each individual starling (see Ch. 5.5.2).

An issue of ecological validity is that laboratory studies that examine the intricate temporal dynamics of entrainment behaviour—e.g. those that employ the ‘tapping’ paradigm—use exclusively isochronous stimuli or make participants perform isochronous movements. Only a small minority of studies use real pieces of music for tapping along to, instead of metronomes (Himberg, 2013:198; see also Repp, 2005:985; Snyder & Krumhansl, 2001). Experimental evidence shows that musicians vary periodicity even when they are asked to perform isochronously (Palmer, 1989; Repp, 1999a; see review by Loehr, Large and Palmer, 2011). We also know that the average person is able to clap or tap along to musical and metronomic stimuli with variable tempo changes (Drake, Penel, & Bigand, 2000; Stephan et al., 2002; Repp, 2011; Repp & Keller, 2008). Moreover, in group music-making, even when individual musicians alter their tempos the rest of the group can still accommodate these changes (Goebl & Palmer, 2009; Shaffer, 1984; Loehr et al. 2011). Cottrell (2007:83) also argues that in group contexts although ‘it may appear that all musicians are playing exactly together, in fact there are likely to be minute differences in what are perceived as synchronous events’; and therefore periodic variability also occurs between musicians within the same group. Amusingly, when referring to these minor discrepancies of timing, Keil

150

(1994:96) quips that ‘music, to be personally involving and socially valuable, must be ‘out of time’ and ‘out of tune’’. The presence of non-isochronous variation in real-life contexts suggests that experiments which exclusively focus on isochrony are not ‘ecologically valid’ in the fullest sense. Experiments very rarely use stimuli of variable periodicity, but such designs are needed in order to cope with the messy reality of musical entrainment (for three exceptions, see Stephan et al. 2002, Repp & Keller, 2008, & Himberg, 2013). Such designs must also be contextualised within the particular tradition or culture that is being investigated, because the range of time values in which timing variability is acceptable may differ. The next important step in entrainment research must be to develop means of analysing the moment-by-moment dynamic process of entrainment. Himberg (2013:226) argues that the windowed cross-correlation is a good analytical tool for dyads, but for groups the order parameter of the Kuramoto model & Hilbert transform (see 6.2.1) is able to provide ‘a similarly useful, continuous view of the entrainment’. However, an order parameter relies on stationarity, and therefore cannot be used to analyse ensemble performance like Gregorian chant, which does not exhibit stationarity even though the choristers are ostensibly ‘in time’ with each other (see Ch. 1.2 for definition of ‘stationarity’, and 8.5). As we will see in Chs. 7.4 & 8.4.2, another influential factor in group entrainment on moment-by-moment process is the structure of power relationships. 6.5 Summary This chapter has been about how groups are able to entrain their movements and sounds. The basis of the embodied approach to understanding social entrainment is the well-corroborated finding that people seem unable to resist entraining their movement behaviour with others in virtually all forms of interaction. The downside of a focus on physical bodies is that it regrettably turns attention away from the direct intersubjective experience of social coordination, which is much harder, if not impossible, to access. However, focusing on embodied movements is useful because measurable aspects of musical interaction—e.g. the ‘pulse’ or musical metre, or communicative gestures displayed by moving bodies—are more amenable to

151

empirical analysis that subjective and intersubjective experience. Thus, the ‘embodied’ approach is perhaps the most reliable for studying entrainment. Having said that, although embodied aspects are empirically accessible, they are still difficult to measure because they exhibit highly complex temporal characteristics, given that even though human rhythmic movements exert a strong coupling attraction for other humans, the norm in entrainment is that they are temporally-irregular and ‘error-prone’. The necessary error correction processes that result from this temporal irregularity in order to keep entrainment stable require bottom-up empirical analysis of coordination, which is harder to achieve in groups because there are currently no statistical techniques for analysing the web of error correction processes in a group, whereas pairwise analysis is sufficient for dyadic interaction. Thus, embodied entrainment researchers have, so far, focused primarily on dyadic, rather than group, interaction. Embodied coordination can be split into two categories—planned and emergent—one top-down, the other bottom-up. Planned coordination refers mainly to top-down shared task representations that stipulate both the desired goal and the role for each individual within the group, and also to a certain extent govern entrainment processes between group members in real time. A top-down approach is useful for understanding the aspect of coordination that attracts individuals towards one state of coordination rather than other. However, the problem with ‘shared task representations’ that govern the actual dynamic process of entrainment is that they require individuals to project predictions into the future of where another individual might be, or how they might sound, and then on the basis of those predictions make movements or sounds in such a way to maintain coordination with the other person. Firstly, such a process is unsustainable and unrealistic because it requires the ‘predicting’ individual to be correct in their prediction at all times in order to be remain coordinated. Second, it is unclear how this prediction process would work when multiple individuals are involved. Third, due to the various ways in which live interaction is expressed, and its inherently messy and complicated nature, it is difficult to define what representations actually are. Fourth, the notion of intersubjectivity challenges the feasibility of a truly ‘shared’ representation, because it is hard to see how an individual can process and integrate the multiple perspectives of others when forming their own individual experience of a ‘shared’ reality.

152

Emergent coordination refers to the network of unpredictable, non-linear and cross-modal interactions that occur between interacting individuals, and is therefore the antidote to the explanatory limitations of planned coordination: it does not require individuals to predict, only to interact; its basis on non-linear bottom-up principles accommodates the actions of multiple individuals; and it assumes that coordination does not conform to a fixed plan, and is thus more flexible. Emergent coordination is a largely unconscious bottom-up form of coordinated behaviour that relies on the tendency for multiple individuals to act in similar ways by mirroring each other’s actions and is based on a posteriori dynamic outcomes, rather than a priori plans or representations. However, there are limitations to the explanatory power of emergent coordination, which has been argued to generate data too complex and dynamic to analyse, and does not make any testable predictions. On the other hand, planned coordination at the group level is static, too reductionist, and erroneously based on predictive certainty, rather than the reality of error-prone subjects. The likelihood is that there are probably no forms of interaction exclusively based on either emergent or planned coordination, and therefore these two forms of coordination are unlikely to operate independently. However, we do not know much about how they interact, and there is currently no model for integrating them. Therefore, the case study of Ch. 8 will attempt to go some way in exploring the integration of planned and emergent coordination by combining [i] interviews with choristers about aspects of consciously planned coordination with [ii] video analysis of body movements and sounds that display emergent coordination; i.e. by finding meeting points between what choristers say they do, and what they actually do.

Most of the discussion in this chapter was primarily theoretical, backed up by various empirical evidence. However, the current empirical approaches to studying entrainment behaviour are limited in a few specific ways. For example, the laboratory evidence often reduces musical entrainment to simplistic rhythmic tasks, such as ‘tapping’ to machine-generated isochronous beats, rather than human-generated stimuli of variable periodicity. The real-life setting of the Gregorian case study in Ch. 8 has the benefit of being ecologically-valid; although, being fieldwork, it is not empirically ‘controlled’.

153

On another point, we do not yet know precisely how interpersonal coupling works. Hearing and vision are sensory channels used for coupling, but the recent discovery of a ‘social memory’ of a previous interaction which allows subjects to tap in time even when they cannot see each other suggests that there are forms of coupling that we may not yet be aware of (see Ch. 7.3 for more detail). The interviews in Ch. 8 will go some way in identifying forms of coupling by which choristers think that they interact, but, naturally, they can only speak about those of which they are aware. The advanced forms of measuring equipment and video analysis techniques used by Cavagna et al. (2010a) that can monitor the movement of each individual starling in a flock of thousands are available (in theory) for entrainment study large groups of humans (see Neda et al., 2000a,b). However, most entrainment researchers do not use them yet, possibly for financial reasons. Nevertheless, advanced forms of mass movement research (e.g. Cavagna et al.’s study) are constrained by whatever initial computer algorithms are used to analyse the video data, which may not take into account all the relevant aspects of the interaction. Human analysis of videos performed subjectively ‘by eye’ is sometimes preferable to more sophisticated research, because not only is it less expensive, but it also has the advantage that the observer can notice aspects of entrainment that might be missed by a computer. Analysing videos ‘by eye’ is, however, very time-consuming and therefore unappealing to researchers. Now that the general theoretical forms of coordination behaviour in entrainment studies have been explored, the next chapter will focus on how these forms are embodied in group musical entrainment.

154

Chapter 7 - Group Entrainment: gestures, sensory communication, and group hierarchy 7.1 Introduction In this chapter I will review the empirical literature that examines gesture, visual communication, leader-follower relationships in group hierarchy, and inter-group entrainment in the context of musical interaction. By studying this particular collection of aspects of musical interaction, I hope to examine further some of the questions that arise from thinking about bottom-up vs. top-down (or emergent vs. planned) aspects of group coordination. I will also explore the literature on joint speech entrainment, given that Gregorian chant is partly based on speech rhythm. The theoretical exploration of the various literatures in this chapter will prepare the ground for the empirical case study in Ch. 8 which investigates group musical entrainment by analysing videos in which two small choirs alternate verse of Gregorian psalmody, often under the direction of a conductor.

7.2 Gesture I will begin by looking at bodily gestures that can be seen and not heard. Gestures tend to be body movements that have a clearly defined beginning and end, with the same start and end position (Kendon, 2013). They are also distinguished from body movements that have a practical purpose by having semantic intent or an expressive character (Ibid.). It may seem strange in a music thesis that one might investigate things which cannot be heard in order to gain understanding of the process of musical entrainment. However, sounds that originate from individual sources are often perceived as blending into each other into a collective sound, making the interaction between these individual sounds harder to analyse than gesture, particularly in live performances, rather than laboratory contexts. By contrast, each individual’s body gestures are easily differentiated from the next person’s gestures.

In the case study of Ch. 8 I will be investigating how a choir can start singing at the same time as each other. Gestures are important cues for onset synchronisation because there is, by nature, no sound before a group of singers starts singing. Audible breathing cues are also important, but these are difficult to analyse in real-life contexts without individual microphones for each singer. Conductors and

155

ensemble musicians often speak of the importance of ‘breathing together’ in order to synchronise the onset of music, and therefore it may be that the audible sound of multiple breaths together is a cue used in synchronisation; although, of course, it is possible to inhale silently, or at least inaudibly. Probably for the above reasons, very little research has been done on breathing together before ‘coming in’ in ensemble playing, and therefore we do not know much about the role it plays in onset synchronisation. However, breathing often has, but not always, a visible gestural component of torso and head movement that can be studied (see Ch. 8.4.1).

Gestures can communicate intention, and in the context of a dyadic conversation

there can be a kind of reciprocal error-correction process of gesturing where one participant proposes a notion and then the other attempts to clarify what they mean until participants eventually converge on a shared understanding (Levinson, 2006:43-4). In speech, Cummins (2012a:33) describes how gestures are ‘integrated into the temporal unfolding of speech in ways that are still being uncovered (Cassell et al., 1999; Leonard & Cummins, 2011; Wachsmuth, 1999)’; for example, ‘speakers and listeners coordinate their eye movements (Richardson et al., 2007) and even their posture (Shockley et al., 2003)’. Similarly, in the context of group musical performance, gestures are used as an effective means of communication, and in the context of temporal coordination one can observe an error-correction process similar to that which Levinson describes above (see Keller, 2008 in Ch. 6.2.1).

Tomasello (2008:60) distinguishes two basic types of human gesture: the first is used to ‘direct the attention of a recipient spatially to something in the immediate perceptual environment (deictically)’, and the second type of gesture is used to ‘direct the imagination of a recipient to something that, typically, is not in the immediate perceptual environment by behaviourally simulating an action, relation, or object (iconically)’. Tomasello (Ibid.) argues that underlying both types of gesture is the communicator’s intention ‘to induce the recipient to infer the communicator’s social intention—to do, know, or feel something’.

The best and most tangible way of directing somebody to something in the immediate environment is to point (Ibid. 70, 202). To refer to something outside of the immediate environment—e.g. an action that one wants someone else to perform—the best way of directing somebody’s imagination is to pantomime (e.g. in music-making, a conductor’s wide sweeping arc made with his hands to communicate desire for an expansive sound) (Ibid. 202-4, 233). One advantage of

156

iconic (pantomiming) gestures is that they require less ‘common ground’ because most of the information is in the gesture itself (Ibid. 203)—‘common ground’ refers to any implicit shared knowledge that communication between individuals is dependent upon, i.e. ‘the things that I know you know, you know I know, and I know you know I know’ (Levinson, 2006:49). In musical interaction, the ‘shared pulse’, for example, is an integral component of the ‘common ground’ (cf. ‘floating intentionality’ in Chs. 2.2.1 & 6.2.2).

But at some point, Tomasello (Ibid. 221) argues that we need to go beyond iconic

gestures that need ‘to be invented anew on every occasion’ and use communicative conventions instead. He describes conventions as ‘ways of doing things that are somewhat arbitrary—there are other ways they could be done—but it is to everyone’s advantage if everyone does it in the same way, and so everyone just does what everyone else is doing because that is what everyone is doing (Lewis, 1969)’ (Ibid.).

The key distinction between iconic gestures and conventional gestures is that ‘conventions require that they be ‘shared’, so that everyone can rely on everyone else in the group knowing how the convention is used communicatively’ (Ibid.). And the more that is shared and predictable among those that are communicating, the more the communicated messages become ‘reduced in form’, and therefore ‘system-like’ in the Luhmannian sense (Ibid. 301; see Ch. 5.3 on Luhmann). Therefore, in highly-stylised forms of music-making, such as Gregorian chant, conventional gestures are often used.

To summarise and contextualise, pointing (deictic) and pantomiming (iconic)

gestures relate to bottom-up emergent coordination, and conventional gestures are analogous to the top-down ‘shared task representations’ of planned coordination (see Ch. 6.3). We will now investigate how useful these distinctions are in the context of musical performance.

7.2.1 Gesture in Music

Tomasello’s distinction between iconic and deictic gestures can be mapped onto gestures in musical interaction. Leman (2012:64) argues that a distinction can be drawn between gross and fine motor gestures. Gross gestures are typified by the kind of loose synchronisation of body movements observing in dancing, whereas ‘fine’

157

gestures are associated with more precise synchronisation gestures. In ‘gross’ dancing gestures ‘there is a large spatial region where the body part can be at a particular point in time, which ensures tolerance for synchronisation variability’. This large spatial region allows for the ‘shape’ of the gesture to be drawn through space, and is therefore suitable for ‘iconic’ or ‘pantomiming’ communication. By contrast, fine motor gestures are deployed more precisely within space and time, and are thus more suited to the more precise demands of certain forms of musical interaction that are less tolerant of temporal variability. The precision of fine gestures lend themselves to more ‘deictic’ purposes, and just as pointing is deployed precisely within space and time, so are fine gestures.

The central point of Leman’s (2012) paper is that gestures travel through space and therefore take a certain amount of time, and therefore gestures can play a role in musical entrainment (see also Leman & Naveda, 2010). Put another way, ‘our interactions with music are based on our body, and therefore…corporeal movements will have an impact on entrainment’ (Leman, 2012:65; see also Leman et al., 2009, and Naveda & Leman, 2009). Indeed, there are certain musical traditions where if nobody is dancing then no music making is taking place (Cottrell, 2007:77; quoting Small, 1998:9).

Leman also draws attention to the ‘action-perception loops’ between various

aspects of musical interaction: gestures that have immediate effect on the sound produced; preparatory gestures made in advance of the sound that they relate to; biomechanical, physical, and acoustic constraints on the timing of these gestures; and, as discussed in Ch. 6.3.2, interactions between individuals in a group. All these feedback loops result in a ‘combinatorial explosion’ that is typical of the bottom-up approach. However, Leman comes to a similar conclusion to that reached in Ch. 6.2.2: he suggests that because the temporal and spatial aspects of gesture can be compared with the timing of acoustic cues in the music itself, gestural analysis is as good a pathway as any into investigating musical entrainment. Having said that, not all musical entrainment requires performers to communicate visually with gesture; for example, Seeger (1987:96) describes how the Suyá do not look at each other when performing unison songs, and, more generally, blind musicians can entrain with other blind musicians (see Chs. 4.5.1. & 4.6, and App. 1.50).

158

7.2.2 A conductor’s gestures A conductor is required in certain musical contexts, usually large ensembles (although see Ch. 8 and App. 2.5.1), and is responsible for entraining the behaviour of many others to a high degree of temporal precision by exclusively visual means, i.e. silent body movements or gestures and no sounds. Luck & Sloboda investigated the beat-inducing aspects of conductors’ gestures empirically and found that ‘temporal information plays a role in visual beat induction, while spatial information does not’ (Luck & Sloboda, 2008:237). Having said that, Burger et al. (2012) found that when dancing to music with a clear pulse people’s movements showed relatively small spatial variation, but when dancing to an unclear pulse they moved with large spatial variation, and therefore the role of spatial aspects of conductor’s gestures in music with an unclear pulse may need to be investigated further. 7.2.3 Emergent multi-level pulse hierarchies in gesture The next three studies investigated the role of gesture in musical ensemble performance. Clayton’s (2007a) study, which analysed video material of tanpura performance (see App. 1.46), discovered an emergent and unintentional multi-level pulse hierarchy in the performers’ gestures that was more complex than the ‘much simpler consciously-perceived metrical structure’ that has been assumed to underlie tanpura performance. Clayton also found that each performer employed a different strategy for entraining with their fellow performers based on their role within the group hierarchy (discussed in more detail in 7.4.1 below). Toivianen et al. (2010) found, using motion capture, that when people danced to a 12-bar blues progression in 4/4 metre at various tempi, that different metrical levels are embodied in the movements of different parts of the body (see also Burger et al. 2012). They found that faster metric levels (a single ‘tactus’ beat) are embodied in the extremities such as arms and hands, and slower levels (periods of two- and four-beats) are embodied in the central parts of the body such as the upper torso. A different study of the choral singing of a South African song also used motion capture technology, and, similarly, found that different metrical levels are embodied in the movements of different parts of the body, and that experts (four South Africans) are able to simultaneously embody these multiple levels in different limbs more flexibly than novices (four Finns) (Himberg, 2013:211; Himberg & Thompson,

159

2011; see also App. 1.49). In the context of the Gregorian study in Ch. 8 it is interesting that Himberg found that head movements tended to express more metrical levels than other body parts, and that the body movements of a group of experts were more coherent than those expressed by a group of novices (Himberg, 2013:211, 213; see Chs. 8.4.1 & 8.4.2). 7.2.4 Categories of musical gesture The video material in the study in Ch. 8 will involve the coding of specific aspects of behaviour—e.g. gaze direction, head nods, visible breaths etc. (Clayton, 2007b:75; see also Clayton, 2007a). Clayton (2007b) describes how bodily gestures can be divided into two categories: [i] Markers (nondepictive gestures) of musical process or structure, include marking focal moments such as the beginning of a phrase, or beating out a regular pulse (markers are equivalent to Tomasello’s notion of ‘deictic’ gestures, and Leman’s notion of ‘fine’ motor gestures; see video clip 1 of App. 3); [ii] Illustrators (depictive gestures) express the emotional or semantic content of the music, or ‘appearing analogous to the melodic flow or ‘motion’’ (illustrators are equivalent to Tomasello’s notion of ‘iconic’ gestures, and Leman’s notion of ‘gross’ motor gestures; see clip 2). Any of these gestures can exist as part of a repertoire of conventional gestures, and musicians in a context as highly-stylised as Gregorian chanting may over time learn such a repertoire.

In the Western context of jazz performance, players may click their fingers in order to start a piece in the right tempo, and, in order to finish a piece, may nod their heads ‘meaningfully’ some time in the last few bars of a piece (Faulkner & Becker, 2009:155). These and any other negotiations ‘can be accomplished with a nod, a small hand gesture, or meaningful look whose meaning arises contextually’ (Ibid. 156; see clip 1). Gestures like these are deictic ‘markers’ because they mark out specific structural moments, and, in the case of finger clicks, communicate the beat. However, such gestures may also contain illustrative, iconic elements to describe how they want the changes to happen. Similar to jazz coordination, the most important gesture category for synchronisation at critical moments in Gregorian chant are ‘marker’ gestures, and, in particular, head nods (see clip 15). Gregorian chant also employs illustrator gestures

160

associated with gross torso movements and the conductor’s hand movements, which ‘shape’ the sound for stylistic purposes (see clip 2). 7.3 Sensory channels As well as investigating the temporal characteristics of body movements in entrainment it is also important to study the role of the visual channel (through which temporal-gestural information is communicated), and various behaviours associated with it, such as gazing (eye direction) and facial expressions etc. (Kawase, 2012:522). Without vision, achieving the temporal complexity evident in some forms of music-making would arguably be more difficult. For example, Will et al. (2005) found that ‘there was significantly higher between-subject agreement in [pulse] tapping to the audio-visual than to the audio-only presentation’ in vocal North Indian alap performance. Will (2011:181) lists various other studies on cross-modal rhythm perception that, taken as a whole, would suggest that auditory information is dominant in rhythm perception (see Wada et al., 2003; Guttman et al., 2005; McAuley & Henry, 2010; Patel et al., 2005). However, this conclusion is not definitive because the artificial visual stimuli used in some of these experiments were very different to those of real body movements, and therefore the following discussion about the visual channel in entrainment will focus on real bodies moving in time. 7.3.1 Do we need visual contact with another to entrain? Oullier et al. (2008) investigated how visual communication affects entrainment by testing the ability of pairs of people to tap in synchrony with each other with either their eyes closed or open. The authors’ hypothesis was that the visual coupling between participants would induce spontaneous entrainment between their taps (Ibid.). They found that each person was behaving more independently of the other when they couldn’t see each other’s finger movements, and that they fell more into synchrony when they could see each other (Ibid. 6). This was also shown to be unintentional because ‘no participant reported having intentionally tracked the finger movements of the other during the experiment’ (Ibid. 7; see also Richardson et al., 2005). This finding leads to the conclusion that what was being demonstrated in this experiment was ‘an emergent behaviour spontaneously [and immediately] brought about by [visual] information exchange’ (see also Katahira et al., 2007). This

161

is to be differentiated from studies that either ask one person out of the pair to lead or follow the other, or interfere with the spontaneous behaviour by any other instruction (e.g. Schmidt & O’Brien, 1997). Clayton (2007a) has also highlighted ‘the importance of visual information in musical entrainment’ in real-life contexts, describing how a certain tanpura performer ‘might not be able to hear the periodicity of the singer’s tanpura pattern to which she

seems to be entraining in this extract. It is more likely that she is entraining her pattern to a periodic

visual rhythm (e.g. in the movement of the singer’s back and shoulder)’. From Oullier’s and Clayton’s research, and common sense, it is clear that being able to see a rhythmic action helps us synchronise with it. Indeed, an experiment by Schmidt, Carello, and Turvey (1990; quoted in Cummins, 2009) demonstrated that the only basis for maintaining an inter-person coordination of limb movements, where each limb belongs to a different person, is visual contact. However, in addition to bottom-up processes observed in their experiment, Oullier et al. also found a mysterious ‘social memory’ effect. They found that within a single trial consisting of three segments—‘eyes closed’, ‘eyes open’, then ‘eyes closed’ again, each segment starting with a sound and running seamlessly into the next segment—participants in the final ‘eyes closed’ segment were still influenced by the rhythm established in the previous ‘eyes open’ segment, rather than reverting to their personal preferred tapping frequency in the first ‘eyes closed’ segment. Even when the duration of the final ‘eyes open’ segment was extended this effect was maintained, but, surprisingly, when a separate new trial started, the influence was lost. In summarising this section on whether we need visual contact to entrain, it would seem that the visual channel plays an important role in body movement entrainment. Of course, this is not the whole story in music-making because musicians entrain with sounds, as well as movements, and, as mentioned above, groups of people who are blind can still make music very well (e.g. App. 1.50). Maybe one explanation for not needing visual contact to entrain is that musicians become familiar with each other’s timing patterns (among other aspects) over the course of repeated rehearsal or performance, and therefore there may be a top-down ‘social memory’ influence acting on their subsequent music-making together (e.g.,

162

see Williamon & Davidson, 2002). However, as discussed in Ch. 6, static top-down influences such as this cannot explain adaptation to dynamic and unpredictable timing variability. Thus, it would seem that the role of visual coupling in music-making is complex, and is but one of a number of possible forms of coupling. 7.3.2 Gazing behaviour at boundary points in musical performance Kawase (2013) analysed the role of gazing behaviour in coordinating piano duo performance (i.e. two pianos, two players), choosing to focus on tempo changes because other researchers had found that performers tend to look at other performers ‘at major boundary points of a piece and barely look at them during pieces with only small tempo changes’ (e.g. Kawase, 2009; Keller & Appel, 2010; Williamon & Davidson, 2002; Luck & Sloboda, 2008). Kawase (2013) found that ‘gazing alone could to some extent enhance coordination even though movement cues were not available [performers’ heads were fixed in position]…although movement cues are necessary for strict coordination’. Furthermore, he found that ‘although mutual gazing (i.e., partners looking toward each other) just before the coordination moment facilitated synchronisation, eye contact (i.e., looking into partners’ eyes) might not be of

much importance’, and that ‘only the mutual gazing just before the onset of sound [not at the onset

of sound itself] reduced the timing lag within coordination’. What is interesting is that the gazing had to be mutual, not merely one-way, and that synchronisation of onset was negatively impacted if there was no gaze contact just before the onset of sound. Luck & Sloboda (2008:225) also argue that at particular structural moments in musical interaction in larger ensembles, the visual modality comes to the fore: ‘the

role of vision is particularly important at the very beginning of a sequence of coordinative movement, especially when there is no auditory information provided…like the first note of an orchestral performance. In such cases, synchronisation can only be achieved by watching other dancers, other

musicians, or the conductor’ (see also 7.3.2 below). The fact that visual communication seems to be most common at points of tempo change, or at the beginning of phrases, may help to explain why Moran (2010) found that in North Indian classical, Western jazz and folk music videos musicians’ gazing behaviours have fairly consistent durations of only one to four seconds, and why Fredrickson (1994) found that performers only looked at a video recording of the conductor 28% of the time in performance, and for an average duration of 1 second each, even when playing on their own listening to a band via headphones. The use of visual movement cues at head height at difficult moments of coordination, such as the beginning of a phrase, is also evident in Gregorian chant performance (i.e. head nod movements) (see Ch. 8.4.1).

163

7.3.3 Which cues do performers use most for entrainment? Kawase et al. (2007) ran a survey to investigate which communication channels performers and listeners use in musical performance, compared classical with pop music performers. It is worth bearing in mind that these are the channels that participants in the survey thought they used most in musical performance, not necessarily the channels they actually use most. The results were that ‘musical sound’ was the most commonly-perceived communication channel, an expected result, with eye contact and breathing in second-place.

The degree to which each communication channel is used depends on whether musicians are rehearsing or performing. During rehearsal, according to the survey research, speech was perceived to be as important as body movement and facial expression, but speech was not used as much in live performance (Ibid. 78). However, this is probably due to extra focus in rehearsal on ‘tricky’ moments of entrainment, whereas for the majority of entrainment in live performance I would suggest that body movements are more important than speech.

In rehearsal, body movements are more important than breathing for entrainment for popular music performers, and vice versa for classical music performers (Ibid.). The authors also found that ‘breathing together’ at pauses during live performance not only happens between co-performers but also with listeners too (Ibid. 77). The fact that breathing together is seen to be more important than body movements in classical music is relevant in the context of the Gregorian case study which focuses on the role of visible (gestural), but not audible, breathing.

Kawase et al. (Ibid.) suggest that the preference for breathing cues in classical music might be due to the fact that in a ‘classical’ configuration performers face a single direction and have limited movement and range of vision and therefore need to rely on audible cues, whereas popular music performers spend more time looking at each other and moving around. This is certainly the case with Gregorian chanting, where choirs stand in rows facing a particular direction. However, although it would seem from the Gregorian study in Ch. 8.4.1 that being able to see one’s neighbour directly is not required for synchronisation, peripheral vision may be an important form of coupling. The authors do not mention peripheral vision in their study.

Touching and interpersonal distance were, on the whole, not highly rated in comparison with other communication factors, but there may have been a tendency

164

for popular music performers to rate these factors higher than classical musicians (Ibid. 78). In Corsican and Sardinian singing touch is an essential part of paghella performance, and perhaps touch is used because it helps group coordination in the context of metrical irregularity (see Chs. 3.1.3 & 3.2.1; it is also common for Corsican performers to close their eyes too when singing, see App. 1.12).

It is clear from the above survey that the visual and auditory channels are the most important for group entrainment in music. However, it is difficult to pin down one communication channel as being totally responsible, because entrainment seems to combine multiple channels most of the time. There may also be forms of sensory communication channel that we do not yet understand. For example, Konvalinka et al. (2011) showed that even with no obvious auditory, visual, or haptic coupling, the individual heart rates of participants in a fire-walking ritual and those of their family relatives were synchronised, even when the family relatives were not synchronising their body movements with the family member as they watched them walk across the hot coals (see App. 1.52).

7.3.4 The effect of group hierarchy on visual communication The hierarchy of performers who have different roles within a group also affects bottom-up gestural and visual communication. Sometimes accompanying musicians will pay closer attention to the gestures that a soloist makes. In flamenco performance, for example, Maduell & Wing (2007:607) explain that the focal performer is usually responsible for the greatest number of gestural cues, and they give more visual cues than auditory cues, in comparison with, for example, a ‘palmero’—an ensemble member who is not actively engaged in a piece, but who provides vocal and rhythmic encouragement. Rhythmical cues of a flamenco dancer might include ‘palmas, taconeo, pitos (‘fingersnapping’), and rhythmical body movements’, but a singer, or a palmero, ‘will usually control the rhythm via palmas, although stomping and rhythmical speech may also be used’, whereas a guitarist controls rhythm ‘by rhythmical strumming, picking and tapping on the guitar.’ (Ibid.; App. 1.53). Many of these cues are both visual and auditory, and each have a different degree of coordinating influence depending on who makes them within the hierarchy of power relationships.

The kinds of rhythmical movements observed in Gregorian chanting (e.g. a head nod) are simpler than in flamenco performance. When there is no conductor,

165

Gregorian chanting also differs from flamenco in that there is a less pronounced group hierarchy, with more equality amongst singers; although even in this context, performers look to other performers who they perceive to be more competent.

7.4 Intra-group hierarchy in entrainment As we have just seen, individuals within a group have differing degrees of influence on the overall musical entrainment; in other words, certain performers are looked towards, or listened to, more than others in order for the whole group to be ‘in time’. The focal individual/s can also vary throughout a single piece depending on whether the group is starting, stopping, transitioning between tempos, or rhythmically improvising or accenting (Maduell & Wing, 2007:622). In those cases where there is no obvious leader, the complex system of power relations that results from aspects such as an individual’s perceived competence by others is also likely to affect timing interactions between group members. Therefore, in terms of analysing group entrainment, each group member can be thought of as a separate channel in addition to the multiple sensory channels that have been discussed. Consequently, there are a huge number of possible permutations of these sensory and hierarchical interactions. As mentioned previously, this makes the phenomenon of group entrainment very difficult to study.

7.4.1 Group hierarchy in performer-led small ensembles The complexity of group hierarchy depends on both the size of the group and the number of different musical roles involved. Flamenco performance is a good example of a relatively small ensemble context with a complex set of power relations between performers, as compared with other types of small musical ensembles. Many different artists at different levels within the flamenco company can function as leaders, as the following list shows: ‘[the ‘company head’ performer (highest status),]

company head’s partner (if the head is a performer), guest artists, ‘first dancers’ (second to company heads in importance and possibly equal in status to any guest artist), ensemble members who are well known to (and respected by) both the performing community and knowledgeable members of the audience in their own right, and long-term members of the ensemble’ (Maduell & Wing, 2007:603; see also App. 1.53).

A performer’s status in the hierarchy of a flamenco troupe usually correlates with the amount of control they have on the timings of a piece, as well as other musical

166

parameters. Having said that, on some occasions a lower status member, e.g. a palmero, may have a musical influence when, by prior arrangement, they are given a specific purpose such as starting the palmas at a certain tempo (Ibid.). If a timing error develops, a leader might take control of the situation to restore order, or, if that fails, ‘whoever has the ear, strength and volume to lead the others back into the rhythm, even if that person has lower status in the ensemble’ (Ibid.). When the metre is well established, other members apart from the leader may on occasion ‘provide rhythmical variation, especially if the performer is engaged in dance, singing etc., but their contribution is normally restricted to support, taking care not to overpower the focal performer’ (Ibid. 604).

There is also a complex distribution of entrainment power relations in North Indian Khyäl performance. For example, the harmonium player is fixed on the singer, yet the tabla player interacts more widely with other players (Clayton, 2007b:81; see App. 1.29). Clayton found that illustrator gestures (see 7.2.4 above) were mainly performed by the lead performer, and so too were beat marker gestures (particularly on sam, which is the most important beat of a metrical cycle, perceived as either the first or last beat, even though usually written as the first). The following quote by Sadanand Naimpalli, from his manual of tabla playing, concerns the primacy of the main artist’s gestural contributions and would be compatible with advice that might be given for flamenco performance (and many other traditions too): ‘Be very alert and

observe every movement or nuance of the main artist. Many times instructions are conveyed to the accompanist by small gestures of the hands, head or eyes’ (Naimpalli, 2005:69; quoted in Clayton,

2007b:81). By contrast, the movements of ‘supporting’ tanpura performers, i.e. those who provide a consistent drone, play less of a proactive role in entrainment given that they only occasionally make eye contact with the main artist, and sometimes even with audience members, or show their enjoyment with inconspicuous head movements, but most of the time look straight ahead or downwards.

Doffman (2008) investigated the relationship between group dynamics and timing relationships in small jazz ensembles. A drummer, bassist, guitar and soloist in a jazz quartet tend to be closely entrained. Within this overall tight entrainment, ‘small

nuances and shifts between relatively tight and loose coordination, or between a particular musician being slightly ahead or slightly behind another, can be intensely meaningful for these musicians, and these phenomena are tightly interwoven with musicians’ estimates of their own and others’ capabilities and characteristics as musicians, and with their understanding of the ideals to which jazz

performers should aspire’ (Clayton, 2012:54). It is clear that the timing relationships

167

themselves between musicians are a means for assessing the relative competency of each individual, which is often determined by how well a player is perceived to ‘groove’ by varying the timing of their onsets in relation to a highly-stable abstract pulse that all players perceive and share (see Prögler, 1995; see App. 1.42). In this sense, therefore, timing relationships also feed back into the group hierarchy.

In ensemble performance contexts questions such as ‘who can set the rhythm, who can originate accent or rhythmical variation, and how stops, starts and other changes can be cued (and by whom)’ may not always lead to straightforward answers (Maduell & Wing, 2007:622). In flamenco performance, for example, determining who the ‘leader’ is is very tricky, even though each member knows their place within the ‘compás’ hierarchy (Ibid. 602). Maduell & Wing suggest that focal or higher-status flamenco ensemble members are usually responsible for coordinating such changes; having said that, rather than merely following the leader, the lower-status members are expected to be sensitive to the timing of each member in the ensemble, not just the leader. The same is likely to be said for khyäl and jazz ensembles, and many other ensembles too.

To give an example, Murnighan & Conlon (1991) examined the relationship between the group hierarchy dynamics of professional string quartets and their performance as a group; what seemed to transpire was that a quartet’s success in performance rested upon their ability to manage (but not necessarily resolve) the tensions between leadership vs. democracy and confrontation vs. compromise. However, this tension is not always in play; for example, Wing et al. (2014) found, by analysing the timing interactions between two players in two world-class string quartets, that although synchrony was maintained by democracy in one quartet (i.e. the timing interactions revealed no obvious leaders or followers), synchrony was maintained by autocracy in the other (i.e the first violinist was the leader and the timing of the other players’ notes followed) (see also App. 1.54).

Flamenco, kyhäl, jazz, and string quartet ensembles are not unique: any other situation that involves individual musicians within a group who have different roles or statuses—some rhythmic, melodic, dancing, singing, or other leader or follower roles—involve complex group dynamics. Some examples of larger ensembles that display these kind of organisation are opera companies, ballet companies,

168

orchestras, circus bands, folk bands and dancers etc., and we will now turn to larger ensembles that require a centralised form of coordination: the conductor.

7.4.2 Group hierarchy in conductor-led large ensembles Larger ensembles with complex role hierarchies, such as the examples just given, tend to fare best under the guidance of a conductor, because as ensembles become larger they become increasingly unmanageable without a centralising guiding force. However, a conductor can be seen as the centre of network in which sections within an orchestra, for example, also often have their own leaders, so it is not a simple case of orchestra as follower and conductor as leader. There are different levels of interaction, such as that between section leaders and their section members, between members of sections themselves, pairs working from a single sheet of music at the same desk, and cross-section interactions too. Indeed, Himberg (2013:107) argues that different subgroups of a musical ensemble like an orchestra or choir might involve ‘entrain[ing] more tightly with each other than the group as a whole, or even competing subgroups all performing together, allowing within-group and between-groups entrainment to be compared’; i.e. that ‘the orchestra can be radically out of phase with the conductor, yet still be perfectly in sync within [itself]’ (Ibid. 112). In other words, the conductor may be part of a ‘between-groups’ feedback loop given that he/she is likely to not only exert an inflexible top-down influence but also respond in a bottom-up way to the unexpected variations of sections performing as a whole, and possibly even individual members of sections.

Even though a conductor’s clear visual beat may help an ensemble, an experiment performed by Luck & Toivianen (2006:189) found that the ‘visual beat’ of the conductor’s gestures is not as significant an entraining influence as the orchestral sound itself (see also Repp & Penel, 2004; and Cottrell, 2007:80). As an illustration, Himberg (2013:112) argues that if ‘the rhythmic blasts of the tuba and the equally rhythmic but silent waves of the conductor’ were put against each other to see which had the bigger effect on synchronisation, the result ‘would not be a difficult one to bet on’.

Of course, at tempo transitions and ‘stops and starts’, for instance, the conductor’s gestures are particularly important (Cottrell, 2007:80; see also Luck & Sloboda, 2008). However, other coordination cues are also in play at these moments, such as the audible breathing and body movements of other performers etc.. Furthermore,

169

during the normal flow of performance in between these stops and starts, ‘the conductor can catch the gaze only of certain key performers at any given time, and there may be many musicians, particularly in the middle of the large string sections, who do not make direct eye contact with the conductor at all during the performance’ (Ibid. 86).

Due to this overall preference for the auditory signal and the fact that performers tend to ‘lag behind’ the conductor’s gestures whilst being in synchrony with each other, Luck & Toivianen (2006) infer that musicians synchronise mainly with their fellow musicians, and ‘in a somewhat looser fashion’ with the conductor. Similarly, Clayton (1986) investigated the role of other players, the conductor, the score, and a performer’s own sense of rhythm on ‘ensemble performance’, and found that ‘of these, the other players and the conductor were the most important ones, in this order’ (reported in Himberg, 2013:111). Nevertheless, in spite of this order of priority, the orchestra tend to be arranged ‘in a semi-circular layout that has as its focal point the conductor’s rostrum’, thus reducing the eye contact between orchestra members, but maximising eye contact with the conductor (Cottrell, 2007:86; see also App. 1.55).

All these various interactions create a even more complex heterarchy than that observed in smaller ensembles, which may be the reason why ‘smaller ensembles synchronise better without a conductor than larger ones’ (Maduell & Wing, 2007:595-6). Thus, a threshold exists somewhere between small ensembles, where members can each respond to each other in a self-organising way, and larger ensembles, where a singular reference point (the conductor) is necessary because it becomes unfeasible for an individual performer to pay attention to every other ensemble member’s timing. Indeed, with more rhythmically complex music, e.g. some contemporary works, the conductor is often necessary even for smaller ensembles to keep the ensemble in time with each together, and his gesturing of the beat may become more rigid and less expressive (Cottrell, 2007:84). By contrast, in rhythmically-simple, as well as rubato, music the conductor’s role need to set a rigid beat is lessened, and the bottom-up influence of individual players may have a stronger influence on the overall timing patterns (Ibid.). Currently the work that has been done on the role of a conductor in group entrainment, (e.g. Luck & Toiviainen 2006; Luck & Sloboda, 2008) is not that informative about the impact of metrical complexity on group entrainment in large ensemble contexts.

170

In light of the above considerations, the fact that the eight-member choir performing Gregorian chant in Ch. 8 has a conductor asks questions such as: is Gregorian chanting rhythmically complex? And does the lack of an obviously predictable metre in Gregorian chanting mean that conductors are necessary for very small ensembles? And which are the particular moments in musical entrainment that require the conductor, e.g. starts, stops, and tempo changes, and those where the conductor is not required for synchronisation? 7.4.3 The effect of the leader-follower relationship on entrainment It has already been mentioned that most of the study of entrainment has focused on dyadic contexts (see Ch. 6.4). Even though dyadic interaction is different from group interaction, dyadic studies can nonetheless point towards the effect of group hierarchy on entrainment. One particular experiment carried out by Goebl & Palmer (2009:427) attempts to examine the temporal dynamics of a dyadic leader-follower relationship in piano duet performance (i.e. two players, two pianos; e.g. App. 1.56). The pianist who played the upper part, which had more notes in it, was referred to as the leader, the other pianist played the bottom part and was the follower. There were three experimental conditions: [i] full auditory feedback, [ii] one-way feedback (leaders heard themselves while followers heard both parts), or [iii] self-feedback only. As auditory feedback was reduced, the ‘leader’, who was playing more notes per beat, played in increasing asynchrony with the follower. Furthermore, the reduced auditory feedback meant that the visual channel became more important: the pianists’ head movements became more synchronised, and leaders raised their fingers higher above the keyboard to give the follower a stronger visual gesture in place of the reduced auditory feedback (Ibid. 436). Goebl & Palmer found that head movements on their own were not enough to keep in synchrony without auditory feedback; although the authors suggest that if more of the body, rather than just the head, was visible then this finding might be different (Ibid.). Anticipatory asynchrony has been argued to be an index of leadership; Maduell & Wing (2007:595) describe how Rasch (1979, 1988) also found this in three professional trios that he examined. For example, in the Goebl & Palmer study, the leader’s head movements and piano sounds tended to come before the follower’s. Like Goebl & Palmer, Rasch also found that melody parts tended to lead but that

171

‘both musicians and observers were in general unaware of the degree of asynchrony present’. Indeed, Uhlig et al. (2013:53) argue that asynchrony between a leader and follower may even be beneficial, because it has been shown that ‘a certain degree of asynchrony between parts facilitates the perception of separate tones and is required for stream segregation (Handel, 1989; Rasch, 1979; Wright & Bregman, 1987)’ (see also Keller, 2008; Large & Palmer, 2002; Hove et al. 2007; Prögler, 1995). Kawase (2011) also found an asymmetry in looking behaviour in piano duo leader-follower relationships because the followers looked at the leaders for longer than the leaders looked at the followers. Both the fact that the leader anticipates the gestures of the follower, and the fact that followers look at the leader for longer than the leader looks at the follower, suggest that leader-follower relationship is similar to the conductor-orchestra or conductor-choir relationship. One might expect that because the leader had been explicitly told they were the ‘leader’ in the piano duo study there might be an asymmetrical relationship between the pair; however, ‘interonset timing suggested bidirectional adjustments during full feedback despite the leader/follower instruction, and unidirectional adjustment only during reduced feedback’ (Goebl & Palmer, 2009:427). These bidirectional adjustments may explain Keller’s (2008) observation that ‘the typical degree of asynchrony in musical ensembles (around 30-50 ms) is far smaller than would be expected if musicians were sheepishly reacting to the sounds of an individual serving as the leader’ (see Rasch, 1979; Shaffer, 1984). The notion of bidirectional adjustment, therefore, combined with the previous example of the feedback that flows both ways between a conductor and his orchestra (see 7.4.2), are evidence for the inherent mutuality of the process of entrainment in these dyadic and group contexts, and fits with evidence from two-person tapping studies by Himberg (2011, 2013) and Konvalinka et al. (2010), and a group choir study (Himberg & Thompson, 2011; see also Himberg, 2013), which show that cross-correlation occurred at lags 1 and -1, not 0 as would be expected for synchronisation. If the lag was either 1 or -1, that would be evidence for a leader and a follower, but 1 and -1 means that mutual adaptation is inherent to two-person entrainment (Himberg, 2013:104). In one of the two-person tapping studies, Himberg (Ibid. 196) found that mutual adaptation between humans ‘can lead to very high entrainment, even simultaneously with high variability in period and phase’. Hence, mutual adaptation makes sense in the context of naturalistic entrainment because it is noisy

172

and unpredictable when compared with metronomic entrainment (Ibid. 105; see also Moore & Chen, 2010). Goebl & Palmer (2009:436) found that a melody with twice as many notes as the accompaniment required more head movements from each player in order to synchronise than the reverse, an accompaniment with twice the notes of the melody. Uhlig et al. (2013:53) found that melody and accompaniment roles affect the leader-follower relationship in piano duets, but added that in some genres melody is not always ‘primary’; for example, in Western classical music the melody often leads, whereas in jazz the accompaniment often leads. The findings of these two studies would suggest that in addition to power relations between performers, there are power relations inherent within the musical structure, in the sense that the ‘melody’ or the ‘accompaniment’ has interactional priority. This might also be extended beyond the dyadic context to suggest that the interactional complexity of a large ensemble depends on how many players within the ensemble are playing the melody. In other words, in ‘melody-primary’ contexts, the larger the proportion of musicians that are singing or playing the melody, the less complex the gestural and auditory coordination processes of the ‘accompanying’ musicians will be, and vice versa. In the context of the case study in Ch. 8, given that Gregorian chant is melodically and rhythmically simple, and that there is only one unison part with no accompaniment, the main influence on interactional complexity is most likely the total number of choristers in the choir. In summary, Himberg (2013:114) observes that the literature on social and musical interaction in ensembles shows that ‘while there might be different social roles, most of the negotiation and decision-making is done via music and musical interaction, and relies on the mutual adaptation process to achieve a common musical goal’. Therefore, the relative importance of social hierarchy in group musical entrainment is entangled in a web of factors. However, one can conclude that the most common coordinating function of leaders in performance is to coordinate transitional moments in performance, such as tempo changes and starts/stops (as well as extra-performance factors such as ‘who to include in the group, what to play, how to play it’) (Ibid. 119). But even transition in live performance is always negotiated within the context of mutual adaptation.

173

7.5 Inter-group entrainment Having discussed the hierarchy of power relationships in entrainment within a group of musicians, i.e. intra-group hierarchy, I now explore entrainment dynamics that exist between groups of musicians, inter-group hierarchy. This is relevant in the context of the Gregorian case study, because Gregorian psalmody usually has two choirs who alternate between verses—where one choir starts singing immediately after the other; hence, there is likely to be entrainment between the groups, even though they sing at separate moments.

The only study that has, to my knowledge, investigated entrainment behaviour between separate groups of musicians is Lucas et al.’s (2011) study of the Afro-Brazilian religious ritual called Congado. The ritual involves a great deal of processional music performed by different groups of musicians (Ibid. 77). Each individual group is self-contained, and ‘typically comprises a lead singer, three drummers, three to six other percussionists, and up to 40 dancers’, and is affiliated with one of a few larger communities that produce a number of groups of musicians (Ibid.). Individual groups will try to demonstrate their own unique collective identity by not falling into time with other performing groups that they come into contact with. Lucas et al. found that ‘different groups belonging to the same community

entrain, and fall into synchrony, relatively easily – at least when their tempi are fairly close together and they are in close proximity. When the groups belong to different communities this is not so: sometimes they manage to retain their mutual independence, using strategies such as exaggerating tempo differences and looking away from each other; at other times one or both groups will simply

stop playing to avert the possibility of falling into time with the other’ (Clayton, 2012:54). Lucas (2006:6) describes how a drummer in each of the Congado ritual groups is charged with maintaining ‘musical unity’ by staying in time with the other players and singers within their group, but also at the same time has to ‘resist and avoid entrainment with the other group’s music, which becomes a complex task, especially if the groups happen to be playing patterns with close periodicities’.

The most dramatic instance in this study of the human tendency to entrain with others was displayed by two groups from different communities coming into contact with each other and performing a mutual greeting ceremony with the conscious intention of playing out-of-time with each other. What was so surprising was that ‘they actually fell into a tightly entrained relationship for over two and a half minutes (r = 0.988), but they managed to do so out of phase by 223°, which meant that although they were tightly entrained they did not perceive the relationship as

174

such’ (Clayton, 2012:54). Therefore, even though the two groups were playing similar rhythms at similar tempi and were therefore highly likely to entrain to some degree, the groups thought that they had fulfilled their intention not to entrain, thus demonstrating the strong, and often unconscious, pull or attraction towards entrainment in inter-group contexts.

Lucas (2006:6) describes how he witnessed all possible forms of entrainment behaviour result from various greeting rituals: ‘synchronization of pulse and phase-locking

between groups playing the same rhythmic pattern or different patterns; synchronization of pulse between groups playing the same rhythmic pattern, but remaining out of phase, regarding the rhythmic period; synchronization of pulse between two distinct rhythmic patterns; absence of synchronization between groups either playing the same rhythmic pattern or different ones’. Furthermore, when two groups synchronised their pulses Lucas found that they were likely to remain locked in to each other until the end of the greeting ritual.

The Congado study found that visual contact and proximity between the groups was necessary for the ‘bottom-up’ aspect of emergent coordination (Lucas et al., 2011:99). In one of the extracts analysed, coupling between groups was found to be stronger the closer the two groups were in proximity. However, proximity on its own did not mean the two groups entrained, because the groups also had to maintain visual contact. This is illustrated in one of the extracts in which the moment that two groups who were approaching each other became entrained was the moment they made visual contact. In this example, no other explanation apart from visual contact could be given. Furthermore, participants in the ritual seemed to be conscious that visual contact was necessary for entrainment because in an extract when the two groups managed to resist entraining with each other, avoiding visual contact was a clearly-observable strategy (Ibid.).

The ‘top-down’ conditions necessary for inter-group entrainment were [i] similarity of tempi between the groups and [ii] a clear intention to either entrain or not entrain (Ibid.). In terms of similarity of tempo, it was found that all the observed cases of inter-group entrainment were when the groups entrained in a 1:1 tempo relationship. In terms of intention, when two groups came from the same community and therefore intended to entrain, they strongly entrained in an in-phase relationship. But in two examples when two groups from different communities met and attempted to resist and avoid entrainment (probably by focusing exclusively on their own group) they did not entrain in one example, and in the other example the

175

two groups were in an out-of-phase relationship for most of the time they were entrained, although they were not consciously aware of being entrained.

The most important observation of the Lucas et al. study is that there ‘is no fundamental difference between the dynamics of intra-group and inter-group entrainment, each of which involves the coordination of autonomous rhythmic entities’. This fits with the concept of the nested hierarchy of forms of entrainment—intra-individual, intra-group, and inter-group—mentioned in Ch. 1.2. The authors argue that studying inter-group entrainment is important because ‘events involving multiple co-present groups are in fact not uncommon around the world’; for example, other events might be carnival processions or football matches (see also Chs. 4.5.1 & 4.6.1). The concept of a nested hierarchy of forms of entrainment is relevant in the case of larger ensembles (such as orchestras) where, Lucas et al. argue, ‘the entrainment dynamics are best considered on two levels, intra-group (e.g. within the string section) and inter-group [e.g. between orchestral sections]’ (Lucas et al., 2011:100; see also Chs. 7.4.1 & 7.4.2, and Maduell & Wing, 2007, on flamenco, with its different ‘groups’ of dancers, singers, instrumentalists, and palmeros).

Another exciting implication of Lucas et al.’s study is that it demonstrates the importance of understanding the cultural context within which a given entrainment behaviour is being observed. This is encouraging given that I have attempted in this thesis to situate the real-world behaviour of Gregorian chant within an anthropological and sociological context before analysing the specific entrainment behaviour it displays in an empirical context. The Congado entrainment study and my Gregorian chant case study are in each case informed by a close understanding of both the musical repertoire and the ritual context of that which they are studying; furthermore, in both cases it was possible to put specific unusual findings within a larger context by interviewing participants in the ritual.

7.6 Joint speech entrainment Chanting, the subject of the next chapter, can be thought of as ‘in between’ speech and song (see Ch. 1.3) and therefore it seems right to consider entrainment in the context of speech, as well as in musical contexts. Cummins (2002) has done empirical work on entrainment in joint speech where two people speak the same text at the same time. Although the task may seem artificial, because it asks two people to

176

speak the same novel text in synchrony with each other in a laboratory setting, we actually entrain our speech in real-life group contexts as well (see Ch. 6.1 for examples). As described in Chs. 1.3 and 3.2.1, chanting is based to a large extent on speech rhythm and therefore shares common ground with joint speech.

Whilst an emergent common periodicity is found in mass clapping and singing, it is not usually observed in synchronous group speech (e.g. App. 1.43). This is perhaps because speech is not usually periodic (Cummins, 2012a; Dauer, 1983; Turk & Shattuck-Hufnagel, 2013; Kohler, 2009; Arvaniti, 2009; Nolan & Asu, 2009); having said that, different registers of speech may display different degrees of periodicity (see Knight, 2013). Cummins (2009:17) argues that even though music displays periodicity it does not mean that embodied synchronisation is only possible when periodicity is present. He argues that music itself often displays a slightly irregular periodicity, and slowing-down is a common feature of the end of phrases, and yet people do not perceive a ‘destruction of rhythm’, and can still move ‘in time’. Indeed, careful analysis of expressive timing suggests that performers often deviate from written durations in notated music in systematic ways (see London, 2012, for discussion of the interaction between metre and expressive timing). Therefore, Cummins concludes that musical synchronisation does not necessarily need to be based on isochrony—i.e. perfectly regular time intervals.

The basic assumption of entrained speech is that ‘the speech of one speaker acts as the entraining signal for the production of the other in a symmetrical, reciprocal relationship’ (Cummins, 2009:18). Cummins found that even though subjects probably had not had much practice at the task of reading a novel text in synchrony with another person reading the same text, the asynchronies reported were smaller than those found within and across speakers in other situations (Ibid.). These asynchronies were about 40 ms throughout the text, rising to 60ms at phrase onsets (Ibid.; see Cummins, 2002). Visual contact between participants helped a little, reducing asynchronies by 20ms, demonstrating that bottom-up coordination is present in entrained speech (Cummins, 2012b).

Cummins (2009:18) argues that the tight coupling demonstrated by these small asynchronies is difficult to explain given that even the short paragraph used in his experiment ‘string[s] hundreds of segments together’, and that this was possible with no practice, and it was also shown that practice did not ‘substantially improve

177

their performance (Cummins, 2003)’. In addition, any brief predictability in timing lasted ‘no more than a few syllables, at best’ (Cummins, 2013:28). This capability to entrain to non-isochronous stimuli could also not be explained by a simple leader-follower relationship, because Cummins found that in the hundreds of recordings he made the ‘very small lead’ that was observed alternated between speakers throughout the short paragraph. Hence, Cummins found evidence for his claim that a pair of speakers are in a ‘symmetrical, reciprocal relationship’ when entraining their speech (see also Ch. 7.4.3 on ‘mutual adaptation’).

In summary, Cummins’ findings form a useful basis for the Gregorian psalmody study in Ch. 8 because they demonstrate: [i] that joint speech entrainment is possible even when the periodicity of speech may not be consistent; [ii] that there is reciprocity in the interaction between speakers in joint speech entrainment; and [iii] that visual contact reduces asynchrony in joint speech. The Gregorian case study in the next chapter will be concerned with the gestural, visual and auditory cues that make it possible to initiate a sequence, however, rather than Cummins’ focus on maintaining synchrony as each utterance unfolds over time.

7.7 Summary In this chapter I have focused on how emergent coordination is expressed

through gestures that are communicated via the visual channel, and how planned coordination is expressed by the way that intra- and inter-group hierarchy affects coordination in musical performance. The sensory channel of touch might also play its part, but I did not focus on it here because singers tend not to stand touching each other in Gregorian chant performance. The focus on gestures is justified, it would seem, by Kawase’s study, which found that the visual and auditory channels are the most important for entrainment, with touch being important only in certain contexts. Visual contact with moving bodies was argued to be most useful at stops, starts, and transitions within musical pieces. The audible cue of ‘breathing together’ before starting a phrase is also important for coordination, but, due to methodological constraints, the case study in Ch. 8 was not able to investigate this, and therefore it was not discussed in this chapter. In addition to visible gestures, the auditory signal will be explored in the next chapter by reference to perceived metre.

178

In investigating the contribution of individual sensory channels I have been aware of the fact that it is impossible to isolate a single channel responsible for entrainment behaviour, because entrainment seems to combine multiple channels most of the time. One obstacle to studying interactions between sounds and gestures in musical interaction is that, to varying degrees, a gesture often happens before the sound it is connected with; partly because a gesture is often signalled in anticipation of a musical event, but also possibly because the visual modality is slower than the auditory modality and therefore needs to happen sooner (Repp, 2005:973; Repp & Penel, 2002). Levinson (2006) calls this ‘a ‘binding problem’—requiring linking of elements which belong to one another across time and modality’. In the Gregorian case study it is usually clear which head nods go with which sound, for example, but some other gestures (like torso movements) might to be harder to ‘bind’ with their corresponding sounds.

Gestures can roughly be categorised into gross vs. fine, deictic vs. iconic, marker vs. illustrator gestures, although such theoretical distinctions are never as clear when it comes to the practice of gesturing. However, these distinctions provide a framework for understanding gesture in the context of Gregorian chant, by broadly differentiating between gestures that are either temporally-communicative or spatially-communicative. The most important gesture category for synchronisation at critical moments in Gregorian chant are temporal ‘marker’ gestures, e.g. head nods. Gregorian chant also employs ‘spatial’ illustrator gestures, e.g. conductor’s hand movements, that ‘shape’ the sound for stylistic purposes, which may nevertheless also affect the timing coordination.

In small ensembles, where there is no conductor, it was found that the timing of interactions between gestures made by different performers in a group can sometimes display very complex metrical patterns (e.g. in North Indian musical performance). One reason for the complexity of the temporal patterning of these gestures may be that, depending on the size of the ensemble and the type of music, gestures may be the most salient cues for ensemble coordination because they are easier to distinguish than the sounds that other ensemble members make, which may merge together in perception. It would seem that eye contact is beneficial for most forms of synchronisation; for example, one tapping experiment by Ouillier showed how eye contact was essential for spontaneous, unplanned coordination.

179

In terms of group hierarchy, it was found that in small ensembles certain performers are looked towards, listened to, or respected more than others in various musical contexts. In the case of some large ensembles, an obvious example of such a performer would be the conductor. Centralised guiding agents like a conductor can function to reduce interactional complexity within ensembles by providing a single reference point. But even a centralising agent belies a greater complexity of feedback loops between different nested levels in the group hierarchy (e.g. between the conductor, orchestral sections, instrumental sections, section leaders, desk leaders etc.). And conductors are not needed all the time; they are required more at stops, starts, and transitions to establish coordination, rather than maintain coordination that has already been established. Indeed, orchestral musicians seem to entrain with their fellow musicians in the main, and less so with the conductor. It was found that even if piano duo musicians are instructed to play roles of either leader or follower, there is still a bidirectional timing relationship between them (as long as both visual and auditory channels are present), suggesting that boundaries between the roles of leader and follower are blurred in dyadic musical entrainment. Having said that, the study suggested a clear leader-follower relationship in the gestural domain, but, even so, from an exploration of intra-group power hierarchies in various contexts (ranging in scale from two speakers to a whole orchestra) I conclude that there are no set rules of power structure which are applicable all the time in ensemble coordination. Nonetheless, thinking about the influence of intra-group power structures is still a useful tool for investigating group entrainment because having a clearer understanding of which between-person interactions one should focus on is likely to simplify an analysis of bottom-up local interaction. The strong (often unconscious) mutuality in the entrainment dynamics of leader-follower relationships and intra-group hierarchies was also found to be present in dyadic speech entrainment, and even in inter-group musical contexts. Thus mutuality would seem to be a central aspect of entrainment behaviour. A study of inter-group entrainment in the Afro-Brazilian ‘Congado’ ritual also found that eye contact between groups was necessary for bottom-up inter-group entrainment to occur, and the main top-down influence on inter-group entrainment was the choice of whether to either entrain or not, usually made by avoiding visual contact with the other group.

180

An implication of Lucas et al.’s Congado study is that it demonstrates the importance of understanding the cultural context within which entrainment behaviour is being observed. This is what I set out to do next, by interviewing choristers about the role of certain gestural cues, visual contact, group hierarchy and other aspects in Gregorian chanting.

181

Chapter 8 - Gregorian Psalmody Study 8.1 Introduction to Gregorian Chant

‘Is any among you merry? let him sing psalms.’ James 5:13, KJB.

This chapter comprises a Gregorian psalmody case study, which provides a real-

world context in which to situate and connect some themes of this thesis: the function of chanting, the rhythm/metre of chant, bottom-up vs. top-down approaches to understanding group entrainment, and an investigation of the communication channels by which groups entrain in music.

Hiley (2009:2) has described how ‘Chanting the [Latin] texts in a measured,

disciplined manner is a good way for [a] group of worshippers to act together; the more harmonious the singing, the more inspiring the communal act.’ The focus of this thesis is ‘singing as one’, which I am operationalising in this empirical study as ‘singing at the same time’; and while this disregards issues such as blend that may contribute to a sense of ‘one-ness’ in choral singing, it is necessitated by the opportunistic approach to data collection adopted in this study.

‘Gregorian chant is the single voice (‘monophonic’) music sung in the services of the Roman church’ (Hiley, 2009:1). Gregorian chant, also commonly known as ‘plainsong’, has been described as ‘uniformly monophonic, lack[ing] harmony, and…characterised by a free-flowing rhythm’ (Chen, 1983:86) and also as having ‘No beat, no harmony, such simple note patterns!’ (Hiley, 2009:xvii). As mentioned in Ch. 1.5.3, Gregorian psalmody, the focus of this chapter, is a form of Gregorian chant where the biblical psalm texts are set to fixed melodic formulas known as psalm tones (Chen, 1983:87; see Fig. 8.1 below). The pitched notes that correspond with syllables in the text have to be performed in unison by the choir, and over the course of this chapter my focus will be on the rhythmic aspect of Gregorian chant and how choral chanters can synchronise with each other, even though there is ‘no beat’ and the rhythm is ‘free-flowing’.

182

Fig. 8.1. An example of the standard form of Gregorian psalmody notation, taken from the Liber Usualis. The notation separates the words and the music from each other on the page, with the melody at the top of the page and the words underneath.

Each psalm melody (or ‘tone’) can be used for many different texts, as Chen (1983:87) explains:

Given that the eight basic psalm tones (melodic patterns) have nine variants, and that each of the 150 psalms may be under ten to over a hundred verses in length, there are 60,000 different patterns of tune-text pairing, i.e. much too large a corpus to print or learn individually. Instead, singers need only learn a few rules or principles to be able to sing any psalm verse to any psalm tone intoned by the cantor. (Ibid., paraphrased).

Due to its simplicity and the fact that ‘singers need only learn a few rules or principles to be able to sing any psalm verse to any psalm tone’—a fact corroborated by the conductor interviewed in this study—Gregorian psalmody lends itself to empirical analysis (as opposed to ‘melismatic’ melodic formulae, see Ch. 2.5). The psalmody is uniform enough verse-to-verse for the purpose of analysing of multiple verses (i.e. trials) as if they are equivalent, in that the majority of phrases are all governed by the same rhythmic rules and the same melodic formula, and each complete psalm (ranging from c. 6-50 verses) will usually employ only one or two psalm tones (or melodies), repeated for multiple verses (Hiley, 2009:5).

183

Psalm tones bear similarities with the intonation or prosodic structure of a sentence of ordinary speech; however, they are more stylised and regular (Dresher, 2008). Prosody, as exemplified in intonation contours, can be understood as the ‘musicality’ of speech, relating more to the rhythmic, melodic, and timbral (etc.) dimensions of speech than to the semantic content. As Cutler & Isard (1980:245) have put it, ‘the major sources of prosodic effects…can be grouped into four main categories: lexical stress patterns of individual words; the placement of sentence accent [i.e. accents on words within a sentence]; syntactic structure; and a variety of pragmatic factors such as choice of speech act and attitudinal indicators, which influence the overall shape of the intonation contour.’. The ‘natural intonation contours’ of English phrases (and many other Western languages) can be roughly broken down into sections. In terms of pitch, the three sections of a typical spoken phrase are: [i] a short initial rise to a particular level; [ii] the next section in which the pitch remains constant or declines gradually; and then [iii] a close, or cadence, where something ‘dramatic’ happens (a rise-fall or fall-rise) to signal the end of the spoken phrase (Dresher, 2008:47).

The guidelines for intonation below provide a basic structure that governs all the psalm verses analysed in this study. On the first verse of a psalm, but not other verses, a syllabic ‘intonation’ of a few tones (usually rising in pitch) is added to the front of the verse, like the short initial rise of section [i], (see melodic notation for ‘Laudate’ in Fig. 8.1. above). After this initial rise on the first verse, and at the start of every other half-verse in the psalm there is the ‘reciting note’ of section [ii] (see words ‘Dominum in sanctis’ and ‘laudate…firmamento vir-’ in notation example) on which the majority of the text of that particular verse is chanted on one pitch, and is repeated as often as needed according to the number of syllables in the half-verse (in the notation, the verse is divided into half-verses by *). Occasionally, a pitch ‘inflection’ called a flex (†) marks appropriate grammatical points (e.g. comma, colon, and full stop), with a pitch-change that is usually down a tone (see notation and v. 5 of text relating to the word ‘benesonantibus’) on the syllable/s that mark that grammatical point (in this case, ‘-tibus’), and, after a brief pause (see video clip 3 of App. 3 for the duration of this pause), the recitation resumes on the previous reciting note. Relating to [iii] above, the final section of a typical phrase is the cadential formula, which refers to the pitch notation that corresponds with ‘-tis é-jus’ and ‘tú-tis é-jus’ in the text underlay. These italicised and emboldened characters in verse 1 associated with the pitch changes of the cadential formula are also intended

184

to correspond in location with those in the text for verse 2-7. The cadential formula is syllabic, and contains one or two ictic (i.e. accented or rhythmically strong) notes of longer duration (in the notation, the two syllables of ‘é-jus’), and the cadence is usually more elaborate than the starting intonation (Dresher, 2008:48).

An extended discussion in Ch. 3.2.1 showed that rhythm in Gregorian chant can be characterised by three different interpretations (see Randel, 2003:364): [i] ’mensuralist’—i.e. the ‘long’ syllables are exactly twice the duration of ‘short’ syllables, and therefore the chant has a regular periodic pulse (Bailey, 1979:103); [ii] ‘accentualist’—i.e. each syllable has a ‘more or less equal’ duration, although, depending on word-accent, some syllables are longer than others, and this means that the pulse is ‘weak’ and constantly varying (Crocker, 2000:44,53); and [iii] ‘figural’—each syllable has equal duration on the whole, but groups of two or three syllables exist to delineate a rhythmic ‘rising and falling’, with less of an emphasis on word-accent, hence, like [ii], the chant exhibits an irregular pulse (Randel, 2003:364).

The conclusion in Ch. 3.2.1 was that it is still an open question as to whether Gregorian chant is periodic or not. In 8.5.3 below, my empirical analysis of metrical perception in Gregorian psalmody shows, in line with the ‘accentualist’ and ‘figural’ interpretations above, that whilst psalmody exhibits a more regular pulse than conversational speech, it is not as regular as the pulse observed in other forms of singing, such as the polyphonic motets that are also sung in Catholic services. However, my findings, the above interpretations, and the rules of intonation outlined above should all be placed in the context of the ‘changing, living tradition’ of Gregorian chant performance, which has several different schools and is likely to both vary from choir to choir and change over time (Hiley, 2001; see also Chen, 1983).

8.2 Introduction to empirical study

I was fortunate to be given the rare opportunity to sing Gregorian chant as part of the 2011 Tridentine Easter celebrations in the Chiesa Cattolica Parrocchiale Beata Vergine Del Rosario, Trieste, for five full days as part of an eight-person choir—which typically split into two choirs of four people—plus a conductor. The choristers had not sung as a whole group before, although some of them had previously sung

185

with each other. Each day for five days (Holy Wednesday-Easter Sunday) the group rehearsed for about three hours, and performed in services for about three hours. I participated fully, singing everything that the other singers were singing, whilst simultaneously recording the sessions (using an handheld iPhone for the majority of the video footage), and interviewed all the participants—the interviews being done mainly on the last day of singing, Easter Sunday, and the free day afterwards.

8.2.1 Summary of interviews with choristers

Below is a summary of answers to interview questions given by the seven choristers and conductor. The interview questions gave an insight into the relative importance of visual vs. auditory cues, audible breathing, and group hierarchy for synchronisation, and the specific nature of psalm singing as opposed to other forms of singing. These reflections will provide a context within which to analyse and understand the data from the audiovisual recordings of the psalmody performance from section 8.3 onwards. A fuller account of the answers to these interview questions can be found in Appendix 2.

The form of interviewing was ‘semi-structured’, which means that ‘all respondents

are asked the same questions, but the order in which they are asked differs from one person to the next. In some cases, even the manner in which they are asked varies, for example, changing the wording or sentence structure to better fit the respondent or the situation’ (Sommer & Sommer,

2002:115). My interviews posed the same set of questions (sometimes paraphrased) to multiple interviewees, usually in the same order. Because I was only able to speak to each chorister for a short time owing to their schedule, I sometimes did not have time to ask every question of each chorister.

The first question concerned whether Gregorian psalmody employs an ‘even rhythm’ (i.e. each syllable has an equal duration), or a ‘speech rhythm’ style (see video clips 4 & 5 for the conductor’s demonstration of ‘even rhythm’). The consensus as to the ideal style of performance was that the rhythm of the Latin text needed to be accommodated within the restrictions of an ‘even rhythm’ style; i.e. a kind of mixture of the two styles of rhythm. There also seemed to be no consensus on rules for the tempo of Gregorian chant performance, but agreement that any decisions about the speed of the tempo should reflect the nature of the text being sung at that moment in time. There was also consensus that although the notation played a role in understanding the rhythm of the text, choristers said that it was more important

186

to make sure that they agreed with each other on how the notation relates to sung rhythm—i.e. each choir will have their ‘house’ style. Choristers agreed that both aural and visual cues are used for coordinating the collective singing—respectively, the audible breath and body movements. It was clear from a discussion about bench formation that choristers need to see each other’s movements and hear each other’s sounds in order to synchronise together. This was corroborated by the fact that choristers thought that being aware of other choristers around them was just as important as reading the notation correctly. It seems that peripheral vision is activated most at points of uncertainty within the notation, or at the beginning of phrases. Choristers were also conscious of being gesturally demonstrative at moments of timing ambiguity, such as the beginning of phrases and those moments when another chorister is in need of correction. They also unanimously agreed that audible breathing was an important, if not the most important, cue for coming in together. It would seem that the intra-group leadership hierarchy was, for some, based on how competent they perceived other choristers to be; whereas, by contrast, others said they ‘followed’ the choristers who were closest to them (in spatial terms). There was consensus that a conductor is not essential for group synchronisation in Gregorian chant, but can be helpful in the case of less experienced groups. In terms of the effect of the size of the choir on performance, the general consensus was that as the number of people in the choir increased inertia would probably increase, resulting in a slower tempo, but ‘mistakes’ would become less obvious. A couple of choristers made the point that mistakes are more likely within less experienced groups, and therefore experience was possibly more of a decisive factor than the size of the choir. However, the length of time each group had to form would seem to be another factor, given that there was consensus that chanting had improved during the week. This could be thought to be due to the consolidation of the specific top-down ideal for how the chant should sound, which emerged from verbal interaction between the conductor and particular choristers as the week went on. The improvement was also likely to have been due to choristers learning to ‘listen more’, and practising other bottom-up coordination processes such as being aware of each other’s movements, as well as each chorister forming a stable top-

187

down understanding of what the other choristers’ various movements and singing styles communicated. 8.3 Analysis of Gregorian Psalmody performance In line with aspects of entrainment behaviour that were discussed in Ch. 7 and in the interviews in 8.2.1, I will be focusing in the following empirical study on the role of visual gesture, group hierarchy, and metre perception in entrainment. In the first part, I will first associate visual gestures with specific choristers, and then look at how at the start of each phrase some choristers ‘lead’ the singing with their gestures more than others, and finally look at the relationship between the synchrony of the collective gestures and the synchrony of the collective sound at the initial onset of each phrase. In the second part, I will investigate the role of metre perception in entrainment in starting singing together at the same time after a short pause between phrases. A choir’s ability to synchronise with each other note-by-note throughout each syllable of each verse of Gregorian psalmody is certainly of empirical interest, but my analytical focus here is comparatively modest. I examine the moment in which choristers go from silence to sound all as one, i.e. the ‘onset’ of sound. As we discovered in Chs. 7.3 & 7.4.2, the process of going from silence to sound requires communication through both visual and auditory channels. Therefore, for this reason, and the fact that the onsets at the beginning of phrases of psalmody are the most difficult to coordinate—visually illustrated by the fact that this is where discrete ‘marker' gestures tend to happen, whereas, by contrast, singers tended to stay still during the rest of the phrase—onsets of musical phrases are rich focal points for the study of group entrainment behaviour. From a pragmatic perspective, it made sense to focus on a small part of the data, i.e. the phrase onset, in order to be able to make use of all the hundreds of trials I collected, because if I had studied the entrainment process throughout the whole phrase, syllable-to-syllable, this would have taken too long, given the nature of the data and the manual analysis it would have required. In Gregorian psalmody, choirs are usually divided into two halves, singing alternate psalm verses (see clips 7 & 8). Each verse is split into two phrases (or half-verses),

188

and therefore the moment of sound onset for each half-choir happens typically twice per verse: [i] at the beginning of the first part of a verse, where one choir starts singing immediately after the other choir finishes the previous verse; and [ii] at the beginning of the second half-verse after a short silence following the end of the first half-verse (see clip 4 & Fig. 8.1). By focusing on these two types of moments I will be looking at the ability of a group of choristers to coordinate their sound onset. Furthermore, the fact that moments [i] and [ii] are examples of, respectively, inter- and intra-group coordination allows for a comparison of these two forms of entrainment behaviour.

This current study is a participant-observation study in a ‘live’ sacred performance context. It was therefore not possible in this study to examine the role of subjective metre perception in synchronisation because this would have required ‘tapping’ equipment to measure each chorister’s perception of the metre during performance, which is not appropriate in sacred performance. Furthermore, this artificial task would have cancelled out the benefits of studying the behaviour of a ‘real-life’ performance context. For these reasons, only the gestural and vocal aspects of behaviour were available for examination; the mental aspects of chanting performance were inaccessible.

It was also not possible to distinguish the individual sound signals and audible breaths coming from each chorister. This would have required each person to have their own microphone, which again would not have been appropriate because of the need to be discreet in the religious ritual that was taking place. The choir were typically sitting on benches in a ‘V’ formation in a raised organ loft behind the congregation, who were looking in the opposite direction (see clips 9-11). I was pushing the boundary of respect by holding up an iPhone, which I used to video-record the two choirs, but having a full recording setup would have been inappropriate and disrespectful, especially given that we were performing during Holy Week. The main disadvantage of not having an audio track for each chorister was that one could not be sure if, and when, each chorister was singing. This meant it was difficult to draw conclusions about the relationship between gesture and sound, because, one could not be certain whether a chorister may have gestured and not sung, or sung and not gestured.

189

8.4 Gesture Analysis

I used the video recordings to perform an analysis of gesture at the sound onset at both intra- and inter-group ‘moments’ (see 8.3). First I determined the individual gesture contributions of the choristers at these moments of onset synchronisation, divided into: where each performer’s eye gaze is focused (sheet music, conductor, or another chorister); head movements (e.g. a head nod); torso movements (forward/backward, up/down, side-to-side); visible breaths; sheet-music movements (moved by hand); and, for the conductor, hand and mouth movements. Unfortunately, given the absence of motion capture equipment, it was only possible to analyse whether these individual categories of gesture happened or not, not precise timing data of each gesture, or any other of its specific characteristics like size, shape etc.

Second, I determined who ‘led’ the collective gesturing made by the half-choir of choristers at each ‘moment’ with a view to exploring the dynamics of group hierarchy. Third, at each ‘moment’, I measured the synchrony of the collective gesturing and how it interacted with the synchrony of the half-choir’s collective sound onset. Any judgements regarding the timing of gestures I made myself solely by ‘eye’ and in real time (i.e. not slowed down). My reasoning for this was, firstly, that a detailed quantitative analysis of the intra- and inter-individual timing interactions between gestures is beyond my mathematical competence (see Ch. 1.2). Secondly, given that I did not have individual audio tracks for each individual I was not able to pair each gesture with each sound to a sufficient degree of timing precision. Thirdly, I did not slow the video down because, for the purposes of the study, I am interested in what a human can see in real time; having said that, a chorister is likely to use cues subconsciously much faster than the length of time it takes to consciously analyse a video, and a chorister is not able to rewind the video footage repeatedly like a researcher. A disadvantage of real-time analysis is that because it all happens so fast one cannot easily determine which gesture causes another gesture, or which person is leading the movement.

This analysis was also limited by time resources and cost, so I was not able to hire independent raters to annotate the videos and corroborate my findings. Another limitation was the visual perspective that the data gave me; i.e. what I was able to see from a camera placed in front of the chanters was different from what the chanters saw looking out of the corner of their eye. The fixed camera perspective also

190

meant that it was hard to see the 3D profile of the torso movements of some of the choristers.

Furthermore, due to the narrow-angle constraints of my iPhone lens it was not possible to include the conductor as well as the choir in the video frame at all times. When holding the iPhone myself (see clips 12 & 13), in the majority of cases I could only focus on either the conductor or the choir. This was problematic because when the video was focused on the conductor I could not know who was looking at him; and when the camera was focused on the choir only, I could not know whether the conductor is conducting or not. Therefore, all I could refer to was the gestural synchrony among the choristers themselves, which, if the conductor was conducting, was not the whole story. However, even in those cases when someone else video-recorded the performance and was able to include conductor and choir within the video frame, it was not possible to determine whether the choristers could see the conductor in their peripheral vision, given that when the camera was focused on them they looked at their sheet music virtually all the time (see clip 14). In any case, one has to assume they do use their peripheral vision, but exactly how is not clear (see 8.7).

There are clearly many limitations to this study, caused by constraints on resources, time, equipment, and experimental context. Nevertheless, some aspects of chant performance were possible to analyse, and it was possible to draw some statistically-reliable conclusions due to the large sample size of 210 ‘onset moments’ that were suitable for analysis. Furthermore, for analysing individual gestures (e.g. head nod, visible breath etc.), the total number of trials performed that were examining each separate individual was 742 (spread over the eight individual singers), because each individual chorister counted as a single individual trial at each of the 210 onset moments, and there were on average 3.5 individual trials per onset moment. Although there were 4 people for each moment in the standard two-choir setup, and 5 people in the ‘upper voice’ setup used on Easter Sunday, the average number of individuals was lower than 4 because there were many ‘moments’ where not every chorister was present or visible in the video frame.

191

8.4.1 Individual Gesture Analysis

Table 8.1 below shows the percentage values for the number of trials in which individual choristers, at onset moments, gazed either at their sheet music, the conductor, another chorister, or anywhere else; nodded their head, moved their torso in various axes, or made a gesture with their sheet music or hand; and breathed visibly.

Table  8.1  -­‐  Total  No.  of  trials   742   %  

Gaze  direction  

Sheet  Music   718   96.8  

Conductor   19   2.6  

Another  Chorister   5   0.7  

Elsewhere   0   0.0  

Body  movements  

Head  Nod   463   62.4  

Torso  Forward/Backward   141   19.0  

Torso  Side-­‐to-­‐Side   105   14.2  

Torso  Up/Down     13   1.8  

Hands  Lift  Sheet  Music  Up/Down   57   7.7  

Free  Hand  Up/Down   2   0.3  

Breathing  

Visible   563   75.9  

There was a highly significant difference between the number of head nods recorded in the ‘silence’, intra-group moments (which come after the silence between the two halves of a verse) and in the ‘adjoining’, inter-group moments (which immediately adjoin the previous verse sung by the other choir) (see Table 8.2 below). The differences between conditions for the other types of gesture were not significant.

Table 8.2 No head nod Head nod Silence 178 237 Adjoining 101 226

χ2 = 11.2336; p<0.001

192

The choristers had only come together for the first time as a whole choir that week; hence, I had the opportunity to study how specific features of the group’s coordination processes evolved over the course of the week. To this end, each day was analysed separately; including the rehearsal in Westminster Cathedral (in England) on the Monday of Holy Week (see Table 8.3 below). If the ‘silence’ trials are divided further into which day they were performed, Table 8.3 shows the head nod percentages on Monday Rehearsal I (Choir I, comprising choristers B, C, E, G; data set hereinafter ‘Mon I’), Maundy Thursday I (‘Thurs I’), Good Friday I (‘Fri I’) / Monday Rehearsal II (Choir II, comprising choristers A, D, F, H; ‘Mon II’), Good Friday II (‘Fri II’) / and Easter Day (Upper voices choir, comprising choristers A, B, C, D, G who were four women sopranos and altos and a male countertenor, all singing a unison pitch; hereinafter ‘Sun U’). The most head nodding (79%) was observed in the upper voices choir singing on the final day, Easter Sunday (or ‘Sun U’), for ‘silence’ trials, and the same occurred in ‘adjoining’ trials (94.4%), but there was no obvious trend in head nodding through the week for Choirs I & II in both types of moment. However, the ‘adjoining’ values are all significantly higher, apart from Thu I & Fri I. The results were similar for visible breaths too: the upper choir on Easter Sunday displayed visible breaths in virtually every silence (95.1%) and adjoining (97.2%) moment, but there was no obvious trend through the week for Choirs I & II.

Table  8.3   Day  Choir  

Mon  I  

Thu  I  

Fri  I  

  Mon  II  

Fri  II  

  Sun  U  

No.  of  trials   Silence   37   55   48     52   142     81  

  Adjoining   38   30   54     68   65     72  

    %  values  

Head  Nod   Silence   35.1   56.4   39.6     61.5   54.9     79.0  

  Adjoining   81.6   36.7   44.4     70.6   67.7     94.4  

Visible  Breath   Silence   81.1   74.5   64.6     82.7   65.5     95.1  

  Adjoining   65.8   63.3   59.3     76.5   76.9     97.2  

193

The individual variation in the frequency of each type of gesture between persons can be seen in Table 8.4 below.

Table  8.4                                                          Choristers   A   B   C   D   E   F   G   H   Conduc

tor  

No.  of  trials   127   90   98   95   70   64   110   88   77  

Gaze  direction  (%)  

Sheet  music   97.6   100   98.0   100   98.6   100   88.2   94.3   94.8  

Conductor   1.6   0.0   2.0   0.0   1.4   0.0   11.8   1.1   0.0  

Another  chorister   0.8   0.0   0.0   0.0   0.0   0.0   0.0   4.5   0.0  

Elsewhere   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   1.3  

Body  movements  (%)  

Head  Nod   89.0   38.9   61.2   93.7   44.3   14.1   81.8   42.0   87.0  

Torso  Forward/Backward   16.5   27.8   4.1   18.9   4.3   7.8   51.8   9.1   36.4  

Torso  Side-­‐to-­‐Side   18.1   2.2   18.4   0.0   24.3   7.8   19.1   21.6   20.8  

Torso  Up/Down   0.0   0.0   1.0   0.0   2.9   7.8   4.5   0.0   29.9  

Hands   Lift   Sheet  Music  Up/Down   0.0   1.1   0.0   0.0   61.4   9.4   6.4   0.0   20.8  

Free  Hand  Up/Down   0.0   0.0   0.0   0.0   1.4   1.6   0.0   0.0   50.6  

Mouth/Lip   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   27.3  

Breathing  (%)  

Visible   91.3   57.8   82.7   90.5   64.3   51.6   90.9   56.8   35.1

There was almost no variation between individuals in where they were looking at onset moments: all looked at their sheet music virtually all the time (apart from Chorister G). However, there was considerable variation between choristers in how often they nodded their head—range 14-94%. There was a moderate amount of variation between individuals in how much they moved their torsos in the side-by-side and up-down axes, but considerable variation in the forward/backward axis (4-52%). There was considerable variation in how visible each chorister’s breath movements were—range 52-91%. There were two anomalous results: Chorister G moved his torso in the backward/forward axis in half of his individual trials, and Chorister E lifted his sheet music with his hands in 61% of his trials. The conductor

194

performed the same gestures at similar frequencies, apart from: torso movements in up/down axis (30%); free hand (51%) and mouth/lip (27%) movements that are associated exclusively with the conductor because choristers need to hold their sheet music and sing; and his breaths were not visible because he was not singing (35%). 8.4.2 Leadership Analysis

In addition to determining which gestures were being performed at each ‘moment’, I also looked to see which chorister ‘led’ at each moment, by two primary criteria: [i] who moved first (usually with a head nod), and [ii] who made the largest and most ‘directive’ movements. Out of the two criteria, who moved first was treated as the primary cue. However, sometimes involuntary movements seemed to come too early to count as ‘leading’ movements, and therefore I chose a chorister who gestured later as leader for that onset moment based on a balance of the two criteria.

In a small minority of the total number of trials the conductor was visible in the video frame, and did indeed lead the collective movement in the majority of those trials, but my focus here was the hierarchy between the choristers within each half-choir. It was impossible to determine whose gestures the choristers were attending to because all of them relied on their peripheral vision in virtually every moment due to the need to look at their sheet music, and there was no way of knowing whether they were attending to movements from their fellow choristers sideways, or movements from the conductor in front of them.

Only trials in which all members of a choir were visible in the video frame were used. The only sub-set of seven trials in which some of the choristers were in a different setup to their normal Choir I & II configurations were those of the upper voices choir on Easter Sunday. This may have had an effect on the dynamics of leadership, but for simplicity I have not analysed this aspect. Table 8.5 below shows which individuals ‘lead’ the movement most often, and also their differing percentages based on whether the onset was at a ‘silence’ or an ‘adjoining’ moment.

195

Table  8.5   A   B   C   D   E   F   G   H  

Total  no.  of  trials   84   50   50   84   43   77   50   77  

Total  lead  (%)   22.6   6.0   4.0   51.2   48.8   9.1   44.0   14.3  

No.  of  silence,  intra-­‐group  moments  trials   52   26   24   52   22   48   26   48  

Leader  in  silence,  intra-­‐group  moments  trials  (%)   15.4   0.0   3.8   61.5   45.5   10.4   46.2   12.5  

No.  of  adjoining,  inter-­‐group  moments  trials   32   24   24   32   21   29   24   29  

Leader  in  adjoining,  inter-­‐group  moments  trials  (%)   34.4   12.5   4.2   34.4   52.4   6.9   41.7   17.2  

It is clear from Table 8.5 that some people led more than others, but all choristers led at some point, and that in some cases their propensity to lead varied according to whether the moment was a ‘silence’ or ‘adjoining’ moment. The leaders in Choir I were choristers E & G, and in Choir II, Chorister D, and less so, Chorister A. The leadership values across all choristers roughly matched their individual experience and confidence with Gregorian Chant (see Appendix 2).

8.4.3 Collective Gestural vs. Vocal Onset Synchrony

In addition to judging who was the leader in each video clip of each onset moment, I also made a judgement of a value between 1 and 5 (1, 1.5, 2 etc.) about the overall collective synchrony of the half-choir at each onset moment in terms of their visible gestures and their audible sound onsets. If all choristers moved or sounded at the same time I gave a score of 1, if each chorister moved or sounded at a different time to each other chorister then I gave a score of 5. I made both gestural and aural judgements intuitively based on many years of Anglican psalm singing experience, and a few years’ experience of Gregorian chanting (I was a choral scholar in Trinity College Chapel Choir, Cambridge during my undergraduate degree).

I made judgements on collective gestural synchrony based on the synchronous movement of head nods between the choristers, because these seemed to be the most salient ‘marker’ gestures for temporal coordination (see Ch. 7.2.4). It is hard to know at what stage of the collective nod movement one should measure synchrony from,

196

i.e. either at the beginning or end of the nod movement. I varied from moment to moment depending on what seemed right in that particular context.

One thing I did notice was that a ‘domino effect’ seemed to occur for certain onset moments, where there is a kind of Mexican wave from one end of the line of choristers to the other. This could be explained by the fact that each person is paying attention to the person next to them; however, in the majority of cases it was a staggered nod, i.e. not in a line from one person to the next (see clip 15). Both cases could be explained by Potts’ chorus line hypothesis (see Ch. 5.5.1) which suggests that humans observe prior movements by others in a group in order to time one’s own, although in the ‘staggered nod’ examples who moved when varied in different instances. Therefore, I judged ‘Mexican waves’ of both kinds as non-synchrony, even though one might equally judge them as entrainment of some form, even if not exact synchrony.

Also, often only one or two people gestured (e.g. nodded their head) at the onset moment out of a group of four choristers—e.g. often the ‘leaders’ mentioned in the previous section (see clips 16 & 17). On these occasions I based my judgement on the relationship between these clear leading movements and the corresponding subtle movements of the other choristers in the half-choir. Like the ‘leader analysis’ data, in some of the trials not all choristers were in the video frame and therefore not all gestures were visible, and so these trials were discarded.

The values obtained are obviously abstract for the reader here, because it is difficult to imagine what a score of 2, 3.5 or 5 in synchrony actually means. I found, in general, that there was a narrower range of absolute time duration for the asynchrony between individual vocal onsets than that of the accompanying collective gestures, as one might expect, given that the sound, not gesture, is the most important aspect of chant and therefore requires more precision in its timing (see Ch. 7.3). The perceived range of timing errors reflected in scores 1 to 5 is correspondingly smaller in the collective synchrony of vocal onsets as opposed to gestures.

Overall, the average perceived value of collective gestural synchrony in 124 trials was 3.30, and of collective vocal onset synchrony in 124 trials was 2.45. A t-test (two-tailed, unequal variance, n=124, p = 2.12E-09) revealed a highly significant difference between the scores for collective synchrony of gestures and sound onsets. In

197

adjoining conditions the difference between the collective gestural synchrony (mean 3.09) and the collective vocal onset synchrony (mean 2.71) was of marginal significance (t-test two-tailed, unequal variance, n=54, p = 0.068), but was highly significant in silence conditions (n=70, p = 2.25E-10), for which the collective gestural synchrony mean was 3.47 and the collective vocal onset synchrony mean was 2.25. However, if the two subjective ranges of collective gesture and vocal onset synchrony scores had been scaled in terms of absolute asynchrony (see above) these differences would have been even more significant. This weak relationship between gestural movements and vocal onset synchrony might be explained by the fact that more than just gestural cues are required for a choir to start on time: for example, metrical entrainment and audible breathing too.

Collective gesture and vocal onset scores differed depending on whether they were associated with ‘silence’ or ‘adjoining’ moments. A two-tailed t-test (unequal variance) revealed a marginally significant difference between the ‘silence’ and ‘adjoining’ data sets for the collective gesture scores (p =0.071), and a highly significant difference for the vocal onset scores (p=0.009). Therefore, the collective gestures were less synchronous in silence moments than in adjoining moments, which may have been because the increased sound intensity level resulting from the other choir’s singing in inter-group ‘adjoining’ moments makes audible breaths harder to hear than in the silent pause of a ‘silence’ moment, possibly making gestural synchrony more important at ‘adjoining’ moments. The opposite was true for vocal onsets, which were significantly more synchronous in silence moments than adjoining moments, and this might be because audible breaths are possibly the most important coordination cue. However, it may be difficult to show the relationship between audible breaths and vocal onset synchrony because the pause between them may not be regular; in any case, I could not test either of the above hypotheses due to the lack of individual microphones (see 8.3 and 8.7 for further discussion).

The low number of trials per day of rehearsal meant that a comparison of the averages values as the week of chanting progressed is less valid (see Table 8.6 below). Nevertheless, there seemed to be improvement in gestural and vocal synchrony within Choir II from Monday to Friday, but not Choir I. The highest level of vocal synchrony was shown by the ‘upper voices’ choir on Easter Sunday.

198

Table  8.6    Day                                  Choir  

Mon  I  

Thu  I  

Fri  I  

  Mon  II  

Fri  II  

  Sun  U  

No.  of  trials   14   10   21     30   42     7  

  Average  score  (range  of  1-­‐5)  

Gesture   3.18   3.30   3.12     3.83   3.01     3.50  

Vocal  Onset   2.68   2.23   2.50     2.92   2.21     1.57    

8.5 Metrical perception

In addition to examining the dynamics of leadership and the role of gesture in onset synchrony, I also investigated whether performed Gregorian chant has a regular pulse (see 8.1 and Ch. 3.2.1). I also examined the metrical relationship between the syllable timings of the first half-verse of each verse and the silent pause that followed on immediately after, just before the second half-verse onset. The reason I did this was to probe deeper into the finding that the collective gestural synchrony was less precise than the sound onset synchrony at the intra-group onset (see moment [ii] in 8.3). I wondered whether choristers were ‘counting in their head’ during the silent pause based on the metre of the previous half-verse in order to start the next phrase together in synchrony. If this was the case then it might offer another explanation by which collective vocal onset synchrony is possible in Gregorian chant, even though gestural signals are likely to play a role in synchronisation given the amount of collective gestural synchrony the choristers often exhibited.

At the adjoining moment, where one choir takes over immediately after the other choir without overlap, it would make sense that the choir taking over needs to time their onset by perceiving the metre of the previous half-verse sung by the other choir, thus allowing them to anticipate the precise ending of the phrase. However, rather than examining both ‘silence’ and ‘adjoining’ moments, I focused on the intra-group ‘silence’ moments. This was because the conductor taught us to count ‘two clicks’ through the pause in order to synchronise the following onset, and I wanted to know what those two clicks refer to (see clips 4 & 6). This meant examining the pulse in the preceding sung half-verse before the period of silence, and the pulse during the period of silence itself. Although a comparison between inter- and intra-group moments would have been useful, the ‘adjoining’ gap of silence that comes at an

199

inter-group moment was too short to conduct a meaningful study for metrical perception of this gap. 8.5.1 Method

In order to perceive the metre I tapped along to each sung syllable during the half-verse, and then maintained the pulse (if there was one) into the silence until the choir sang the first syllable of the next half-verse in order to ‘measure’ the strength of the metre, in terms of whether the onset came on the downbeat, or another strong beat. I made all judgements intuitively based on my previous choral experience (see 8.4.3). The downside of intuitive judgement is that the criteria are not prescriptive; the upside is that for ‘special cases’ I can make a judgement without needing to contravene a particular system of criteria. For similar reasons, Lucas et al. (2011), who were analysing group drumming in real-life contexts (see Ch. 7.5), measured ‘perceived tempo [by tapping] rather than by drum onsets, which were impossible to extract from these recordings because each track contains the sounds of many drums playing simultaneously’. In this study there were also multiple choristers and they did not have individual microphones, so I too decided that I would tap to the metre of the collective chanting as I perceived it, and make a subjective judgement about each half-verse’s overall metrical consistency (see 8.5.2 for a test of my accuracy). Even though this may seem unsatisfactory, the fact that conductor told the choristers to count two ‘clicks’ through the pause (implying that they should maintain their perception of the pulse in the previous half-verse), and that I am an experienced choral singer, would suggest that perceived metre was an appropriate measure in the context of this study. In the Lucas et al. (2011) experiment, it was the individual most-experienced with the repertoire who did all the tapping, which was Lucas, so I also believe that having only a single person’s perspective is sufficient, but obviously not ideal.

In general terms, for strong ‘metrical consistency’ I listened for the presence of a regular isoperiodic pulse, and a metre (e.g. 4/4 or 3/4) if there was one; and for weak metrical consistency I listened out for both a lack of isoperiodicity and a highly changeable pulse, where the pulse seems to change several times over the course of the half-verse. If a sung half-verse had a perfectly regular pulse for its entire length

200

then it would be given a score of 1, if it had a pulse that was constantly variable then it would be given a score of 5.

For the second judgement, and in the context of the pulse of the sung half-verse which often had a metrical feel, I finger-tapped through the silence to determine how ‘metrical’ the silence was by judging whether the first syllable of the next half-verse felt like it started on a ‘strong’ beat metrically. If the first syllable of the next half-verse landed precisely on a strong ‘downbeat’ I would give it a score of 1, and if it landed imprecisely on a weak beat, then I would give it a score of 5.

One challenge was that the periodicity often changed during each sung half-verse and therefore it was difficult to know how to judge the metre of the silence. For example, the cadential melodic formulas at the end of the sung half-verse (see 8.1 above) tended to slow the tempo of the reciting tone, and did not always have a regular pulse even if the reciting tone itself had been isoperiodic. The way I coped with this difficulty was to continue tapping the pulse of the reciting tone (from whenever the last section of stable periodicity started) through the cadential formula and silence as best as I could; but, of course, this may not be how the choristers do it. However, I had to assume that that choristers were recalling the previous general tempo of the reciting tone, as opposed to the tempo of the few syllables associated with the cadential formula, in order to make sense of the metre of the silence.

8.5.2 Pilot test

In order to compare my overall subjective judgement of ‘metrical consistency’ with the specific timings of each syllable, I isolated a few examples from the ‘Easter Sunday’ trials (see Extracts 1-3 of App. 3) and used the software program ‘Praat’ to determine the specific timings. Unfortunately, it was not possible either to determine periodicity using the automatic beat detection feature on the program ‘Sonic Visualiser’, or determine where the syllable onsets occurred on the intensity (dB) graph in Praat. However, it was possible to slow down the audio files sufficiently for each example on Praat in order to record the timings of the syllable onsets. This method too required me to use subjective judgement, and was therefore not the objective verification I wanted, but the process was comparatively more accurate due to the slowed-down audio file.

201

I recorded the timings of each syllable during the phrase and, by calculating the time intervals between the syllables—i.e. the inter-onset intervals, or ‘IOIs’—I determined the mean and standard deviation for each set of IOIs. Prior to using Praat, I had subjectively rated these three sung half-verse examples as ‘2’, i.e. almost isoperiodic. The three examples had standard deviations of 12ms, 9ms, and 10ms, respectively. The consistency between the three standard deviation values show that my subjective ratings for regularity of pulse in the sung chant were consistent with the fairly precise timing data I was able to obtain, albeit subjectively.

In order to test the accuracy of my subjective scores relating to the ‘metrical consistency’ of the silent pause, I measured the duration of time between the onset of the final syllable of the sung half-verse and the onset of the first syllable of the next half-verse, and divided this time interval by the mean of the previous set of IOIs, which I took as an indication of the ‘beat’. The subjective non-Praat ‘silence’ scores I gave for the three examples were 2, 4, and 4, and the number of beats calculated for each silence was 6.5, 6.9 and 6.2. Each sung half-verse example had a different number of syllables (9, 12 and 13), so I rounded the silence beats to 7, 7 and 6 respectively, and added the half-verse and silence numbers together (i.e. 9+7, 12+7, 13+6), which resulted in 15, 19, and 19. The sung text seemed to be in 3/4 metre in the first example, and 4/4 metre in the latter two example. Applying the respective metres to each of the total beat values of 15, 19, and 19 revealed that the first example fell on the strong beat 1 of 3, and the other two examples fell on the weak beat 3 of 4. This fitted with my judgement of 2, 4, and 4 for the silence scores. It seems from the above pilot tests that although my non-Praat subjective methodology could be said to be flawed in the sense that it was subjective and only had one person (myself) making the judgements, an analysis using Praat showed that it was a relatively reliable method of scoring ‘metrical consistency’ at both points.

8.5.3 Main results

Overall, the average perceived value of metrical consistency during the sung half-verse was 2.58 (across 124 trials), the average perceived value of the ‘metrical consistency’ through the silence before the following half-verse’s onset was 3.09 (across 124 trials), and a t-test between these two values showed that they were significantly different (2-tailed, unequal variance, n=124, p=5.90E-05). Therefore, when a choir was singing one would be more likely to perceive a metre, but less

202

likely to perceive the onset of the next half-verse as falling on a downbeat if one were to carry the previously-established metre through the silence.

As mentioned above, the conductor suggested to the choristers that counting ‘two clicks’ in one’s head during the silence was a good way of knowing when to start together in synchrony after a silent pause. It would seem from these findings that counting two clicks may not be the most reliable way of achieving this, given that the mean of 2.58 represents a half-way point between isoperiodicity and aperiodicity, and therefore it could be said that the sung half-verse rarely exhibited stationarity (see Ch. 1.2). This meant that in the average verse there is likely to be no established metrical framework to which the ‘two clicks’ could refer.

The average values for perceived metrical consistency throughout the week are displayed in Table 8.7 below, but there are no clear trends either within or across the three different choirs.

Table  8.7                  Day/Choir  

Mon  I  

Thu    I  

Fri    I  

  Mon    II  

Fri    II  

  Sun    U  

No.  of  trials   9   26   13     14   40     21  

  Average  score  (range  1-­‐5)  

Metre  in  sung  half-­‐verse   2.50   2.88   2.96     2.61   2.25     2.64  

Metre  through  silence   2.40   3.13   2.58     2.82   3.23     3.62  

8.6 General Discussion

This chapter started with a couple of observations that Gregorian chant was ‘characterised by a free-flowing rhythm' (Chen, 1983:86) and that it had ‘no beat’ (Hiley, 2009:xvii). I would argue that this study shows that these statements are both true and not true at the same time. It was found that on a subjective rating scale of 1 to 5, the perceived metre of Gregorian chanting was 2.58; i.e. roughly half-way between perfect isoperiodicity and a total lack of isoperiodicity. The finding also tallies with the consensus among the choristers that each syllable in Gregorian chant should have the same duration, hence a regular pulse, but that the chant should also

203

reflect the ‘free-flowing’ spoken rhythm of Latin, hence an irregular pulse. Thus, the findings of this study and those of another joint speech entrainment study suggest that entrainment can occur without a regular periodicity which implies that the current definition of entrainment as dependent on periodicity needs to be modified (see Chs. 1.2 & 7.6., and Cummins, 2002, 2009, 2013; see also London, 2012, on non-isochronous metres in Ch. 3.2.1; it is also worth saying that the degree of isochrony demonstrated in the video clips in Appendix 3 is not representative of the total data corpus).

This finding also fits Crocker’s view, from a musicological perspective, that a pulse exists in Gregorian chant, but one that is weak and constantly varying (see 8.1 & Ch. 3.2.1, Crocker, 2000:44,53). Crocker’s suggestion that the rhythm of Gregorian chant shows evidence of grouping might offer a possible explanation for this kind of half-regular pulse, and he points out that ‘long before European musicians developed a notation for metre, they had a notation that showed various kinds of grouping of the pitches used in Gregorian chant’ (Ibid. 45).

Crocker argues that in modern-day performance ‘the most basic way in which the pitches of Gregorian chant are grouped is by the Latin syllables to which they are sung’, and that although very few people speak Latin, ‘the sound [and therefore the rhythm] of Latin syllables is familiar in European languages, including English, many of whose syllables are derived from Latin’ (Ibid. 45-6). It makes sense that if you are chanting in a stylised form of speech rhythm then syllables with long vowels will tend towards a longer duration than syllables with short vowels—e.g. ‘o’ in ‘note’ rather than ‘not’—but it would seem that Crocker does not just mean distinguishing syllable-to-syllable between long and short syllables, but grouping multiple syllables together. However, it is unclear whether Crocker’s ‘groups’ refer to groups of syllables, or groups of words, and he does not suggest systematic criteria by which to test his theory. In this case, perhaps empirically testing the notion of rising and falling groups of two or three syllables associated with the ‘figural’ interpretation in 8.1 might be one means of clarifying the debate surrounding rhythm in chant.

The beginning onset of a verse of psalmody was chosen as a focal point by which to investigate group entrainment behaviour from different angles. Gestural analysis showed that choristers looked at their sheet music in 97% of all the onset moments analysed; i.e. any visual perception had to be peripheral. Their primary visual

204

gestures for temporal coordination were head nod gestures (62% of onsets) and visible breaths (76%). Head nods are at eye level and therefore probably more effective in visual communication, and they are also more instrumental, given that visible breaths are a relatively natural result of breathing. The ‘gross’ torso gestures did not occur as frequently as the ‘fine’ movements of head nods across all choristers, and also hands in the case of the conductor (see Ch. 7.2.1 for definition of ‘gross’ and ‘fine’ gestures). This is probably because ‘fine’ gestures are more temporally precise and therefore more useful for the purpose of coordination. Having said that, the head nod movement was often accompanied by a ‘gross’ backward/forward and/or up/down movement of the torso in a fifth of the trials.

Head nodding also seemed to increase as the week went on, given that the highest frequency of head nods occurred on the final day of Holy Week, in 87% of onsets. The same was true for visible breath gestures, which, on the final day, occurred in virtually every silence and adjoining moment. As we will see below, these figures correlate with an improvement in vocal onset synchrony too. However, it is hard to say with confidence in both cases that this was due to the cumulative result of repeated rehearsal, rather than the fact that the choirs only split into upper and lower voices on the final day. For example, Chorister G, one of the leaders in Choir I (see. 8.4.2), had moved into the upper voices choir, thus combining with the leaders A & D in Choir II, resulting in a higher proportion of choristers who gestured frequently in the upper voices choir (3 out of 5 choristers).

Indeed, there was extreme variation between individual choristers in how often they nodded their head (14-94%) and significant variation in how visible each chorister’s breath movements were (52-91%). Gesturing patterns were therefore not unanimous—choristers did not all move in the same way at onset moments. The conductor performed similar gestures with a similar frequency as compared with the choristers, apart from a higher frequency of up/down torso movements, and conductor-specific hand and mouth movements. Interestingly, his movements were generally larger and more ‘directive’ than the rest; a detail which was hidden by the data, which only said whether a gesture happened or not (see clip 11). However, it was hard to make generalisations about the conductor’s influence due to the fact that he appeared in a small number of data trials in comparison with the singers, and often the conductor and singers were not in the same video frame.

205

In terms of leading others with visible gestures, there seemed to be roughly two ‘leaders’ per choir who consistently made the first gesture at onset more than others, which seemed to be roughly correlated with what some of the choristers said about their own and each other’s confidence and competence (see interviews in 8.2.1 and App. 2). By contrast, some of the choristers said that they ‘followed’ whoever they happened to be sitting next to, which would seem to challenge the idea that group hierarchy was determined exclusively by confidence and ability.

I found that my subjective ratings for collective gestures at onset were significantly less synchronous than for the collective sounds at onset. One must also take into account that if my ranges of subjective ratings had been scaled according to absolute values of asynchrony, then the difference would have been even more significant (see 8.4.3). This finding shows that there is more collective asychrony between the individual gestures of choir members at onset than between their corresponding sounds. Having said that, for the majority of onsets there did seem to be a tendency, perhaps even a need, for all choristers to contribute gesturally in some way in order that the group coordinated. These gestural contributions may either have been for the purpose of actively negotiating with others in order to synchronise the onset, but they may also have played a more confirmatory function, i.e. letting others know that temporal cues had been perceived, which could explain why collective gesturing was often asynchronous. However, this specific explanation was not tested in this study. Either way, this dynamic process has a very short duration (c. <1sec) and therefore it is hard to know how conscious individual choristers are of their own actions and how they relate to the actions of others.

One interesting finding in the individual gesture analysis was that choristers nodded their head more frequently in ‘adjoining’ onsets than ‘silence’ onsets, which suggests that inter-group coordination with no gap of silence requires more gesturing than intra-group coordination with a two-second gap of silence. This perhaps relates to the findings that ‘adjoining’ moments had higher collective gesture synchrony scores, and that ‘silence’ moments showed significantly higher collective vocal onset synchrony scores. In 8.4.3 I argued that both these findings might be due to the varying audibility of breathing in the different kinds of onset. Another reason might be that the cadential formula at the end of the phrase just before the ‘adjoining’ onset is not rhythmically predictable, because it sometimes maintains the tempo of the rest of the half-verse and sometimes slows down in

206

tempo. This variability in tempo, coupled with the inaudibility of the breathing of the fellow choristers, might mean that the choir coming in would need to compensate at ‘adjoining’ moments for the fact they don’t know exactly when the other choir will finish, by performing more frequent and more synchronous coordinating gestures (e.g. head nods). However, the ‘silence’ moments are also affected by the variable tempo of cadential formulas (see 8.5.1), and we do not know how audible breathing affects entrainment, so it is quite possible that both explanations of the above findings are not viable.

As mentioned above, I found that some form of ‘half-regular pulse’ exists in the average half-verse of Gregorian psalmody. Given that there was a significant disparity between the collective gesture and vocal onset synchrony scores, I wondered whether choristers were coordinating at ‘silence’ onsets by counting a metrical pulse that they had established during the previous half-verse through the silence, with a view to coordinating the following vocal onset by coming in on an imaginary, but shared, strong metrical beat. I found that perceived metre was more ‘metrical’ during the sung half-verse rather than the silence, which would suggest that onset synchrony is not fully dependent on metrical perception, or ‘counting in your head’ throughout the silence.

8.7 Future Directions

The opportunity of spending a week with a choir and singing with them had advantages; for example, because many of the choir did not regularly perform Gregorian chant, they were learning how to chant at that very moment and so were able to give explicit accounts in their interviews, as opposed to them being so proficient that they were not very conscious of how they did it. Also, as a participant observer, I had myself been immersed in the group dynamics, and both this and my getting to know each individual was helpful when it came to interviewing them.

However, due to the ad-hoc nature of an empirical study in a real-life performance context as a participant observer, there were some weaknesses to this study. For example, it is clear from the lack of detail afforded by a single collective gestural synchrony score that more detailed quantitative work needs to be done, which was restricted by the fact I was not allowed to use motion capture equipment in the religious context even if I had been able to acquire it.

207

An example of this quantitative work might be examining whether the timing of each chorister’s gestures have a predictable relationship with the sounds they make, such as discovering whether a particular chorister’s head nod always coming at a certain time-interval before their vocal onset. One could then compare the timing of each chorister’s gestures and sounds with those of the other choristers. In order to test this correlation, one could perform a multiple regression, taking the gestural onsets as the predictors and looking at the extent to which they predict the timing of the vocal onsets, or, indeed, vice versa. If it turned out that the group of choristers all share a similar intra-individual timing profile then there might be some predictability to the way that choirs negotiate an onset gesturally; i.e. if the whole group waits until the final person has made their ‘confirmatory gesture’ (see 8.6 above) then each individual in the group would know roughly how long to wait before making their sound, based on the average intra-individual time delay between gesture and sound. One is also likely to find that the more synchronous the collective gesture is the more the complexity of the intra-individual timing interactions are reduced, making it easier to synchronise the onset. Furthermore, it might also be useful to investigate the spatial characteristics of each individual’s gestures, because these might have a subtle influence on entrainment. Investigating these questions above would require motion capture equipment and individual microphones for each chorister, and would therefore need to be performed in a controlled laboratory context.

Another aspect of entrainment in psalmody that will need to be investigated is the top-down ‘emergent structure’ of subjective metre perception within the group (see Ch. 5.5). Given that the pulse of Gregorian psalmody is fairly irregular it offers a lot of scope for interpretation, and therefore it is important to know the extent to which a group of choristers agree on a ‘shared’ pulse. If there was significant disagreement then one could perhaps discount the hypothesis that perception of a shared pulse in psalmody is necessary for entrainment; indeed, we already know that a pulse is not needed for two speakers to synchronise their speech (see Ch. 7.6). Testing the perceived pulse in a live performance of psalmody would have involved asking each and every chorister to ‘tap’ the beat on MIDI drums whilst singing (see Himberg, 2013). However, tapping during performance might disrupt natural singing behaviour and therefore it might be better to ask the group of choristers to tap along to a recording of their own psalmody outside of the performance context. Alternatively, metre perception could have been tested using a more ‘objective’

208

method such as EEG to explore the temporal course of brain activity; although, it might not be possible to abstract information about metrical perception and production from the noisy data that would inevitably result, given that EEG traces would be a mixture of motoric and perceptual signals.

By measuring metre perception by either tapping or EEG methods I might have also been able to test my assumption that choristers carried on counting the pulse established in the reciting tone just before the cadential formula through the silent pause (see 8.1), by seeing how the choristers tap through the reciting tone, the cadential formula and the silent pause. I had to make this assumption because the slowing-down of tempo at cadential formulas at the end of each half-verse made it difficult to tap a metre through the silence. It was a reasonable assumption because otherwise it would be difficult to explain what the ‘two clicks’ that the conductor suggested the singers count in their minds during the pause refer to. It may be that the ‘two clicks’ are related to the duration of the inhalation of breath, but the duration is most likely not very consistent. Another explanation might be that singers memorise the duration of the ‘two clicks’ to a high degree of consistency, no matter what the pulse of the previous phrase might have been; although this is very difficult to do, and unlikely to be very precise among a number of singers. Furthermore, whether or not the first syllable after the pause is accented or unaccented may affect when the singers time their onset; e.g. they may start singing earlier after the pause if there is one or more unaccented syllables coming before an accented syllable at the beginning of the next half-verse. However, if these possible explanations were explored further, then the findings in this study might potentially gain more weight, and whilst my study may have been simplistic in this regard, it did nevertheless show that onset synchronisation is helped by a number of factors, including visual gestures.

Of course, investigating synchronisation from the point of view of periodic or metrical organisation may be the wrong way of approaching this kind of ‘imprecise’ entrainment, and Van Noorden (pers. comm.) has suggested that it might be better to investigate how close the tappers are to each other, rather than the pulse. I was doing this when I investigated the collective gestural synchrony at the onset, but this was only at an isolated time point rather than a whole phrase. Thus, in addition to onset synchronisation, it may also be instructive to analyse whether certain syllables in the text are more synchronous than others, which might offer an approach into

209

understanding where the (possibly non-periodic) ‘pulse’ lies (see Ch. 3.2.2). Widdess (pers. comm.) has also suggested that heterometric organisation may be relevant for certain musical traditions, where there is a regular pulse, but at the next level of organisation the grouping of beats are not equal—e.g. a mix of twos or threes—so there is no downbeat coming round at regular intervals. Overall, future experiments might gain more from analysing interpersonal entrainment with an open mind as to whether the music is periodically, or at least metrically, organised.

Another aspect to explore would be the peripheral visual perception that choristers have of other choristers as they perform. For example, it would be useful to test the choristers’ observation that peripheral vision was most activated at ‘uncertain’ performance moments. However, measuring what a person is looking at out of the corner of their eye in a real-life group entrainment context is very difficult, given that looking at another chorister might move the eyes but not necessarily the head, and therefore head-cameras would not be suitable. A high-definition camera would be able to show eye movements, and possibly gaze direction, but analysing such data would be very laborious. Even so, it is necessary to investigate visual perception from the chorister’s perspective given that the view from a camera placed in front of the choristers is not representative of what the choristers themselves can see, and although motion capture equipment is very useful for studying gestures from the position of the observer, it tells us very little about what choristers are actually attend to in their vision.

The reason why peripheral vision is the only way choristers can see each other is because the choristers in this study, like most other choristers would have to, focused their gaze on their sheet music in virtually all onset moments because they were not able to sing the psalms off by heart in Latin. Unless choristers are experienced monks who chant the psalms throughout the year, it is unlikely one can find any for whom chanting psalmody from memory is normal. However, perhaps the dynamics of group coordination, especially in the visual domain, would be radically different if they did not need to look at their music? When singing from memory, unencumbered by an instrument or sheet music, eyes are freed for joint attending and hence the possibility for full, embodied interaction with others (Bithell, 2007:69). If one gave choristers a short psalm to memorise in order that they would be able to look around, one could determine which chorister is looked at most by the other choristers and, thus, the hierarchy of leadership, or, by contrast,

210

determine whether choristers look for an equal length of time at each individual in the group, demonstrating a leader-less organisation.

In addition to gesture and pulse, the other cue for vocal onset entrainment is audible breathing. As mentioned previously, by not having an audio track for each chorister it was not possible to determine whether they were breathing or singing, and, therefore one could not examine the intra- and inter-individual relationships between gestures, audible breaths, and sound. A combination of individual audio tracks and motion capture sensors would have allowed us to know whether a chorister may have gestured and not sung, sung and not gestured, breathed and not gestured, breathed and not sung, etc.. I would hypothesise that the correlation between audible breathing and vocal onset synchrony is not precise, because at the end of the audible intake of air the pause before the vocal onset may often vary in duration, and, therefore, although the audible intake might ‘prime’ the choristers for imminent sound, it in itself might not be enough to communicate the exact moment of onset (see also Apps. 2.3.4 & 2.4).

However, it may also be the case that the singers are entraining to the length of the breath; i.e. each sung phrase is roughly the same duration of several seconds as another sung phrase, because that is roughly how long a breath lasts (van Noorden, pers. comm.). In other words, one quickens/slows the tempo of each phrase depending on the number of syllables (more/fewer) to get through in each half-verse. Indeed, the ‘flex’ device mentioned in Ch. 8.1 gives credence to this idea, because it seems to be used to split a half-verse into manageable breath lengths. However, it is hard to see how entraining to the overall breath length (representing a higher level of entrainment) can produce the kind of precision of onset synchrony witnessed in this study, especially as breathing is a stochastic process influenced by many factors, and the silent pause did not seem to be regular in duration, and, furthermore, if the breath lengths are equal then it begs the question: why were some onsets more synchronous than others? Equally, entrainment at the level of the overall arc of the breath cannot explain at a lower level the precise syllable-to-syllable synchrony that is evident during the course of the sung phrase.

I would imagine that onset synchrony is achieved by a combination of audible breath cues and gestural cues. However, determining the precise correlation between audible breaths, gestures, and vocal onset synchrony might also be difficult to demonstrate given that in one individual alone, before the eventual sound onset, a

211

breath might come before a gesture, or vice versa, and these timing interactions would then interact with other choristers’ own breaths and gestures? To offer a full bottom-up explanation would require us to understand the relationship between every one of these interactions, but the field of entrainment studies has not yet proposed methods for examining the intricate workings of cross-modal networks of interactions, only methods for extracting single measures of group entrainment across the whole system of cross-modal timing interactions (see Ch. 6.4; although see Wing et al., 2014).

Another variable to test would be the size of the choir (see 8.2.1). The number of choristers in an ensemble is likely to radically change the dynamics of coordination, as discussed in the interviews in App. 2.6. Groups of different sizes could be tested under two experimental conditions, one where the group sings with a conductor, the other without. The dependent variable would be overall collective synchrony, and therefore could potentially be measured statistically using group order parameters that take data from individual audio tracks for each chorister (see Ch. 1.2). The benefit of using continuous statistical measures such as these would mean that one would not be restricted to focusing exclusively on onset moments. Having said that, these measures rely on stationarity in the timing data, which may not be appropriate for the irregular pulse of Gregorian chant, and therefore new statistical methods need to be developed that do not rely on stationarity (see Ch. 1.2). However, in musical contexts that exhibit stationarity this experiment would explore two questions: [i] what is the effect of being conducted, or not being conducted, on collective synchrony?; and [ii] how large can a group become before a conductor is required? The influence of the conductor is particularly relevant, and yet this study did not investigate this important aspect of group synchronisation to an acceptable degree of rigour, because the conductor wasn’t in the frame often enough in my video data. For choirs experienced with Gregorian chant it is common for there to be no conductor, but I would imagine that when a conductor does gesture, it is likely that he has an important influence, and that this influence varies on a phrase-to-phrase basis.

212

8.8 Summary

This study has resulted in a number of provisional findings—provisional because they are primarily the result of my own subjective judgements. It would seem that Gregorian chant displays elements of both regular and irregular periodicity, reflecting a hybrid between even syllable rhythm and speech rhythm, yet, even so, choirs were able to entrain. I also found that the process of synchronising vocal onsets after the silence in the middle of a psalm verse is not dependent on counting a metrical pulse through the silence. These two findings imply that musical contexts which involve an irregular pulse can still be defined as exhibiting entrainment.

It was also found that visual communication had to be peripheral because choristers only looked at their sheet music at the start of verses, and that, at the onset of a verse, collective gesturing was less synchronous than the collective singing. These two findings suggest that vision is not the only form of coupling responsible for synchronous onset. Another potential aspect of coordination, audible breathing, was not investigated. In terms of group hierarchy, there seemed to be two ‘leader’ choristers per choir, determined by who made ‘leading’ movements in the largest number of trials. Thus, it would seem that, rather than being dependent on a single form of perception (e.g. vision, gesture, metre), entrainment in Gregorian psalmody emerges in part from various factors, such as top-down pedagogical instruction and the network of bottom-up interactions between multiple modes of communication.

213

Chapter 9 - Conclusions 9.1 How does Gregorian chant fit within this thesis?

‘Feelings in common are expressed through actions in common’ (Durkheim, 1995:390).

This quote by Durkheim has been presented twice already, but I place it here

again because it is the central pivot of this thesis. The first tenet of this thesis is that by acting in common a group of individuals feel like they ‘belong’ to each other, and one way of understanding ‘acting in common’ is ‘acting in synchrony’. The second tenet of this thesis is that by exploring how it is possible for a group of individuals to act in synchrony or ‘entrain’, we can gain more of an understanding of feeling and acting and common, both at the level of the individual and at the level of the group.

The first half of this chapter integrates the most salient implications of this thesis within the context of Gregorian chant. The second half of this chapter identifies fruitful future directions in the fields of group singing and group entrainment.

I started in Chapter 1 by introducing various concepts that help to contextualise

Gregorian chant. Gregorian chant is an activity that involves ‘entrainment’, the process by which independent rhythmical systems (e.g. the singers) interact with each other. It also has a high degree of melodic and rhythmic repetition, which formed the basis of my distinction between chant and song. Gregorian chant is also a transregional ‘music’ (cf. Slobin), but one which expresses significant ‘local’ variation between different ‘schools’ of chanting. In terms of Turino’s distinction between participatory music-making, which aims to allow as many people to perform as possible, and presentational music-making, where a select group provides music for an audience, Gregorian chant is a bit of both, but on balance more presentational; in particular, my empirical focus on the degree of onset synchrony derived from the ‘presentational’ requirement of precise synchrony (see Ch. 1.5.3). Finally, Gregorian chant is firmly embedded within the larger context of communal ritual: it is an activity that conforms to, and maintains, a shared code of ordered behaviour; it creates social continuity through the repetition of liturgy; it relates the community to that which it holds sacred; it provides a collective focus that brings the worshipping community together. However, in comparison with the Amazonian Suya Indians,

214

Gregorian chant is less about relating the community to its immediate physical environment than to its more abstract ‘environment’ of liturgy, theology and scripture.

In Chapter 2, two aspects that seemed to be associated with almost all of the

traditions I looked at were the ‘together the one’ experience—the experience of feeling ‘sameness’ with a group of people, even if some of those people are outsiders to the group—and the notion that the singing did ‘work’ (in a mental, physical, and spiritual sense). In the choral form of Gregorian chant, individual singers have to entrain with other singers, and thus, according to the interviews and from my own personal experience, this fundamental pull towards entraining with others led each individual to feel unified with the larger whole of the choir itself, and occasionally even at one with God (see Ch. 2.3). The point here is that no matter how much an individual singer is focusing on the textual/liturgical aspect of Gregorian chant, they cannot avoid moving and singing with their fellow chanters, which is arguably as powerful, or even more so, for forming community.

Chapter 3 argued that group entrained action embodies a blurring of an

individual’s boundaries between self and other because their actions are performed in alignment with others. Musical activity is a particularly strong example of entrained activity, and periodicity of pulse in music, often associated with the psychological concept of ‘metre’, can be understood as having a structuring influence on the dynamic mental process that makes entrainment possible. However, in the context of Gregorian chant, it was found that there was a tension created by the need to be both faithful to the speech rhythm of the text (i.e. non-periodic temporal organisation) as well as to chant the text in unison with each other, which would ideally require a periodic pulse.

I found in the study of Ch. 8 that the pulse of Gregorian psalmody was rarely periodic, and observed that if entrainment is based on regular periodicity, then it should be, in theory, impossible to entrain with others when singing Gregorian chant; yet it was found to be possible. This is consistent with other chanting traditions with irregular periodicity such as Gyantse Buddhist chant in Tibet and paghjella singing in Corsica (for other examples see Ch. 3.2.1). Indeed, the synchronous vocal onsets that were observed after the silent pause in the middle of psalm verses did not appear to be dependent on counting a metrical pulse through

215

the silence, as was taught by the conductor. This finding is reminiscent of the striking example of Malaysian Choral Speaking, where the synchronous vocal onsets of choirs of ‘speakers’ do not seem to need any form of regular pulse (see Ch. 6.1 & App. 1.43). As mentioned in Ch. 8.6, the implication of these findings is that the definition of entrainment needs to be modified to include contexts which involve an irregular pulse. Consequently, we need to develop new statistical methods for analysing entrainment that do not rely on stationarity (see Ch. 1.2).

However, the finding that a periodic pulse is not required for entrainment only

makes things more complicated because in Chapter 4 I described how music’s ability to organise entrained actions through a periodic pulse (amongst other ritual factors) creates the conditions for stable interaction that is required to manage social uncertainty, such as that experienced in the Suyán initation rite. This kind of pulse was argued to facilitate the music-making that drives certain social contexts associated with both Durkheim’s concept of collective effervescence—characterised by intense collective emotion that is generated both in social contexts in which new forms of social structure can be created and in contexts in which a community is reminded of its moral code—and Turner’s concept of communitas—a transformative experience that reveals to a group of individuals something that is profoundly communal and shared between them (see Ch. 4.2, and cf. the ‘together the one’ experience).

Of course, one would be hard pushed to say that Gregorian chant creates feelings of intense collective emotion in the manner being described by Turner and Durkheim. Indeed, the more presentational requirements—reading the music (and thus standing still); needing to be precisely in synchrony with each other—alongside the slowed-down breathing due to the relatively long verses, are more likely to make Gregorian chant a ‘very effective crowd-sedating mechanism’, rather than a generator of effervescence (Van Noorden, pers. comm).

Indeed, one might argue that the move away from bodily dancing towards a kind of ‘dance of the mind’ at the time of Saints Ambrose and Augustine and (d. 397 & 430AD)—presumably with the view to focus more on the Latin text of the liturgy—led to performance being governed more by non-isochronous speech rhythm than the more isochronous pulse usually associated with dance, and thus less enthusiasm (see McNeill, 1995:75-77). The point of this historical digression is that in order to understand contemporary chanting practice it may be necessary to understand the

216

history of the cultural/religious/philosophical trends associated with a given chanting tradition.

It was also argued in Ch. 4 that metre, which operates on many hierarchical levels, also allows for individuality and collectivity to be in tension simultaneously by giving individuals a shared framework to align their actions with, whilst also giving them space for rhythmic self-expression (cf. Suyán Indian akia songs). However, Gregorian chant, in which everyone is expected to make the same sound at the same time—i.e. to blend together as one voice—does not allow for this expression of individuality, but nevertheless has a certain social ‘equalising’ effect. Furthermore, when individuals sing solo Gregorian chant, their singing is subject to certain rules of performance shared by all, to the extent that the opportunity for self-expression is minimal, and certainly not encouraged (in contrast with the Suyán akia song genre).

Gregorian chant is, however, similar to Suyán song in that it is often used in ritual contexts that involve social uncertainty, such as when the community, or an individual within the community, is undergoing change between two states of being (e.g. baptisms, weddings, funerals, taking communion etc.). It was argued that perhaps the purpose of music in these contexts is to bond the community with a feeling of ‘togetherness’ through entrained action, that then becomes the foundation of collective support for the individual and community in the new state of being, even after the ritual has finished. Having said that, Gregorian chant involves less communal participation than Suyán song, because it is usually performed by a select group of singers within the community (i.e. the choir), and the moments where all the congregation chant together do not usually last long enough to engender much of a sense of togetherness.

Finally, I also argued in Ch. 4 that music within ritual can have a powerful vitalising effect in both giving new life to old tradition as well as creating and supporting change within community. However, it is fair to say that Gregorian chant is more about preserving the old traditions than inspiring ecstatic worshippers to bring in revolutionary changes to the liturgy, or the way the community functions.

The juncture between Chapters 4 and 5 constituted a shift from seeing how the

‘together the one’ experience can be described from an anthropological perspective to how the experience is made possible by coordinated embodied action from a scientific perspective. Therefore, as well as studying entrainment in group singing,

217

this thesis has been about studying how two very different disciplinary approaches can inform each other, even though they offer different levels of explanation. This was necessary for situating my ‘real-life’ empirical study of entrainment within the wider cultural context of Gregorian psalmody. A precedent for this is Lucas et al.’s (2011) study in which the ethnographical finding that groups of musicians in the Congado ritual would consciously intend to avoid entraining with each other, and had particular strategies for doing so, framed the way the researchers analysed their timing data (see Ch. 7). This shows that when analysing a particular entrainment behaviour scientifically, it is important to understand the cultural context within which that entrainment behaviour is observed.

Chapter 5 attempted to integrate the approaches of sociology, complex systems

theory, psychology and animal behaviour for the purpose of exploring group behaviour. Group entrainment often manifests in systems that are very complex, with multiple individuals performing together in real-life contexts, such as Gregorian psalmody. In attempting some form of general explanation of how choral contexts coordinate multiple individual contributions, I argued that a bottom-up ‘sum of the parts’ systems approach, based exclusively on individual interactions between individual singers, is incomplete because performers also need to be guided by a top-down ‘over and above the sum of the parts’ understanding of how their individual role fits in to the group performance as a whole.

Potts’ ‘chorus line hypothesis’, which attempted to offer one ‘bottom-up’ explanation for a specific kind of collective synchronous movement based on the information passing from person-to-person, argued that members at the end of a human chorus line plan the synchronisation of their dancing movements by anticipating an incoming ‘maneouvre wave’ from other members; i.e. that some initial movement from one or a few individuals starts a very fast chain reaction which results in the process of synchronisation. However, this kind of explanation cannot explain unison movements or sounds that are too synchronous to be based on a fast chain reaction, such as the precise onset synchrony witnessed in Gregorian psalmody.

Furthermore, an examination of the real-life behaviours of animal flocking and schooling demonstrated that a bottom-up computational approach, based on local interactions through visual communication in birds and vision and pressure sensing in fish, is not wholly valid. Also, in both cases, auditory communication probably

218

cannot explain these behaviours either because of the complexity and multitude of the sound signals (see Chs. 5.2.2 and 5.2.3).

The combination of factors such as speed, complexity, diversity, leader-less organisation, and ‘scale-free’ correlations involved with these collective movements by animals suggest the presence of something like a ‘collective mind’. Indeed, in the context of human activity (although not necessarily human collective movement), there are occasions when people act as a group in such a way that individuals may say something like: ‘I do not know why I did that…it was if the group took over’. Of course, in all these group interactions, the top-down influence of ‘collective mind’ may have to be complemented by bottom-up coordination, in order that disturbances from individuals within the group do not cause the interaction to break down.

One might argue that exploring the entrainment process in Gregorian psalmody, which uses both bottom-up coordination (gestures/audible breath) and top-down coordination (rules for stylistically-appropriate performance/shared collective pulse) may provide a disciplinary focus for investigating the collective mind metaphor. This is because entrainment often involves a collective mental percept that is responsive to perturbation: the shared pulse. Indeed, the finding that no single ‘normal’ sensory channel could explain on its own how synchronous onsets in Gregorian psalmody were achieved would suggest that there is reason to speculate that something like a collective mind is operational (see Ch. 9.2). Of course, it is plausible that some form of integration between the normal sensory channels exists which is responsible for precise synchrony, and which does not require an explanation involving a new channel of communication (i.e. a collective mind). However, this was not tested because there is currently no model for analysing the network of cross-modal interactions in group entrainment (see Ch. 8.7).

Chapter 6 argued that shared aspects like the ‘beat’ or musical metre are most

accessible and measurable by investigating body movements, and therefore the ‘embodied’ approach is perhaps the most reliable empirical approach for studying entrainment; therefore, this was the primary focus of the study of Gregorian psalmody. Embodied coordination was separated into two forms: planned and emergent. Planned coordination refers mainly to models of top-down coordination termed ‘shared task representations’ that guide individuals in their predictions of the timings of other individuals, allowing them to entrain with those individuals. In

219

the context of Gregorian psalmody, planned coordination might refer to pedagogical instruction, whereas emergent coordination, its opposite, might refer to the network of non-linear, unpredictable, cross-modal interactions between individual singers, but there is currently no model for integrating top-down and bottom-up coordination (and see also Ch. 6.5 for a summary of problems associated with investigating both planned and emergent coordination). Although no integrative model was formulated in this thesis, the Gregorian case study was an attempt to integrate these two explanatory approaches to some extent by finding meeting points between what choristers say they do in interviews, and what they actually do in video data.

Chapter 7 focused on how emergent coordination is expressed through gestures

that are communicated via the visual channel, and how planned coordination is expressed by the way that group hierarchy affects coordination in musical performance.

Visual contact with moving bodies was argued to be useful in most forms of entrainment, particularly at stops, starts, and transitions within musical pieces, which is why visual gestures were investigated with regard to the onset moment in Gregorian psalmody. The auditory channel is also an obvious means of coupling, and, less obviously, the touch channel too, but these were not discussed. I found in my empirical study of Gregorian psalmody that each chorister’s visual communication was only peripheral, which is supported by Kawase’s finding that direct eye contact does not improve coordination as much as being able to see movement (see Ch. 7.3.2). Visual gestures in the form of head nods were less synchronous than the collective sound onsets, suggesting that collective gestural entrainment was not the only factor responsible for vocal onset synchrony; for example, it is very likely that audible breathing plays an significant role too. However, gestural entrainment would seem to be an important feature of other group contexts that exhibit precise musical entrainment; e.g. Malaysian choral speaking, Indian tanpura performance, and Sotho singing in South Africa (see Chs. 6.1 & 7.2.3). Nevertheless, it is impossible to isolate any single sensory channel as responsible for entrainment behaviour, because entrainment seems to combine multiple channels most of the time.

In terms of the top-down effect of group hierarchy on entrainment processes, certain performers are looked towards, listened to, or respected more than others in

220

most musical ensemble contexts. However, the Goebl & Palmer piano duo study suggested that even if musicians are told to act as either leader or follower they still interact mutually with each other, evidenced by a bi-directional timing relationship between their sounds; having said that, the study suggested a clear leader-follower relationship in the gestural domain. Certainly in the case of Gregorian psalmody, although some individuals were observed as having ‘led’ gesturally more than others, all singers led at some point; therefore the data in my study would support both of these findings. Whilst the power hierarchy would seem to play a significant role in group musical entrainment, mutuality of timing coordination can be observed in both intra- and inter-group contexts (e.g. the two different types of ‘moment’ in the Gregorian study), and it would seem that mutuality is a central aspect of entrainment behaviour. Overall, the results of my study—that come from the interviews with the choristers and the empirical video analysis of ritual performance—imply that group entrainment in Gregorian psalmody, and possibly entrained musical action in general, is the result ofan interaction between top-down pedagogical instruction and a network of bottom-up interactions between the various channels of sensory communication and embodied movements (see also Ch. 6.3). The next section offers potential directions that the field of entrainment research might take in investigating this interaction further.

9.2 Future Directions

The ‘together the one’ experience is central to this thesis, because it may be one of the main reasons humans make music together. More exploration of this form of experience is necessary for understanding the appeal of collective music-making worldwide. One reason for this is that the current focus on embodied action in the entrainment literature has led to a move away from the study of subjective feelings that result from entrained activity. Although many studies (see Ch. 3) have looked at the effect of entrained action on perception and cooperation, the tasks performed in these studies inform about the product of the experience, rather than the experience of being in synchrony itself. One might ask how essential entrained action is for the ‘together the one’ experience, or for communitas/effervescence etc.? In other words, what is the relationship between how ‘entrained’ participants are and how much they feel like a unified group, or an egalitarian group? And if there is a direct

221

correlation between the degree of entrainment between group’s members and their subjective experience of ‘oneness’, then does such entrained behaviour have to have a regular periodicity?

One approach to exploring these questions would be to interview and survey musicians about their experience of music-making. However, the passage of time between the experience itself (the performance) and the conscious recollection of that experience (the interview) would serve to limit the scope of the findings. On the other hand, exploring the subjective experience of musical performance in real-time in a laboratory context is optimistic, because the experience may not conveniently happen in artificial conditions. Whilst there may be many challenges to such a project, it is worth doing because the expressions of experience (e.g. body movements and sounds) only tell a partial story.

Whether or not periodicity is required for entrained activity is an open question, because although periodicity has been used to explain entrained action at various points throughout this thesis, my finding that Gregorian chant displays semi-periodic organisation creates explanatory problems (see Ch. 9.1), compounded by the fact that probably any musical tradition in which the transmission of text is paramount is likely to exhibit semi-periodic organisation (Clayton, 1996:330). In the Preface, I referred to entrainment as being the interactive process between rhythmical systems. This definition may be vague in not mentioning how periodic and stable the interaction has to be, but it does leave open the possibility that certain forms of music are more about the participants being entrained with each other, rather than the pulse. Thus, the pulse may make entrainment easier, but it may not necessarily be required.

Frigyesi (1993:78) has argued that if we ignore the issue of non-periodic musical traditions, then ‘we may overlook something that is deeply ingrained in and central to the conceptualisation of most music cultures’. However, exploring traditions that involve ‘free rhythm’, or at least the middle ground between regular periodicity and free rhythm, is not easy and we are far off understanding it. As yet, no specific methodology for understanding ‘free rhythm’ has been offered, even though many of the issues and problems associated with ‘free rhythm’ have been discussed (e.g. see Clayton, 1996; Frigyesi, 1993). Part of the difficulty with free rhythm is that it is metrically and periodically ‘free’ in too many ways to make such a methodology possible—e.g. it is often difficult to determine where beat one is, and sometimes each

222

beat in a bar is a different duration—and sometimes the structure of periodicity cannot be perceived anyway (see Clayton, 1996). In addition, no non-Western theories of free rhythm have been offered, and this lack inevitably adds a ‘Western’ interpretational bias to the project. Furthermore, the transcriptions that already exist for other musical traditions may not be reliable, given that they have not been made with reference to some form of a ‘universal’ theory of rhythm (Clayton, 1996:327).

One terminological challenge throughout this thesis was choosing whether to describe vocal performance as ‘singing’ or ‘chanting’. It would seem the boundaries are not easy to draw, particularly cross-culturally, given that there are so many factors involved. My working definition of chanting in Ch. 1.3.2 used the degree of repetition in parameters such as melody, rhythm, and text as the best criterion by which to define chant. However, this begs the question: how repetitive does a chant have to be before it can be called a chant? An answer would require a comparison of levels of repetition in traditions specifically termed ‘chant’ and more conventional forms of group song. However, this would rely on there being enough traditions that are clearly distinguished as ‘chant’ rather than song, and this is unlikely given the difficulty of finding an acceptable definition across cultures (Widdess, pers. comm.).

A lot of work has been done on understanding music that occurs in the ritual context. However, there seems to have been little work done on the precise relationship between the music and the ritual. The need to study this was highlighted by in Basso’s (1981) notion that we have been prevented from conceiving of the possibility that such a thing as a truly ‘musical religion’ can exist. Similarly, Seeger (1987:xiii) has suggested that it is just as important to study how music itself creates society as it is to study music’s place within society. One striking example of music having a direct impact on society is the institution of the ‘work song’, which has the function of helping groups of people to work better together; however, ‘work songs’ were an important omission in this thesis, and would be useful to explore in the future.

I would argue that this enterprise to study the powerful influence that song has on ritual and society, inspired by Basso and Seeger, also needs to place an understanding of musical interaction at its heart. This would require an inter-disciplinary approach. Accordingly, the fields of entrainment and ‘musical anthropology’, amongst others, should speak to each other. Ideally, a researcher would need to be fluent in both areas to give as rounded a picture as possible,

223

otherwise we are left with empirical findings that cannot be grounded in culture, or cultural context that tells us little of how that cultures manifests itself in real-time interaction. We also need more ‘hybrid’ studies, such as the Gregorian study in Ch. 8 and the Lucas et al. (2011) study, that are more rigorously experimental than many anthropological field studies, but more ‘ecologically-valid’ than laboratory studies. Such an approach, while imperfect, creates a more rounded understanding. And even though it is a big ask for researchers to become acquainted with both disciplines, it is necessary.

One theme that connects the anthropological and scientific discussions in this thesis is that of the ‘collective mind’ suggested by animal behaviour, and the ‘together the one’ experience in anthropology. It is hard to know whether the ‘together the one’ experience is referring to a ‘collective mind’, but both of these terms—one a scientific metaphor, the other an anthropological metaphor—are trying to describe something that seems necessary for explaining collective interaction in its various forms. However, should these terms be understood as merely metaphorical, or are they based in physical reality? Sheldrake (2003) argues that in the same way that magnetic or gravitational fields are metaphors for some kind of physical influence that cannot be seen, does the metaphor of a ‘collective mind’ stand for a similar kind of physical ‘consciousness’ influence? Only the effects of the magnetic field can be measured, not the field itself. Is the same occurring with a collective mind—that we can observe the effects of this ‘collective mental field’, e.g. the flocking movements of starlings, but not the ‘collective mind’ itself?

In terms of the implications of the Gregorian case study, the most obvious is that more work needs to be done on entrainment in group contexts. It is almost as if the field of entrainment studies is waiting until it is mature enough to cope with group interaction, but I think that it should just get messy now and start learning. The results and methodologies may not be perfect, but the complications that arise from studying groups can only serve to revitalise the field of entrainment study rather than diminish it. Furthermore, advanced forms of measurement and video analysis techniques do exist, as used by Cavagna et al. (2010a), that can monitor the movement of each individual starling in a flock of thousands, and entrainment research might benefit from adapting these techniques for the purpose of studying large groups numbers of humans (although for a cautionary note, see Ch. 6.5).

224

The following implications from the case study have already been discussed in more detail in Ch. 8.7, and therefore will be summarised briefly: [i] one could explore peripheral visual perception, particularly from the performer’s perspective, in order to know what performers attend to in their vision when looking at sheet music; [ii] one could analyse the hierarchy of leadership in more detail by analysing who looks at whom and for how long they look at each other when performing from memory (i.e. no sheet music) in an attempt to see who the leaders are, or if the interaction is characterised by a hierarchy at all; [iii] one could explore how the dynamics of musical interaction are affected by the number of performers in an ensemble; [iv] one could determine whether the timing of each individual’s gestures have a predictable relationship with the timing of the sounds they are associated with, because this might reveal something about how intra-group entrainment works; [v] one could analyse the network of timing interactions between gestures, audible breaths and sound onset in order to extend the current paradigm beyond investigating each coordination cue in isolation; [vi] one could investigate individuals’ subjective perception of pulse and metre, particularly in musical contexts where the periodicity is irregular, because this would determine whether or not it is appropriate to think of pulse and metre as shared ‘emergent structures’ in the minds of a group of individuals (see also Ch. 5.5).

The final implication of the Gregorian study is that it studied dynamic factors like collective gestural entrainment and metrical perception amongst other ‘static’ factors like types of gesture, leadership structure and pedagogical instructions, but it did not show how these interact. The distinction between planned (static) and emergent (dynamic) coordination is useful and necessary for delineating different forms of entrainment study, but we must now start to design studies that demonstrate how emergent and planned coordination both interact (see Chs. 5 & 6). From there we can start to build an integrative theory of entrainment processes. This will be a challenge, but one that is at the core of understanding how we act and feel in common with each other.

225

9.3 Concluding thoughts The most important message of this thesis is that being in time with each other, in singing, chanting and dancing, will always remain ‘the most powerful [surest, speedy and

efficacious] way to create and sustain a community that we have at our command’ (McNeill, 1995:150-51). And now, as our communities—which give guidance and meaning to people’s lives—continue to crumble, due to increasing urbanisation and technologisation, any community-forming techniques we have at hand (and voice) should be used to the fullest and most benign extent possible. We may worry that such a powerful technique might be used for violence (cf. the Nazis), but as so many people lack a sense of being part of a community nowadays, especially in big cities, we need to use its power for good, and one way to make people feel like they belong is to get them singing and dancing with others. This project will be met with resistance, however: in the West many of us now ‘cling strenuously’ to our individuality, and traditional (and harmless) forms of being together in time, such as folk singing/dancing, choral societies and chanting groups struggle to compete with the lure of modern entertainment (Ibid. 151). As mentioned in the Preface, sports like football often involve chanting together in the stands, but this is not always used for positive ends. The words and idealistic messages that our leaders will use over the coming years to inspire us to do good would undoubtedly benefit from being communicated in conjunction with some form of singing, chanting or dancing in time together. In such cases, the more tricky parts of the leader’s message will be absorbed by the communal warmth of feeling, and obstacles will be hurdled. As McNeill (Ibid. 155) argues: ‘The shared euphoria aroused by keeping together in time is intrinsically diffuse, without definite external object or significance. Ideas and words can therefore turn the warm sentiments of group solidarity it arouses in many different directions’.

Words may not be enough to inspire real change, but we have the tools—we just need to use them, making sure we do so for the right reasons, and in the right way.

226

Bibliography Abram, D. (1997). The Spell of the Sensuous. Vintage Books. Acebrón, J., Bonilla, L., Vicente, C., Ritort, F., & Spigler, R. (2005). The kuramoto model: A simple paradigm for synchronization phenomena. Reviews of modern physics, 77(1), 137–185. Agawu, K. (2003). Representing African Music: Postcolonial Notes, Queries, Positions. New York: Routledge/Taylor & Francis. Albert, E. (1972). Culture pattering of speech behaviour in Burundi. In J. J. Gumpertz & D. Hymes (Eds.), Directions in sociolinguistics (pp. 72-105). New York: Holt. Anshel, A., & Kipper, D. (1988). The influence of group singing on trust and cooperation. Journal of Music Therapy, 25(3), 145–155. Arom, S. (1991). African polyphony and polyrhythm. Cambridge, UK: Cambridge University Press. Aroui, J.-L. (2009). Introduction: Proposals for Metrical Typology. In J.-L. Aroui & A. Arleo (Eds.), Towards a typology of poetic forms: From language to metrics and beyond (pp. 1-42). Amsterdam: John Benjamins. Arvaniti, A. (2009). Rhythm, Timing and the Timing of Rhythm. Phonetica, 66(1-2), 46-63. Augustine (trans. H. Chadwick). (2008/397). The Confessions (Oxford World Classics). Oxford: Oxford University Press. Bahuchet, S. (1995). De la musique considérée comme une philosophie (chez les Pygmées Aka de Centrafrique). In V. Dehoux et. al., (Eds.), Ndroje balendro, Musiques. terrains et disciplines. Textes offerts à Simha Arom (pp. 57-65). Paris-Louvain: Peeters. Bailey, T. (Ed.). (1979). Commemoratio brevis de tonis et psalmis modulandis: Introduction, Critical Edition, Translation. Ottawa. Bailey, B. A., & Davidson, J. W. (2005). Effects of group singing and performance for marginalized and middle-class singers. Psychology of Music, 33(3), 269-303. Ballerini, M., Cabibbo, N., Candelier, R., Cavagna, A., Cisbani, E., Giardina, I., Lecomte, V., Orlandi, A., Parisi, G., Procaccini, A., Viale, M., & Zdravkovic, V. (2008a). Interaction ruling animal collective behaviour depends on topological rather

227

than metric distance: Evidence from a field study. Proceedings of the National Academy of Sciences, 105, 1232-1237. Ballerini, M., Cabibbo, N., Candelier, R., Cavagna, A., Cisbani, E., Giardina, I., Orlandi, A., Parisi, G., Procaccini, A., Viale, M., & Zdravkovic, V. (2008b). Empirical investigation of starling flocks: a benchmark study in collective animal behaviour. Animal Behaviour, 76, 201-15. Balzer, M. M. (1997). The Poetry of Shamanism. In J. Leavitt (Ed.), Poetry and Prophecy: Cross-Cultural Perspectives on Inspiration and Verbal Art (pp. 93-127). Ann Arbor: University of Michigan. Baron, R. M. (2002a, October). A dynamical systems perspective on individual–group relations: Theoretical and applied considerations. Paper presented at the meeting of the Society for Experimental Social Psychology, Columbus, OH. Baron, R. M. (2002b). Exchange and development: A dynamical, complex systems perspective. In B. Laursen & W. G. Graziano (Eds.), Social exchange in development. New directions for child and adolescent development (pp. 53–71). San Francisco, CA: Jossey-Bass/Pfeiffer. Barry W., Andreeva B., & Koreman J. (2009). Do Rhythm Measures Reflect Perceived Rhythm? Phonetica, 66(1-2), 78-94. Bashkow, I. (2004). A Neo-Boasian Conception of Cultural Boundaries. AmericanAnthropologist, 106(3), 443-458. Basso, E. B. (1981). A ‘Musical View of the Universe’: Kalapalo Myth and Ritual as Religious Performance. The Journal of American Folklore, 94(373), 273-291. Basso, E. B. (1985). A Musical View of the Universe. Philadelphia: University of Pennsylvania Press. Basso, K. (1970). To give up on words: Silence in the western Apache culture. Southwestern Journal of Anthropology, 26(3), 213-230. Bateson, G. (1972a). Style, Grace, and Information in Primitive Art. In Steps to an Ecology of Mind (pp. 128-52). New York: Ballantine. Bavassi, M. L., Tagliazucchi, E., & Laje, R. (2013). Small perturbations in a finger-tapping task reveal inherent nonlinearities of the underlying error correction mechanism. Human Movement Science, 32(1), 21-47.

228

Beck, R. J., Cesario, T. C., Yousefi, A., & Enamoto, H. (2000). Choral singing, performance perception, and immune system changes in salivary immunoglobulin a and cortisol. Music Perception, 18(2), 87–106. Becker, J. (1994). Music and Trance. Leonardo Music Journal, 4, 41-51. Becker, J. O. (2004). Deep listeners: Music, emotion, and trancing. Indiana University Press. Benedictines of Solesmes, (Ed.). (1950). Liber Usualis. Tournai: Society of St. John Evangelist. Benedictines of Solesmes, (Ed.). (1962). The Liber Usualis, with Introduction and Rubrics in English. Tournai, Belgium & New York: Desclée. Bernardi, L., Sleight, P., Bandinelli, G., Cencetti, S., Fattorini, L., Wdowczyc-Szulc, J., Lagi, A. (2001). Effect of rosary prayer and yoga mantras on autonomic cardiovascular rhythms: comparative study. British Medical Journal, 323, 1446-9. Bithell, C. (2007). Transported by Song: Corsican Voices from Oral Tradition to World Stage. Lanham, MD: Scarecrow Press. Blackburn, S. H. (2010). The Sun Rises: A. Shaman's Chant, Ritual Exchange and Fertility in the Apatani Valley. Leiden: Brill. Blackwell, T. & Young, M. (2004). Self-organised music. Organised Sound, 9(2), 123-136. Bloch, M. (1974). Symbols, Song, Dance, and Features of Articulation: Is Religion an Extreme Form of Traditional Authority? European Journal of Sociology, 15, 55–81. Bloch, M. (2002). Are religious beliefs counter-intuitive?. In N.K. Frankenberry (Ed.), Radical Interpretation in Religion (pp. 129-46). Cambridge, MA: Cambridge University Press. Bonabeau, E., Dorigo, M., and Theraulaz, T. (1999). From Natural to Artificial Swarm Intelligence. New York: Oxford University Press. Brightman, R. (1995). Forget Culture: Replacement, Transcendence, Relexification. Cultural Anthropology, 10(4), 509–546. Bungay, H., Clift, S., & Skingley, A. (2010). The Silver Song Club Project: A sense of well-being through participatory singing. Journal of Applied Arts & Health, 1(2), 165-178.

229

Burger, B., Thompson, M. R., Luck, G., Saarikallio, S., & Toiviainen, P. (2012). Music moves us: Beat-related musical features influence regularity of music-induced movement. In 12th International Conference on Music Perception and Cognition, Thessaloniki, Greece. Canetti, E. (1973). Crowds and Power. Harmondsworth: Penguin. Carneiro, R. (1961). Slash-and-Burn Cultivation among the Kuikuru and its Implications for Cultural Development in the Amazon Basin. In J. Wilbert, (Ed.), The Evolution of Horticultural Systems in Native South America, Causes and Consequences: A Symposium (pp. 47-67). Caracas: Editoria Sucre (Antropologia, Supplement 2). Carroll, L. (1893/1982). Sylvie and Bruno Concluded. In The Complete, Fully Illustrated Works. Gramercy Books. Cassell, J., McNeill, D., & McCullough, K. (1999). Speech-gesture mismatches: evidence for one underlying representation of linguistic and nonlinguistic information. Pragmatics and Cognition, 7(1), 1. Cavagna, A., Cimarelli, A., Giardina, I., Parisi, G., Santagati, R., Stefanini, F., Viale, M. (2010a). Scale-free correlations in starling flocks. Proceedings of the National Academy of Sciences, USA, 107, 11865-70. Cavagna, A., Cimarelli, A., Giardina, I., Parisi, G., Santagati, R., Stefanini, F., Tavarone, R. (2010b). From empirical data to inter-individual interactions: unveiling the rules of collective animal behavior. Mathematical Models and Methods in Applied Sciences, 20, 1491-1510. Chartrand, T. L., & Bargh, J. A. (1999). The chameleon effect: The perception–behavior link and social interaction. Journal of Personality and Social Psychology, 76(6), 893-910. Chela-Flores, B. (1994). On the acquisition of English rhythm: Theoretical and practical issue. International Review of Applied Linguistics, 32, 232-242. Chen, M. (1983). Toward a grammar of singing: tune-text association in the Gregorian chant. Music Perception, 1, 84-122. Chong, H. J. (2010). Do we all enjoy singing? A content analysis of non-vocalists’ attitudes toward singing. The Arts in Psychotherapy, 37(2), 120-124. Clark, H. H. (1996). Using language. New York: Cambridge University Press.

230

Clarke, E. (2011). Music perception and musical consciousness. In D. Clarke & E. Clarke (Eds.), Music and Consciousness: Philosophical, Psychological, and Cultural Perspectives (pp. 193-214). Oxford: Oxford University Press. Clayton, A. (1986). Coordination between players in musical performance. Unpublished doctoral dissertation, Edinburgh University, Edinburgh, UK. Clayton, M. (1996). Free Rhythm: Ethnomusicology and the Study of Music Without Metre. Bulletin of the School of Oriental and African Studies, 59, 323-332. Clayton, M. (2005). Time in Indian Music: Rhythm, Metre, and Form in North Indian Rag Performance. Oxford: Oxford University Press. Clayton, M. R. (2007a). Observing entrainment in music performance: Video-based observational analysis of Indian musicians’ tanpura playing and beat marking. Musicae Scientiae, 11(1), 27-59. Clayton, M. R. (2007b). Time, Gesture and Attention in a ‘Khyāl’ Performance. Asian Music, 38(2), 71-96. Clayton, M. R. (2012). What is Entrainment? Definition and applications in musical research. Empirical Musicology Review, 7(1-2), 49-56. Clift, S., Hancox, G., Morrison, I., Hess, B., Kreutz, G., & Stewart, D. (2007). Choral singing and psychological wellbeing: Findings from English choirs in a cross-national survey using the WHOQOL-BREF. In Proceedings, international symposium on performance science, Porto, Portugal (pp. 22–23). Clift, S., Hancox, G., Morrison, I., Hess, B., Kreutz, G., & Stewart, D. (2010). Choral singing and psychological wellbeing: Quantitative and qualitative findings from english choirs in a cross-national survey. Journal of Applied Arts and Health , 1(1), 19–34. Cohen, E. E., Ejsmond-Frey, R., Knight, N., Dunbar, R. I. (2010). Rowers' high: behavioural synchrony is correlated with elevated pain thresholds. Biology Letters, 6(1), 106-8. Cohen, M. L. (2009). Choral singing and prison inmates: Influences of performing in a prison choir. Journal of Correctional Education, 60(1), 52-65. Condon, W. (1985). Sound-Film Microanalysis: A Means for Correlating Brain and Behaviour. In F. H. Duffy & N. Geschwind (Eds.), Dyslexia. Boston/Toronto.

231

Connor, R. C., Smolker, R., & Bejder, L. (2006). Synchrony, social behaviour and alliance affiliation in Indian Ocean bottlenose dolphins, Tursiops aduncus. Animal Behaviour, 72(6), 1371–1378. Copeman, H. (1990). Singing in Latin, or Pronunciation Explor’d. Oxford: Oxford University Press. Cottrell, S. (2007). Music, Time, and Dance in Orchestral Performance: The Conductor as Shaman. Twentieth-Century Music, 3(1), 73-96. Couzin, I. D. (2007). Collective minds. Nature, 445, 715. Couzin, I. D. (2009). Collective cognition in animal groups. Trends in Cognitive Science, 13, 36-43. Crocker, R. L. (1958). ’Musica rhythmica’ and ‘musica metrica’ in antique and medieval theory. Journal of Music Theory, 2, 2-23. Crocker, R. L. (2000). An Introduction to Gregorian Chant. London: Yale University Press.

Cross, I. (2001). Music, mind and evolution. Psychology of Music, 29(1), 95-102.

Cross, I. (2003a). Music and biocultural evolution. In M. Clayton, T. Herbert & R. Middleton (Eds.), The cultural study of music: a critical introduction (pp. 19-30). Routledge: London.

Cross, I. (2005). Music and meaning, ambiguity and evolution. In D. Miell, R. MacDonald & D. Hargreaves (Eds.), Musical Communication (pp. 27-43). Oxford University Press, Oxford.

Cross, I. (2012). Music as a social and cognitive process. In P. Rebuschat, M. Rohrmeier, J. A. Hawkins & I. Cross (Eds.), Language and Music as Cognitive Systems (pp. 315-328). Oxford: Oxford University Press.

Cross, I. (2013). ‘Does not compute’? Music as real-time communicative interaction. AI & Society, 28(4), 415-430.

Csikszentmihalyi, M. (2002). Flow: the psychology of optimal experience. New York: Harper Row.

Csikszentmihalyi, M. (2003). Good Business: Leadership, Flow, and the Making of Meaning. New York: Viking.

Cummins, F. (2002). On synchronous speech. Acoustic Research Letters Online, 3(1), 7–11.

232

Cummins, F. (2003). Practice and performance in speech produced synchronously. Journal of Phonetics, 31(2), 139-148.

Cummins, F. (2009). Rhythm as entrainment: The case of synchronous speech. Journal of Phonetics, 37, 16-28.

Cummins, F. (2012a). Looking for Rhythm in Speech. Empirical Musicology Review, 7(1-2).

Cummins, F. (2012b). Synchronized Speaking: What speaking together can tell us about skilled action. Talk at MRC-CBU, Cambridge, 5 Dec.

Cummins, F. (2013) Joint speech: The missing link between speech and music?. Percepta, 1(1), 17-32.

Custodero, L. A. (2005). Observable indicators of flow experience: a developmental perspective on musical engagement in young children from infancy to school age. Music Education Research, 7(2), 185-209.

Cutler, A. & Isard, S.D. (1980). The Production of Prosody. In B. Butterworth (Ed.), Language production (pp. 245-269). London: Academic Press.

Dauer, R. M. (1983). Stress-timing and syllable-timing reanalyzed. Journal of Phonetics, 11, 51-62.

De Jaegher, H., & Di Paolo, E. (2007). Participatory sense-making. Phenomenology and the Cognitive Sciences, 6(4), 485-507.

de Menezes Bastos, R. J. (1978). A Musicológica Kamayurá. Brasilia: Fundaçao Nacional do Indio.

Decety, J., & Sommerville, J. A. (2003). Shared representations between self and other: A social cognitive neuroscience view. Trends in Cognitive Science, 7, 527-533.

Deutsch, D., Henthorn, T., and Lapidis, R. (2011). Illusory transformation from speech to song. Journal of the Acoustical Society of America, 129, 2245-2252.

Dijksterhuis, A., & Bargh, J. A. (2001). The perception–behavior expressway: Automatic effects of social perception on social behavior. In M. P. Zanna (Ed.), Advances in experimental social psychology, 33 (pp. 1-40). San Diego, CA: Academic.

Doffman, M. R. (2008). Feeling the groove: Shared time and its meanings for three jazz trios. Unpublished doctoral dissertation, Music Department, Open University. Drake, C., Penel, A., & Bigand, E. (2000). Tapping in time with mechanically and expressively performed music. Music Perception, 18, 1–23.

233

Dresher, B.E. (2008). Between music and speech: The relationship between Gregorian and Hebrew chant. Toronto Working Papers in Linguistics, 27, 43-58.

Durkheim, É. (1912/1995). The Elementary Forms of the Religious Life (transl. Karen. E. Fields). New York: Free Press.

Edelman, G. (1989). The Remembered Present: A Biological Theory of Consciousness. New York, NY: Basic Books.

Ehrenreich, B. (2007). Dancing in the streets: A history of collective joy. London: Granta Books.

Elkin, A. P. (1945/1980). Aboriginal men of high degree. Brisbane: University of Queensland Press.

Faber, D. (1986). Teaching the rhythms of English: A new theoretical base. International Review of Applied Linguistics, 24, 205-216.

Faulkner, R. R., & Becker, H. S. (2009). ‘Do You Know…?’: The Jazz Repertoire in Action. University of Chicago Press.

Feld, S. (1988). Aesthetics as Iconicity of Style, or ‘Lift-up-over Sounding’: Getting into the Kaluli Groove. Yearbook for Traditional Music, 20, 74-113.

Fischer, R., Callander, R., Reddish, P. & Bulbulia, J. (2013). How do rituals affect cooperation? An experimental field study comparing nine ritual types. Human Nature, 24(2), 115-125.

Franks, A. (1996). The Enchantress. The Times (Dec 14, 1996).

Fredrickson, W. E. (1994). Band Musicians' Performance and Eye Contact as Influenced by Loss of a Visual and/or Aural Stimulus. Journal of Research in Music Education, 42, 306–317.

Frigyesi, J. (1993). Preliminary thoughts toward the study of music without clear beat: the example of ‘flowing rhythm’ in Jewish Nusah. Asian Music, 24(2), 59-88.

Frisbie, C. J. (1980). Vocables in Navajo Ceremonial Music. Ethnomusicology, 24(3), 347-392.

Fujioka, T., Trainor, L. J., Large, E. W. & Ross, B. (2012). Internalized timing of isochronous sounds is represented in neuromagnetic beta oscillations. The Journal of Neuroscience, 32, 1791-1802.

Fürniss, S. (2006). Aka polyphony: Music, theory, back and forth. In M. Tenzer (Ed.), Analytical Studies in World Music (pp. 163-204). New York: Oxford University Press.

234

Gale, N. S., Enright, S., Reagon, C., Lewis, I., & van Deursen, R. (2012). A pilot investigation of quality of life and lung function following choral singing in cancer survivors and their carers. ecancermedicalscience, 6, 1-13.

Gershon, I. (2005). Seeing like a system: Luhmann for anthropologists. Anthropological Theory, 5, 99-116.

Ghitza, O., & Greenberg, S. (2009). On the Possible Role of Brain Rhythms in Speech Perception: Intelligibility of Time-Compressed Speech with Periodic and Aperiodic Insertions of Silence. Phonetica, 66(1-2), 113-126.

Gill, S. P. (2011). Rhythmic synchrony and mediated interaction: towards a framework of rhythm in embodied interaction. AI & Society, 27(1), 111-127.

Gill, S. P. & Borchers, J. (2003). Knowledge in co-action: social intelligence in collaborative design activity. AI & Soc, 17, 322-339.

Glass, T. A., de Leon, C. M., Marottoli, R. A., & Berkman, L. F. (1999). Population based study of social and productive activities as predictors of survival among elderly Americans. British Medical Journal, 319(7208), 478-83.

Goebl, W., & Palmer, C. (2009). Synchronisation of Timing and Motion among Performing Musicians. Music Perception, 26(5), 427-438.

Goodman, L. (2003). Singing the songs of my ancestors: the life and music of Helma Swan, Makah elder (Vol. 244). University of Oklahoma Press.

Gordon, D.M. (1999). Ants at Work: How an insect society is organised. New York: The Free Press.

Graham, W. A. & Kermani, N. (2007). Recitation and Aesthetic Reception. In J. D. McAuliffe (Ed.), The Cambridge Companion to the Qur'an. Cambridge. Cambridge University Press.

Green, A. E. & Widdowson, J. (2003). Traditional English language genres: Continuity and change, 1950-2000. Sheffield: NATCECT Occasional Publications No. 9.

Guttman, S. E., Gilroy, L. A., & Blake, R. (2005). Hearing what the eyes see: Auditory encoding of visual temporal sequences. Psychological Science, 16, 228-235.

Hagen, E. H., & Bryant, G. A. (2003). Music and dance as a coalition signaling system. Human Nature, 14(1), 21-51.

Haken, H. (1983). Synergetics, An Introduction: Nonequilibrium Phase Transitions and Self-Organisation in Physics, Chemisty, and Biology (3rd edn.). Berlin: Springer-Verlag.

Hall, E.T. (1976). Beyond Culture. New York: Doubleday.

235

Hall, E.T. (1983). The Dance of Life, The Other Dimension of Time. New York: Doubleday.

Handel, S. (1989). Listening: An Introduction to the Perception of Auditory Events. The MIT Press, Cambridge, MA, US.

Hannon, E.E. (2009). Perceiving speech rhythm in music: Listeners classify instrumental songs according to language of origin. Cognition, 111, 403-409.

Hayward, G. D. (2009). ‘Let’s Walk and Talk’: the Effect of Social Interaction on Gait Entrainment. Unpublished Master’s thesis, University of Cambridge, UK.

Hiley, D. (1993). Western Plainchant: A Handbook. Oxford: Oxford University Press.

Hiley, D. (2001). ‘Medieval Monophony’. Entry in H. M. Brown et al., ‘Performing practice’, Grove Music Online, Oxford Music Online. Oxford: Oxford University Press.

Hiley, D. (2009). Gregorian Chant. Cambridge: Cambridge University Press.

Himberg, T. (2011). Interacting with responsive and non responsive tapping partners. 13th International rhythm perception and production workshop, Leipzig 13-15th July, 32.

Himberg, T. (2013). Interaction in Musical Time. Unpublished doctoral dissertation, University of Cambridge, UK.

Himberg, T., & Spiro, N. (2012). Beating to each other’s drum: Towards a comprehensive understanding of pairwise rhythmic interaction. In 7th Nordic Music Therapy Congress, Jyvaskyla, Finland.

Himberg, T., & Thompson, M. R. (2011). Learning and synchronising dance movements in South African songs: Cross-cultural motion-capture study. Dance Research, 29(2), 303-326.

Hockett, C. F. (1959). Animal ‘Languages’ and Human Language. Human Biology, 31(1), 32-39.

Hove, M. J., & Risen, J. L. (2009). It’s all in the timing: Interpersonal synchrony increases affiliation. Social Cognition, 27(6), 949-960.

Hove, M. J., Keller, P. E., & Krumhansl, C. L. (2007). Sensorimotor synchronization with chords containing tone-onset asynchronies. Perception & Psychophysics, 69(5), 699-708. Howard, W. (1977). Samavedic Chant. New Haven: Yale University Press.

236

Howell, S. (1994). Singing to the spirits and praying to the ancestors: a comparative study of chewong and lio invocations. L'Homme, Anthropologie de la prière: Rites oraux en Asie du Sud-Est, 34e Année, 132, 15-34. Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., Liu, H. H. (1998). The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 454(1971), 903-995. Humphrey, C., & Laidlaw, J. (1993). Archetypal Actions: A Theory of Ritual as a Mode of Action and the Case of the Jain Puja. Oxford: Clarendon Press.

Hunter, J. R. (1966). Procedure for analysis of schooling behaviour. Journal of the Fisheries Research Board of Canada, 23, 547-62.

Husserl, E. (1960/1929; trans. D. Cairns). Cartesian Meditations: An Introduction to Phenomenology. The Hague: Martinus Nijhoff Publishers.

Huygens, C. (1673/1986). The pendulum clock or geometrical demonstrations concerning the motion of pendula as applied to clocks (R. J. Blackwell, trans.). Ames: Iowa State University Press.

Iacoboni, M. (2009). Imitation, Empathy, and Mirror Neurons. Annual Review of Psychology, 60(1), 653-670. Jackendoff, R. (2002). Foundations of language. Oxford: Oxford University Press. Jacob, P. (2010). Intentionality. In E.D. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Fall 2010 Edition). <http://plato.stanford.edu/archives/fall2010/entries/intentionality/>. Jeffery, P. (1992). Re-Envisioning Past Musical Cultures: Ethnomusicology in the Study of Gregorian Chant. Chicago: University of Chicago Press. Kapferer, B. (1986). Performance and the Structuring of Meaning and Experience. In V. W. Turner & E. M. Bruner (Eds.), The Anthropology of Experience (pp. 188-203). Urbana and Chicago: University of Illinois Press. Katahira, K., Nakamura, T., Kawase, S., Yasuda, S., Shoda, H., & Draguna, M. R. (2007). The role of body movement in co-performers’ temporal coordination. Proceedings of the inaugural International Conference on Music Communication Science, 72-75. Kauffman, S. (1995). At Home in the Universe: The Search for Laws of Self-Organisation and Complexity. Oxford. Oxford University Press.

237

Kaufmann, W. (1975). Tibetan Buddhist chant: musical notations and interpretations of a song book by the Bkah Brgyud Pa and Sa Skya Pa sects (translations from the Tibetan by T.J. Norbu). Bloomington: Indiana University Press. Kawase, S. (2009). An exploratory study of gazing behavior during live performance. Proceedings of the 7th Triennial Conference of European Society for the Cognitive Sciences of Music (ESCOM 2009), Jyväskylä, Finland. Kawase, S. (2011). The effect of performers’ leadership on gazing behaviour during piano duo performance. Proceedings of the 2011 conference of the Japanese Society for Music Perception and Cognition, 23-28. Kawase, S. (2012). Mutual Gaze Facilitates synchronisation during piano duo performances. Proceedings of the 12th International Conference on Music Perception and Cognition and the 8th Triennial Conference of the European Society for the Cognitive Sciences of Music, July 23-28, 2012, Thessaloniki, Greece, 522-526. Kawase, S. (2013). Gazing behavior and coordination during piano duo performance. Attention, Perception, & Psychophysics, 1-14. Kawase, S., Nakamura, T., Draguna, M. R., Katahira, K., Yasuda, S., & Shoda, H. (2007). Communication Channels Performers and Listeners Use: A Survey Study. Proceedings of inaugural International Conference on Music Communication Science, 5-7 Dec, Sydney, Australia, 76-79. Keil, C. (1994). Participatory Discrepancies and the Power of Music. In C. Keil & S. Feld (Eds.), Music Grooves: Essays and Dialogues (pp. 96-108). Chicago: University of Chicago Press. Keller, P. E. (2007). Musical Ensemble Synchronisation. The inaugural International Conference on Music Communication Science, 5-7 Dec, Sydney, Australia, 80-83. Keller, P. E. & Appel, M. (2010). Individual differences, auditory imagery, and the coordination of body movements and sounds in musical ensembles. Music Perception, 28, 27-46. Kendon, A. (2013). When is a movement a ‘gesture’? and other matters concerning the nature and organisation of extra-oral action in utterances. Talk given at Centre for Music & Science, Cambridge, 26 Nov 2013. Kenny, D. T., & Faunce, G. (2004). The impact of group singing on mood, coping, and perceived pain in chronic pain patients attending a multidisciplinary pain clinic. Journal of Music Therapy, 41(3), 241-258.

238

Kirschner, S., & Tomasello, M. (2009). Joint drumming: Social context facilitates synchronisation in preschool children. Journal of Experimental Child Psychology, 102, 299-314. Kirschner, S., & Tomasello, M. (2010). Joint music making promotes prosocial behavior in 4-year-old children. Evolution and Human Behavior, 31(5), 354-364. Klatt, D. H. (1976). Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. The Journal of the Acoustical Society of America, 59, 1208-21. Knight, S. L. (2013). An investigation of passive entrainment, prosociality and their potential roles in persuasive oratory. Unpublished doctoral dissertation, University of Cambridge, UK. Knoblich, G., Butterfill, S., & Sebanz, N. (2011). Psychological Research on Joint Action: Theory and Data. In B. Ross (Ed.), The Psychology of Learning and Motivation, Vol. 54 (pp. 59-101). Burlington: Academic Press. Kohler K. J. (2009). Rhythm in Speech and Language: A New Research Paradigm. Phonetica, 66(1-2), 29-45. Kokal, I., Engel, A., Kirschner, S., & Keysers, C. (2011). Synchronized Drumming Enhances Activity in the Caudate and Facilitates Prosocial Commitment-If the Rhythm Comes Easily. PloS One, 6(11), e27272. Kolinski, M. (1973). A Cross-Cultural Approach to Metro-Rhythmic Patterns. Ethnomusicology, 17(3), 494-506. Konvalinka, I., Vuust, P., Roepstorff, A., & Frith, C. D. (2010). Follow you, follow me: continuous mutual prediction and adaptation in joint tapping. The Quarterly Journal of Experimental Psychology, 63(11), 2220-2230. Konvalinka, I., Xygalatas, D., Bulbulia, J., Schjødt, U., Jegindø, E. M., Wallot, S., Guy Van Ordend & Roepstorff, A. (2011). Synchronized arousal between performers and related spectators in a fire-walking ritual. Proceedings of the National Academy of Sciences, 108(20), 8514-8519. Koudenburg, N., Postmes, T., & Gordijn, E. H. (2011). Disrupting the flow: How brief silences in group conversations affect social needs. Journal of Experimental Social Psychology, 47(2), 512-515. Kramer, J. (1988). The Time of Music. New York.

239

Kreutz, G., Bongard, S., Rohrmann, S., Hodapp, V., & Grebe, D. (2004). Effects of choir singing or listening on secretory immunoglobulin A, cortisol, and emotional state. Journal of Behavioral Medicine, 27(6), 623-635. Kugler, P. N., & Turvey, M. T. (1987). Information, natural law, and the self-assembly of rhythmic movements. Hillsdale, NJ: Lawrence Erlbaum Associates. Lakens, D. L. (2010). Movement synchrony and perceived entitativity. Journal of Experimental Social Psychology, 46(5), 701–708. Lakens, D., & Stel, M. (2011). If they move in sync, they must feel in sync: Movement synchrony leads to attributions of rapport and entitativity. Social Cognition, 29(1), 1-14. Langton, C. G. (1992). Preface. In C. G. Langton, C. Taylor, J. D. Farmer, & S. Rasmussen, (Eds.), Artificial Life II, volume X of SFI Studies in the Sciences of Complexity (pp. xiii–xviii). Redwood City, CA: Addison-Wesley. Large, E. W. (2000). On synchronizing movements to music. Human Movement Science, 19, 527-566. Large, E. W. (2010). Neurodynamics of music. In M. Riess Jones, R. R. Fay & A. N. Popper (Eds.), Springer Handbook of Auditory Research, Vol. 36: Music Perception (pp. 201-231). New York: Springer. Large, E. W., & Jones, M. R. (1999). The dynamics of attending: How people track time-varying events. Psychological Review, 106(1), 119-159. Large, E. W. & Kolen, J. F. (1994). Resonance and the perception of musical meter. Connection Science: Journal of Neural Computing, Artificial Intelligence and Cognitive Research, 6(2-3), 177-208. Large, E. W., & Palmer, C. (2002). Perceiving temporal regularity in music. Cognitive Science, 26(1), 1-37. Launay, J., Dean, R.T., Bailes, F. (2013). Synchronization can influence trust following virtual interaction. Experimental Psychology, 60(1), 53-63. Lehmann, J. (1976). Die Kreuzfahrer. Munich: Bertelsmann. Leman, M. (2012). Musical Entrainment Subsumes Bodily Gestures—Its Definition Needs a Spatiotemporal Dimension. Empirical Musicology Review, 7(1-2), 63-67.

240

Leman, M., & Naveda, L. (2010). Basic gestures as spatiotemporal reference frames for repetitive dance/music patterns in Samba and Charleston. Music Perception, 28(1), 71-91. Leman, M., Desmet, F., Styns, F., Van Noorden, L., & Moelants, D. (2009). Sharing musical expression through embodied listening: A case study based on chinese guqin music. Music Perception, 26(3), 263-278. Leonard, T. & Cummins, F. (2011). The temporal relation between beat gestures and speech. Language and Cognitive Processes, 26(10), 1457-1471. Lerdahl, F. (2001). The Sounds of Poetry Viewed as Music. Annals of the New York Academy of Sciences, 930(1), 337-354. Lerdahl, F., & Jackendoff, R. (1983). A Generative Theory of Tonal Music. Cambridge, MA: MIT Press. Levinson, S. C. (2006). On the human "interaction engine". In N. J. Enfield, & S. C. Levinson (Eds.), Roots of human sociality: Culture, cognition and interaction (pp. 39-69). Oxford: Berg. Lewin, K. (1952). Field Theory in Social Science. London: Tavistock Publications. Lewis, D. (1969). Convention. Cambridge, Mass: Harvard University Press. Lewis, J. (2009). As well as words: Congo Pygmy hunting, mimicry, and play. In R. Botha & C. Knight (Eds.), The Cradle of Language, Volume 2: African Perspectives (pp. 236-256). Oxford: Oxford University Press. Liberman, M. (1975). The Intonational System of English. Unpublished doctoral dissertation, MIT. Lienard, P., & Boyer, P. (2006). Whence collective rituals? A cultural selection model of ritualized behavior. American Anthropologist, 108(4), 814-827. Lindholm, C. (1990). Charisma. London: Blackwell. List, G. (1963). The boundaries of speech and song. Ethnomusicology, 7(1), 1-16. Locke, D. (1998). Drum Gahu. Tempe, Ariz.: White Cliffs Media. Lodge, M. (2009). Music historiography in New Zealand. In Blazekovic & Mackenzie (Ed.), Music's intellectual history (pp.625-652). New York: Répertoire International de Littérature Musicale (RILM). http://www.rilm.org/historiography/lodge.pdf.

241

Loehr, J. D., Large, E. W., & Palmer, C. (2011). Temporal Coordination and Adaptation to Rate Change in Music Performance. Journal of Experimental Psychology: Human Perception and Performance, 37(4), 1292-1309. Loersch, C. & Arbuckle, N. L. (2013). Unraveling the mystery of music: Music as an evolved group process. Journal of Personality and Social Psychology, 105(5), 777-798. Lomax, A. (1968). Folk Song Style and Culture (AAAS, 88). New Brunswick, New Jersey: Transaction Publishers. Lomax, A. (1982). The Cross-Cultural Variation of Rhythmic Style. In M. Davis, (Ed.), Interaction Rhythms: Periodicity in Communicative Behaviour. New York: Human Sciences Press, 149-174. London, J. (1995). Some Examples of Complex Meters and their Implications for Models of Metric Perception. Music Perception, 13(1), 59-77. London, J. (2012). Hearing in time: Psychological aspects of musical meter (2nd edn.). Oxford: Oxford University Press. Lucas, G. (2006). Entrainment and socio-musical interactions in Afro-Brazilian Congado rituals. <http://www.open.ac.uk/Arts/entrainment-network/Entrainment_III_Glaura.pdf> Lucas, G., Clayton, M. & Leante, L. (2011). Inter-group entrainment in Afro-Brazilian Congado ritual. Empirical Musicology Review, 6(2), 75-102. Luck, G., & Sloboda, J. (2008). Exploring the spatio-temporal properties of simple conducting gestures using a synchronization task. Music Perception: An Interdisciplinary Journal, 25(3), 225-239. Luck, G., & Toiviainen, P. (2006). Ensemble musicians’ synchronization with conductors’ gestures: An automated feature-extraction analysis. Music Perception: An Interdisciplinary Journal, 24(2), 189-200. Luhmann, N. (2002). Theories of Distinction: Redescribing the Descriptions of Modernity. Stanford, CA: Stanford University Press. Luhrs, J. (2008). Football chants and ‘blason populaire’: the construction of local and regional stereotypes. In E. Lavric (Ed.), The linguistics of football (pp. 233-244). Tübingen: Narr.

242

Lumsden, J., Miles, L. K., Richardson, M. J., Smith, C. A., & Macrae, C. N. (2012). Who syncs? Social motives and interpersonal coordination. Journal of Experimental Social Psychology, 48(3), 746-751. Mac Coinnigh, M. (2013). The Blason Populaire: Slurs and Stereotypes in Irish Proverbial Material. Folklore, 124(2), 157-177. MacDougall, H. G., & Moore, S. T. (2005). Marching to the beat of the same drummer: the spontaneous tempo of human locomotion. Journal of Applied Physiology, 99(3), 1164-1173. Maduell, M., & Wing, A. M. (2007). The dynamics of ensemble: the case for flamenco. Psychology of Music, 35, 591-627. Marais, E.N. (1973). The Soul of the White Ant. Harmondsworth: Penguin. Margulis, E. H. (2013). Repetition and emotive communication in music versus speech. Frontiers in Psychology, 4(167), 1-4. Marsh, K. L. (2011). Sociality, from an Ecological, Dynamical Perspective. In G. R. Semin & G. Echterhoff (Eds.), Grounding sociality: Neurons, minds, and culture (pp. 53-82). London: Psychology Press. Marsh, K. L., Richardson, M. J., Baron, R. M., & Schmidt, R. C. (2006). Contrasting Approaches to Perceiving and Acting With Others. Ecological Psychology, 18(1), 1-38. Marsh, K., Richardson, R., & Schmidt, R. (2009). Social connection through joint action and interpersonal coordination. Topics in Cognitive Science, 1(2), 320-339. Martin, G. R. (1986). The eye of a passeriform bird, the European starling (Sturnus vulgaris): eye movement amplitude, visual fields and schematic optics. Journal of Comparative Physiology A, 159, 545-557. Makris, N. C., Ratilal, P., Jagannathan, S., Gong, Z., Andrews, M., Bertsatos, I., Godø, O. R., Nero, R. W., & Jech, J. M. (2009). Critical population density triggers rapid formation of vast oceanic fish shoals. Science, 323(5922), 1734-1737. McAuley, J. D., & Henry, M. J., (2010). Modality effects in rhythm processing: auditory encoding of visual rhythms is neither obligatory nor automatic. Attention, Perception, & Psychophysics, 72, 1377-1389. McDougall, W. (1920). The Group Mind. Cambridge: Cambridge University Press.

243

McGarva, A., & Warner, R. (2003). Attraction and social coordination: Mutual entrainment of vocal activity rhythms. Journal of Psycholinguistic Research, 32(3), 335–354. McLeod, N. (1974). Ethnomusicological research and anthropology. Annual Review of Anthropology, 3, 99-115. McNeill, W. (1995). Keeping Together in Time: Dance and Drill in Human History. Cambridge: Harvard University Press. Mead, G. H. 1932. The Philosophy of the Present. Chicago: University of Chicago Press. Mellor, P. A. & Shilling, C. (1998). Durkheim, Morality and Modernity: Collective Effervescence, homo duplex and the sources of moral action. The British Journal of Sociology, 49(2), 193-209. Menezes Bastos, R. J. de. (1978). A musicológica Kamayurá: para uma antropologia da comunicaçao no Alto-Xingu. Brasilia: Fundação Nacional do Indio. Merker, B. (2008). Ritual foundations of human uniqueness. In S. Malloch & C. Trevarthen (Eds.), Communicative Musicality (pp. 45-60). Oxford: Oxford University Press. Miles, L.K., Nind, L.K., & Macrae, C.N. (2009). The rhythm of rapport: Interpersonal synchrony and social perception. Journal of Experimental Social Psychology, 45, 585-589. Miles, L. K., Nind, L. K., Henderson, Z., & Macrae, C. N. (2010). Moving memories: Behavioral synchrony and memory for self and others. Journal of Experimental Social Psychology, 46(2), 457-460. Miles, L.K., Lumsden, J., Richardson, M.J., Macrae, C.N. (2011). Do birds of a feather move together? Group membership and behavioral synchrony. Experimental Brain Research, 211(3-4), 495-503. Mitchell, S. D. (2012). Emergence: logical, functional and dynamical. Synthese, 185, 171-186. Moore, A. J., Szekely, T. & Komdeur, J. (2010). Prospects for research in social behaviour. In A. J. Moore, , T. Szekely & J. Komdeur (Eds.), Social Behaviour: Genes, Ecology and Evolution (pp. 538-550). Cambridge: Cambridge University Press. Moore, S. F. (1978; 2nd edn. 2000). Law As Process: An Anthropological Approach. London, Boston: Routledge & K. Paul.

244

Moore, G. P., & Chen, J. (2010). Timings and interactions of skilled musicians. Biological cybernetics, 103(5), 401-414. Moore, S. F. & Myerhoff, B. (Eds.). (1977). Secular Ritual. Amsterdam: Van Gorcum. Moran, N. (2010). Improvising musicians' looking behaviours: Duration constants in the attention patterns of duo performers. Proceedings of ICMPC11, 565–568. Moulin, J. F. (1994). Chants of Power: Countering Hegemony in the Marquesas Islands. Yearbook for Traditional Music, 26, 1-19. Moyle, P.B., & Cech, J.J. (2003). Fishes: An Introduction to Ichthyology (5th edn.). San Francisco: Benjamin Cummings.

Murnighan, J. K., & Conlon, D. E. (1991). The Dynamics of Intense Work Groups: A Study of British String Quartets’. Administrative Science Quarterly, 36(2), 165-86. Nadel, S. F. (1954). Nupe Religion. London: Routledge and Kegan Paul. Nagy, M., Akos, Z., Biro, D., & Vicsek, T. (2010). Hierarchical group dynamics in pigeon flocks. Nature, 464, 890-4. Naimpalli, S. (2005). Theory and practice of tabla. Mumbai: Popular Prakashan. Naveda, L. & Leman, M. (2009). A Cross-modal Heuristic for Periodic Pattern Analysis of Samba Music and Dance. Journal of New Music Research, 38(3), 255-283. Neda, Z., Ravasz, E., Brechte, Y., Vicsek, T., & Barabasi, A.-L. (2000a). The sound of many hands clapping. Nature, 403, 849-850. Néda, Z., Ravasz, E., Vicsek, T., Brechet, Y., & Barabási, A. L. (2000b). Physics of the rhythmic applause. Physical Review E, 61(6), 6987. Nettl, B. (1956). Music in Primitive Cultures. Cambridge, MA: Harvard University Press. Nettl, B. (1983). The Study of Ethnomusicology: Twenty-Nine Issues and Concepts. Urbana, IL: University of Illinois Press. Nettl, B. (2000). An ethnomusicologist contemplates universals in musical sound and musical culture. In N. L. Wallin, B. Merker, & S. Brown (Eds.), The origins of music (pp. 463-472). Cambridge, MA: MIT Press.

245

Nettl, B. (2005). The Study of Ethnomusicology: Thirty-one Issues and Concepts. Chicago: University of Illinois Press. Newtson, D. (1993). The dynamics of action and interaction. In L. B. Smith & E. Thelen (Eds.), A dynamic systems approach to development: Applications (pp. 241–264). Cambridge, MA: MIT Press. Niebuhr, O. (2009). F0-Based Rhythm Effects on the Perception of Local Syllable Prominence. Phonetica, 66(1-2), 95-112. Niwa, H-S. (1994). Self-organising dynamic model of fish schooling. Journal of Theoretical Biology, 171, 123-136. Nolan F., & Asu E. L. (2009). The Pairwise Variability Index and Coexisting Rhythms in Language. Phonetica, 66(1-2), 64-77. Nowicki, L., Keller, P. E., & Prinz, W. (2007). The influence of another’s actions on one’s own synchronization with music. Poster presented at the 15th Meeting of the European Society for Cognitive Psychology, Marseille, France. Olaveson, T. (2001). Collective Effervescence and Communitas: Processual Models of Ritual and Society in Emile Durkheim and Victor Turner. Dialectical Anthropology, 26, 89-124. Oohashi, T., Kawai, N., Honda, M., Nakamura, S., Morimoto, M., Nishina, E., & Maekawa, T. (2002). Electroencephalographic measurement of possession trance in the field. Clinical Neurophysiology, 113(3), 435-445. Orloff, A. (1981; transl. R. B. Palmer). Carnival: Myth and Cult. Dallas, TX. Oullier, O., de Guzman, G. C., Jantzen, K. J., Lagarde, J., & Kelso, J. A. S. (2008). Social coordination dynamics: Measuring human bonding. Social Neuroscience, 3(2), 178-192. Pacherie, E. (2012). The phenomenology of joint action: self-agency vs. joint-agency. In A. Seemann (Ed.), Joint attention: New Developments (pp. 343-389). Cambridge, MA: MIT Press. Paladino, M. P., Mazzurega, M., Pavani, F., & Schubert, T. W. (2010). Synchronous multisensory stimulation blurs self-other boundaries. Psychological Science, 21(9), 1202-1207. Palmer, C. (1989). Mapping musical thought to musical performance. Journal of Experimental Psychology: Human Perception and Performance, 15, 331-346.

246

Pantaleoni, H. (1987). One of Densmore's Dakota Rhythms Reconsidered. Ethnomusicology, 31(1), 35-55. Parsons, T. (1991). The Social System. Routledge. Partridge, B. L. (1981). Schooling. In D. McFarland (Ed.), The Oxford Companion to Animal Behaviour. Oxford: Oxford University Press. Patel A.D. (2008). Music, Language, and the Brain. New York: Oxford University Press. Patel, A. D., Iversen, J. R., Chen, Y., & Repp, B. H. (2005). The influence of metricality and modality on synchronization with a beat. Experimental Brain Research, 163, 226-238. Patel, A.D., Iversen, J.R., & Rosenberg, J.C. (2006). Comparing the rhythm and melody of speech and music: The case of British English and French. Journal of the Acoustical Society of America, 119, 3034-3047. Pecenka, N., & Keller, P. E. (2011). The role of temporal prediction abilities in interpersonal sensorimotor synchronization. Experimental brain research, 211, 1–11. Peters, K. (2006). Footpaths to reintegration: war, youth and rural crisis in Sierra Leone. Unpublished doctoral dissertation, Technology & Agrarian Development Group, Wageningen University. Peters, K. (2011). War and the Crisis of Youth in Sierra Leone. Cambridge: Cambridge University Press. Pickering, W. S. F. (1984). Durkheim’s Sociology of Religion: Themes and Theories. London: Routledge and Kegan Paul. Potts, W. K. (1984). The chorus-line hypothesis of manoeuvre coordination in avian flocks. Nature, 309(5967), 344-345. Praeger, F. (1882–1883). On the fallacy of the repetition of parts in the classical form. Proceedings of the Royal Musical Association, 9, 1-16. Prögler, J. A. (1995). Searching for swing: Participatory discrepancies in the jazz rhythm section. Ethnomusicology, 39(1), 21-54. Qureshi, R. (1969). Tarannum: The Chanting of Urdu Poetry. Ethnomusicology, 13(3), 425-468.

247

Qureshi, R. (2005). Sufi Music of India and Pakistan: Sound, Context, and Meaning. Oxford: Oxford University Press.

Rabinowitch, T.-C., Cross, I., & Burnard, P. (2012). Long-term musical group interaction has a positive influence on empathy in children. Psychology of Music, 1-15.

Randel, D.M. (2003). Gregorian Chant. In D.M. Randel (Ed.), The Harvard Dictionary of Music (pp. 362-6). Harvard: Harvard University Press. Rappaport, R.A. (1999). Ritual and Religion in the Making of Humanity. Cambridge: Cambridge University Press. Rasch, R.A. (1988). Timing and Synchronization in Ensemble Performance. In J. A. Sloboda (Ed.), Generative Processes in Music: The Psychology of Performance, Improvisation, and Composition (pp. 70–90). Oxford: Oxford University Press. Rasch, R.A., (1979). Synchronization in performed ensemble music. Acustica, 43, 121-131. Reddish, P., Bulbulia, J., & Fischer, R. (2013). Does synchrony promote generalized prosociality? Religion, Brain and Behavior Science, 3-19. Reisman, K. (1974). Contrapuntal conversations in an Antiguan village. In R. Bauman & J. Sherzer (Eds.), Explorations in the ethnography of speaking (pp. 110-124). Cambridge: Cambridge University Press. Repp, B.H. (1999a). Control of expressive and metronomic timing in pianists. Journal of Motor Behavior, 31, 145-164. Repp, B.H. (2005). Sensorimotor synchronization: A review of the tapping literature. Psychonomic Bulletin & Review, 12(6), 969-992. Repp, B.H. (2011) Tapping in Synchrony With a Perturbed Metronome: The Phase Correction Response to Small and Large Phase Shifts as a Function of Tempo. Journal of Motor Behavior, 43(3), 213-227. Repp, B., & Keller, P. (2004). Adaptation to tempo changes in sensorimotor synchronisation: Effects of intention, attention, and awareness. Quarterly Journal of Experimental Psychology, 57, 499-521. Repp, B. H., & Keller, P. E. (2008). Sensorimotor synchronization with adaptively timed sequences. Human Movement Science, 27(3), 423-456.

248

Repp, B.H., & Penel, A. (2004). Rhythmic movement is attracted more strongly to auditory than to visual rhythms. Psychological Research, 68(4), 252-270. Reynolds, C. (1987). Flocks, herds, and schools: A distributed behavioural model. SIGGRAPH ’87, 21(4), 25-34. Rice, T. (1994). May it fill your soul: Experiencing Bulgarian music. Chicago: University of Chicago Press. Richards, P. (2007). The emotions at war: a musicological approach to understanding atrocity in Sierra Leone. In Perri 6, S. Radstone, C. Squire & A. Treacher, (Eds.), Public emotions (pp. 62-84). Basingstoke: Palgrave. Richardson, D., Dale, R., & Kirkham, N. (2007). The art of conversation is coordination. Psychological Science, 18(5), 407-413. Richardson, M.J., Marsh, K.L., & Schmidt, R.C. (2005). Effects of visual and verbal information on unintentional interpersonal coordination. Journal of Experimental Psychology: Human Perception and Performance, 31, 62-79. Roseman, M. (1987). Transcription of Suyá Rainy Season Song. In A. Seeger, Why Suyá Sing: a musical anthropology of an Amazonian people (pp. 88-90). Cambridge: Cambridge University Press. Sanal, A. M., & Gorsev, S. (2013). Psychological and physiological effects of singing in a choir. Psychology of Music (Online), 1-10. Schiering, R. (2008). Regional Identity in Schalke Football Chants. In E. Lavric (Ed.), The linguistics of football (pp. 221-232). Tübingen: Narr. Schögler, B. (2000). Studying temporal co-ordination in jazz duets. Musicae Scientiae, 3(1 suppl), 75-91. Schmidt, R. C., & O’Brien, B. (1997). Evaluating the dynamics of unintended interpersonal coordination. Ecological Psychology, 9, 189-206. Schmidt, R. C., Carello, C., Turvey, M. T. (1990). Phase transitions and critical fluctuations in the visual coordination of rhythmic movements between people. Journal of Experimental Psychology: Human Perception and Performance, 16, 227-247. Schmidt, R.C., Fitzpatrick, P., Caron, R., Mergeche, J. (2011). Understanding social motor coordination. Human Movement Science, 30, 834-845.

249

Schüler, S. (2012). Synchronised Ritual Behaviour. In D. Cave & R. Sachs Norris (Eds.), Religion and the Body: Modern Science and the Construction of Religious Meaning (pp. 81-101). Boston: Brill. Schweikard, D.P., & Schmid, H.B. (2013). Collective Intentionality. In E.D. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Summer 2013 Edition). <http://plato.stanford.edu/archives/sum2013/entries/collective-intentionality/>. Sebanz, N., Knoblich, G., & Prinz, W. (2005). How two share a task: Corepresenting stimulus response mappings. Journal of Experimental Psychology: Human Perception and Performance, 31, 1234-46. Seeger, A. (1979). What Can We Learn When They Sing? Vocal Genres of the Suyá Indians of Central Brazil. Ethnomusicology, 23(3), 373-394. Seeger, A. (1987). Why Suyá sing: a musical anthropology of an Amazonian people: Cambridge: Cambridge University Press. See also 2004 paperback edition with accompanying CD, published by University of Illinois Press, Chicago; and 10 hours of coded video footage (peer-reviewed and corrected) on EVIA digital online archive (www.eviada.org). Seeley, T. (1989). Social foraging in honey bees: How nectar foragers assess their colony’s nutritional status. Behavioral Ecology and Sociobiology, 24, 181-199. Semin, G.R., & Cacioppo, J.T. (2008). Grounding Social Cognition: Synchronisation, Coordination, and Co-Regulation. In G.R. Semin, & E.R. Smith (Eds.), Embodied Grounding: Social, Cognitive, Affective, and Neuroscientific Approaches (pp. 119-147). Cambridge: Cambridge University Press. Shaffer, L. H. (1984). Timing in solo and duet piano performances. Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 36A, 577–595. Shapiro, L. (2007). The embodied cognition research programme. Philosophy compass, 2(2), 338-346. Sheldrake, R. (2003). The Sense of Being Stared At: And Other Aspects of the Extended Mind. London: Arrow. Sheldrake, R. (2011). The Presence of the Past: Morphic Resonance and the Habits of Nature. London: Icon Books. Sheldrake, R. (2012). The Science Delusion: Freeing the Spirit of Enquiry. London: Coronet.

250

Shockley, K., Santana, M.V., Fowler, C.A. (2003). Mutual interpersonal postural constraints are involved in cooperative conversation. Journal of Experimental Psychology: Human Perception and Performance, 29, 326-332. Shumway, R. H. (1988). Applied statistical time series analysis. In R. A. Johnson & D. W. Wichern (Eds.), Prentice Hall Series in Statistics. Englewood Cliffs, NJ: Prentice-Hall. Sidnell, J. (2001). Conversational turn-taking in a Caribbean English creole. Journal of Pragmatics, 33, 1263-1290. Sidnell, J. (2007). Comparative studies in conversation analysis. Annual Review of Anthropology, 36, 229-244. Sipper, M. (1995). An Introduction To Artificial Life. Explorations in Artificial Life (special issue of AI Expert), Miller Freeman, San Francisco, CA, 4-8. http://www.cs.unibo.it/~babaoglu/courses/cas00-01/papers/Alife/Intro.pdf Slobin M. (1992). Micromusics of the West: A Comparative Approach. Ethnomusicology, 36(1), 1-87. Sloboda, J. (2004). Exploring the Musical Mind: Cognition, Emotion, Ability, Function. Oxford: Oxford University Press. Small, C. (1998). Musicking: the Meanings of Performing and Listening. Hanover, NH, and London: Wesleyan University Press. Snell, R. (1983). Metrical forms in Braj Bhasa verse: the Caurasi pada in performance. In M. Thiel-Horstmann (Ed.), Bhakti in current research, 1979-1982 (pp. 353-83). Berlin: Dietrich Reimer Verlag. Snyder, J., & Krumhansl, C. L. (2001). Tapping to ragtime: Cues to pulse finding. Music Perception, 18(4), 455-489. Sommer, R., & Sommer, B.B. (2002). A Practical Guide to Behavioural Research: Tools and Techniques. Oxford: Oxford University Press. Spiro, N., & Himberg, T. (2012). Musicians and non-musicians adapting to tempo differences in cooperative tapping tasks. In E. Cambouropoulos, C. Tsourgas, P. Mavromatis, & K. Pastiadis (Eds.), Proceedings of the 12th International Conference on Music Perception and Cognition and the 8th Triennial Conference of the European Society for the Cognitive Sciences of Music, Thessaloniki, Greece. Spivey, J. M. (2007). The continuity of mind. New York, NY: Oxford University Press.

251

Staal, F. (1989). Rules without meaning: Ritual, mantras and the human sciences. Peter Lang, New York. Staal, F. (1993). From meanings to trees. Journal of Ritual Studies, 7, 11-32. Stephan, K. M., Thaut, M. H., Wunderlich, G., Schicks, W., Tian, B., Tellmann, L., et al. (2002). Conscious and subconscious sensorimotor synchronization - prefrontal cortex and the influence of awareness. NeuroImage, 15, 345–352. Stewart, N. & Lonsdale, A. (2013). Does singing in a choir improve well-being? A comparative study. Poster presented at Annual Conference of the British Psychological Society’s Division of Clinical Psychology (DCP), York, 4-6 Dec. Stivers, T., Enfielda, N. J., Browna, P., Englertb, C., Hayashic, M., Heinemannd, T., Hoymanna, G., Rossanoa, F., de Ruitera, J. P., Kyung-Eun Yoonf, & Levinson, S. C. (2009). Universals and cultural variation in turn-taking in conversation. PNAS, 106(26), 10587-10592. Strehlow, T. G. H. (1947). Aranda traditions. Melbourne: Melbourne University Press. Strogatz, S. H. (2003). Sync: The emerging science of spontaneous order. New York: Hyperion Press. Sugarman, J. C. (1997). Engendering Song: Singing and Subjectivity at Prespa Albanian Weddings (with accompanying CD). Chicago: University of Chicago Press. Tannen, D. (1984). Conversational Style: Analyzing Talk among Friends. Norwood, NJ:Ablex. Taylor, D.S. (1981). Non-native speakers and the rhythm of English. International Review of Applied Linguistics, 19, 219-226. Teichner, W. H. (1954). Recent studies of simple reaction time. Psychological Bulletin, 51, 128-49. Thom, R. (1975). Structural Stability and Morphogenesis. Reading, MA: Benjamin. Thram, D. (2002). Therapeutic efficacy of music-making: Neglected aspect of human experience integral to performance process. Yearbook for Traditional Music, 34, 129-38. Titze, I.R., & Worley, A.S. (2008). The Human Instrument. Scientific American, 298(1), 94-101.

252

Toiviainen, P., Luck, G.,& Thompson, M. R. (2010). Embodied meter: hierarchical eigenmodes in music-induced movement. Music Perception, 28(1), 59-70. Tomasello, M. (2008). The Origins of Human Communication. Cambridge, MA: MIT Press. Tomasello, M., & Rakoczy, H. (2003). What makes human cognition unique? From individual to shared to collective intentionality. Mind & Language, 18(2), 121-147. Tomatis, A. (2005). The Ear and the Voice (transl. Roberta Prada). Scarecrow Press. Toner, J. & Tu, Y. (1998). Flocks, herds, and schools: A quantitative theory of flocking. Physical Review E, 58(4), 4828-4858. Trainor, L. J., Austin, C. M., & Desjardins, R. N. (2000). Is infant-directed speech prosody a result of the vocal expression of emotion?. Psychological science, 11(3), 188-195. Turk, A., & Shattuck-Hufnagel, S. (2013). What is speech rhythm? A commentary on Arvaniti and Rodriquez, Krivokapić, and Goswami and Leong. Laboratory Phonology, 4(1), 93-118. Turino, T. (2008). Music as social life: The politics of participation. London: University of Chicago Press. Turner, V. (1967). The Forest of Symbols. Ithaca: Cornell University Press. Turner, V. W. (1968). The Drums of Affliction. Oxford: Clarendon Press. Turner, V. W. (1969). The Ritual Process. London: Routledge and Kegan Paul. Turner, V. W. (1974). Dramas, Fields, and Metaphors: Symbolic Action in Human Society. Cornell University Press. Turner, V. W. (1977). Variations on a Theme of Liminality. In S.F. Moore & B. Myerhoff (Eds.), Secular Ritual. Assen: Van Gorcum. Turner, V. W. (1987). The Anthropology of Performance. New York: PAJ Publications. PDF from http://erikapaterson08.pbworks.com/f/Antrophology%2520of%2520performance(2).pdf Turvey, M. T. (2005). Theory of brain and behavior in the 21st century: No ghost, no machine. Japanese Journal of Ecological Psychology, 2, 69-79.

253

Turvey, M. T., & Shaw, R. E. (1995). Toward an ecological physics and a physical psychology. In R. L. Solso & D. W. Massaro (Eds.), The science of the mind: 2001 and beyond. New York, NY: Oxford. Uhlig, M., Fairhurst, M. T., & Keller, P. E. (2013). The importance of integration and top-down salience when listening to complex multi-part musical stimuli. NeuroImage, 77, 52-61. Valdesolo, P., Ouyang, J., & DeStono, D. (2010). The rhythm of joint action: Synchrony promotes cooperative ability. Journal of Experimental Social Psychology, 46, 693-695. Valdesolo, P., & Desteno D. (2011). Synchrony and the social tuning of compassion. Emotion, 11(2), 262-6. Vallacher, R. R. & Jackson, D. (2009). Thinking inside the box—dynamical constraints on mind and action: Comment on Marsh et al.’s ‘‘Toward a radically embodied, embedded social psychology,’’ this issue. European Journal of Social Psychology, 39, 1226-1229. Van Noorden, L. & Moelants, D. (1999). Resonance in the Perception of Musical Pulse. Journal of New Music Research, 28(1), 43-66. Van Olst, J. C. & Hunter, J. R. Some aspects of the organisation of fish schools. Journal of the Fisheries Research Board of Canada, 27, 1225-1238. van Ulzen, N.R., Lamoth, C.J.C., Daffertshofer, A., Semin, G.R., & Beek, P.J. (2008). Characteristics of instructed and uninstructed interpersonal coordination while walking side-by-side. Neuroscience Letters, 432(2), 88-93. Varela, F. G., Maturana, H. R., & Uribe, R. (1974). Autopoiesis: The organisation of living systems, its characterisation and a model. BioSystems, 5, 187-196. Vesper, C., Butterfill, S., Knoblich, G., & Sebanz, N. (2010). A minimal architecture for joint action. Neural Networks, 23(8), 998-1003. Vesper, C., van der Wel, R. P., Knoblich, G., & Sebanz, N. (2011). Making oneself predictable: reduced temporal variability facilitates joint action coordination. Experimental brain research, 211(3-4), 517-530. Vickhoff, B., Malmgren, H., Åström, R., Nyberg, G., Ekström, S. R., Engwall, M., Snygg, J., Nilsson, M., & Jörnsten, R. (2013). Music structure determines heart rate variability of singers. Frontiers in Psychology, 4, 1-16.

254

Vicsek, T., & Zafeiris, A. (2012). Collective motion. Physics Reports, 517(3), 71-140. Visell, Y. (2004). Spontaneous organisation, pattern models, and music. Organised Sound, 9(2), 151-165. Vitebsky, P. (1995). The shaman: voyages of the soul from the Arctic to the Amazon. London: Duncan Baird. von Frisch, K. (1975). Animal Architecture. London: Hutchinson. Wachsmuth, I. (1999). Communicative rhythm in gesture and speech. In Proceedings of the International Gesture Workshop on Gesture-Based Communication in Human-Computer Interaction, London, UK, 277-289. Berlin: Springer-Verlag. Wada, Y., Kitagawa, N., & Noguchi, K. (2003). Audio-visual integration in temporal perception. International Journal of Psychophysiology, 50, 117-124. Wallwork, E. (1985). Durkheim’s Early Sociology of Religion. Sociological Analysis, 46, 201-218. Westermeyer, P. (2005). Let the People Sing: Hymn Tunes in Perspective. Chicago: GIA Publications. Whitehouse, H. (2004). Modes of Religiosity: A Cognitive Theory of Religious Transmission. Walnut Creek, California: Rowman Altamira. Widdess, D. R. (1994). Involving the Perfomers in transcription and Analysis: A Collaborative Approach to Dhrupad (with Ritwik Santal and Ashok Tagore). Ethnomusicology, 38(1), 59-80. Widdess, D. R. (2012). Nepali music and space in ritual. Empirical Musicology Review, 7(1-2), 88-94. Widdess, R. (2013). Dapha: Sacred Singing in a South Asian City: Music, Performance and Meaning in Bhaktapur, Nepal. Farnham: Ashgate. Will, U. (2011). Coupling factors, visual rhythms, and synchronization ratios. Empirical Musicology Review, 6(3), 180-185. Will, U. & Turow, G. (2011). Introduction to entrainment and cognitive ethnomusicology. In G. Turow & J. Berger (Eds.), Music, Science and the Rhythmic Brain. Cultural and Clinical Implications (pp. 3-30). New York: Routledge.

255

Will, U., Utter, H., Clayton, M., & Leante, L. (2005): Influence of auditory and visual information on the tapping responses to vocal alap. Paper presented at the Entrainment Network Conference, The Ohio State University, Columbus, 5 May, 2005. Williamon, A., & Davidson, J. W. (2002). Exploring co-performer communication. Musicae Scientiae, 6(1), 53-72. Wilson, A. D., & Golonka, S. (2013). Embodied cognition is not what you think it is. Frontiers in psychology, 4(58), 1-13. Wilson, B. & Dobbelaere, K. (1994). A Time to Chant: The Soka Gakkai Buddhists in Britain. Oxford: Clarendon Press. Wilson, E.O. (1971). The Social Insects. Cambridge, MA: Harvard University Press. Wing, A.M., Endo, S., Bradbury, A., & Vorberg, D. (2014). Optimal feedback correction in string quartet synchronization. Journal of the Royal Society Interface, 11. Winkler, I., H’adena, G., Ladinigd, O., Szillere, I., & Honing, H. (2009). Newborn infants detect the beat in music. PNAS, 106, 2468-2471. Woodworth, R. S. (1939). Individual and group behaviour. The American Journal of Sociology, 44(6), 823-828. Worrall, D. (2004). Editorial: Complex systems in composition and improvisation. Organised Sound, 9(2), 121-122. Wren, B. A. (2000). Praying Twice: The Music and Words of Congregational Song. Westminster John Knox Press. Wright, J. K., & Bregman, A.S. (1987). Auditory stream segregation and the control of dissonance in polyphonic music. Contemporary Music Review, 2, 63-92. Zivotofsky, A. Z., Gruendlinger, L., & Hausdorff, J. M. (2012). Modality-specific communication enabling gait synchronization during over-ground side-by-side walking. Human Movement Science, 31(5), 1268-1285. Zivotofsky, A.Z., Hausdorff, J.M. (2007). The sensory feedback mechanisms enabling couples to walk synchronously: an initial investigation. Journal of NeuroEngineering and Rehabilitation, 4, 28-32.

256

Appendix 1 - Videos of various ethnographic examples 1. Synchronised Mass Football Chants (Various Locations) http://www.youtube.com/watch?v=PVQllaOyHig 2. Celtic Football Fans doing huddle, Glasgow http://www.youtube.com/watch?v=YbjccPaO4K0 3. Opera Recitative from Mozart’s Cosi Fan Tutte, England http://www.youtube.com/watch?v=2u9_N0zQWks 4. Hopi (Native American Indian) Buffalo chant, US (Arizona) http://www.youtube.com/watch?v=5hRI7tjs8eo 5. Gregorian Chant, Solesmes Abbey (1930), Sarthe, France http://www.youtube.com/watch?v=sKm54iQ1i-M 6. Playground Handclapping Chant, US http://www.youtube.com/watch?v=5K-FpmUUc7U 7. Protest Chants, Syria http://www.youtube.com/watch?v=NDwXN5jwhZA http://www.youtube.com/watch?v=2A5m3BZHMCs 8. ‘Occupy Wall Street’ protest chant, New York http://www.youtube.com/watch?v=LpQ-wk0ZWUM 9. Football chants demonstrating ‘Blason Populaire’, England http://www.youtube.com/watch?v=dHiSlmIVxgo 10. Qawwali Group Singing, England http://www.youtube.com/watch?v=I40EIjFGoqc 11. Aka pygmy polyphonic singing, Congo http://www.youtube.com/watch?v=yKLxFmnYO_I http://www.youtube.com/watch?v=DIJ9A8eV5QE http://www.youtube.com/watch?v=FkQTEqwTs7Q 12. Paghjella Singing, Corsica http://www.youtube.com/watch?v=RmrDOn7aVbo 13. Aymara Musicians, Conima, Peru http://www.youtube.com/watch?v=xdDnDDp7nyA

257

14. Mbendjele pygmy group singing, Congo Brazzaville https://soundcloud.com/timlewis77/congo-pygmy-group-singing1 15. Traditional Maasai Rainmaking Chant, Kenya/Tanzania http://www.youtube.com/watch?v=wbQxr-rheds 16. Barong-Rangda ritual, Bali http://www.youtube.com/watch?v=b8IJI0VAqJQ http://www.youtube.com/watch?v=zUdcmGpM2v0 17. Buddhist Chant, Tibet http://www.youtube.com/watch?v=ZKyB2YURGLw http://www.youtube.com/watch?v=633eH4yajHE in Gyantse: http://www.youtube.com/watch?v=EhXVKKoFxFI in Ladakh, India: http://www.youtube.com/watch?v=0CvWZvoIqws 18: Overtone chanting - Tibet, Mongolia, & England http://www.youtube.com/watch?v=0xDZbIf7png http://www.youtube.com/watch?v=UJ3BX2tMj1Y http://www.youtube.com/watch?v=Wdjt5wjIl-Y http://www.youtube.com/watch?v=OVJw8Z63DTw 19. Nichiren Buddhist chant http://www.youtube.com/watch?v=ad4hN3FbwFo 20. Qur’anic recitation http://www.youtube.com/watch?v=qUr0J0h-VhI 21. Tarannum Solo singing http://www.youtube.com/watch?v=G56sBfTi0bc 22. Samavedic chant, Tiruvannamalai, India http://www.youtube.com/watch?v=A0tQt2CS9P4 Vedic chant: http://www.youtube.com/watch?v=qPcasmn0cRU http://www.youtube.com/watch?v=yIVfpIB8aZY 23. Apatani chant, India http://www.youtube.com/watch?v=Sdbh_bLm2_A 24. Hiva Chant, Polynesia http://www.youtube.com/watch?v=1IyCN_Kjqd0 25. Maori Karakia Chant, New Zealand

258

http://www.youtube.com/watch?v=sIFvaxVPvfs 26. Maori Haka chant, New Zealand http://www.youtube.com/watch?v=3BoNmpvkavo 27. Gamelan Music, Indonesia http://www.youtube.com/watch?v=c1AiCTJ9t8g 28. Tiv Swange Dancers, Benue State, Nigeria http://www.youtube.com/watch?v=3zVZdzyrbzc 29. North Indian Khyäl Performance, Open University Study, UK http://www.youtube.com/watch?v=rZsKLECKx10 30. Jerusarema Dance, Murehwa, Zimbabwe http://www.youtube.com/watch?v=zT6AwOZ9g5Q 31. Cante Jondo, Spain http://www.youtube.com/watch?v=Xpz4s7jtt0s 32. Sephardic Torah Cantillation, Israel http://www.youtube.com/watch?v=AJSXMauvDYw 33. Raga Singing, India http://www.youtube.com/watch?v=_HgGzeL9bEE 34. Geisha Singing, Japan http://www.youtube.com/watch?v=QWEa_xqKxsA 35. Introduction to North Indian concept of tal http://www.youtube.com/watch?v=gne67t2R91U 36. Schools of Fish http://www.youtube.com/watch?v=B6M_XgiONoo http://www.youtube.com/watch?v=cIgHEhziUxU 37. Flash expansion of fish http://www.youtube.com/watch?v=KasJjuuaCiM 38. A flock of starlings’ response to hawk attack http://www.youtube.com/watch?v=NI3RceKkegw http://www.youtube.com/watch?v=_btcPA7ssgc

259

39. Murmuration of Dunlins http://www.youtube.com/watch?v=ObDlvBLPxas 40. Murmuration of Starlings http://www.youtube.com/watch?v=iRNqhi2ka9k http://www.youtube.com/watch?v=ctMty7av0jc 41. Free Jazz Ensemble playing http://www.youtube.com/watch?v=b6pm3jipx-o http://www.youtube.com/watch?v=d0HB8ybKJzo 42. Modern Jazz Quartet Groove http://www.youtube.com/watch?v=oU5PAZWBvJY 43. Malaysian Choral Speaking http://www.youtube.com/watch?v=SOmfvKKubnY 44. Finnish Shouting Choir http://www.youtube.com/watch?v=r4maBitrrnI http://www.youtube.com/watch?v=xuVNo1R4VkE 45. Recitation of the Nicene Creed, Eastern Orthodox Church http://www.youtube.com/watch?v=lSoyeFvQTUM 46. Tanpura performance, Open University Study, UK http://www.youtube.com/watch?v=QG6YAoFWIeY 47. Military Parade, China http://www.youtube.com/watch?v=ru-xQac_sWw 48. Synchronised Walking Demonstration, Japan http://www.youtube.com/watch?v=E7cQtbMtODk 49. Nguni, Swazi, Zulu, and Tswana Folk Song and Dance, Mpumalanga, South

Africa http://www.youtube.com/watch?v=9zONDVjJTcA 50. The Blind Musical Flames, Sierra Leone http://www.youtube.com/watch?v=h3OnVS41gEI 51. Emerging, disappearing, reemerging pulse in group clapping experiment http://www.youtube.com/watch?v=a_EbtgVD81w

260

52. Fire-walking ritual, San Pedro Manrique, Spain http://www.youtube.com/watch?v=9WN1aMfggF8 53. Flamenco Ensemble, Casa Patas, Madrid http://www.youtube.com/watch?v=BmW2oNDoHWE 54. Performance of a Beethoven String Quartet http://www.youtube.com/watch?v=im0DEUTFEPE http://www.youtube.com/watch?v=atEuL0UA0YI http://www.youtube.com/watch?v=DRgh6yozmR0 55. Conductor-Orchestra relationship http://www.youtube.com/watch?v=RZkIAVGlfWk http://www.youtube.com/watch?v=Eo1KHr-b-CA http://www.youtube.com/watch?v=SJU0lC3iHaY 56. Piano Duo Performance http://www.youtube.com/watch?v=tq1rZXzjP-I http://www.youtube.com/watch?v=NZBf9kIhuLI http://www.youtube.com/watch?v=8xSGJVLtO58 Other videos of interest: Maori Powhiri (Welcome) chant, New Zealand http://www.youtube.com/watch?v=1qRRtuZZB-M Canto a tenore pastoral songs, Sardinia http://www.youtube.com/watch?v=cWVCMvbGcPA Kecak ritual singing, Bali http://www.youtube.com/watch?v=_RC5E3rp8l0 Group Rhythmic Gymnastics, Japan http://www.youtube.com/watch?v=YCyEr4jRHRY http://www.youtube.com/watch?v=DHIMS2EAEzQ Inuit Drum Dance, Greenland http://www.youtube.com/watch?v=RW0Q6d8MrqI

261

Appendix 2 - Interview summary for Gregorian Psalmody Study

2.1 Introduction

In Chapter 8 I summarised answers to interviews conducted with eight choristers. This Appendix presents a fuller account of what they said. Each header below will refer to a question that I asked each of them in turn; the questions are presented here in a different order to that of the interviews themselves, for the sake of narrative flow. At the end of each header there will be a summary of the their responses. Only a selection of the responses given by them are included here, for the sake of space, even though the responses that were not included are interesting too.

2.2 How does psalmody differ from other forms of singing in terms of keeping in time with each other? I first wanted to find out in general terms how psalm chanting differs from the other types of choral singing that the interviewees were familiar with, in terms of keeping in time with each other. The conductor, whose main experience of Gregorian psalmody comes from singing as a professional lay clerk at Westminster Cathedral, described how the synchronisation is ‘about just getting…the rhythm of the words…an almost hypnotic flow of rhythm…it’s all about keeping an even syllable length and just going like that…very very even all the way, almost robotic’. He suggested that the reason for the even syllable length in Gregorian psalm chanting may have something to do with the sheer volume of text to get through. It may also have something to do with the fact that most people in the contemporary era cannot speak or understand Latin fluently. He describes how this particular ‘even’ performance style may just be the house style of Westminster Cathedral, which has been performed for 100 years, but that he had noticed a similar rhythmic approach by monks chanting in the various monasteries he had visited in his life, even if there was no truly ‘authentic’ way of doing it.

262

Even with Gregorian chant’s fairly clear notation system, Chorister E noted how ‘there’s no set rules, or how long things are supposed to be, you are supposed to come to some sort of agreement with the people you are singing with because…the way it's written is basically a guideline for monks in the community to be able to chant relatively…round about at the same time, it's not as if it’s a set rule for all communities’. Chorister F describes how ‘there is nothing in front of you that’s necessarily telling you how long you should be dwelling on a note’. On the other hand, Chorister C described how the shared knowledge of the notated rules concerning the stressing of the words means that there is little interpretative variation between the choristers. In summary, synchronisation in psalm singing differs from other forms of choral singing in that it employs ‘evenness’, or isoperiodicity, for each syllable of text. And although the notation plays a role in determining the rhythm of the chant, it seemed to be important for fellow choristers to negotiate ‘some sort of agreement’ amongst themselves. 2.2.1 Do you feel Gregorian psalmody is best done in speech rhythm or 'even syllable rhythm'?

In Gregorian chant, there are two extremes of text-setting: syllabic text-setting, where ‘each syllable of text is set to a single note’, and melismatic style, where a melody of several notes is sung on a single syllable of text (Dresher, 2008:42). Gregorian psalmody displays syllabic text-setting for the most part (see Fig. 8.1). The conductor’s response was that even syllable length—i.e. each syllable has the same duration—is the key to singing psalms in Latin in synchrony: ‘very, very even all the way, almost robotic’. He says that such a strategy would be inappropriate if the psalms were in English, because the monosyllabic rhythmic repetition would sound wrong. Chorister F also thought an even syllable rhythm leads to ‘a better product’. Chorister A preferred speech rhythm because she knew Latin, but understood that not everyone was familiar with Latin, and so an even rhythm was safer. Chorister B thought that the conductor was able to combine even syllable length with varied stressing of the words. Chorister D thought that ‘it’s always a balance between the two [speech and even rhythm] because if it’s pure speech rhythm it can also sound quite artificial and kind of like it’s being pushed too much’. Chorister E, who was experienced with Gregorian chant, said ‘my experience has been in these monasteries…they go for a speech rhythm’.

263

Chorister C made the interesting point that there is no such thing as ‘speech rhythm’ with written-down text because ‘when it’s written down in front of you and you are only thinking about one line at a time’ that is a different scenario from when one speaks, because when speaking one creates natural pauses by thinking about what one is saying, creating the text as one speaks. With written-down text, ‘there’s much less room for maneouvre, and much less need’ because the words are fixed and pre-composed. Chorister G also made the point that ‘everyone has different ways of speaking’, but he thought that the ideal would be speech rhythm, even though ‘we were doing something more on the variation of even rhythm’. The answers above suggest that the Gregorian chant was performed more in an ‘even’ than a ‘speech’ rhythm style, i.e. each syllable of text tended to last the same duration. However, according to some, the ideal style of performance was that the rhythm of the Latin text needed to be accommodated within the restrictions of an ‘even rhythm’ style. 2.2.2. Which is easier: a fast or slow tempo?

As well as the rhythm of the text, the tempo of the chanting itself seemed to have an influence on the group’s ability to synchronise. As yet, I do not have a theory as to why this is so. The conductor described how the chant should be fast but not too fast, because otherwise people start tripping over the words. Chorister A said that slower was easier because she had to have enough time to go through each thought process consciously, e.g. checking the text against the melody, and how many syllables the text contained. Chorister D also said that for verses with fewer words it was better to go slower, and faster for verses with more words. The others preferred a faster tempo.

From these answers it would seem that there are no hard and fast rules for how fast or slow one should sing Gregorian chant, but that any variance in tempo should reflect the nature of the text being sung at that moment in time.

2.3 Are visual or aural cues more useful for getting 'in time' with other people? The conductor explained that at Westminster Cathedral (referred to as ‘The

Drome’) ‘we used to do Vespers…during the week without a director…you wouldn’t even look up sometimes, you’d just hear the breath and you’d very quickly

264

get into the rhythm and it would just happen, you’d look up at the start of each psalm because the cantor would start and then you’d look back down and then you’d just sing’.

In this description of routine professional psalm chanting, aural cues tend to be most prominent. However, when asked whether synchronising psalm chanting has anything to do with vision he replied ‘I think it’s to do with all the senses…hearing, the breath, noticing the way that people move’. He clarified that breathing is more useful heard than seen if the person is standing close to you, but when people are too far away to hear vision is more important. Chorister C said that hearing someone take a breath is useful for starting and she thought that a particular person usually leads the breathing, and overall synchronisation, even if they may lead by the ‘tiniest of differences’.

Chorister A suggested that because of the need to look at the notation, she used aural cues more often, and she thought that breathing cues were audible enough for synchronisation purposes. Chorister D also corroborated Chorister A’s observations, but added that peripheral vision of another person ‘twitching’ was also useful for starting together. Chorister D who was deaf in one ear found it more difficult to keep in time with someone on their deaf side, which suggests that aural cues are very important. Chorister A said that one could anticipate by listening to ‘the preparations you hear in someone’s voice to do with word stresses so you can feel them either accelerating or building towards the crescendo when they arrive on a stressed syllable’. Chorister A liked the choir to be tightly-packed next to each other because ‘physically being able to touch the people next to you was actually really helpful…physically feeling the shoulders of someone…[so] you can feel someone move into something, or through something’. Chorister E thought that the visual cues of ‘gentle body sway’ were useful. In summary, both aural and visual cues are used. The audible breath seems to be a key cue for synchronisation, but, as these become less audible with people that are further away, visual body movements are better cues. Some choristers said body movements of other choristers, even those in close proximity, were important. Touch was, for one chorister, useful for starting singing together.

265

2.3.1 Which bench formation is best? All of the visual gestures, and, to some extent, the aural cues too, depend on the

right formation or spatial positioning of chanters in relation to each other. The benches that the chanters sat on were placed in different positions during the week of chanting, and so I asked which bench formation each chanter preferred. The opinions varied widely but there was common reasoning: balancing the demands of keeping together with the projection of sound into the building so that people could hear.

Choristers E & G commented that when chanters are in one long line, the people on the ends cannot see each other in their peripheral vision. Chorister G described how in a different performance context, even when the conductor had stood in front of the choir they could not sing in time with each other because they were in a line. Although there had been an orchestra in between the choir and conductor, the fact that even a conductor could not synchronise choristers in a line shows how important it is for people to see each other.

Chorister E’s solution was to join up the line at both ends by having a circle facing inwards, ‘when everyone has peripheral vision of everyone else…plainchant works much better that way’; however, we did not try this arrangement, and arguably it might have been better for the chanters than for those listening to the chant down below in the church, due to reduced sound projection. Chorister F would have preferred ‘two parallel lines [of chanters] facing each other, performing in our own little space…I think we sang quite loudly at times when we didn’t have to…and so…I would have felt more intimate facing my colleagues rather than in a V-shape’.

Chorister D also said that ‘if you’re fully opposite each other then you can read more people’s signals…whereas if you are in a V you can only see the people who are closest to you’. Chorister D also said that increasing the distance between parallel facing choirs makes synchronisation more difficult. Of course, a V-shape is better for projecting the sound out into the auditorium as opposed to facing opposite each other, where choir members are singing more to each other.

Other choristers liked the V-shape—Chorister C preferred the conductor in the middle of the ‘open’ end of the V, and Chorister B preferred a tighter, less obtuse, V-shape, because then you can see more people and are less likely to ‘daydream and stare out of the gallery’. The conductor agreed that the V-shape worked well. However, he said that the formation depends on which space and building you are

266

singing in; in narrow spaces, like the one we were singing in, ‘you sit facing each other’. The conductor’s ideal formation, regardless of the building, was a semicircle where ‘the voices are kind of aiming down the building but…you are not looking away from each other’.

All these various comments demonstrate that although Gregorian chanters need

to see each other’s movements and hear each other’s sounds in order to synchronise as a group, they also need to project whatever sound they are making away from themselves towards the congregation.

2.3.2 Is reading the notation and words correctly more important than being aware of the other people? The importance of being aware of other choristers, as opposed to reading the notation and text correctly, came up a number of times in answers to previous questions so this question was a natural sequitur. As far as the conductor was concerned the notated rhythm rules he imposed were simple, relating to only a couple of ways that the text could vary rhythmically, and therefore he felt that for the purpose of synchronising on the group level it was most important to be aware of the other choristers. However, he said it was an individual’s responsibility to read the notation correctly—if someone missed out a syllable or forgot a ‘lengthened’ note they could destabilise the other choristers. He also said that: ‘if you are sight reading anything you’ve got to deal with words, rhythm, and pitch. A lot of people if they are sight reading can do 1, maybe 2 but not 3 of those things. They might sing the right words and the right notes but they'll slow down or they might sing at the right speed but make lots of mistakes with the words—with sight-reading doing all three at once is rare…’. Chorister F said that ‘my no. 1 thing would always be ‘I’ve got to get this right [the notation]…I’m paying particular attention to the longer words because I’m always being tripped up, I actually put little 'v's across the tops of the syllables cos sometimes you will read [a word] and you’ll miss out a syllable’. Chorister E & C thought that getting the words right was probably the most important factor, but E also said that you couldn’t say the words in time without at least some connection with the other people. Chorister D thought that in order to know whether it is you or someone else who is right or wrong, the main focus has to

267

be on the notation and the words. Unexpectedly, Chorister A observed that ‘on several occasions during the week somebody went wrong and we all followed because it was better that we were all doing the same thing than…it was strictly correct’; i.e. being aware and in sync with others is primary over getting the notation correct. Others suggested it was a combination of both reading the notation correctly and being aware of others. On balance, from the above responses, it would seem that both reading the notation correctly and being aware of other people are essential for synchronisation. This is a good example of the mutuality between top-down and bottom-up processes in entrainment. 2.3.3 Do you look out of the corner of your eye? Visual contact with other Gregorian chanters is virtually never direct, but rather usually out of the corner of one’s eye. This is because the complexity of the task of reading the notation means that choristers need to look at their sheet music for nearly all the time they are singing (see Ch. 8.4.1). Chorister G elaborates: ‘getting the point where the note changes and trying to get my head round the bold and italic type and what that means, and when there is extra notes and how to fit them in…it’s half dark and we’ve got a lot of this stuff to get through, and it's actually quite difficult to read!’. Chorister G also describes how it is more difficult to ‘read ahead’ with chant notation, given the quantity of unfamiliar Latin words to process, combined with the way the rhythm varies with the specific text in any given verse. However, these difficulties with the notation may also have been due to the relative unfamiliarity with the notation experienced by the majority of the group, who were trained primarily in the Anglican choral tradition. Chorister F suggested that whilst he mainly focused on the text, at irregular moments in the notation, e.g. a pitch change associated with an italicised or emboldened syllable in the text, their peripheral vision awareness is heightened and they check to see if those around them are also aware of the pitch change coming up, and may even over-emphasise the particular syllable so that others are made aware. Chorister F paid particular attention to one of the other choristers who they were standing next to, and observed that this other chorister had a tendency to ‘dip’ in her

268

body language at points of melodic change. Chorister B said that peripheral vision is ‘not completely necessary if you are quite practised as a group, but I find it very useful to have a little bit of physical gesture within your peripheral vision to help bring everybody in’. Chorister E noticed out of the corner of his eye that when chanting was ‘going well, we tended to have the same body language, having the same sort of movement at the same time’. Chorister A thought that the aural cues were the most essential because all of her visual focus was on the text and notation. Choristers gave mixed answers about whether they consciously looked out of the corner of their eye, but most said they did to some degree. It seems that peripheral vision is activated most at points of uncertainty within the notation, or at the beginning of phrases. 2.3.4 Are you aware of other people's breathing influencing when you come in? Chorister B explained that breathing was the most important cue for starting singing together but ‘from then on…people sort of bounce a bit’. Chorister A said that ‘I think we all breathe slightly louder when we know that we are using it to communicate with each other’. Choristers G & C said breathing is very important. Chorister G said ‘You hear [the breath], even though it is a very faint sound…you pick up on it definitely because it warns you what people are going to do’.

The conductor described how breathing was the primary cue for highly

experienced Gregorian chanters who tend to not look up from the sheet music apart from at the beginning of a psalm. He described how ‘you’d just hear the breath and you’d very quickly get into the rhythm and it would just happen’. He described how ‘you can see someone breathe as well as hearing them breathe’, although ‘you can [only] hear them breathe if they are sitting next to you…and that’s one way a choir gets together if everyone hears the person next to them breathe’. It is questionable whether synchronisation works as he describes here, because everyone waiting for the person next to them to breathe would create a ‘domino’-like mechanism, which would pass from person to person, thus possibly reducing the chance of starting in synchrony.

The choristers unanimously agreed that audible breathing was an important, if

not the most important, cue for coming in together.

269

2.4 Are you aware of the signals you give to others? Previous answers have already hinted that they are aware of what gestures others

make to keep everyone together in time, but this question asked whether they were aware of their own gestures. Chorister A said ‘I think we all breathe slightly louder when we know that we are using it to communicate with each other. I think you can often tell if someone next to you is going to go wrong, and…being very clear on particular syllables, if you think they are going to miss one out, can often bring someone back [into sync]’. Chorister A also said she received similar emphasis from those around her, which helped her realise she was about to go wrong, which suggests that other people intentionally corrected her behaviour too. Chorister B also mentioned how she too would signal consciously ‘only when I think they are going wrong!…then I’m aware of the signals I’m giving out because I’m actively trying to get the two of us back into sync, but otherwise no’. Conscious signalling seems to occur at points where ambiguity is likely; for example, Chorister C mentioned that ‘I sort of lean into the changes of notes, and the pauses’.

In terms of the body gestures used, Chorister D described how ‘I do tend to nod my head sometimes, I’m quite conscious of deliberately placing [my head nod with the beat]’. Chorister G was unaware that he moved fairly vigorously when I told him that he did, and admitted that he was more aware of how others are signalling than how he is signalling. Chorister E, who often took a leader role because he was experienced, saying that ‘I tend to sway my [sheet] music, in kind of rhythm to the psalms’. Chorister C said that she had become more demonstrative in her own breathing when starting new phrases as the week had gone on.

There seemed to be consensus on [i] being consciously aware of being

demonstrative in one’s gestures at moments of timing ambiguity, such as the beginning of phrases; and [ii] in moments when another chorister has got out of sync and they need correcting.

2.5 Do you focus on certain people? Are they part of a hierarchy? The conductor suggested that power dynamics arguably play less of a role when a group of chanters are all highly experienced, as is the case with Westminster Cathedral Choir, his own choir: ‘when I have sung it has been with experienced

270

groups when you just keep going and people make mistakes occasionally’. However, he agreed that ‘if you are in a group of inexperienced people and there’s a stronger person doing it you will go with them, and often you can hear someone will make a mistake and everyone will follow them’. Chorister A said that ‘if there’s no-one directing then I tend to follow the strongest voice, who is least likely to make a mistake’. Chorister C thought that it is more instinctive to follow a leader rather than pay attention to everyone at the same time, and added that some people are obviously very confident, taking it upon themselves to lead, whereas others do not lead. Chorister G also heard certain voices more than others (particularly the conductor’s voice), suggesting that perceived volume was a factor. Chorister E thought that the person intoning was likely to going to take charge the most, which was often the conductor; Chorister E also replied ‘maybe’ when I asked him whether he was a natural leader. About the chorister he sat next to, Chorister F said: ‘she was a strong voice, accurate, and there is something sort of ‘together’ about her, and therefore she’s unlikely to be careless’; he also later commented that she was a professional music critic and therefore he respected her more as a musician. For others, another chorister’s perceived competence was not as important as their spatial position. For example, Chorister B said that she was ‘really locked into’ whoever she was sitting next to or opposite. Chorister D thought that the most important person for synchronisation was the person who was closest to you, not necessarily a leader; however, this may have been due in part to Chorister D being deaf in one ear, and finding background noise confusing. Chorister D had also sung with the person that sat next to her for years to the extent that she could ‘feel and read what she is going to do more easily’, and admitted to singing infinitesimally after her beat, and even following her mistakes. It would seem that in terms of who people ‘follow’, some choristers chose who they followed based on the other chorister’s perceived competence, whereas others chose based on how close the other person was in relation to them (in spatial terms).

271

2.5.1 How important is the conductor for keeping in time? The most obvious leader within the group was the conductor, so I asked the

choristers how important a conductor is for keeping in time in Gregorian chant. Chorister E said ‘I think [the conductor is] vital, he has to give lots of body language for [the chanting] to work’. He also said ‘for people who are less trained in it, the more body language [e.g. breathing and hand movements] you get from the conductor [the better]’. Similarly, Chorister C thought that a conductor’s usefulness depends on ‘the overall confidence and the experience of the group’, and that a conductor instils confidence by being physically demonstrative, particularly at the beginning of verses, which is very important given that ensemble is easily destabilised in psalmody.

Chorister G explained that when the psalm text gets complicated an experienced conductor is particularly useful. However, Chorister G also said that without a conductor, the choristers engaged more and were able to listen to each other in such a way that they were able to ‘somehow’ start together. He also added that this was probably not made possible by each individual counting the beat independently in their minds, instead saying that ‘we’re definitely communicating with each other in some subliminal way to get that next entry in on time’. Similarly, Chorister D thought that conductors can be useful ‘but sometimes it’s easier just for you to feel it as a group…it can get really confusing if you’ve got a conductor, particularly if they’ve got a certain style and way they like to do it’. Chorister D found that it was more important for synchronisation for a group to be familiar with each other than to have a conductor, who plays more of a ‘tweaking’ or ‘polishing’ role. Chorister B also thought that a conductor was not completely necessary for Gregorian chanting if the group is well practised, but that a little physical gesturing from a leader is useful. Chorister A had a more extreme stance: ‘I’d say in some ways it’s actively negative [to have a conductor]…[because] the tendency is to fall into traditional singing patterns, which is to check in every now and then and generally do your own thing. I think when you are singing psalms in a small group, the conscious act of not having a conductor, which requires you and demands you to look around and be far more aware of fellows, and listen more closely, makes for a better psalm singing experience, and a more accurate one’.

272

The conductor’s opinion was that ‘anyone in charge is going to have some influence if they’ve got an idea of how it should go’. He added that his main top-down conscious inputs were: [i] making sure the tempo is not too fast or too slow; [ii] reminding people ‘to sing even syllables rather than trip over the smaller syllables’; and [iii] finding new ways of explaining how the notation works. However, he said that successful Gregorian chant performance is ‘mostly to do with the fact you [the group] just get used to doing it [together], and I think that’s as it should be’. Indeed, he even said that he felt like he was partly directing for himself, for fear that something might go wrong if he did not.

All in all, there was consensus that the conductor is not essential for group

synchronisation in Gregorian chant, but can be helpful, especially if the group is less experienced.

2.6 How would performing psalmody differ in groups of 20 vs. 8 vs. 2 people? In Ch. 7 I discussed the possibility of a threshold number of musicians in a group

beyond which a conductor is required. Here I asked the choristers how psalmody would differ between 20, 8 and 2 people in terms of being together in time. These sizes were chosen because they relate to common sizes of choirs in Gregorian chant.

Chorister F thought that ‘unless you’ve got a very good group, it’s likely to get more laboured the more people are involved…it would start to slow down and get ponderous’. Chorister G also agreed that more people meant less precision because chanting ‘require[s] everyone to be in tune with everybody’s movements, everybody’s vibrations…[and] it is a statistical possibility for that to go wrong more if there are more people singing’. He added that a bigger group could be accurate ‘as long as they can all see and hear enough people around them’. Chorister B also agreed that one effect of a larger group would be that the chant would get slower, and that solo psalmody is freer than group psalm singing because the mistakes are not as obvious when only one person is singing.

Similarly, Chorister D said that with two people ‘there’s more space for

disagreement so more obvious if you disagree’, but with four choristers the mistakes are less obvious. Chorister A said: ‘it’s much harder to sing psalms I’d say with 2 people than with 20, because when you’ve got 20 connections…there’s much more

273

flexibility of togetherness, whereas with 2 people if there’s a slight disparity suddenly you notice it…so I think you need to be more accurate the fewer people you have’. Chorister D, however, thought that it takes a lot longer with larger groups to build up the kind of group dynamic to the point when everyone knows what they are doing. The conductor made the point that ‘the more people you have the more mistakes there’ll be. I suppose that’s stating the obvious [is it?]…[although,] the larger the group is…the less effect one person is going to have on it going wrong’.

Chorister C took the view of experience being more important than numbers: ‘I

think logically the fewer people the better but if you’ve got 20 people who all know exactly what they are doing vs. 4 people who haven’t sung with each other before who do not know the rubrics of how plainsong psalm singing works, the 20 people will be far more in sync with each other. I think it’s really about knowing how the person next to you sings, and if everyone knows how the person next to them sings then technically it should all be in time’.

Chorister E also said something similar that ‘if it was going to be 40, 50, 100 monks there would be no problem [because] monks have to sing every day every day every day every day in order to have that knowledge and understanding of the Gregorian chant [in order] to sing it together at the speed that they sing it…I think it would take years and years and years [for us laypeople]’.

It also matters how voices are arranged within the choir; for example, Chorister

A suggested that when the two sides of the choir were split into upper voices and lower voices, pitch ambiguity was reduced because each choir would sing on alternate psalm verses all at the same pitch. This pitch clarity meant that she was freed up to concentrate on rhythm.

The general principle was that inertia increased—resulting in slower tempo—as

the number of people in the choir increased, but ‘mistakes’ became less obvious. A couple of choristers made the point that mistakes are more likely within less experienced groups, and therefore experience was possibly more of a decisive factor than the size of the choir.

274

2.7 Has the chanting changed during the course of the week? When asked how the process of synchronisation in psalm chanting had evolved

during the week, the answers varied but there was a clear sense that it had improved. The conductor felt that by the end of the week ‘people are singing more together, people are listening more, [and] I’m having to explain less’. Chorister F put that down to ‘techniques’ he had developed, such as ‘reading ahead a bit more carefully, [a] bit more pencilled preparation, particularly long words with many syllables’. Chorister C noticed that the evolution of skill had occurred ‘as a group’, with everyone getting used to the speed and rhythm, and the ‘two little clicks between the verses’ had become functional (i.e. imagined rhythmic clicks in the mind). Indeed, Chorister B pointed out that the group’s ability to ‘resync’ after an individual rhythmic mistake had been made improved, whereas at the beginning of the week the group performance might have been easily derailed. Chorister G thought that practice had improved many different aspects of chanting the psalms; e.g. listening, anticipation, and notation-reading.

There was consensus that chanting had improved during the week, partly due to

consolidation of top-down personal strategies and pedagogical instruction, and also due to ‘listening more’ to individuals and dynamically adapting to destabilising mistakes more efficiently.

Appendix 3 - Online access to accompanying digital media

The following link is for online access to the collection of audio extracts and video clips referred to as ‘App. 3’ throughout this thesis:

https://drive.google.com/folderview?id=0B_rCe-bl7cm6VWJFMjlkMTViLXM&usp=sharing

The audio extracts of the singing by the Suyá are included by kind permission of

the Suyá people and Anthony Seeger.