— The Role of Timing and Intensity in the Production and ...

194
The Role of Timing and Intensity in the Production and Perception of Melody in Expressive Piano Performance Dissertation zur Erlangung des Doktorgrades der Philosophie an der Geisteswissenschaftlichen Fakult¨ at der Karl-Franzens-Universit¨ at Graz eingereicht von Mag. phil. Werner Goebl am Institut f¨ ur Musikwissenschaft. Erstbegutachter: Univ.-Prof. Dr. Richard Parncutt Zweitbegutachter: Gastprof. PD Dr. Christoph Reuter 2003

Transcript of — The Role of Timing and Intensity in the Production and ...

—The Role of Timing and Intensity

in the Production and Perception of Melodyin Expressive Piano Performance

Dissertationzur Erlangung des Doktorgrades der Philosophie

an der Geisteswissenschaftlichen Fakultat

der Karl-Franzens-Universitat Graz

eingereicht von

Mag. phil. Werner Goebl

am Institut fur Musikwissenschaft.

Erstbegutachter: Univ.-Prof. Dr. Richard Parncutt

Zweitbegutachter: Gastprof. PD Dr. Christoph Reuter

2003

ii

Vienna, August 27, 2003.This manuscript was typeset with LATEX2ε.

Abstract

This thesis addresses the question of how pianists make individual voices stand outfrom the background in a contrapuntal musical context, how they realise this withrespect to the constraints of the piano keyboard construction, and finally how mucheach of the expressive parameters employed by the performers contributes to theperception of particular voices. Three different empirical approaches were used toinvestigate these questions: a study in the area of piano acoustics investigated thetemporal properties of three different grand piano actions, a performance study witha Bosendorfer computer-controlled grand piano examined intensity and onset timedifferences between the principal voice and the accompaniment, and a series of per-ception studies looked at the relative effect of asynchrony and intensity variation onthe perceived salience of individual tones in musical chords and real music contexts.

First, the temporal behaviour of grand piano actions from different manufac-turers was investigated under two touch conditions: once with the finger restingon the key surface (legato touch) and once hitting the keys from a certain distanceabove (staccato touch). A large amount of measurement data from three grand pi-anos by different piano makers was gathered with an accelerometer setup monitoringkey and hammer movements as well as recording the sound signal. Selected toneswere played by two pianists with the two types of touch. From these multi-channelrecordings of over 4000 played tones, discrete readings such as the onset time ofthe key movement, hammer–string and key–bottom contact times, the instant ofmaximum hammer velocity, as well as peak sound level were obtained. Prototypi-cal functions were determined (and approximated by power curves) for travel times(from finger–key to hammer–string contact), key–bottom times, and the instants ofmaximum hammer velocity. These varied clearly between the two types of touch,only slightly between the investigated pianos, and not at all between tested keys.However, no effect of touch type was found in peak sound level (dB), indicating thatthe hammer velocity rather than touch determined the tone intensity. Furthermore,the measurement and reproduction accuracy of the two computer-controlled grandpianos used (Yamaha Disklavier DC2IIXG, Bosendorfer SE290) was examined withrespect to their reliability for performance research.

The second approach was through a performance study in which 22 profes-sional pianists played two excerpts by Frederic Chopin on a Bosendorfer computer-controlled grand piano. The performance data were analysed with respect to toneonset asynchronies and dynamic differences between melody and accompaniment.

iii

iv Abstract

The melody was consistently found to precede the other voices by around 30 ms,confirming findings from previous studies (melody lead). The earlier an onset of amelody tone appeared with respect to the other chord tones the greater was alsoits intensity. This evidence supported the velocity artifact hypothesis that ascribesthe melody lead phenomenon to mechanical constraints of the piano keyboard (theharder a tone is hit, the earlier it will arrive at the strings). In order to test thishypothesis, the relative asynchronies at the onset of the keystrokes (finger–key asyn-chronies) were inferred through the relation of travel time and hammer velocity fromthe previous study. Those key onsets showed then almost no asynchrony betweenthe principal and the other voices anymore. This finding indicated that pianistsstarted the key movement basically in synchrony; the typical asynchrony patterns(melody lead) were caused by different sound intensities in the different voices. Thisrelationship was modelled to predict melody lead from intensity differences. It wasconcluded that melody lead can be largely explained by the mechanical propertiesof the grand piano action and is not necessarily an independent expressive deviceapplied (or not) by pianists for purposes of expression.In a third approach, the influence of systematic manipulation of the two param-

eters found in the previous study (relative onset timing and intensity variation) onthe perceived salience was investigated. In a series of seven experiments, trainedmusicians judged single tones in dyads and three-tone chords in which the relativeonset timing and intensity were systematically manipulated. Two experiments fo-cussed on the threshold beyond which two tones sound asynchronous. With pianotones, this threshold was at 30–40 ms, but changed considerably with the intensityof the two tones. With the earlier tone much louder, dyads with as much as 55 ms ofasynchrony were heard as simultaneous by musicians. Either musicians perceive fa-miliar combinations of asynchrony and intensity difference as more synchronous thanunfamiliar combinations, or sensitivity to synchrony is reduced in the melody-leadcondition by forward masking. The other experiments examined loudness ratings ofchord tones (target) with each of the two or three tones simultaneously manipulatedin relative timing and intensity by up to ±55 ms and +30/−22 MIDI velocity units.The experiments involved various types of tone (pure, sawtooth, synthesised andreal piano) and musical material (dyads, three-tone chords, sequences of three-tonechords, and a real music excerpt by Frederic Chopin). Generally, loudness ratingsdepended mainly on relative intensity and relatively little on timing throughout allexperiments. Loudness ratings increased with early onsets (anticipation), but onlyin conditions in which the target tone was hardly heard (equally loud or softer thanthe other tones). In these cases, anticipation helped to overcome spectral masking.Melodic streaming of tones in chord progressions enhanced the effect of asynchronyonly marginally. The two selected voices of the excerpt by Chopin were perceivedas more important when they were either delayed or anticipated, but only in com-bination with enlarged intensities.

Zusammenfassung

In dieser Arbeit wurde untersucht, in welcher Weise professionelle Konzertpianisteneinzelne Stimmen in einem mehrstimmigen musikalischen Kontext herausheben,welche Moglichkeiten und welche Einschrankungen ihnen dabei der moderne Kon-zertflugel bietet bzw. auferlegt, und welche perzeptuellen Konsequenzen die verwen-deten Ausdrucksmittel fur die Horer haben. Diese Grundfragen wurden anhand vondrei unterschiedlichen methodischen Ansatzen behandelt.

In einer instrumental-akustischen Studie wurde das Zeitverhalten von Klavier-mechaniken dreier unterschiedlicher Hersteller (Yamaha, Steinway, Bosendorfer) un-ter verschiedenen Anschlagsbedingungen untersucht. Funf ausgewahlte Tasten wur-den einmal von der Taste (Legato-Anschlag) und einmal aus der Luft angeschlagen(Staccato-Anschlag). Der Versuchsaufbau umfaßte ein kalibriertes Mikrophon undzwei Akzelerometer, welches die Bewegungen von Taste und Hammer wahrend desAnschlagvorganges registrierten. Mehrkanalaufnahmen von uber 4000 gespieltenTonen wurden von einem dafur geschriebenen Computerprogramm automatisiertausgewertet. Zeitliche Zusammenhange, wie die Zeitdauer des Anschlagvorganges(vom Beginn der Tastenbewegung bis zum Auftreffen des Hammers auf der Saite),dem Moment der hochsten Hammergeschwindigkeit oder dem Zeitpunkt, zu dem dieTaste den Tastenboden beruhrt, wurden ermittelt und durch prototypische Expo-nentialfunktionen angenahert. Unterschiedliche Anschlagsarten veranderten maßge-blich diese Zusammenhange, wie z. B. die Dauer des Anschlages (travel time), weitmehr als Hersteller oder Tonhohe. Es konnte kein Effekt von Anschlagsart auf denKlavierklang beobachtet werden, der unabhangig von der Hammerendgeschwindikeitware. Weiters wurden die Aufnahme- und die Wiedergabeprazision der zwei Repro-duktionsflugel (Yamaha Disklavier DC2IIXG, Bosendorfer SE290) in bezug auf ihreVerwendbarkeit in der Performanceforschung getestet.

In einer zweiten Studie, in der 22 Konzertpianisten auf einem Bosendorfer Com-puterflugel zwei kurze Ausschnitte aus Stucken von Chopin spielten, wurde unter-sucht, in welcher Weise zeitliche und dynamische Anschlagsdifferenzen zwischen denMelodie- und den Begleittonen zusammenhangen. Melodietone erklangen typischer-weise um die 30 Millisekunden (ms) vor den Begleitstimmen, was Ergebnisse fruhererStudien bekraftigt (melody lead). Der starke Zusammenhang zwischen melodylead und Unterschieden in der Dynamik konnte mit der velocity-artifact-Hypotheseerklart werden: der Hammer einer heftig angeschlagenen Taste erreicht entsprechendfruher die Saiten und erzeugt einen Ton als einer, der schwacher angeschlagen wurde.

v

vi Zusammenfassung

Es wurden mithilfe der travel-time-Funktionen die Beginnzeiten der einzelnen Tas-tenbewegungen (finger–key contact times) ermittelt, die dann keine Asynchronienmehr aufweisen. Es konnte somit nachgewiesen werden, daß durch Herausrechnenallein dieser zeitlichen Eigenschaft der Klaviermechanik der großte Teil des melody-lead-Phanomens erklart werden kann, das damit in dieser Form nicht als ein von derDynamik unabhangiges Ausdrucksmittel bezeichnet werden kann. Ein melody-lead-Modell wurde entwickelt, das anhand der Dynamik der einzelnen Tone das Ausmaßder Ungleichzeitigkeit vorhersagen kann.Der dritte Ansatz widmete sich den Auswirkungen von Asynchronizitat und Dy-

namikdifferenzierung auf die Perzeption durch musikalisch gebildete Horer. In einerSerie von sieben Horexperimenten wurden zwei Hauptaspekte behandelt. Zum einenwurde gefragt, ab wann Musiker zwei beinahe simultane Klange als ungleichzeitigempfinden, und zum anderen, wie sich speziell die zeitliche Verschiebung zweierKlange zueinander auf die perzipierte Dynamikempfindung (salience) auswirkt. Da-zu wurden als Stimulusmaterial zwei- und dreistimmige Akkorde als auch Abfolgenvon Akkorden und ein kurzes Musikbeispiel von Chopin verwendet. Musikalischgebildete Personen beurteilten sowohl unterschiedliche Klange (Sinus, Sagezahn,synthetisiertes und akustisches Klavier) als auch unterschiedliche Tonhohen, in de-nen die jeweiligen zu testenden Tone zeitlich bis±55 ms und in dynamischer Hinsichtbis zu +30/−22 MIDI-velocity-Einheiten manipuliert wurden. Die Ungleichzeitig-keitsschwelle lag mit 30–40 ms etwas hoher als in der Literatur berichtet. Sie konnteaber noch wesentlich hoher sein, wenn der fruhere Ton zugleich auch um einigeslauter als der andere war. In dieser melody-lead-Situation wurden sogar Ungleich-zeitigkeiten von 55 ms als gleichzeitig gehort. Dieses Phanomen wurde einerseits mitder Vertrautheit mit Klavierklangen erklart (Musiker erkennen Ungleichzeitigkeitenin ungewohnten Kombinationen von Asynchronie und relativer Dynamik leichter)und andererseits mit Maskierungsphanomenen (im speziellen die Nachverdeckung).Der zweite Aspekt bezog sich darauf, ob ein verfruhter oder verspateter Akkord-

ton in seiner perzeptuellen Dominanz verandert wahrgenommen wird. Es zeigtesich, daß die beurteilenden MusikerInnen sich hauptsachlich nach der Dynamikder einzelnen Tone orientierten und nur kaum nach ihrer Asynchronizitat. Diesewurde erst relevant, wenn gleichlaute oder wesentlich leisere Tone zu beurteilenwaren. Dann ‘entkamen’ verfruhte Tone der spektralen Maskierung und wurden alslauter beurteilt. Wiederholte Akkorde erhohten den Einfluß von Ungleichzeitigkeitauf die Lautheitsbeurteilung nur unwesentlich (streaming-Effekt). Auch in demMusikbeispiel konnte ein derartiger Effekt nicht nachgewiesen werden. Antizipationsowie Verzogerung wurden nur im Zusammenhang mit dynamischer Verstarkung alsperzeptuell verstarkend bewertet.

Acknowledgements

This work was carried out within the framework of a large-scale research project“Computer-Based Music Research: Artificial Intelligence Models of Musical Expres-sion” at the Austrian Research Institute for Artificial Intelligence (OsterreichischesForschungsinstitut fur Artificial Intelligence, OFAI), Vienna. This project has beenfinanced through the START programme from the Austrian Federal Ministry forEducation, Science, and Culture (Grant No. Y99–INF) in form of a generous re-search prize to Gerhard Widmer (http://www.oefai.at/music). The OFAI ac-knowledges basic financial support from the Austrian Federal Ministry for Edu-cation, Science, and Culture and the Austrian Ministry for Transport, Innovationand Technology. Furthermore, the author acknowledges financial support from theEuropean Union for his research visit at the Department of Speech, Music, andHearing (TMH) at the Royal Institute of Technology (KTH) in Stockholm (MarieCurie Fellowship, HPMT-GH-00-00119-02). Parts of this work were additionally fi-nanced through other projects by the European Union: The Sounding Object project(SOb), IST-2000-25287, http://www.soundobject.org) and the MOSART IHPnetwork, HPRN-CT-2000-00115 supported the studies to measure and to record theBosendorfer grand piano in Vienna.

Special thanks are due to the Bosendorfer company, Vienna for providing anSE290 grand piano in excellent condition, to Alf Gabrielsson (Department of Psy-chology, University of Uppsala), who provided a well maintained Disklavier for ex-perimental use, to the Department of Speech, Music, and Hearing (TMH) of theRoyal Institute of Technology (KTH), Stockholm for providing the accelerometerequipment for the piano action studies, and to the Acoustics Research Instituteof the Austrian Academy of Sciences for generously providing recording equipmentfor the multiple recording sessions at the Bosendorfer grand piano in Vienna (withspecial thanks to Werner A. Deutsch and Bernhard Laback). Furthermore, I amindebted to Tore Persson and especially to Friedrich Lachnit, who maintained andserviced the two reproducing pianos with endless patience.

At the outset, I want to thank Gerhard Widmer, the leader of the OFAI musicgroup, for his pioneering spirit while guiding our young research group into the ad-venture of exploring music expression and surrounding topics and at the same timeleaving the necessary freedom for unconventional ideas. It has been his merit that Igot the unique opportunity to work as a musicologist and pianist in an artificial in-telligence department. I am grateful to my colleagues Simon Dixon, for his advice in

vii

viii Acknowledgements

computer programming and logic thinking (His “Well, isn’t there a better way to dothis?” helped me to save weeks of computation time and intellectual meanders), toElias Pampalk especially for implementing the Zwicker loudness model into efficientcomputer code, to Asmir Tobudic, to Emilios Cambouropoulos for his ‘pragmaticapproach’ towards the use of computers, and to Renee Timmers for giving criticaland thus essential advice in design and interpretation of psychological experimentsand their statistical evaluation. I take this occasion to especially thank RobertTrappl, the head of OFAI, for his endless patience in recruiting research money toallow young researchers to spend their entire power exclusively on their work in anenjoyable environment, and for his fascination with music.I would like to express my sincere thanks to my collaborators Roberto Bresin

and Alexander Galembo who shared my fascination with grand pianos and helpedme to carry out the time-consuming experimental tests on the inmost functionalityof grand piano actions in Sweden and Vienna. Furthermore, I would like to thankJohan Sundberg for enabling my stay as a guest researcher in Stockholm. I take thisopportunity to thank further Anders Askenfelt and Erik Jansson for making theirexpertise and their equipment to monitor the various aspects of piano acousticsavailable to me. Moreover, I want to mention Giampiero Salvi, Erwin Schoonder-waldt, and Anders Friberg for stimulating discussions and helpful hints during mystay in Stockholm.I am grateful to my supervisor Richard Parncutt for his advice in getting focussed

on the more essential research questions and guiding me through the whole processfrom designing the listening tests until writing up this thesis in English. I amindebted to Christoph Reuter for examining this book and giving essential finalhints. I wish to finally thank Oliver Vitouch for his important statistical advice.At this place, I have to say a huge ‘Thank you!’ to all the participants hav-

ing shared their exquisite musical expertise either by performing on the computer-controlled grand piano or by listening to my unpleasant and awkward stimuli withoutrunning away immediately.Last but not least, I thank my parents and my whole family for the not only

mental support during the last three decades of my education.

Contents

Abstract iii

Zusammenfassung v

Acknowledgements vii

1 Introduction 1

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Dynamics and the Grand Piano 7

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 The acoustics of the piano . . . . . . . . . . . . . . . . . . . . 7

2.1.2 Measurement of dynamics in piano performance . . . . . . . . 7

2.1.3 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 The piano action as the performer’s interface . . . . . . . . . . . . . . 10

2.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Piano action timing properties . . . . . . . . . . . . . . . . 10

Different types of touch . . . . . . . . . . . . . . . . . . . 12

2.2.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . 15Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . 18

Influence of touch . . . . . . . . . . . . . . . . . . . . . . . 18

Timing properties . . . . . . . . . . . . . . . . . . . . . . 21

Travel time . . . . . . . . . . . . . . . . . . . . . . . . 21

Key–bottom contact relative to hammer–string contact 25

Time of free flight . . . . . . . . . . . . . . . . . . . . 26

Comparison among tested pianos . . . . . . . . . . . . 28

Acoustic properties . . . . . . . . . . . . . . . . . . . . . . 30

Rise time . . . . . . . . . . . . . . . . . . . . . . . . . . 30

ix

x Contents

Peak sound-pressure level . . . . . . . . . . . . . . . . . 31

2.2.4 General discussion . . . . . . . . . . . . . . . . . . . . . . . . 32

2.3 Measurement and reproduction accuracy of computer-controlled grandpianos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.3.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.3.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . 39

Timing accuracy . . . . . . . . . . . . . . . . . . . . . . . 39

Dynamic accuracy . . . . . . . . . . . . . . . . . . . . . . 42

Two types of touch . . . . . . . . . . . . . . . . . . . . . . 43

2.3.4 General discussion . . . . . . . . . . . . . . . . . . . . . . . . 47

2.4 A note on MIDI velocity . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Bringing Out the Melody in Homophonic Music—Production Ex-periment 57

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.1.2 Piano action timing properties . . . . . . . . . . . . . . . . . . 59

3.2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.3.1 Materials and participants . . . . . . . . . . . . . . . . . . . . 60

3.3.2 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.3.3 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.4.1 Relationship between velocity and timing . . . . . . . . . . . . 67

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.6 Finger–key contact estimation with alternative travel time functions . 74

3.7 A model of melody lead . . . . . . . . . . . . . . . . . . . . . . . . . 78

4 The Perception of Melody in Chord Progressions 79

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.1.1 Perception of melody . . . . . . . . . . . . . . . . . . . . . . . 80

4.1.2 Perception of isolated asynchronies . . . . . . . . . . . . . . . 81

4.1.3 Intensity and the perception of loudness and timbre . . . . . . 82

4.1.4 Masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.1.5 Stream segregation . . . . . . . . . . . . . . . . . . . . . . . . 84

4.2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.3 Perception of asynchronous dyads (pilot study) . . . . . . . . . . . . 87

4.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.3.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Participants . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Contents xi

Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Perception of tone salience (question 1) . . . . . . . . . . . 89

Temporal order perception (question 2) . . . . . . . . . . . 914.3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.4 Perception of dyads varying in tone balance and synchrony . . . . . . 95

4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

4.4.2 Determination of balance baseline (Experiment I) . . . . . . . 96

Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Participants . . . . . . . . . . . . . . . . . . . . . . . . 96

Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Equipment . . . . . . . . . . . . . . . . . . . . . . . . 96

Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 97

Results and discussion . . . . . . . . . . . . . . . . . . . . 97

4.4.3 Perception of tone salience (Experiment II) . . . . . . . . . . . 98Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Results and discussion . . . . . . . . . . . . . . . . . . . . 99

4.4.4 Asynchrony detection (Experiment III) . . . . . . . . . . . . . 100

Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Results and discussion . . . . . . . . . . . . . . . . . . . . 101

4.4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.5 Perception of chords and chord progressions varying in tone balanceand synchrony (Experiments IV and V) . . . . . . . . . . . . . . . . . 105

4.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

4.5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Participants . . . . . . . . . . . . . . . . . . . . . . . . . . 106Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

4.5.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . 109

Effects of intensity balance and asynchrony . . . . . . . . 110

Experiment IV . . . . . . . . . . . . . . . . . . . . . . . 110

Experiment V . . . . . . . . . . . . . . . . . . . . . . . 111

Post-hoc comparisons . . . . . . . . . . . . . . . . . . . 113

Effects of chord, transposition, and voice . . . . . . . . . . 114

Experiment IV . . . . . . . . . . . . . . . . . . . . . . . 114Experiment V . . . . . . . . . . . . . . . . . . . . . . . 115

4.5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

4.6 Asynchrony versus relative intensity as cues for melody perception:Excerpts from a manipulated expressive performance (Experiment VI) 118

4.6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

4.6.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

xii Contents

Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 1224.6.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . 123

4.7 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1274.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1274.7.2 Input of the models . . . . . . . . . . . . . . . . . . . . . . . . 1274.7.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . 128

Experiments IVa, IVb, and V . . . . . . . . . . . . . . . . 128Intensity . . . . . . . . . . . . . . . . . . . . . . . . . . 128Unsigned asynchrony . . . . . . . . . . . . . . . . . . . 129Signed asynchrony . . . . . . . . . . . . . . . . . . . . . 130Voice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

Experiment VI . . . . . . . . . . . . . . . . . . . . . . . . 1304.8 General discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

5 Conclusions 137

Bibliography 144

A Ratings of Listening Tests 163

B Curriculum Vitae 181

Chapter 1

Introduction

1.1 Background

“And here I shall go back to something I said earlier: since the basis ofall audible music is singing and since piano literature is full of cantabile,the first and main concern of every pianist should be to acquire a deep,full, rich tone capable of any nuance, with all its countless graduations,vertically and horizontally. An experienced pianist can easily give threeor four dynamic nuances simultaneously: for instance

fmpppp

to say nothing of using horizontally every possibility inherent in the pi-ano’s tone.” (Neuhaus, 1973, pp. 67–68, Chapter 3: ‘On Tone’, emphasisin original)

Heinrich Neuhaus while dwelling upon the singing quality in piano performancerefers exclusively to the dynamic shaping of the tones. However, pianists may alterseveral expressive parameters to “bring out” the melody in a piano piece, to make itcantabile (singable), to give it singing quality, to let the melody stand out from thebackground. The most obvious strategy is—as mentioned by Neuhaus—to strikethe keys of the principal voice with a slightly firmer blow, so that the melody tonessimply sound louder, and to colour the accompaniment tones darker and behind themelody.Other, more subtle ways to make different voices acoustically more distinguish-

able include articulation and the use of the right pedal. A melody becomes morecantabile, when it is played more legato than the other tones, that is, when all tonesare connected to each other, the previous key released when the next is already de-pressed (finger legato). Finger legato can also be replaced by using the right pedal

1

2 Chapter 1. Introduction

that—in addition to linking tones together by rising the dampers from the strings—also introduces more sympathetic vibrations between all sounding strings, resultingin a more complex sound that lets the melody glow over the accompaniment. Toadditionally reduce the natural decay of the piano tone (cf. Martin, 1947; Repp,1997a), the left pedal (of a grand piano) can be used, shifting the piano action side-wise so that only two strings of the triple-strung tones are struck by the hammer.This decreases the decay time of the piano tones and thus increases their effectiveduration (Weinreich, 1977, 1990).

Alongside above mentioned expressive devices, another expressive feature hasbeen investigated. The onsets of melody tones can be anticipated or delayed withrespect to the other tones of the same chord in the score. The excessive use ofasynchronies as an individual expressive freedom reminds us of old recordings whererenowned pianists often used to play bass notes up to some hundreds of millisecondsearlier than the other tones (e.g. Josef Pembaur and Harold Bauer, see Hartmann,1932). Moreover, a melody can be played more freely and independently from theaccompaniment that keeps the meter rigidly. This effect is usually called temporubato in its earlier sense (Hudson, 1994), a performance practice going back tothe Baroque period. However, another effect has been reported in recent and alsoin the older literature, that is, melody tones usually sound some 30 ms before theother tones of the same chord (Vernon, 1937; Palmer, 1989, 1996; Repp, 1996a;Goebl, 2001). This effect was called melody lead by Palmer (1996) and is usuallyaccompanied with and presumably causally related to differences in tone intensity(Repp, 1996a; Goebl, 2001).

The harder a key is actuated by the pianist’s finger, the faster the hammerwill travel to the strings and the earlier a sound will emerge in relation to thebeginning of the keystroke. As simple as this physical constraint is, it is just asimportant for the performing pianist to coordinate the command to the finger todepress a key relatively to how hard that key is intended to be hit so that theoutcoming sound starts at the desired instant in time. The time interval betweenthe beginning of the key movement and the hammer hitting the strings is referred toas travel time (see Chapter 2). A soft tone (piano) takes around 160 ms to reach thestrings, whereas a forte keystroke only around 25 ms (Askenfelt and Jansson, 1991).These temporal properties can be modified by changing the regulation of the pianoaction (Askenfelt and Jansson, 1990a,b), they may also differ slightly among pianomanufacturers and action designs. Alongside the close relation between hammervelocity and travel time, it is expected that how the key is actuated by the playerinfluences this relation considerably.

Not only pianists have to be aware of this temporal peculiarity of the keyboardconstruction in order to achieve intended onset timing with varying hammer veloci-ties, but it has also to be considered for reproducing devices, such as a BosendorferSE or a Yamaha Disklavier, how their actions behave under what intensity situa-tions. These systems are provided with correction maps that allow to adjust for thedifferent travel times at different hammer (MIDI) velocities. Repp (1996a) reported

1.1. Background 3

that the “prelay function” of his Yamaha Disklavier was not operating, so he hadthe opportunity to measure the travel time interval with respect to MIDI velocityunits and found similar results as Askenfelt and Jansson (1991), however with re-spect to MIDI velocity units and not final hammer velocity in meters per second (asin Askenfelt and Jansson, 1991).

Palmer (1996) advocated that melody leads were largely independent of the dy-namic differentiation between voices. She considered note onset asynchronies as anexpressive device of pianists to bring out individual voices in contrapuntal contextsindependently from other expressive cues such as dynamics, articulation or ped-alling. Her conclusions were based on evidence in data that, e.g., asynchronies werelarger in experts’ performances than in students’, asynchronies decreased in ‘unmu-sical’ performances, and melody lead got larger with voice emphasis. However, Repp(1996a) found—with a more detailed methodology—strong relations of dynamic dif-ferences between voices and onset differences. The louder a melody note was played(in comparison to the dynamic level of the other chord tones), the earlier it tendedto appear (also in comparison to the timing of the other tones). He explained thisinterrelationship with the ‘velocity artifact’, referring to the above mentioned tem-poral characteristics of the keyboard construction. However, he could not entirelyexplain the causal relationship, because correlational evidence (as found betweendynamic and timing differences) did not prove causal connection. For him it seemedplausible that “pianists aim for synchrony of finger–key contacts” (Repp, 1996a,p. 3929).

The temporal properties of the grand piano action were described in the liter-ature in fine detail by Askenfelt and Jansson (1990b, 1991, 1992a). However, onlyexemplary data from a single instrument was reported. In order to be able to esti-mate, how much of the above mentioned melody lead phenomenon can be accountedfor by this temporal behaviour of the keyboard construction, more data has to begathered from different instruments, and the interaction of travel time and hammervelocity studied in finer detail, also with respect to different ways of depressing thekey.

With reliable data about the travel time characteristics for various pianos andkey actuation types, it will be possible to infer the asynchronicities at finger–keylevel from note onset asynchronies (corresponding to hammer–string contact timedifferences), that is, how asynchronously pianists started the keystrokes within achord. Thus, Repp’s above mentioned hypothesis that pianists aim for synchronyat finger–key level can be verified or rejected (Repp, 1996a).

Melody lead (or lag) may render a tone or voice more salient (prominent) thanother tones in a chord, independently of the associated dynamic differences (Parn-cutt and Troup, 2002, pp. 294–296). An early tone onset will initially not be maskedby the other chord tones (Rasch, 1978), and according to Bregman and Pinker (1978)asynchronous onsets enable the auditory system to segregate those tones into differ-ent melodic streams (voices). Although the presence of the melody lead phenomenonis likely to be explained by mechanical constraints of the keyboard construction

4 Chapter 1. Introduction

(Repp, 1996a), its perceptual effects may be (even unconsciously) wanted by theperformers so that efforts to overcome it (i.e., to play dynamically differentiatedchords without anticipating the louder tones or even the opposite) would not beworth it, simply because the reason of playing a voice louder (and thus earlier) isto make it stand out from the context, to make it cantabile, to impart it a singingquality.

Apart from the psycho-acoustic relevance of note onset asynchronies, differencesin sound intensity and thus in timbre may entail by themselves psycho-acousticeffects on the perception of a complex chordal sonority. A louder melody tone willalso impart a singing quality because it becomes more salient in pitch (Terhardtet al., 1982), there will be less beating between a pair of two tones with increasingloudness difference (Terhardt, 1974), and the compound timbre will sound less rough(Parncutt and Troup, 2002).

1.2 Outline

The central part of this thesis comprises three large chapters (2–4), each representinga different approach to my research question of melody emphasis: an investigationon the acoustics and instrumental characteristics of the grand piano, a performancestudy approach, and an experimental evaluation of the perceptual hypotheses re-garding the perception of timing and intensity differences in multi-voiced contexts.

In Chapter 2, the acoustics of the piano are discussed with special emphasison the grand piano action and its typical temporal behaviour in different playingsituations. Two prototypical ways of depressing the keys were investigated. Withthe finger resting on the surface of the key and pressing it down starting with zerokey velocity (legato touch), and with hitting the key from a certain distance aboveand thus striking it already with a certain speed (staccato touch). The differentbehaviour of the grand piano action and the various tone intensities produced withthese two kinds of touch was investigated in Section 2.2 (p. 10). Special attention wasgiven to the relationship between the hammer velocity and the time interval betweenthe beginning of the key movement (finger–key contact1) and the sounding tone(hammer–string contact). This function is referred to as travel time function. Twomodern reproducing grand pianos were subject of investigation in Section 2.3 (p. 36),where the measurement and reproduction reliability and accuracy were determinedand evaluated with respect to the usability of such devices for performance research.Section 2.4 (p. 50) discusses briefly the relation between hammer velocity and soundlevel or loudness of the resulting tones.

Chapter 3 (p. 57) describes a performance study in which 22 skilled pianists

1The term finger–key contact may be misleading, because with a legato touch the finger isalready resting on the key surface and thus touching it, so the finger–key contact point would bemuch earlier than the start of the key movement. However, always the onset of the key movementis meant.

1.2. Outline 5

played two short excerpts by Frederic Chopin (from the Etude op. 10, No. 3 andthe second Ballade op. 38). The onset asynchronies between the melody and theaccompaniment tones were investigated and compared with the differences in toneintensity (in terms of MIDI velocity). This study used data from a Bosendorfercomputer-controlled grand piano (Goebl, 1999b) and has been published in Goebl(2001). The findings from Goebl (2001) were revised by applying the results of themore recent measurements from Chapter 2 (i.e., alternative travel time functions).The adjusted results are reported in Section 3.6 (p. 74).Chapter 4 (p. 79) is dedicated to a series of perceptual experiments (mostly with

musically trained participants) that investigated the perceptual influence of onsetasynchrony on the perceived salience of individual tones. The main questions are:first, what is the threshold for two musical sonorities to be heard as simultaneousor as separate; and second, can anticipation or delay of a tone alter its perceivedsalience in a chordal context? Seven listening experiments are dedicated to thesequestions. In a pilot experiment (Section 4.3, p. 87), two equally loud tones withasynchronies up to ±50 ms were used to investigate the perceived relative loudnessof two tones and their order. Different types of tones were used (pure, sawtooth,MIDI-synthesised piano, and real piano) to test whether different attack curveschange loudness perception or temporal order identification. Intensity variation wasintroduced to the next three experiments (Experiments I–III, Section 4.4, p. 95).In Experiment I, participants adjusted the relative level of two simultaneous tones(pure, sawtooth, and piano sound) until they sounded equally loud to them. InExperiment II, they rated the relative loudness of the two tones of dyads with relativetiming and intensity simultaneously manipulated by up to ±54 ms and ±20 MIDIvelocity units. In Experiment III, listeners judged whether or not the stimuli ofthe previous experiment sounded simultaneous. In the last three experiments, thestimulus material was extended to three-tone piano chords, sequences of three-tonepiano chords (Experiment IV and Experiment V, see Section 4.5, p. 105), and anexcerpt of a piece by Frederic Chopin (Experiment VI, Section 4.6, p. 118).

6 Chapter 1. Introduction

Chapter 2

Dynamics and the Grand Piano

2.1 Introduction

In this chapter, the loudness dimension in expressive piano performance is discussedwith respect to the acoustics of the grand piano. Emphasis was given to the temporalbehaviour of the grand piano action and its consequences for piano performance.

2.1.1 The acoustics of the piano

The acoustics of the piano is a comparably well investigated topic. A comprehensiveoverview can be found in the piano chapter in Fletcher and Rossing’s book (Fletcherand Rossing, 1998, pp. 352–398). There is a vast amount of detailed studies on thevarious aspects of the acoustics of the piano covering all steps involved in sound pro-duction: the keyboard and the action (Lieber, 1985; Askenfelt, 1991; Askenfelt andJansson, 1990a,b, 1991), the hammers (Conklin, 1996a; Giordano and Winans II,2000), the strings (Askenfelt and Jansson, 1992a; Chaigne and Askenfelt, 1994a,b;Conklin, 1996c; Suzuki, 1986; Weinreich, 1977), hammer–string interaction (Hall,1986, 1987a,b; Suzuki, 1987; Boutillon, 1988; Hall and Askenfelt, 1988), the sound-board (Conklin, 1996b; Giordano, 1997, 1998a,b), sound radiation (Suzuki, 1986;Bork et al., 1995), and the sound and its decay (Knoblaugh, 1944; Martin, 1947;Nakamura, 1989; Taguti et al., 2002). The differences between grand and uprightpianos were investigated by Galembo and Cuddy (1997) and Mori (2000).

2.1.2 Measurement of dynamics in piano performance

The dynamics and the timbre of a single piano tone are controlled by a singleparameter: the final hammer velocity. As in many instruments, the intensity of apiano tone and its timbre are closely linked; the louder the tone, the higher is itssound level and the more partials are involved causing a brighter sound. However,already more than one tone with simultaneously using the two pedals entails avirtually unlimited variety of possible sounds and timbres so that investigating the

7

8 Chapter 2. Dynamics and the Grand Piano

dynamics in piano performance is not easy at all. There are two possibilities toapproach the dynamics of the piano. The first is to directly measure the acousticoutput of the piano (the sound), and the second is to measure how that sound isproduced.

1. Measuring the sound

• In the amplitude of the radiated sound as, e.g., from recordings.– Physical: Sound-pressure level (dB).

– Perceptual: Loudness level (sone), see Moore (1997) and Zwicker andFastl (1999).

• In the amplitude of the string vibrations, confer to Askenfelt (1990);Askenfelt and Jansson (1990b).

2. Measuring the production of sound

• Movement of the piano hammer, e.g., the (final) hammer velocity (in me-ters per second) with computer-monitored pianos as, e.g., a BosendorferSE290 (see Section 2.3, p. 36), Shaffer (1981, 1984), Shaffer et al. (1985),and Shaffer and Todd (1987), or with an optical measurement setup asin Henderson (1936) and Skinner and Seashore (1936).

• MIDI velocity with any MIDI instrument (e.g., a digital piano).• Movement of the piano key, i.e., the continuous key acceleration (cf.Askenfelt and Jansson, 1990a, future computer-monitored pianos mightalso measure such parameters). A historic approach used smoked paperand a tuning fork to investigate key movement (Ortmann, 1925).

Attempts were made to relate these various ways of determining the dynamicsof piano tones to each other. The relation of hammer velocity and peak amplitudewas investigated by Palmer and Brown (1991); the relation of MIDI velocity unitsand sound level in dB by Friberg and Sundberg (1995) and Repp (1996d). Anotherstudy tried to infer the loudness of single piano tones out of multi-voiced chords bymeasuring the energy of their fundamentals and first overtones (Repp, 1993b), butcould not find satisfactory results.The emphasis of performance research of the past two decades was mainly on

timing and tempo issues, because tone onsets are easier and more reliably obtainablefrom music performances. However, there were several studies focusing on dynamicseither by obtaining data from electronic MIDI instruments, such as digital pianosor other keyboards (Palmer, 1989; Repp, 1995a), from computer-monitored pianos(Repp, 1993b, 1996d; Palmer, 1996; Tro, 1994, 1998, 2000a,b; Riley-Butler, 2001,2002), or by measuring and analysing the sound signal of recordings (Truslit, 1938;Repp, 1993a; Gabrielsson, 1987; Nakamura, 1987; Kendall and Carterette, 1990;Namba and Kuwano, 1990; Namba et al., 1991; Repp, 1999; Lisboa et al., 2002).

2.1.2. How to measure dynamics 9

Using MIDI velocity units to investigate the dynamic dimension of music per-formance bears certain difficulties. The first is that these units are an arbitrarychoice of the MIDI instrument’s manufacturer to scale the range of possible dynam-ics to numbers between zero and 127. They are not comparable between instruments(e.g. between a digital piano and a computer-controlled piano, cf. Friberg, 1995).However, within one instrument MIDI velocity units seem to be able to depict a con-sistent picture of what the pianist did. In informal experiments with concert pianistson a Yamaha grand piano, Tro (2000a, p. 173) asked pianists to produce repeatedtones on one piano key, while trying to constantly increase the loudness. The MIDIvelocity units output by the device increased almost linearly over a range between25 and 127 units. On the other hand, even to playback a recorded Bosendorfer SEfile with another SE grand piano model will result in obviously distorted dynamicreproduction due to a different response of the second instrument.

2.1.3 Aims

In this chapter, above mentioned problems of the relation between piano mechanicsand tone intensity were investigated with an extensive experimental measurementsetup. In Section 2.2, the temporal properties of three grand piano actions by dif-ferent piano manufacturers were investigated with an accelerometer setting. Here,the aim was to provide benchmark functions for performance research and to repli-cate assumptions from earlier work (see Chapter 3, and Goebl, 2001). Two of thethree pianos were computer-controlled. Their recording and reproduction precisionwas measured in Section 2.3 (p. 36) in order to study the reliability of these instru-ments for performance research. The relationship between MIDI velocity units andthe sound level of the tones produced by a Bosendorfer SE290 computer-controlledgrand piano was examined for all 97 tones of the keyboard in Section 2.4 (p. 50).

10 Chapter 2. Dynamics and the Grand Piano

2.2 The piano action as the performer’s interface

This work was performed at the Department of Speech, Music, and Hearing atthe Royal Institute of Technology (KTH/TMH) in Stockholm, Sweden, in close co-operation with Roberto Bresin and Alexander Galembo. Parts of this work has beenpresented at the Stockholm Music Acoustics Conference (SMAC’03, cf. Goebl et al.,2003).

2.2.1 Introduction

This is an exploratory study on the temporal behaviour of grand piano actions bydifferent piano manufacturers using different types of touch. Large amounts of datawere collected in order to determine as precisely as possible the temporal functionsof piano actions, such as travel times versus hammer velocity, or key–bottom contacttimes relative to hammer string contact.

A pianist is able to bring out a wide range of imaginable facets of expressiononly by varying the manner and the intensity of actuating the 88 keys of the pianokeyboard. Since not only the intensity of the keystroke, but also the precise timingof the onset of the tone produced is crucial to expressive performance, it can beassumed that pianists are intuitively acquainted with the temporal properties of thepiano action, and that they take them into account while performing expressively.

The grand piano action is a highly elaborated and complex mechanical interface,whereby the time and the speed of the hammer hitting the strings is controlled onlyby varying the manner and the force of striking the keys. The movement of thekey is transferred to the hammer via the whippen, on which the jack is positionedthat it touches the roller (knuckle) of the hammer shank. During a keystroke, thetail end of the jack makes contact with the escapement dolly (letoff button, jackregulator) causing the jack to rotate away from the roller, and thus breaking thecontact between key and hammer. From this moment, the hammer travels with nofurther acceleration to the strings and rebounds from them immediately (‘free flightof the hammer’). The roller comes back to the repetition lever, while the hammer iscaught by the back check. For a fast repetition, the jack slides back under the rollerwhen the key is only released half-way, and the action is ready for another stroke.More precise descriptions of the functionality of grand piano actions can be found inliterature (Askenfelt and Jansson, 1990b; Fletcher and Rossing, 1998, pp. 354–358).

Piano action timing properties

The temporal parameters of the piano action have been described in Askenfelt (1990)and Askenfelt and Jansson (1990b, 1991, 1992a). When a key is depressed, the timefrom its initial position to the bottom contact ranges from 25 ms (forte or 5 m/s finalhammer velocity, FHV) to 160 ms (piano or 1 m/s FHV, Askenfelt and Jansson,

2.2. The piano action as the performer’s interface 11

1991, p. 2385).1 In a grand piano the hammer impact times (when the hammerexcites the strings) are shifted in comparison to key–bottom contact times. Thehammer impact time is 12 ms before the key–bottom contact at a piano touch(hammer velocity 1 m/s), but 3 ms after the key–bottom contact at a forte attack(5 m/s, Askenfelt and Jansson, 1990a, p. 43). The timing properties of a grandpiano action were outlined by these data, but more detailed data were not available(Askenfelt, 1999).

The timing properties of the piano action can be modified by changing the reg-ulation of the action. Modifications, e.g., in the hammer–string distance or in thelet-off distance (the distance of free flight of the hammer, after the jack is releasedby the escapement dolly), affect the timing relation between hammer–string contactand key–bottom contact or the time interval of free flight, respectively (Askenfeltand Jansson, 1990b, p. 57). Greater hammer mass in the bass (Conklin, 1996a,p. 3287) influences the hammer–string contact durations (Askenfelt and Jansson,1990b), but not the timing properties of the action.

Another measurement was made by Repp (1996a) on a Yamaha Disklavier onwhich the “prelay function” was not working.2 This gave him the opportunity tomeasure roughly a grand piano’s timing characteristics in the middle range of thekeyboard. He measured onset asynchronies at different MIDI velocities in compar-ison to a note with a fixed MIDI velocity. The time deviations extended over arange of about 110 ms for MIDI velocities between 30 and 100 and were fit well bya quadratic function (Repp, 1996a, p. 3920).

The timing characteristics of electronic keyboards vary across manufacturersand are rarely well documented. Each key has a spring with two electric contactsthat define the off-states and the on-states. When a key is depressed, the springcontact is moved from the off-position to the on-position (Van den Berghe et al.,1995, p. 16). The time difference between the breaking of the off-contact and theon-contact determines the MIDI velocity values; the note onset is registered nearthe key–bottom contact.

There are several attempts to model piano actions (Gillespie, 1992; Hayashi et al.,1999), also for a possible application in electronic keyboard instruments (Cadoxet al., 1990; Van den Berghe et al., 1995). Van den Berghe et al. (1995) performedmeasurements on a grand piano key with two optical sensors for hammer and keydisplacement and a strain gauge for key force. Unfortunately, they reported only inone figure an example of their data. Hayashi et al. (1999) tested one piano key ona Yamaha grand piano. The key was hit with a specially developed key actuatorable to produce different acceleration patterns. The displacement of the hammer

1Askenfelt and Jansson (1990b) used a Hamburg Steinway & Sons grand piano, model B (7 ft,211 cm) for their measurements.

2The “prelay function” compensates for the different travel times of the action at differenthammer velocities. In order to prevent timing distortions in reproduction, the MIDI input isdelayed by 500 ms. The solenoids (the linear motors moving the keys) are then activated earlierfor softer notes than for louder notes, according to a pre-programmed function.

12 Chapter 2. Dynamics and the Grand Piano

was measured with a laser displacement gauge. They developed a simple modeland tested it in two touch conditions (with constant key velocity and constantkey acceleration). Their model predicted the measured data for both conditionsaccurately.

Different types of touch

There has been an ongoing discussion in the literature whether it is only the finalvelocity of the hammer hitting the strings that influences the tone of the piano (ex-cept pedalling) or whether there is an influence of touch as pianists claim frequently.In other words: is it possible to produce two isolated piano tones without using thepedal with identical final hammer velocities, but with perceptually different sounds?

The first scientific approach to this question was by Otto Ortmann from thePeabody Conservatory of Music (Ortmann, 1925).3 He approached the “mysteryof touch and tone” at the piano through physical investigation. With a piece ofsmoked glass mounted to the side of a piano key and a tuning fork, he was ableto record and to study key depression under different stroke conditions. He in-vestigated various kinds of keystrokes (percussive versus non-percussive, differentmuscular tensions, and positions of the finger). He found different acceleration pat-terns for non-percussive (finger rests on the surface of the key before pressing it) andpercussive touch (an already moving finger strikes the key). The latter starts with asudden jerk, thereafter the key velocity decreases for a moment and increases again.During this period, the finger slightly rebounds from the key (or vice versa), thenre-engages the key and “follows it up” (p. 23). On the other side, the non-percussivetouch caused the key to accelerate gradually. He found that these different types oftouch provide a fundamentally different kind of key-control. The percussive touchrequired precise control of the very first impact, whereas with non-percussive touch,the key depression needed to be controlled up to the very end. “This means that thepsychological factors involved in percussive and non-percussive touches are differ-ent” (p. 23). “In non-percussive touches key resistance is a sensation, in percussivetouches it is essentially an image” (p. 23, footnote 1). His conclusions were thatdifferent ways of touching the keys produced different intensities of tones, but whenthe intensity was the same, also the quality of the tone must be the same. “Thequality of a sound on the piano depends upon its intensity, any one degree of inten-sity produces but one quality, and no two degrees of intensity can produce exactlythe same quality” (p. 171).

The discussion continued in the 1930s with studies that examined the sound ofthe piano and defined the hammer velocity to be the most important factor (Hartet al., 1934; Seashore, 1937; White, 1930). This technical view does not reduce theconceptual variety of the pianists’ opportunities to freely and artistically controlling,

3The discussion, however, was certainly not new at that time; see, e.g., Bryan (1913a) and thelively discussion following this contribution (Wheatley, 1913; Heaviside, 1913; Allen, 1913; Morton,1913; Bryan, 1913b; Pickering, 1913a; Bryan, 1913c; Pickering, 1913b; Bryan, 1913d).

2.2. The piano action as the performer’s interface 13

shaping, and altering their performances.

“It is our opinion that the reduction of the process of controlling thetone from the piano to the process of controlling hammer-velocity doesnot in any way detract from the beauty of the art, since it shows, amongother things, what extreme delicacy of control is called for, and, in turn,to what an extent a great artist is able to bring his command over hismental and physical processes to bear upon the task of obtaining al-most infinitesimal varieties of manipulation of a key-board, no one ofthe 88 members of which can travel through a greater distance than 3/8inch.” (White, 1930, pp. 364–365)

The other side argued that different types of noise emerge with varying touch(Baron and Hollo, 1935; Cochran, 1931). Baron and Hollo (1935) distinguishedbetween finger noise (Fingergerausch) when the finger touches the key (which isabsent when the finger velocity is zero as touching the key—in our terminologylegato touch), keybed noise (Bodengerausch) when the key hits the keybed, andupper noises (Obere Gerausche) when the key is released again (e.g., the damperhitting the strings). As another source of noise they mentioned the pianists foothitting the stage floor in order to emphasise a fortissimo passage. In a later study,Baron (1958) advocated a broader concept of tone quality, including all kinds of noise(finger–key, action, and hammer–string interaction), which he argued to be includedinto concepts of tone characterisation of different instruments (Baron, 1958).

More recent studies investigated these different kinds of noise that emerge whenthe key is stricken in different ways (Askenfelt, 1994; Koornhof and van der Walt,1994; Podlesak and Lee, 1988). The hammer–impact noise (string precursor) ar-rived at the bridge immediately after hammer–string contact (Askenfelt, 1994) andcharacterises the attack–thump of the piano sound without which it would not berecognised as such (Chaigne and Askenfelt, 1994a,b). This noise was independent oftouch type. The hammer impact noises of the grand piano did not radiate equallystrongly in all directions (Bork et al., 1995). As three dimensional measurementswith a two-meter Bosendorfer grand piano revealed, higher noise levels were foundhorizontally towards the pianist and in the opposite direction, to the left (viewedfrom the sitting pianist), and vertically towards the ceiling (see also Meyer, 1965,1978, 1999).

Before the string precursor, another noise component could occur: the touchprecursor, only present when the key was hit from a certain distance above (staccatotouch Askenfelt, 1994). It preceded the actual tone by 20 to 30 ms and was muchweaker than the string precursor (Askenfelt, 1994). Similar results were reportedby Koornhof and van der Walt (1994). The authors called the noise prior to thesounding tone early noise or acceleration noise corresponding in time to finger–keycontact. They performed an informal listening test with four participants. The twotypes of touch (staccato with the early noise and legato) could be easily identified by

14 Chapter 2. Dynamics and the Grand Piano

the listeners, but not anymore with the early noise removed. No further systematicresults were reported (Koornhof and van der Walt, 1994).The different kinds of touch also produced different finger–key touch forces

(Askenfelt and Jansson, 1992b, p. 345). A mezzo forte attack played staccato typ-ically had 15 N, very loud staccato attacks showed peaks up to 50 N (fortissimo),very soft touches went as low as 8 N (piano). Playing with legato touch, finger–keyforces of about one third of those of staccato attacks were found, usually havinga peak when the key touched the keybed. At a pianissimo tone, the force hardlyexceeded 0.5 N.Although measurement tools improved since the first systematic investigations

in the 1920s, no more conclusive results could be obtained as to whether the touch-variant noise components (especially finger–key noise) can be aurally perceived bylisteners not involved into tone production.4 It is assumed that the hapto-sensorialfeedback to the player influences his/her aural perception of the tone (Askenfeltet al., 1998). The pianist’s perception of the tone starts with finger–key contact whilethe listener’s (aural) perception starts with the excitation of the strings (assumingthat other, e.g., visual cues are avoided). This finding concurs with Ortmann (1925)who said the psychological processes involved in the two types of touches to beessentially different (see above).

2.2.2 Method

The present study aimed to collect a large amount of measurement data from dif-ferent pianos, different types of touch, and different keys, in order to determinebenchmark functions for performance research. The measurement setup with ac-celerometers was the same as used by Askenfelt and Jansson (1991), but the dataprocessing procedure was automated with custom computer software in order toobtain a large and reliable data set. Each of the measured tones was equippedwith two accelerometers monitoring key and hammer velocity. Additionally, a mi-crophone recorded the sound of the piano tone. With this setup, various temporalproperties (travel time, key–bottom time, time of free flight) and acoustic properties(peak sound level, rise time) were determined and discussed.

Material

Three grand pianos by different manufacturers were measured in this study.

1. Steinway grand piano, model C, 225 cm, situated at the Department ofSpeech, Music, and Hearing at the Royal Institute of Technology (KTH-TMH)in Stockholm, Sweden. Serial number: 516000, built in Hamburg, Germany,approximately 1992 (this particular grand piano was already used in Askenfeltand Jansson, 1992a).

4The hammer–string impact noise is part of the piano tone and is certainly heard, however, thisnoise component cannot be varied independently of hammer velocity.

2.2. The piano action as the performer’s interface 15

2. Yamaha Disklavier grand piano DC2IIXG, 173 cm, situated at the De-partment of Psychology at the University of Uppsala, Sweden. Serial number:5516392, built in Japan, approximately 1999 (The Mark II series were issued1997 by Yamaha; personal communication with Yamaha Germany, Rellingen).

3. Bosendorfer computer-controlled grand piano SE290, 290 cm, situatedat the Bosendorfer Company in Vienna; internal number: 290–3, built inVienna, Austria, 2000. The Stahnke Electronics (SE) system dates back to1983 (for more information on its development, see Roads, 1986; Moog andRhea, 1990), but this particular grand piano was built in 2000. The samesystem used to be installed in an older grand piano (internal number 19–8974,built in 1986, used, e.g., in Chapter 3), but was put into a newer one forreasons of instrumental quality.

Immediately before the experiments, the instruments were tuned, and the pianoaction and—in the case of the computer-controlled pianos—the reproduction unitserviced. At the Disklavier, this procedure was done by a specially trained Yamahapiano technician. At the Bosendorfer company, the company’s SE technician tookcare of this work. The Steinway grand has been regularly maintained by a pianotechnician of the Swedish National Radio.

Equipment

The tested keys were equipped with two accelerometers: one mounted on the key5

and one on the bottom side of the hammer shank.6 The accelerometer setting (seeFigure 2.1) was the same as used in Askenfelt and Jansson (1991). Each of theaccelerometers was connected with an amplifier7 with a hardware integrator inside.Thus, their output was velocity in terms of voltage change. A sound-level meter (OnoSokki LA–210) placed next to the strings of that particular key (approximately 10-cm distance) picked up the sound. The velocities of the key and the hammer as wellas the sound were recorded on a multi-channel digital audio tape (DAT) recorder(TEAC RD–200 PCM data recorder) with a sampling rate of 10 kHz and a wordlength of 16 bit. The DAT recordings were transferred onto computer hard disk intomulti-channel WAV files (with a sampling frequency of 16 kHz).8 Further evaluationof the recorded data was done in Matlab programming environment with routinesdeveloped by the author for this purpose.

5Bruel & Kjær accelerometer type 4393. Mass without cable: 2.4 g; serial number 1190913.6ENDEVCO accelerometer model 22 PICOMIN. Mass without cable: 0.14 g; serial number

20845.7Bruel & Kjær charge amplifier type 2635.8Using an analogue connection from the TEAC recorder to a multi-channel sound card (Pro-

ducer: Blue Waves, formerly Longhborough Sound Images; Model PC/C32 using its four-channelA/D module) on a PC running Windows 2000 operating system.

16 Chapter 2. Dynamics and the Grand Piano

hammer accelerometer

key accelerometer

Figure 2.1: A Bosendorfer grand piano action with the SE sensors sketched. Additionally,the placement of the two accelerometers is shown. (Figure generated with computersoftware by the author. Piano action by Bosendorfer with permission from the company.)

Calibration

The recordings were preceded by calibration tests in order to be sure about themeasured units. The accelerometer amplifiers output AC voltages corresponding tocertain measured units (in our case, meters per second) depending on their setting,e.g., 1 V/m/s for the key accelerometer. To calibrate the connection between theTEAC DAT recorder and computer hard disk, different voltages (between −2 and+2 V DC) were recorded onto the TEAC recorder and measured in parallel by avolt meter. The recorded DC voltages were transferred to computer hard disk asdescribed above. These values were compared with the values measured by thevolt meter. They correlated highly (R2 = 0.9998), with a factor slightly above 2.Always before the recording sessions, the microphone was calibrated with a 1-kHztest tone produced by a sound-level calibrator,9 in order to get dB values relativeto the hearing threshold.

Procedure

Five keys distributed over the whole range of the keyboard were tested: C1 (MIDInote number 24), G2 (43), C4 (60), C5 (72), and G6 (91).10 The author and hiscolleague (RB) served as pianists to perform the recorded test tones. Each key washit at as many different dynamic levels (hammer velocities) as possible, with twodifferent kinds of touch: once with the finger resting on the surface of the key (legatotouch), once hitting the key from above (staccato touch), striking the key alreadywith a certain speed.

Parallel to the accelerometer setting, the grand pianos recorded these test tones

9Bruel & Kjær sound-level calibrator type 4230, test tone: 94 dB, 1 kHz.10Only three keys were tested at the Steinway piano (C1, C5, G6).

2.2. The piano action as the performer’s interface 17

with their internal device on computer hard disk (Bosendorfer) or floppy disk (Dis-klavier). For each of the five keys, both players played in both types of touch from 30to 110 individual tones, so that a sufficient amount of data was recorded. Immedi-ately after each recording of a particular key, the recorded file was reproduced by thegrand piano, and the accelerometer data was recorded again onto the multi-channelDAT recorder. The recordings took place in May 2001 (Steinway, Stockholm), June2001 (Yamaha, Uppsala) and January 2002 (Bosendorfer, Vienna). For the Stein-way, 608 individual attacks were recorded, for the Yamaha Disklavier 932, and forthe Bosendorfer 1023.

Data analysis

In order to analyse the three-channel data files, discrete measurement values had tobe extracted from them. Several instants in time were defined as listed below anddetermined automatically with the help of Matlab scripts prepared for this purposeby the author.

1. The hammer–string contact was defined as the moment of maximum decel-eration (minimum acceleration) of the hammer shank (hammer accelerometer)which corresponded well to the physical onset of the sound, and conceptuallywith the ‘note on’ command in the MIDI file. In mathematical terms, thehammer–string contact was the minimum of the first derivative of the mea-sured hammer velocity.11

2. The finger–key contact was defined to be the moment when the key startedto move. It was obtained by a simple threshold procedure applied on the keyvelocity track. In mathematical terms, it was the moment when the (slightlysmoothed) key acceleration exceeded a certain threshold which varied relativeto the maximum hammer velocity. Finding the correct finger–key point wasnot difficult for staccato tones (they showed typically a very abrupt initialacceleration). However, automatically determining the correct moment forsoft legato tones was sometimes more difficult and needed manual adaption ofthe threshold. When the automatic procedure failed, it failed by several tensof milliseconds—an error easy to discover in explorative data plots.

3. The key–bottom contact was the instant when the downwards travel ofthe key was stopped by the keybed. This point was defined as the maximumdeceleration of the key (MDK). In some keystrokes, the MDK was not theactual keybed contact, but a rebound of the key after the first key–bottomcontact. For this reason, the time window of searching MDK was restrictedto 7 ms before and 50 ms after hammer–string contact. The time window

11This measurement was also used to find the individual attacks in a recorded file. All acceler-ations below a certain value were taken as onsets. The very rare silent attacks were not capturedwith this procedure, as well as some very soft attacks.

18 Chapter 2. Dynamics and the Grand Piano

was iteratively modified depending on the maximum hammer velocity untilthe correct instant was found. The indicator MDK was especially clear andnon-ambiguous when the key was depressed in a range of medium intensity(see Figures 2.2 and 2.3).

4. The maximum hammer velocity (in meters per second) was the maximumvalue in the hammer velocity track before hammer–string contact.

5. An intensity value was derived by taking the maximum energy (RMS) of theaudio signal immediately after hammer–string contact, using a RMS windowof 10 milliseconds.

To inspect the recorded key and hammer velocity tracks and the sound signal,an interactive tool was created in order to display one keystroke at a time in threepanels, one upon the other. The user could click to the next and the previouskeystroke, zoom in and out, and change the display from velocity to accelerationor displacement. Screen shots of this tool are shown below (see Figure 2.2 andFigure 2.3). The data was controlled and inspected on errors with the help of thistool.

2.2.3 Results and discussion

Influence of touch

To illustrate the difference between the two types of touch recorded (legato andstaccato), one of each is shown in Figure 2.2 and Figure 2.3. These two exampleshave a similar maximum hammer velocity. The left hand side panels show velocity,those on the right acceleration. Lines indicate finger–key (“fk,” blue dashed line),hammer–string (“hs,” red solid line) and key–bottom contact times (“kb,” greendash-dotted line).

In the legato attack (Figure 2.2, with the finger resting at the key surface beforehitting it), the key accelerated smoothly and almost constantly (about 8 ms beforehammer–string impact there was an interrupt in the movement, which could be dueto the escapement of the jack).

The staccato attack (Figure 2.3) showed a sudden acceleration in the beginning,whereas the hammer started to move up with a certain time delay. The parts of thepiano action were compressed by the strong initial impact. Only after the inertiaof the hammer was overcome, the hammer moved up towards the strings. Afterthis initial input, the key almost stopped moving. Shortly before hammer–stringimpact, it accelerated again, but did not reach its original speed. The accelerationof the key showed two negative peaks, whereas the second indicated the momentof key–bottom. In some very strong attacks, the first negative peak (maximumdeceleration) can surpass the second. Due to this fact, the key–bottom findingprocedure had to be restricted to a certain time window around hammer–string

2.2. The piano action as the performer’s interface 19

−0.6

−0.4

−0.2

0

0.2

0.4

0.6 hsfk kb

Key

vel

ocity

(m

/s)

hs−fk: 45.9 mskb−hs: −1.9 ms

−3

−2

−1

0

1

2

3 hsfk kb

maxHv: 2.654 m/s

Ham

mer

vel

ocity

(m

/s)

−80 −60 −40 −20 0 20−0.3

−0.2

−0.1

0

0.1

0.2

0.3 hsfk kb

SPL: 98.33 dB

Am

plitu

de (

−1/

+1)

Time (ms)

−300

−200

−100

0

100

200

300 hsfk kb

Key

acc

eler

atio

n (m

/s2) hs−fk: 45.9 ms

kb−hs: −1.9 ms

−4000

−2000

0

2000

4000 hsfk kb

maxHv: 2.654 m/s

Ham

mer

acc

eler

atio

n (m

/s2)

−80 −60 −40 −20 0 20−0.3

−0.2

−0.1

0

0.1

0.2

0.3 hsfk kb

SPL: 98.33 dB

Am

plitu

de (

−1/

+1)

Time (ms)

Figure 2.2: A legato attack played at middle C (C4, 60) on the Yamaha grand piano. Keyvelocity (upper left panel), key acceleration (upper right panel), hammer velocity (middleleft), hammer acceleration (middle right), and the sound signal are displayed. The dashedlines (blue) indicate finger–key contact (“fk”), the solid lines (red) hammer–string contact(“hs”), and the dotted lines (green) represent key–bottom contact (“kb”).

−0.6

−0.4

−0.2

0

0.2

0.4

0.6 hsfk kb

Key

vel

ocity

(m

/s)

hs−fk: 28.3 mskb−hs: 1.5 ms

−3

−2

−1

0

1

2

3 hsfk kb

maxHv: 2.552 m/s

Ham

mer

vel

ocity

(m

/s)

−80 −60 −40 −20 0 20−0.3

−0.2

−0.1

0

0.1

0.2

0.3 hsfk kb

SPL: 97.41 dB

Am

plitu

de (

−1/

+1)

Time (ms)

−300

−200

−100

0

100

200

300 hsfk kb

Key

acc

eler

atio

n (m

/s2) hs−fk: 28.3 ms

kb−hs: 1.5 ms

−4000

−2000

0

2000

4000 hsfk kb

maxHv: 2.552 m/s

Ham

mer

acc

eler

atio

n (m

/s2)

−80 −60 −40 −20 0 20−0.3

−0.2

−0.1

0

0.1

0.2

0.3 hsfk kb

SPL: 97.41 dB

Am

plitu

de (

−1/

+1)

Time (ms)

Figure 2.3: A staccato attack played at the middle C (C4, 60) on the Yamaha grand piano.(Annotations as in Figure 2.2).

20 Chapter 2. Dynamics and the Grand Piano

0 1 2 3 4 5 6 70

50

100

150

200

250

Maximum hammer velocity (m/s)

BÖSENDORFER SE290

old TCCnew TCC

0 1 2 3 4 5 6 70

50

100

150

200

250

Tra

vel t

ime

(ms)

YAMAHA DISKLAVIER

Hayashi, const. speedHayashi, const. acc.

0 1 2 3 4 5 6 70

50

100

150

200

250

STEINWAY C

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st rp

Figure 2.4: Travel times (from finger–key to hammer–string contact) againstmaximum hammer velocity for the threegrand pianos (three panels), differenttypes of touch (legato: “lg,” staccato:“st,” and reproduction by the piano:“rp”), and different keys (from C1 toG6, see legend of upper panel; only C1,C5, and G6 were measured on the Stein-way). In the middle panel, travel timedata is plotted as reported by Hayashiet al. (1999, p. 3543), for constant keyspeed (solid line with dots) and for con-stant key acceleration (solid line). Inthe bottom panel, the solid line depictsthe timing correction curve (TCC) ofthe older Bosendorfer grand piano (t =89.16h−0.570 , used in Goebl, 2001), thedash-dotted line that of the newer grandpiano (t = 84.27h−0.562).

2.2. The piano action as the performer’s interface 21

contact (see Section 2.2.2). Independently of the type of touch, the hammer–stringcontact is always the minimum acceleration (middle panel on the right).

The key reached the keybed shortly before the hammer–string contact (2 ms)with the legato touch, but 1.5 ms after the hammer–string contact with the staccatotouch. The whole attack process (from finger–key to hammer–string) needed 46 mswith the legato touch, but only 28 ms with the staccato touch although similarhammer velocities were produced.

The two attacks displayed in Figure 2.2 and 2.3 sounded indistinguishable tothe author (while listening informally to the material). Their difference in hammervelocity was obviously negligibly small. In some staccato attacks played by oneof the two pianists, a clear touch noise of the finger nail hitting the key surfacewas perceivable in the samples. This noise was absent in the legato keystrokes ofthat pianist. In these tones, the difference between legato and staccato touch wasevident. We have to bear in mind here that the microphone was very close to thestrings, a position in which an audience would never sit in a concert.12 An exampleof such an staccato tone with nail noise is displayed in Figure 2.19 (p. 46). In thesound signal, first noisy activation starts shortly after the finger touched the key. Itis interesting that the touch noise was so clearly audible in some samples. Was ittransmitted through the piano construction to the microphone or simply via the air?Nevertheless, systematic listening tests have to be performed to more conclusivelydiscuss the perception of the present samples. This will remain a topic for futureinvestigation with this material.

Timing properties

The different types of touch result in different acceleration patterns as illustratedabove. Hence, the timing properties of the piano action change with the differenttypes of touch. In this section, we discuss some typical measures: travel time, key–bottom time relative to hammer–string impact, hammer–string contact duration,and the time of free flight of the hammer.

Travel time The time interval between finger–key contact and hammer–stringimpact is defined here as the travel time.13 The travel times of all recorded tones areplotted in Figure 2.4 against hammer velocity separately for different types of touch(indicated by colour), for different keys (denoted by symbol), and for the three grandpianos (different panels). The present data were generally congruent with findingsby Askenfelt and Jansson (1991).

12However, in some professional recordings the microphones are sometimes placed very close tothe piano so that such finger–key noises got clearly perceivable.

13This terminology might be misleading, because “time” refers to a point in time, although inthis case a time duration is meant. Terms like “travel time” or “time of free flight” were usedaccording to the term “rise time” that is commonly used in acoustic literature (see, e.g., Truax,1978).

22 Chapter 2. Dynamics and the Grand Piano

Table 2.1: Power curves of the form t = a · hb fitted into travel time data separately forthe types of touch (legato, staccato, reproduction) and piano. t stands for travel time andh for maximum hammer velocity (see Figure 2.4).

legato staccato reproSteinway t = 98.57h−0.7147 R2 = 0.983 t = 65.19h−0.7268 R2 = 0.959

Yamaha t = 89.41h−0.5959 R2 = 0.965 t = 57.43h−0.7748 R2 = 0.969 t = 63.38h−0.7228 R2 = 0.990

Bosendorfer t = 89.96h−0.5595 R2 = 0.939 t = 58.39h−0.7377 R2 = 0.968 t = 60.90h−0.7731 R2 = 0.992

Some very basic observations can be drawn from this figure. The two pianistswere able to produce much higher hammer velocities on all three pianos with a stac-cato attack (beyond 7 m/s), whereas with a legato attack, the maximum hammervelocities hardly exceeds 4 m/s. There was a small trend towards higher hammervelocities at higher pitches (due to smaller hammer mass, see Conklin, 1996a). Thehighest velocities on the Yamaha and the Steinway were obtained at the G6, but atthe middle C on the Bosendorfer. The lowest investigated key (C1) showed slightlylower maximal hammer velocities by comparison to the fastest attacks (loudest at-tacks on the Steinway: C1: 6 m/s versus G6: 6.6 m/s, on the Yamaha: C1: 5.6 m/sversus G6: 6.8 m/s, and on the Bosendorfer: C1: 5.3 m/s versus G6: 5.8 m/s and C4:6.7 m/s). Since the keys were played by human performers, this variability betweenkeys could be due to the human factor.

The travel times ranged from 20 ms to around 200 ms (up to 230 ms on theSteinway) and depicted clearly different patterns for the two types of touch. Thetravel time curves were independent of pitch although lower keys have much greaterhammer mass than in the high register (Conklin, 1996a).

The data plotted in Figure 2.4 were approximated by power curves of the formt = a · hb separately for the type of touch (“lg,” “st,” and “rp”) and the threepianos. The results of these curve interpolations are listed in Table 2.1. From thesenumbers, we learn that the travel time curves of the reproducing systems (“rp”)resembled more the staccato than the legato curves. Staccato touch needed lesstime to transport the hammer to the strings than a legato touch which smoothlyaccelerated the key (and thus the hammer). The travel times were more spread outwhen the tones were played legato, indicating that there was a more flexible controlof touch in this way of actuating the keys (also reflected in the lower R2 values inTable 2.1). On the Steinway, the staccato data showed higher variability, almostsimilar to the legato data.

The Bosendorfer reproducing system (see Chapter 2.3) uses a timing correctionsimilar to the Yamaha Disklavier’s “prelay function” (Repp, 1996a, cf.) to correctfor the different travel times of tones with different intensity. In order to get thetones sounding on the required instant in time, the system has to advise its solenoidsto start to act—that is, hitting the key at its backside upwards—earlier for a softerthan for a louder tone. For this purpose, the SE system recalculates the timing char-

2.2. The piano action as the performer’s interface 23

12 24 36 48 60 72 84 96 1080

20

40

60

80

100

120

140

160

180

200

220Eprom SE 19−8974 (1999)

5.12 m/s

3.20 m/s

2.00 m/s

1.28 m/s

0.80 m/s

0.50 m/s

0.32 m/s

Pitch (MIDI note numbers)

Tra

vel t

ime

(ms)

440

Hz

Figure 2.5: The timing correction matrix (TCM) for the SE built into the older Bosendorfergrand piano (19–8974) as measured in 1999. Each of the seven lines represents measure-ments for a particular final hammer velocity (as plotted on the right hand side).

acteristics for each key individually by running a calibration program on demand.Among other parameters, the calibration function records the time interval betweenthe key sensor response (2–3 mm below the key’s resting position) and the hammer–string contact (as measured by one of the two trip points at the hammer sensor, fordetailed functionality see Chapter 2.3) for seven final hammer velocities (0.32, 0.50,0.80, 1.28, 2.00, 3.20, 5.12 m/s) and all 97 keys (the Bosendorfer Imperial 290 cmgrand piano has nine additional keys in the bass). This data matrix is stored ininternal system memory (EEPROM X2816AP). The content of this hardware chipof the SE system in Vienna was transferred into a file twice. Once, for the older pi-ano (19–8974, measured 1999), and once for the new piano (290–3, measured 2002).The calibration matrices (timing correction matrix, TCM) of the older Bosendorfer(used in Goebl, 2001, cf. Chapter 3) and the one of the newer grand piano (used inthe present study) are plotted in Figure 2.5 and Figure 2.6, respectively.

The matrices contained both irregularities from the piano action and the elec-tronic playback system. Since the playback system was identical in the two figuresand only the piano changed, we can assume that differences in the two matriceswere due to different piano actions (the newer grand piano also possesses a slightlyre-designed action, personal communication with Bosendorfer).

What can be seen from these data is that travel time does not depend on hammermass which becomes much larger in the bass. In the TCM of the newer grand piano,

24 Chapter 2. Dynamics and the Grand Piano

12 24 36 48 60 72 84 96 1080

20

40

60

80

100

120

140

160

180

200

220Eprom SE 290−3 (2002)

5.12 m/s 3.20 m/s

2.00 m/s

1.28 m/s

0.80 m/s

0.50 m/s

0.32 m/s

Pitch (MIDI note numbers)

Tra

vel t

ime

(ms)

440

Hz

Figure 2.6: TCM for the same SE system (as in the previous figure) built into a newerBosendorfer grand piano (290–3, measured in 2002). The bass register with the wrappedstrings crossing the middle register strings ranges from C0 (12) to C#2 (37).

the transition from the bass register (with the wrapped strings crossing the middlestrings) to the lower middle register can be seen. The strings in the bass register arepositioned some centimeters higher than the other strings so that these keys need tobe regulated slightly different from the rest. Nevertheless, the register change wasnot obvious in the TCM of the older piano.

These data represent measurements originally not collected for scientific purposesbut to internally calibrate a reproducing system. Apart from providing prototypicaltravel time data Goebl (used in 2000), the developer of the system (W. Stahnke)did not offer any more specific information on that calibration data. It must beassumed that it also reflects properties of the electronic equipment or even that itcannot be interpreted at all. Due to this interpretational uncertainty, only the dataaveraged over the 97 keys was taken into consideration.

The power curves fitted into the averaged (seven) data points of the two TCMsare called ‘timing correction curves’ (TCC) in Goebl (2001). They are plotted ontothe Bosendorfer data in Figure 2.4. It is evident that both curves were very similarto each other and to the curve obtained from the legato touch. This was somewhatsurprising, because we found that the Bosendorfer SE generates typically a staccatocurve at reproduction, but measured a timing correction curve that was more similarto the legato pattern than to the staccato pattern. Nevertheless, the functions usedin earlier work (Goebl, 2000, 2001) were replicated with the present measurement

2.2. The piano action as the performer’s interface 25

0 1 2 3 4 5 6 7−10

−5

0

5

10

15

20

25

30

35

40

Key

bot

tom

tim

es (

ms)

STEINWAY C

0 1 2 3 4 5 6 7−10

−5

0

5

10

15

20

25

30

35

40

Maximum hammer velocity (m/s)

YAMAHA DISKLAVIER

0 1 2 3 4 5 6 7−10

−5

0

5

10

15

20

25

30

35

40BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st rp

Figure 2.7: Key–bottom contact times relative to the moment of hammer–string contactseparately for the three grand pianos (different panels), five keys (different markers), andtypes of touch (colour). Negative values indicate key–bottom contacts before, positivevalues contacts after hammer–string.

setup. The impact of different power curve approximations on the interpretations ofresults found by Goebl (2000, 2001) is studied and discussed in detail in Section 3.6,p. 74.

Also the travel time function of the Yamaha Disklavier was compared to somedata reported in the literature. In the middle panel in Figure 2.4, travel time datais plotted as printed in Figure 26 in Hayashi et al. (1999).14 The author transferredthe graph of this figure into discrete data with the help of a ruler in order to comparetheir findings with the present data. The graph with keystrokes at constant speed(solid line with dots in middle panel of Figure 2.4) was similar to the staccato data,the graph with keystrokes at constant acceleration resembled more the legato typeof touch in the Disklavier’s data.

Key–bottom contact relative to hammer–string contact Figure 2.7 dis-plays the key–bottom contact times relative to hammer–string contact (kbrel =kb−hs). Negative values indicate key–bottom contacts before hammer–string, pos-itive values key–bottom contacts after the hammer hits the strings (see overviewdisplay in Figure 2.9a, p. 29). The keybed was reached by the key up to 35 ms afterhammer–string contact in very soft tones (up to 39 ms at the Bosendorfer) and asearly as 4 ms before in very strong keystrokes. This finding coincides with Askenfeltand Jansson (1990a,b, see Section 2.2.1), but since much softer tones were measuredin the present study (as low as 0.1 m/s), the key–bottom times extended more afterhammer–string contact.

However, the different types of touch behave quite differently. Keystrokes pro-duced in a legato manner tended to reach the keybed earlier than keystrokes hit

14Hayashi et al. (1999) used “the 11th key” of a Yamaha grand piano, model C7.

26 Chapter 2. Dynamics and the Grand Piano

Table 2.2: Power functions of the form kb = a · hb + c fitted into the data from Figure 2.7separately for the types of touch and the different pianos. (kb is key–bottom; h themaximum hammer velocity.)

legato staccato repro

Steinway kb = 19.09h−0.3936 − 12.3 kb = 59.57h−0.1131 − 51.19

R2 = 0.794 R2 = 0.738

Yamaha kb = 14.63h−0.4158 − 11.05 kb = 10.15h−0.6825 − 3.743 kb = 16.2h−0.1639 − 12.47

R2 = 0.933 R2 = 0.893 R2 = 0.836

Bosendorfer kb = 11.59h−0.4497 − 9.983 kb = 13.96h−0.3559 − 10.15 kb = 10.09h−1.108 − 2.085

R2 = 0.855 R2 = 0.698 R2 = 0.942

in a staccato manner. This was especially evident for the Bosendorfer and for theYamaha, but not for the Steinway. Askenfelt and Jansson (1992b, p. 345) stated thatthe interval between key–bottom and hammer–string contact varies only marginallybetween legato and staccato touch. They obviously refer with this statement to oneof their earlier studies (Askenfelt and Jansson, 1990b), where the investigated grandpiano was also a Steinway grand piano.15

Power functions were fitted into the data as depicted in Figure 2.7, separatelyfor the two types of touch and the different pianos. They are listed in Table 2.2.Since the data to fit contains also negative values on the y axis, power functions ofthe form kb = a ·hb+ c were used. The data spread out more than in the travel timecurves (reflected in smaller R2 values) and showed considerable difference betweentypes of touch, except for this Steinway, where touch did not divide the data visibly.Recall that these data apply to specific instruments and depend strongly on theirregulation so that generalisation to other instruments may be problematic.

Askenfelt and Jansson (1990b) considered key–bottom times as being hapticallyfelt by pianists and thus as being important for the vibrotactile feedback in pianoplaying. Temporal differences of the order of 30 ms are in principle beyond thetemporal order threshold (Hirsh, 1959), but these time differences may be perceivedsubconsciously and perhaps as response behaviour of a particular piano. Especially,the different key–bottom behaviour for the different kinds of touch might be judgedby the pianists as part of the response behaviour of the action (Askenfelt and Jans-son, 1992b). The staccato tones have in addition to the shorter travel time alsoa longer time interval after hammer–string contact so that the tone appears evenearlier and thus louder and more direct than a legato keystroke with comparableintensity.

Time of free flight In order to estimate the time interval after the jack madecontact with the escapement dolly (and the hammer travels without any further

15Askenfelt and Jansson (1990b) used a Steinway Model B, #443001, built in Hamburg 1975.

2.2. The piano action as the performer’s interface 27

0 1 2 3 4 5 6 70

10

20

30

40

50

60

70

80

90

Tim

e of

free

flig

ht (

ms)

STEINWAY C

0 1 2 3 4 5 6 70

10

20

30

40

50

60

70

80

90

Maximum hammer velocity (m/s)

YAMAHA DISKLAVIER

0 1 2 3 4 5 6 70

10

20

30

40

50

60

70

80

90BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st rp

Figure 2.8: Time of free flight of the hammer. Time intervals between the point ofmaximum hammer velocity and hammer–string contact are plotted against maximumhammer velocity.

acceleration towards the strings) and the sound, the time interval between the pointof maximum hammer velocity and hammer–string contact was calculated (‘time offree flight’). These intervals are plotted in Figure 2.8 against maximum hammervelocity. Power curves were approximated also for this data (as listed in Table 2.3).The piano action of this Yamaha grand piano showed two different behaviours forstaccato touch at medium intensities (between around 1 and 2 m/s): maximumhammer velocity occurred at two different instants (Figure 2.8, middle panel). Thiswas accounted for by two separate curve fits (Table 2.3).

With escapement, the pianist loses control over the tone. The point of maximumhammer velocity coincided well with the escapement point for medium and hardkeystrokes, but was sometimes considerably earlier for soft keystrokes. On thisSteinway, the time of free flight was almost zero (e.g., below two milliseconds) beyondhammer velocities of 2 m/s. For this Bosendorfer, the free flight times went below2 ms at around 2.5 m/s. The same was true for the the legato tones and partof the staccato tones of the Disklavier. However, the other crowd of staccato tonesdepicted comparatively larger times of free flight up to velocities of about 4 m/s (seeFigure 2.8). This early moment of maximum hammer velocity might be ascribed toa hammer lifting off from the jack because of the strong initial force of a staccatoattack even before the moment of escapement.

There was a tendency for all three measured pianos that staccato tones hadlonger free flight times (that is later instants of maximum hammer velocity) thanlegato tones. The differences are of the order of some milliseconds (e.g., at 0.5 m/sthey were 11, 17, and 15 ms for this Steinway, the Yamaha, and the Bosendorfer,respectively according to the curve fits in Table 2.3). Moreover, legato data did notexceed 30 ms at this Steinway, barely at this Bosendorfer, but considerably (around25 data points) at this Yamaha, whereas staccato data ranged up to 80 ms and more

28 Chapter 2. Dynamics and the Grand Piano

Table 2.3: Power functions of the form f = a · hb fitted into the data from Figure 2.8separately for the types of touch and the different pianos. (f is the time interval of freeflight; h the maximum hammer velocity.)

legato staccato repro

Steinway f = 3.73h−1.6850 f = 6.76h−1.7710

R2 = 0.9763 R2 = 0.7139

Yamaha f = 3.72h−1.7850 fearlier = 18.08h−1.2880 f = 4.04h−1.1410

flater = 6.71h−2.1490

R2 = 0.9331 R2 = 0.9664/0.9514 R2 = 0.9623

Bosendorfer f = 5.04h−1.1870 f = 7.64h−1.83 f = 17.6h−1.866

R2 = 0.983 R2 = 0.9864 R2 = 0.9895

in all three pianos.

These findings have interesting implications to piano playing. Longer times offree flight with staccato touch suggest that legato touch allows closer control of theattack than a sudden keystroke from above, because the pianist has longer connec-tion with the hammer and thus longer control over the acceleration. Moreover, thepianist might lose contact to the hammer with staccato touch before the jack isescaped by the jack regulator because the hammer lifts off from the jack. However,we have to bear in mind that the instant of maximum hammer velocity can be quitedifferent from the moment of escapement at very soft tones, in other words, a pianistcan also decelerate willingly until escapement. That means that an early instant ofmaximum hammer velocity might also be due to an hesitating keystroke. Furtherevaluation of the data (e.g., determining the moment of escapement) will clarifythese questions. This remains for future investigation.

Moreover, the earlier the hammer reaches its maximum velocity, the more energyit loses on its travel towards the strings, and the larger is the difference between themaximum velocity and the velocity at which the hammer hits the strings. Therefore,playing from the key surface is also a more economic way of playing piano. Espe-cially the very early hammer velocity maxima at the Yamaha’s staccato tones16 re-flect especially uneconomical and uncontrolled ways of attack. This conclusion alsocoincides with suggestions from piano teaching literature (e.g., Gat, 1965), wherelegato touch is considered to be more economic and to produce less noise during theattack process.

Comparison among tested pianos In Figure 2.9a, all power curve approxima-tions as reported above (cf. Table 2.1, 2.2, and 2.3) are plotted in a single display,separately for the type of touch (panels) and the three tested piano actions (linestyle) against the time (in seconds) relative to the hammer–string contact. The

16We found around 20 such tones on our Bosendorfer and 3 on our Steinway.

2.2. The piano action as the performer’s interface 29

(a)

250−25−50−75−100−125−150−175−200

1

2

3

4

5

6

7 Legato touch

Finger−key

Key−bottom

Max. HV

Steinway CYamaha DisklavierBösendorfer SE290

250−25−50−75−100−125−150−175−200

1

2

3

4

5

6

7 Staccato touch

Max

imum

ham

mer

vel

ocity

(m

/s)

Time relative to hammer−string (ms)

(b)

2001751501251007550250

1

2

3

4

5

6

7Legato touch

Hammer−string

Key−bottom

Steinway CYamaha DisklavierBösendorfer SE290

2001751501251007550250

1

2

3

4

5

6

7Staccato touch

Max

imum

ham

mer

vel

ocity

(m

/s)

Time relative to finger−key (ms)

Max. HV

Figure 2.9: Temporal properties of the three tested grand piano actions. Power curve ap-proximations (cf. Table 2.1, 2.2, and 2.3) for finger–key contact time, instant of maximumhammer velocity (max. HV), and key–bottom contact time (right) for the three pianos(line style) and the two types of touch (panels), (a) relative to hammer–string contactand (b) relative to finger–key contact. In (b), the different instants in time (instant ofmaximum hammer velocity, hammer–string contact, key–bottom contact) become visuallybarely distinguishable.

30 Chapter 2. Dynamics and the Grand Piano

temporal differences between extremes in intensity were largest for the finger–keytimes and smallest for key–bottom times. The differences of the curves between thepianos by different manufacturers were small compared to the differences introducedthrough diverse touch. The finger–key curve of this Steinway action was the left-most except for loud legato tones. Also our Steinway’s key–bottom curve was theright-most of the three actions. Thus, the Steinway action needed more time for theattack operation than the other two pianos, except for very loud legato tones. Themost striking difference between the tested piano actions was the early curve of thehammer velocity maxima on the Disklavier (see Table 2.3, p. 28) which was around20 ms earlier than the other curves.

In Figure 2.9b, the same curves are plotted relative to finger–key. Although inthis display the different curves for maximum hammer velocity, hammer–string andkey–bottom contact are hard to distinguish, it makes clear, how close together thesethree different points in time are in comparison to the start of the key acceleration.

These data apply only to the tested instruments and temporal behaviour changesconsiderably with regulation (especially key–bottom contact and the time of freeflight, see Askenfelt and Jansson, 1990b). We do not know how different the tempo-ral properties of other instruments of these three manufacturers will be. The timingproperties of the actions can be varied considerably by regulation (see Dietz, 1968;Askenfelt and Jansson, 1990b). Changes in regulation (Hammer–string distance,let-off distance) resulted in changes of the key–bottom timing and the time inter-val of the hammer’s free flight, respectively, of up to 5 ms (for a medium intensityAskenfelt and Jansson, 1990b, pp. 56–57). The differences between piano actions inthe present data are approximately of the same order.17

It can be concluded that the temporal behaviour of the tested piano actionsby different manufacturers were similar. However, no definitive conclusions can bedrawn whether or not these (comparably small) differences in temporal behaviourwere crucial for the pianist’s estimation of the piano’s quality and whether theyapply also to other instruments of these manufacturers.

Acoustic properties

Rise time The hammer–string contact was defined as the conceptual onset of atone, which corresponds closely to the physical onset. From perceptual studies weknow that the perceptual onset of a tone might be slightly later than its physicalonset, depending on the rise times of the tones (Vos and Rasch, 1981a,b). In thisparagraph, the rise time characteristics of the pianos were investigated with re-spect to their pitch and their intensity. For this purpose, the time interval betweenhammer–string contact and the maximum of the energy of the sound.18 was definedas the rise time of the piano tone.

17Note that all three pianos were maintained and regulated by professional technicians beforethe measurement so that all pianos were in concert condition before the tests.

18The RMS was calculated with a fixed window of 10 ms.

2.2. The piano action as the performer’s interface 31

0 1 2 3 4 5 6 7

5

10

15

20

25

30

Ris

e tim

e (m

s)

STEINWAY C

0 1 2 3 4 5 6 7

5

10

15

20

25

30

Maximum hammer velocity (m/s)

YAMAHA DISKLAVIER

0 1 2 3 4 5 6 7

5

10

15

20

25

30

BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st rp

Figure 2.10: Rise times of piano tones (from hammer–string to the maximum RMS energyof the sound signal) against maximum hammer velocity, separately for three pianos, fivepitches, and two types of touch.

The rise times for the three pianos are plotted in Figure 2.10 The rise timesranged between 4 and 23 ms. The data grouped according to their pitch, but didnot change with intensity (e.g., the louder the sooner the tones develop). The lowertones needed up to 25 ms to reach their maximum, whereas high pitches achievedmaximum energy already after 5 ms. For some soft tones, there was a tendency toslightly shorter rise times by comparison to louder tones of the same pitch.

Since the rise times were invariant over the whole dynamic range, also the per-ceptual onsets will not change with tone intensity. Also the perceptual onsets will belater the lower the pitch. This implies for performance research that the measuredonsets at computer-monitored instruments (corresponding well with the physicalonset of the tone) have to be delayed for lower pitches. According to the presentdata, the differences will be at most of the order of 10 ms. These differences assmall as they are might not be crucial for the analysing researcher, but essential forautomatic transcription systems.

Peak sound-pressure level The peak sound-pressure level (in dB) for all tonesis plotted against the maximum hammer velocity in Figure 2.11 separately for thepianos, pitch, and type of touch. The microphone position was always very closeto the strings (about 10 cm distance). Different pitches depicted slightly differentcurves with a tendency for the lower pitches to have lower peak sound-pressurelevels. There was no effect of type of touch. The same hammer velocity resulted inequal sound level independently of the type of touch. Only for the very soft legatotones on the Bosendorfer, the same maximum hammer velocity resulted in differentsound levels according to the type of touch. For these cases, the maximum hammervelocity was considerably higher than the speed at which the hammer touched thestrings for staccato tones, but not for legato tones.

32 Chapter 2. Dynamics and the Grand Piano

0 1 2 3 4 5 6 740

50

60

70

80

90

100

110

120

Pea

k S

PL

(dB

)

STEINWAY C

0 1 2 3 4 5 6 740

50

60

70

80

90

100

110

120

Maximum hammer velocity (m/s)

YAMAHA DISKLAVIER

0 1 2 3 4 5 6 740

50

60

70

80

90

100

110

120BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st rp

Figure 2.11: Peak sound-pressure level (dB) against maximum hammer velocity (m/s) fordifferent pianos, pitch, and type of touch.

2.2.4 General discussion

This study provided benchmark data on the temporal properties for three differentgrand pianos under two touch conditions (legato and staccato). Prototypical func-tions were obtained for travel times, key–bottom times, and the instants of maximumhammer velocity by fitting power curves into measured data. The temporal prop-erties varied considerably between type of touch, only marginally between pianos,and not at all between the different tested keys. The latter was not surprising,since piano technicians generally aim to adjust a grand piano action so that all keysshow similar and consistent behaviour over the whole range of the keyboard. Thetested pianos were all maintained and tuned by highly skilled technicians before theexperiments.

Different kinds of actuating the keys produced different ranges of hammer veloc-ity. Very soft tones could only be achieved with a legato touch and the extremelyloud attacks only with staccato touch. Playing from the keys (legato) did not allowhammer velocities beyond 4–5 m/s, thus for some very loud intensities hitting thekeys from above was the only possible means. A better tone control was achievedthrough legato touch, because the time of free flight was shorter than in a staccatokeystroke. Additionally, depressing a key in a legato manner caused less touch noisewhich is usually regarded as a desired aesthetic target in piano playing and teaching(cf. e.g., Gat, 1965).

The two types of touch (in the present terminology legato and staccato) dorepresent two poles of a variety of different ways to actuate the piano key (i.e., lateacceleration versus early, hesitating in between, or accelerating from escapementpoint). It must be assumed that a professional pianist will (even unconsciously) beable to produce many different steps of touch between legato and staccato.

The travel times and the key–bottom times changed considerably with intensityof key depression. A soft tone may take over 200 ms longer from the first actuation

2.2. The piano action as the performer’s interface 33

by the pianist’s finger to sound production compared to a very sudden fortissimoattack. Moreover, travel times and key–bottom times changed considerably withtouch. A staccato tone needed around 30–40 ms less from finger–key to hammer–string than a legato tone with the same hammer velocity. These findings were notsurprising, but the performing artist has to anticipate these changes in temporalbehaviour while playing in order to achieve the desired timing of the played tones.The pianist not only has to estimate before playing a tone, how long the keystrokewill take for what desired dynamic level, but also for what intended way of actuatingthe key. These complex temporal interactions between touch, intensity and the toneonset are dealt with and applied by the pianist unconsciously; they are establishedover years of intensive practising and extensive self-listening. Immediately, musicalsituations come to mind in which loud chords tend to come early with pianists atbeginning or intermediate level; or that crescendo passages tend to accelerate intempo as well, because each keystroke is performed with a harder blow and thusquicker in order to achieve the crescendo, but the time intervals between fingeractivity were not correspondingly increased.

A keystroke starts for the pianist kinesthetically with finger–key contact (theacceleration impulse by the finger) and ends at key–bottom, but it starts aurally (forpianist and audience) at (or immediately after) hammer–string. Typical dynamics(as measured in Chapter 3) of piano performances at intermediate dynamic levelsfall between 40 and 60 MIDI velocity units (0.7 to 1.25 m/s) and thus typical traveltimes are between 80 and 108 ms, thus varying as much as about 30 ms. At suchkeystrokes, the key–bottom times are between 3.5 and 0.5 ms before hammer–stringcontact, thus a range of 3 ms. It can be assumed here that with such moderateintensity levels (and a default touch which is likely to be rather legato), the changesin travel times due to varying intensity might not be directly relevant for the playersince they are small and at the threshold of perceivability. Thus, they are sufficientlylarge to produce the typical melody lead (see Chapter 3).

At that typical dynamic range, key–bottom times are even more unlikely to beperceived by the pianist separately from the sound (hammer–string), since thosetemporal differences are there of the order of 1–4 ms. However, the differencesbetween key–bottom and hammer–string can be up to 40 ms in extreme cases whichis of the order of or just beyond just noticeable differences (Askenfelt and Jansson,1992b, p. 345). Also as Figure 2.9b makes visually evident, the travel times are farlarger than the time differences of the other readings (maximum hammer velocity,hammer–string, key–bottom), so it can be assumed that the pianist (especially inthe dynamic middle range) only senses two points in time: the start of the keystroke(finger–key) and its end which coincides with the beginning of the sound.

Although the piano hammer cannot be controlled anymore after the jack wasrotated away by the escapement dolly, pianist do still apply force to the key atkeybed. In piano education, pianists are usually made aware of that fact. Neverthe-less, pianists pressed down the key although it already arrived at key–bottom (whichis certainly after the jack’s escapement). This effect was far stronger for amateur

34 Chapter 2. Dynamics and the Grand Piano

pianists than for expert performers (Parlitz et al., 1998). Experts stopped applyingpressure immediately after the key hit keybed whilst amateurs continued force. Theimmediate reduction of force saves energy and allows relaxation and preparation fora next keystroke.

Furthermore, senso-motoric feedback is considered an utmost important factorfor pianists not only for judging the action’s response, but also to judge the piano’stone (Galembo, 1982, 2001). In an extended perception experiment, Galembo (1982)asked a dozen professors from the Leningrad Conservatory of Music to rate the in-strumental quality of three grand pianos under different conditions. The participantsagreed that the Hamburg Steinway grand piano was superior, followed by the Bech-stein grand piano, while the lowest quality judgement received a grand piano fromthe Leningrad piano factory. In different discrimination tasks, the participants werenot able to distinguish between the instruments (although all indicated to be ableto) only by listening to them when played by some other person behind a curtain.But they could very well discriminate between instruments when they played onthem blindly or deaf-blindly (Galembo, 1982, 2001). This study implied that thehapto-sensoric feedback of the piano action to the playing pianist is crucial for theestimation of the instrumental quality.

Another important factor on the hapto-sensoric feedback sensed by the pianistis the room acoustics (Bolzinger, 1995). The same piano action might feel easily tohandle in a room with reverberant acoustic, whilst the same action feels intractableand tiring in a room without any reverberation. Similarly, the timbre of that instru-ment might be judged differently with changing room acoustics. A pianist is usuallynot able to separate the influences of room acoustics from properties of the instru-ment and directly attributes room acoustics to instrumental properties (Galembo,2001).

The reported temporal properties of the piano actions were derived from isolatedpiano tones (without pedal) such as they virtually never occur in piano performances.For a new keystroke, the key does not necessarily have to come back to its restingposition, but, due to the double repeating feature of modern grand piano actions, thehammer is captured by the check and the repetition lever stopped by the drop screw(Askenfelt and Jansson, 1990b). When the key is released approximately half way(of the approximately 10 mm touch depth), the jack is able to resile back underneaththe roller and another keystroke can be performed. This point is usually some 2–4 mm below the key surface. For such keystrokes, the key can travel only 6–8 mm,so the travel times are expected to be shorter than in legato key depression fromthe key’s resting position. Also for such repeated keystrokes, it would be impossibleto calculate or to determine a finger–key contact point in time.

The different kinds of touch present in the study sometimes displayed portionsof noise that stemmed from the finger–key interaction and were clearly perceivable.Especially at the staccato tones of one of the playing pianists (RB), the nail hittingthe key was audible in the samples and visible in the wave form of the sound.Although this issue was not investigated systematically here (controlled listening

2.2. The piano action as the performer’s interface 35

tests and comprehensive analyses of the noise portions in the sound), these findingscoincided with results from the literature (Askenfelt, 1994; Baron, 1958; Baron andHollo, 1935; Koornhof and van der Walt, 1994; Podlesak and Lee, 1988). However, itis stated here that the type of touch influences the pianist more through kinaestheticfeedback, through the different times at which the tone is to be expected after hittingthe keys, and through the different motor efforts involved more than the manifoldemerging noises which cease away at a certain distance from the piano (Bork et al.,1995).Another interesting issue with respect to the reported data is whether there is a

relationship between the actions’ temporal properties and the instrumental qualityof the tested grand pianos. The author’s opinion as a pianist was that from the threeinvestigated grand pianos in this study the Steinway was qualitatively superior tothe other two, although the Bosendorfer was a high standard concert grand piano.The small Yamaha baby grand was the least interesting instrument also due to itssize. However, all pianos were on a mechanically high standard and they were wellmaintained and tuned. The most convincing feature of the Steinway was (to theauthor’s opinion) apart from the clear tone the extremely precise action that allowedvirtually every subtle control over touch and tone. It is assumed here that one ofthe most important features of a ‘good’ piano is a precise and responsive action.In the data reported above, some differences between the pianos could be ob-

served that might influence the subjective judgement of instrumental quality. TheSteinway showed (1) no difference in touch at the shape of the travel time functions;(2) no difference in touch at key–bottom times; (3) short time intervals of free flight(already around zero at keystrokes beyond a hammer velocity of 1.5 m/s, while forthe Bosendorfer at around 2.5 m/s, for the Yamaha above 3 m/s). Moreover, theDisklavier showed many very early hammer velocity maxima at velocities betweenabout 1 and 2 m/s, the Bosendorfer some, the Steinway almost none.Although further evaluative investigations would be required to more conclu-

sively state any hypotheses on the relation of temporal behaviour of grand pianoactions and instrumental quality, it seems likely that the constant behaviour overtype of touch and late hammer velocity maxima are crucial for precise touch controland a subjective positive appreciation of instrumental quality.

36 Chapter 2. Dynamics and the Grand Piano

2.3 Measurement and reproduction accuracy of

computer-controlled grand pianos

This section examined the precision of the two reproducing pianos used in the previ-ous section in order to determine benchmark data for performance research on howreliable those devices are. Parts of this work have already been published (Goebland Bresin, 2001). A slightly modified version of this section will appear in theJournal of the Acoustical Society of America (Goebl and Bresin, 2003a) and waspresented at the Stockholm Music Acoustics Conference (SMAC’03, cf. Goebl andBresin, 2003b).

2.3.1 Introduction

Current research in expressive music performance mainly deals with piano interpre-tation because obtaining expressive data from a piano performance is easier than,e.g., from string or wind instruments. Pianists are able to control only a few pa-rameters on their instruments. These are the tone19 onsets and offsets, the intensity(measured as the final hammer velocity), and the movements of the two pedals.20

Computer-controlled grand pianos are a practical device to pick up and to measurethese expressive parameters and—at the same time—provide a natural and famil-iar setting for pianists in a recording situation. Two systems are most commonlyused in performance research: the Yamaha Disklavier (Behne and Wetekam, 1994;Palmer and Holleran, 1994; Repp, 1995b, 1996c,a,d, 1997b; Juslin and Madison,1999; Bresin and Battel, 2000; Timmers et al., 2000; Riley-Butler, 2001, 2002), andthe Bosendorfer SE system (Palmer, 1996; Bresin and Widmer, 2000; Goebl, 2001;Widmer, 2001, 2002a,b). Some studies made use of various kinds of MIDI keyboardswhich do not provide a natural playing situation to a classical concert pianist be-cause they have a different tactile and acoustic response (e.g., Palmer, 1989; Repp,1994).

Both the Disklavier and the SE system are integrated systems (Coenen andSchafer, 1992), which means that they are permanently built into a modern grandpiano. They are based on the same underlying principle. That is, to measure andreproduce movements of the piano action, above all the final speed of the hammerbefore touching the strings. These devices are not designed for scientific purposesand their precise functionality is unknown or not revealed by the companies. There-fore, exploratory studies on their recording and playback precision are necessary inorder to examine the validity of the collected data.

Both devices have sensors at the same places in the piano action (see Figure 2.1

19The onset of a sounding tone is very often called “note onset,” because of the MIDI world’sterminology. In this paper, the terms “tone” and “note” are used synonymously, since we are nottalking about musical notation.

20The middle or sostenuto pedal only prolongs certain tones and is not counted as an individualexpressive parameter.

2.3. Measurement and reproduction accuracy 37

on page 16). There is a set of shutters mounted on each of the hammer shanks.21

This shutter interrupts an infrared light beam at two points just before the hammerhits the strings: the first time approximately 5 mm before hammer–string impact,the second time when the hammer crown just starts to contact the strings. Thesetwo points in time yield an estimate of the final hammer velocity (FHV ). In thecase of the Disklavier, no further information about how this data is processed wasobtainable. On the Bosendorfer, the time difference between these two trip pointsis called (by definition) inverse hammer velocity (IHV ) and is stored as such in theinternal file format. Since the counter of this infrared beam is operating at 25.6 kHz,the final hammer velocity (in meters per second) is: FHV = 128/IHV (Stahnke,2000; Goebl, 2001, p. 572). The timing of the trip point closer to the strings is takenas the note onset time which has a resolution of 1.25 ms. It seems that the Disklavieruses the same measuring method for hammer velocity and note onset, but as thecompany does not distribute any more specific details, this is only speculation. TheMIDI files of the Disklavier provided 384 MIDI ticks per 512 820 µs (as defined in thetempo command in the MIDI file), thus a theoretical timing resolution of 1.34 ms.A second set of sensors is placed under the keys to measure when the keys are

depressed and released. Again, the exact use of this information at the Disklaviercannot be reconstructed, but the Bosendorfer uses this information for releasing thekeys correctly (note offsets) and to reproduce silent tones (when the hammer doesnot reach the strings). The Disklavier used in this study does not reproduce anysilent notes at all.The data picked up by the internal sensors are stored in the Disklavier on an

internal floppy drive or externally by using the MIDI out port. The SE systemis linked with a special cable plugged into an ISA card of a personal computerrunning MS DOS. Internal software controls the recording. The information isstored in standard MIDI format on the Disklavier, and in a special file format onthe Bosendorfer (each recording comprises a set of three files with the extensions“.kb” for keyboard information, “.lp” for the loud (right) pedal, and “.sp” for thesoft (left) pedal). Although the SE file data are encrypted, the content of the filescan be listed with the supplied software and used for analysis.

The reproduction is carried out with linear motors (solenoids) placed under theback of each key. The cores of the coils of the Disklavier have a length of approx-imately 7 cm, whereas those of the SE system are at least double that length ormore. Pedal measurement and reproduction is not discussed in the present study.Only a few studies provide some systematic information about the precise func-

tionality of these devices. Coenen and Schafer (1992) tested five different repro-duction devices (among them a Bosendorfer SE225 and a Yamaha Disklavier grandpiano, DG2RE) on various parameters, but their goal was to evaluate their re-liability for compositional use; their main focus was therefore on the productionmechanism. They determined practical benchmark data like scale speed, note repe-

21On the Disklavier, the hammer shutter is mounted closer to the fixed end of the hammer,whereas the SE has its shutter closer to the hammer (as displayed in Figure 2.1).

38 Chapter 2. Dynamics and the Grand Piano

tition, note density (maximum number of notes which can be played simultaneously),minimum and maximum length of tones, and pedal speed. In their tests, the in-tegrated systems (Disklavier, SE) performed generally more satisfactorily than thesystems which are built into an existing piano (Autoklav, Marantz pianocorder).The Bosendorfer, as the most expensive device, had the best results in most of thetasks. Bolzinger (1995) performed some preliminary tests on a Yamaha upright Dis-klavier (MX-100 A), but his goal was to measure the interdependencies between thepianist’s kinematics, performance, and the room acoustics. With his Disklavier, hehad the opportunity to play back files and to simultaneously record the movementsof the piano with the same device using the MIDI out port. That way, he obtainedvery easily a production-reproduction matrix of MIDI velocity values, showing alinear reproducing behaviour only at MIDI velocity units between approximately30 and 85 (Bolzinger, 1995, p. 27). On the Disklavier in the present study, thisparallel playback and recording was not possible. Maria (1999) developed a com-plex methodology to perform meticulous tests on a Disklavier (DS6 Pro), but nosystematic or quantitative measurements are reported so far.

The focus of this study lies on the recording and reproducing accuracy of twocomputer-controlled grand pianos with respect to properties of the piano action(hammer–string contact, final hammer velocity), and properties of the sounding pi-ano tone (peak sound-pressure level). In addition to this, we report the correspon-dence between physical sound properties and their representation as measured by thecomputer-controlled pianos (MIDI velocity units), in order to provide a benchmarkfor performance research (see also Palmer and Brown, 1991; Repp, 1993b).

Another issue discussed in the following is the timing behaviour of the grandpiano action in response to different types of touch and their reproduction by areproducing piano. Selected keys distributed over the whole range of the keyboardwere depressed by pianists with many degrees of force and with two kinds of touch:with the finger resting on the surface of the key (legato touch), and with an attackfrom a certain distance above the keys (staccato touch). These different kinds oftouch are described in Askenfelt and Jansson (1991).

2.3.2 Method

The two computer-controlled grand pianos (the Yamaha Disklavier and the Bo-sendorfer SE290), the experimental setup, and the procedure were the same as inSection 2.2.2. Immediately before the experiments, both instruments were tuned,and the piano action and the reproduction unit serviced. In the case of the Diskla-vier, this procedure was done by a specially trained Yamaha piano technician. Atthe Bosendorfer company, the company’s SE technician took care of this work.

This method delivered (1) the precise timing (onset) and dynamics of the originalrecording, (2) the internally stored MIDI file of the Disklavier or its correspondentof the SE device, and (3) the precise timing and dynamics of the reproduction.

For data analysis, only few of the discrete readings done in Section 2.2.2 were

2.3. Measurement and reproduction accuracy 39

0 100 200 300 400−50

−40

−30

−20

−10

0

10

20

y = −0.053 ⋅ x +1.715

R2 = 0.572

Del

ay o

f MID

I file

(m

s)

Original time (s)

YAMAHA DISKLAVIER II

0 100 200 300 400−50

−40

−30

−20

−10

0

10

20

y = −0.143 ⋅ x −1.381

R2 = 0.942

BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st

Figure 2.12: Timing delays (ms) as a function of recorded time (s) between the originalrecording and the MIDI file as recorded by the computer-controlled grand pianos for twotypes of touch: legato (“lg”) and staccato (“st”). Negative values indicate that an onsetin the MIDI file was earlier than in the original recording. The straight lines are linearfits of the whole data.

used: the hammer–string contact corresponding to the ‘note on’ time in the MIDIfile, the maximum hammer velocity, the peak sound-pressure level, and the MIDIvelocity value as stored in the recorded MIDI files (or the internal file format of theBosendorfer SE system).

The onset differences between the original recording and the MIDI file, andthose between the original recording and its reproduction were calculated.22 Sincethe three measurements (original recording, MIDI file, and reproduction) were notsynchronised in time by the measurement procedure, their first attacks were definedas being simultaneous. Care was taken that the first tones always were loud attacks,to minimise synchronisation error, since timing error was smaller the faster theattack was. If there were soft attacks at the beginning, the files were synchronisedby the first occurring loud attack (hammer velocity over 2 m/s or 77 MIDI velocityunits).

2.3.3 Results and discussion

Timing accuracy

In Figure 2.12, the note onset delays of the MIDI file in comparison to the originalrecording are plotted against the recorded time separately for the two pianos. It isevident that both MIDI files show a constantly decreasing delay over time.

22delayMIDI =MIDI onset – original onset; delayrepro =reproduced onset – original onset.

40 Chapter 2. Dynamics and the Grand Piano

0 20 40 60 80 100 120−30

−20

−10

0

10

20

30

y = 0.00115⋅x2−0.239⋅x+11.620

R2 = 0.3968

Res

idua

l tim

ing

erro

r (m

s)

MIDI velocity

YAMAHA DISKLAVIER II

0 20 40 60 80 100 120−30

−20

−10

0

10

20

30

y = 8.419e−006⋅x3−0.00257⋅x2+0.275⋅x−8.615

R2 = 0.6928

BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st

Figure 2.13: The residual timing error (ms) between the MIDI file and the original record-ing as a function of MIDI velocity, as recorded by the computer-controlled pianos. Again,negative values indicate onsets too early in the MIDI data, in comparison to the originalfile. The data was approximated by polynomial functions.

This constant timing error in the MIDI file was larger for the SE system thanfor the Disklavier. The origin of this systematic timing error is yet unknown, but itis likely that the internal counters of the systems (in the case of the SE system, itis a personal computer) do not operate in exactly the desired frequency, probablydue to a rounding error. This time drift over time was small (0.0053% or 0.014%,respectively) and negligible for performance research Friberg (tempo changes of thatorder are far below just-noticeable differences, cf. 1995). But, when such a devicehas to play in time with, i.e., an audio tape, the synchronisation error will alreadybe perceivable after some minutes of performing.

To illustrate the recording accuracy without this systematic error, the residualtiming error (the differences between the fitted lines and the data) is plotted inFigure 2.13 separately for the two pianos against recorded MIDI velocity.23 In anearlier conference contribution, a different normalisation method was applied on thesame data of the Disklavier (see Goebl and Bresin, 2001). The variance was largerfor the Disklavier than the SE system (Yamaha mean: 1.4 ms, standard deviation(s.d.): 3.8 ms; Bosendorfer mean: 0.2 ms, s.d.: 2.1 ms), but for both pianos, theresidual timing error bore a trend with respect to the loudness of the recordedtones. The Disklavier tended to record softer tones later than louder ones; the SEshowed the opposite trend, but to a smaller extent and with much less variation(Figure 2.13). The data in Figure 2.13 were approximated by polynomial curves;

23For the SE system, the final hammer velocity needs to be mapped to MIDI velocity values bychoosing a velocity map. In the present study, a logarithmic map was always used: MIDIvelocity =52 + 25 · log2(FHV ).

2.3. Measurement and reproduction accuracy 41

0 20 40 60 80 100 120−30

−20

−10

0

10

20

30

Del

ay o

f rep

rodu

ctio

n (m

s)

MIDI velocity

YAMAHA DISKLAVIER II

0 20 40 60 80 100 120−30

−20

−10

0

10

20

30BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

rp

Figure 2.14: Timing delays (ms) between the original and its reproduction by thecomputer-controlled piano. (No systematic trend had to be removed.)

the formulas are printed there. The R2 values were different for the two pianos. TheDisklavier’s approximation explained hardly 40% of the variance, while at the SEsystem it was about 70%. The Disklavier’s curve fit indicated a larger erroneoustrend in recording, and—in addition to that—it possesed larger variabilty aroundthat curve.

The timing delays between the original recording and its reproduction are plottedin Figure 2.14 separately for the two pianos. The systematic timing error of therecording was not observed, so the display against recorded time (as in Figure 2.12)was not required. Evidently, the error in recording was cancelled out by the sameerror in reproduction. The difference between the two systems became most evidentin this display. While the reproduced onsets of the Disklavier differed as much as+20 and −28 ms (mean: −0.3 ms, s.d.: 5.5 ms) from the actual played onset, thelargest timing error of the SE system rarely exceeded ±3 ms with a tendency ofsoft notes coming up to 5 ms too soon (mean: −0.1 ms, s.d.: 1.3 ms). Interestingly,the recording accuracy of the SE system was lower than its reproduction accuracy.Obviously, its internal calibration function aimed successfully to absolute precisereproducing capabilities. It could also be that the SE takes the first trip point (5mm before the strings) as being the note onset, but calibrates itself correspondinglyto overcome this conceptual mistake. However, this assumption is contradicted byinformation obtained by the SE’s developer, Wayne (Stahnke, 2000, see also Goebl,2001).

42 Chapter 2. Dynamics and the Grand Piano

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

Max

imum

ham

mer

vel

ocity

(m

/s)

repr

oduc

ed

Maximum hammer velocity (m/s) original

YAMAHA DISKLAVIER II

0 1 2 3 4 5 6 70

1

2

3

4

5

6

7

BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st

Figure 2.15: The maximum hammer velocity (m/s) as played by the pianists (x axes) andreproduced by the computer-controlled pianos (y axes). (The diagonal line indicates idealreproduction.)

Dynamic accuracy

The second of the investigated parameters was dynamics which was measured interms of the speed of the hammer hitting the strings (m/s) or peak sound-pressurelevel (dB). We defined the hammer velocity to be the maximum hammer velocity (seeabove) since this value was easy to obtain automatically from the recorded hammertrack. Usually, this value corresponded very well with the velocity of the hammerwhen starting to touch the strings (final hammer velocity), but especially for softnotes the maximum hammer speed was larger than the hammer speed at the strings.In this case the time between the escapement (when the hammer loses physicalconnection to the key, that is, when the jack is catapulted away by the escapementdolly; for more detail see Askenfelt and Jansson, 1990b; Goebl et al., 2003) andhammer–string contact can be as long as 100 ms or more. The actual final hammervelocity was hard to determine from the hammer accelerometer measurements, butthe computer-controlled devices measure an average velocity of the last 5 mm of thehammer’s travel to the strings (approximately the last 10% of that distance).

In Figure 2.15, the reproduced maximum hammer velocity is plotted againstthe original maximum hammer velocity. It becomes evident that the Disklavier’ssolenoids were not able to reproduce above a certain hammer speed. This variedslightly between keys, e.g., the G6 (with less hammer mass than hammers at a lowerpitch) was accelerated up to 3.5 m/s, whereas a C1 (with a comparatively heavyhammer) only up to 2.4 m/s. On the SE system, this ceiling effect was not soevident, and there was no obvious effect of pitch as for the Disklavier. Especially invery loud staccato tones, the first impact of the finger hitting the key resulted in avery high-peak hammer velocity which decreased significantly until hammer–string

2.3. Measurement and reproduction accuracy 43

50 60 70 80 90 100 110 12050

60

70

80

90

100

110

120

Pea

k S

PL

(dB

) re

prod

uced

Peak SPL (dB) original

YAMAHA DISKLAVIER II

50 60 70 80 90 100 110 12050

60

70

80

90

100

110

120BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st

Figure 2.16: Peak sound level (dB) as measured in the tones performed by the pianists (xaxes) and reproduced by the computer-controlled pianos (y axes).

contact. The solenoid was not able to reach this high-peak hammer velocity (and isnot programmed to do so), but it aimed to reproduce the measured final hammervelocity properly (see also Figure 2.18). In this light, the maximum hammer velocitydid not seem to be a proper measure. Instead, the peak sound-pressure level (dB)was considered (see Figure 2.16).

This display compares acoustic properties of the played tones with their repro-duction (peak SPL in dB, Figure 2.16). Here, the SE system revealed a much moreprecise reproducing behaviour over the whole dynamic range than the Disklavier.In the latter, the dynamic extremes flattened out, soft tones were played back tooloudly and very loud tones too softly.

In Figure 2.17, the relation between MIDI velocity units and peak sound-pressurelevel is displayed separately for the recording (a) and its reproduction (b). On bothinstruments, different pitches exhibited a different curve. The higher the pitch,the louder the radiated sound at the same MIDI velocity. The reproduction panel(Figure 2.17b) reflects the reproducing limitations of the Disklavier already shownin Figure 2.16.

Two types of touch

Examples of a legato attack (Disklavier, see Figure 2.18) and a staccato attack (SE,see Figure 2.19) are shown in order to demonstrate the reproducing behaviour of thecomputer-controlled pianos. In these figures, instantaneous key and hammer velocitywith the sound signal are plotted. In Figure 2.18 on the left side, a legato attackas played by one of the authors is shown with its smooth acceleration, on the rightits reproduction by the Disklavier. The Disklavier hit the key always in a staccato

44 Chapter 2. Dynamics and the Grand Piano

(a)

0 20 40 60 80 100 120

60

70

80

90

100

110

SP

L (d

B)

MIDI velocity

YAMAHA DISKLAVIER II

0 20 40 60 80 100 120

60

70

80

90

100

110

BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

lg st

(b)

0 20 40 60 80 100 120

60

70

80

90

100

110

SP

L (d

B)

MIDI velocity

YAMAHA DISKLAVIER II

0 20 40 60 80 100 120

60

70

80

90

100

110

BÖSENDORFER SE290

C1 (24)G2 (43)C4 (60)C5 (72)G6 (91)

rp

Figure 2.17: Peak sound-pressure level (dB) against MIDI velocity as recorded by thecomputer-controlled pianos. The upper panels show legato touch (“lg”), and staccatotouch (“st”) as played by the pianist (a), the lower display the reproduction (“rp”) by thecomputer-controlled pianos (b).

2.3. Measurement and reproduction accuracy 45

−0.5

0

0.5

hsfk kbK

ey v

eloc

ity (

m/s

)

hs−fk: 36.8 mskb−hs: −3.0 ms

−4

−2

0

2

4 hsfk kb

maxHv: 3.765 m/s

Ham

mer

vel

ocity

(m

/s)

−60 −50 −40 −30 −20 −10 0 10−0.4

−0.2

0

0.2

0.4 hsfk kb

SPL: 101.13 dB

Am

plitu

de (

−1/

+1)

Time (ms)

−0.5

0

0.5

hsfk kb

Key

vel

ocity

(m

/s)

hs−fk: 25.9 mskb−hs: 1.6 ms

−4

−2

0

2

4 hsfk kb

maxHv: 2.794 m/s

Ham

mer

vel

ocity

(m

/s)

−60 −50 −40 −30 −20 −10 0 10−0.4

−0.2

0

0.2

0.4 hsfk kb

SPL: 98.53 dB

Am

plitu

de (

−1/

+1)

Time (ms)

Figure 2.18: A forte attack (C4, MIDI note number 60) played by one pianist (left panel)‘from the key’ (legato touch), and its reproduction by the Yamaha Disklavier (right). Theupper panels plot key velocity, the middle hammer velocity, the bottom panels the soundsignal. The three lines indicate the finger–key contact (start of the key movement, “fk,”left dashed line), the key–bottom contact (“kb,” dotted line), and the hammer–stringcontact (“hs,” solid line).

manner, with an abrupt acceleration at the beginning of the attack. The parts ofthe piano action were condensed before its inertia was overcome and the hammerstarted to move upwards. The solenoid’s action resulted in a shorter travel time (thetime between finger–key contact (“fk”) and hammer–string contact (“hs”) is 26 msinstead of 37 ms, see Figure 2.18, upper panels). The travel time difference betweenproduction and reproduction was even larger at very soft keystrokes. This could beone reason why soft notes appear earlier in the reproduction by the Disklavier thanlouder notes.

In this particular attack, the difference in peak hammer velocity was clearlyaudible. When the (final) hammer velocities became similar, the two sounds, in-dependently on how they were produced (legato—staccato—reproduced) becameindistinguishable, as informal listening to the material suggests. Systematic listen-ing tests have to be performed in future work. Furthermore, we cannot tackle herethe old controversy as to whether it is only hammer velocity that determines thesound of an isolated piano tone (White, 1930; Hart et al., 1934; Seashore, 1937), orif the pianist can alter the piano tone with a specific type of touch so that there aremore influencing factors like various types of noise emerging from the piano actionand the pianist’s interaction with it (Baron and Hollo, 1935; Baron, 1958; Podlesak

46 Chapter 2. Dynamics and the Grand Piano

−0.5

0

0.5

1

1.5hsfk kb

Key

vel

ocity

(m

/s)

hs−fk: 16.1 mskb−hs: −5.4 ms

−6

−4

−2

0

2

4

6hsfk kb

maxHv: 5.792 m/s

Ham

mer

vel

ocity

(m

/s)

−30 −25 −20 −15 −10 −5 0 5 10 15−0.5

0

0.5 hsfk kb

SPL: 107.71 dB

Am

plitu

de (

−1/

+1)

Time (ms)

−0.5

0

0.5

1

1.5hsfk kb

Key

vel

ocity

(m

/s)

hs−fk: 18.4 mskb−hs: −1.6 ms

−6

−4

−2

0

2

4

6hsfk kb

maxHv: 5.390 m/s

Ham

mer

vel

ocity

(m

/s)

−30 −25 −20 −15 −10 −5 0 5 10 15−0.5

0

0.5 hsfk kb

SPL: 107.70 dB

Am

plitu

de (

−1/

+1)

Time (ms)

Figure 2.19: A fortissimo attack (C4, MIDI note number 60) played by one pianist (leftpanel) from a certain distance above the key (staccato touch), and its reproduction bythe Bosendorfer SE grand piano (right). The upper panels plot key velocity, the middlehammer velocity, the bottom panels the sound signal. The three lines indicate the finger–key contact (start of the key movement, “fk,” left dashed line), the key–bottom contact(“kb,” dotted line), and the hammer–string contact (“hs,” solid line).

and Lee, 1988; Askenfelt, 1994; Koornhof and van der Walt, 1994). In the contextof touch, the author considers the hapto-sensoric feedback from the piano to theplayer as crucial. Through this feedback, the specific touch of one keystroke mightinfluence the performer’s possibilities to play a next keystroke.

A very loud staccato attack is plotted in Figure 2.19 with the original, humanattack on the left, and its reproduction by the Bosendorfer SE on the right. Thepoint of maximum hammer velocity was 6 ms before hammer–string contact in theoriginal recording, but only 2.5 ms in the reproduction. Although the reproducedmaximum hammer velocity was lower (4.6 m/s instead of 5.6 m/s), the reproducedpeak SPL was slightly higher than that of the original sound (109.63 dB insteadof 108.92 dB). The human player accelerated the key extremely abrupt so that thehammer reached its highest speed quite before hitting the strings and—of course—lost energy at its free flight to the strings. Since the reproducing solenoid couldnot accelerate the key in the same abrupt way as the human player, the hammerreached maximum speed later, and—in this example—the machine performed withless energy loss than the human player.

2.3. Measurement and reproduction accuracy 47

2.3.4 General discussion

In this study, we measured the recording and reproducing accuracy of two computer-controlled grand pianos (Yamaha Disklavier, Bosendorfer SE) with an accelerometersetting in order to determine their precision for piano performance research. Bothdevices showed a systematic timing error over time which was most likely due to arounding error in the system clock (the internal hardware at the Disklavier, a com-mon personal computer at the SE). This linear error removed, the Bosendorfer hada smaller (residual) timing error than the Disklavier, but both exhibited a certaintrend with respect to the loudness of the tones. The Disklavier tended to recordsoft tones too late whereas the SE had the tendency to record soft tones too early.But within these tendencies, the SE was more consistent. During reproduction, thesuperior performance of the Bosendorfer became more evident: the timing errorwas smaller than during recording whereas the Disklavier increased in variance incomparison to its recording.

The important point for performance research is the recording accuracy of thesesystems. Apart from the systematic error that only marginally affects the mea-sured tempo value (0.0053% or 0.014%, respectively), the residual timing error (Fig-ure 2.13) was considerably large for the Disklavier and smaller for the Bosendorfer.The measurement precision can be improved by substracting these trends using thepolynomial curve approximations as displayed in Figure 2.13.

To examine reproducing accuracy in the loudness dimension, we used the max-imum hammer velocity and the peak sound-pressure level as measures. Maximumhammer velocity did not correspond to the velocity measures captured by the sen-sors of the two systems. Considering the peak sound levels of the sounding signal,both devices recorded at a similar precision. However, the Disklavier system couldnot reproduce very loud tones properly most likely due to its smaller solenoids. Thelower the pitch (and thus the greater the hammer mass), the lower was the max-imum sound-pressure level of the Disklavier’s reproduction. The reproduction ofsoft notes was also limited (very soft notes were played back somewhat louder bythe Disklavier), because the tested Disklavier prevented very soft tones from be-ing silently reproduced with a minimum velocity matrix, adjustable by the internalcontrol unit. It was also due to this function that the Disklavier was not able toreproduce silent notes, a crucial feature especially for music of the 20th century. TheBosendorfer exhibited linear reproducing behaviour over the whole dynamic range(from 60 to 110 dB SPL).

As another, and indeed very important criterion of recording and reproducingcapability, we did not investigate the two pedals.24 The use of the right pedal wasnot investigated extensively up to date (apart from Repp, 1996b, 1997b). We did

24We are talking only of the right and the left pedal of grand pianos, since the middle pedal—the sostenuto pedal—only varies the tone length of certain keys depressed during its use, which isrecorded and reproduced by simply holding down the corresponding keys the same time this pedalwas depressed.

48 Chapter 2. Dynamics and the Grand Piano

not have any hypotheses of how pedal recording and reproducing accuracy shouldbe approached. This item remains for future work.Both the Disklavier and the SE system are based on the same underlying prin-

ciple. That is, to measure and reproduce movement of the piano action (and thepedals), in particular the final speed of the hammer before touching the strings.This principle is fundamentally different from what a performing artist does whenplaying expressively. The artist controls his/her finger and arm movements in orderto reproduce a certain mental image of the sound by continuously listening to theresulting sound and by feeling the hapto-sensory feedback of the keys (Galembo,1982, 2001). In this way, the performer is able to react to differences in the action,the voicing, the tuning, and the room acoustics, just to mention a few variables thathave a certain influence on the radiated sound. On the other hand, a reproducingpiano aims to reproduce a certain final hammer velocity independently of whetherroom acoustics, tuning, or voicing changed since the recording or not. Even if thereproduction takes place on the same piano and immediately after the recording,the tuning might not be the same anymore and the mechanical reproduction, asgood as it might be, does not result in an identical sounding performance as the pi-anist played it before. This obvious limitation of such devices becomes most evidentwhen a file is played from a different piano or in a different room. Especially, if thedamping point (the point of the right pedal where it starts to prevent the stringsfrom freely oscillating) is a different one on another piano, the reproduction couldsound too blurred (too much pedal) or too “dry” (too little pedal).

One possible solution to this problem could be a reproducing device with “ears,”in other words, the piano should be able to control its acoustical output via afeedback loop through a built-in microphone. If put into a different room, thedevice could check the room acoustics, its pedal settings, and its current tuning andvoicing before the playback starts, much the same as a pianist warming up before aconcert. Such a system would require a representation of loudness or timbre otherthan MIDI velocity, indicating at what relative dynamics a certain note was intendedto sound in a pianist’s performance.As the present study was planned to investigate the usefulness of the two devices

in question for performance research, we have to consider the obtained results inthe light of practical applications. Although the Bosendorfer is the older system,it generally performed better. The disadvantage of the Bosendorfer is its price,around double the price of a grand piano of that size. Moreover, the SE system isnot produced anymore, and there were only about 35 exemplars sold around theworld, and very few in academic institutions (such as Ohio State University, atColumbus, USA, or the Hochschule fur Musik at Karlsruhe, Germany).25 On theother hand, the Disklavier is a consumer product, the price level generally cheaperthan the Bosendorfer (depending on type of system), and therefore more likely tobe obtained by an institution.

25The SE system was recently completely re-engineered and was expected to be available com-mercially at the Bosendorfer company by mid-2002 (Dain, 2002).

2.3. Measurement and reproduction accuracy 49

The Disklavier measured in this study was certainly not the top model of theYamaha corporation. Since then, Yamaha issued the Mark III series and the high-end series, called Pro (e.g., the special Pro2000 Disklavier). The latter series uses anextended MIDI format (with a velocity representation using more than 7 bits), andadditional measures like key release velocity to reproduce the way the pianist releaseda particular key. It can be expected that these newer devices perform significantlybetter than the tested Mark II grand piano. Since these more sophisticated deviceswere not available for the authors or were too far away from the accelerometerequipment, which was too costly to transport, this has to remain a subject forfuture investigations.This study examined the reliability of computer-controlled pianos for perfor-

mance research. It showed that not all of the data as output by such devices can beblindly relied on. Although especially the timing data can be listed with around onemillisecond of precision, this seemingly high accuracy has to be interpreted by theresearcher with caution. Differences of ±10 ms with an effect of tone intensity (asfound for recording accuracy of the examined Disklavier) might blur performancedata considerably so that, e.g., a study on tone onset asynchronies as reported inChapter 3 with a Bosendorfer SE would not have delivered reliable results with aDisklavier such as examined in the present study. However, strictly speaking eachmodel (e.g., an upright Disklavier) has to be measured and examined individuallybefore its specific accuracy can be determined for the purpose of performance stud-ies.

50 Chapter 2. Dynamics and the Grand Piano

2.4 A note on MIDI velocity

When a hammer hits the strings with a certain velocity at a certain pitch, it producesa tone with a certain intensity. The same hammer velocity with a different hammerat an adjacent pitch will produce a tone with similar loudness, but still it will notsound equally loud. These differences are due to a slightly different regulation ofthe action, to a different density of the hammer felt, and to different resonances ofthe strings, the soundboard, and room acoustics. Although piano technicians tryto maintain action, hammers, and the tuning so that adjacent tones show similarbehaviour in sound quality and touch and that the whole keyboard will exhibit aconsistent behaviour, total equality of tones will not be possible to achieve at anacoustic musical instrument.

In Figure 2.11 (p. 32) and in Figure 2.17 (p. 44), the same hammer velocityresulted in different peak sound levels at different pitches. Repp (1997a) measuredpeak sound level of every second tone in a range of 5 octaves (from C2 to C7) pro-duced by a Yamaha Disklavier baby grand piano26 with 5 different MIDI velocities(from 20 to 100). He found large unsystematic variability from one tone to the nextwith slightly higher intensity in the middle register (Repp, 1997a, p. 1880, Fig. 2).These data were similar to measurements of an earlier study of his (Repp, 1993b).

In order to obtain a complete picture of the peak sound level behaviour of a grandpiano, the Bosendorfer grand piano used in Chapter 3 was examined over the wholerange of the keyboard. Computer-generated files advised the SE system to producetones of MIDI velocities between 10 and 110 in steps of 2 units for all of its 97 keys(4947 tones in total).27 Each tone lasted for 300 ms and was followed by silence ofvariable length (longer in the bass, shorter in the middle, longer again in the treblewhere strings are not damped anymore). In order to avoid immoderate warmingof the linear drives of the reproducing system, the tones were arranged so that thepause between two attacks was kept maximum. The microphones (two AKG CK91positioned in an ORTF setup28) were positioned aside the grand piano at the openlid about 1.5 meters from the strings and connected to a digital audio tape (TascamDA–P1 DAT recorder, set to a sampling frequency of 44100 Hz, 16-bit word length,and stereo). The recordings were transferred digitally to computer hard disk usinga “Creative SB live! 5.1 Digital” soundcard into WAV files and analysed with thehelp of Matlab scripts. The signal was transferred into its sone representation withan implementation of Zwicker’s loudness model (Zwicker and Fastl, 1999) by EliasPampalk (similar approaches were used in Pampalk et al., 2002, 2003).29 The audio

26Yamaha Disklavier grand piano, Mark II. An exemplar of the same series was used in thepresent experiments, see Section 2.2 and 2.3

27The Bosendorfer Imperial 290 cm grand piano features 9 additional keys in the bass so thatthe lowest key is the C0 (see grey keys in Figure 2.20).

28The two microphones are from from each other in approximately the same distance as the twohuman ears, in an angle of 120 degrees.

29Another approach of implementing Zwicker’s model was used by Langner et al. (1998). How-ever, this implementation could not be used here due to copyright restrictions.

2.4. A note on MIDI velocity 51

(a)

C0 C1 C2 C3 C4 C5 C6 C7 C8

−48

−42

−36

−30

−24

−18

−12

−6

0

440

Hz

Pitch

Pea

k so

und

leve

l (dB

) (a

vg)

Bösendorfer SE 290−3 (Jan 7, 2002)

(12) (24) (36) (48) (60) (72) (84) (96) (108)

MID

I vel

ocity

10

20

30

40

50

60

70

80

90

100110

MIDI velocity

10

20

30

40

50

60

70

80

90

100

110

(b)

C0 C1 C2 C3 C4 C5 C6 C7 C8

0

5

10

15

20

25

30

440

Hz

Pea

k lo

udne

ss (

sone

) (a

vg)

Bösendorfer SE 290−3 (Jan 7, 2002)

(12) (24) (36) (48) (60) (72) (84) (96) (108)

MID

I vel

ocity

102030405060

70

80

90

100110

MIDI velocity

10

20

30

40

50

60

70

80

90

100

110

Figure 2.20: Lines of equal MIDI velocity against pitch measured at the Bosendorfer SE290–3, once in terms of dB peak sound level (a) or in terms of sone peak loudness (b).MIDI velocity ranged from 10 to 110 in steps of 2 units. The data were averaged over thetwo channels of the recording.

52 Chapter 2. Dynamics and the Grand Piano

signal was converted into the frequency domain and bundled into critical bandsaccording to the Bark scale. After determining spectral and temporal maskingeffects, the loudness sensation (sone) was computed from the equal loudness levels(according to Terhardt, 1979) which in turn was calculated from the sound-pressurelevel in decibel (dB–SPL). The present sone implementation deviates to those usedin Pampalk et al. (2002, 2003) that the calculation of equal loudness contours wasreplaced with a model by Terhardt (1979). The loudness envelope was sampled at11.6 ms intervals according to the window size and sampling rate used (1024 at44100 samples per second with 50% overlap). The onsets were determined from theloudness curve automatically by a simple threshold procedure.30 Peak sound levelvalues (in dB) and peak loudness values were taken for each onset separately for thetwo channels.

From the nominally produced 4947 tones, 266 were detected as silent (when thehammer was too slow to hit the strings) or missing tones.31 The results are displayedin Figure 2.20 in terms of lines of equal MIDI velocity against pitch. In the upperpanel, the intensity is plotted in terms of dB peak sound level,32 in the lower onein terms of sone (in both panels, the data was averaged over the two channels ofthe recording). Every fifth line (every tenth MIDI velocity value) is printed blackto ease orientation in the figure.

The individual pitches showed considerable different peak sound levels at thesame MIDI velocity. No systematic trend over the keyboard could be observed, butthe lines of equal MIDI velocity ran always parallel; virtually no crossing of linesoccurred. This indicates that the intrinsic properties of a given pitch were consistentover the whole dynamic range.

In Figure 2.20b, the sone representation showed a less regular picture. The linescrossed often; the order of MIDI velocity units did not always correspond to theorder of sone values. There was a trend over the keyboard that low pitches depictedsmaller sone values than higher pitches. This trend could be explained by the natureof Zwicker’s loudness model that adds up individual loudness per frequency band(bark). The higher the pitch, the more energy appears in the higher frequency bandsand thus, the overall sone values become higher. The sudden peak in the highestoctave is likely due to a drop in the equal-loudness contours between around 2700and 3700 Hz, reflecting a sensitivity of the ear in that region.

However, these facts alone cannot explain the shape of the present representation.Going back to the data of the individual channels, only one of the two channelsshowed this trend over pitch, but not the other one. Since the two microphones

30An onset was defined as depicting a larger loudness increment than 0.2 sone. This simpledefinition worked stable over the whole range of the keyboard and robustly differentiated betweenonsets and silent tones.

31Between the C#2 (26) and and Eb (28), all tones below MIDI velocity 40 were missing due toa tape error (90 tones).

32Not calibrated to a reference hearing threhold and thus in terms of negative level values fromthe sound file’s maximum amplitude

2.4. A note on MIDI velocity 53

C5 C6 C4

−48

−42

−36

−30

−24

−18

−12

−6

0

20

40

60

80

MID

I vel

ocity

440

Hz

Bösendorfer SE 290−3 (Nov 6, 2001 & Jan 7, 2002)

(60) (72) (84)Pitch

Pea

k so

und

leve

l (dB

)

Jan 7,2002, ch.1Jan 7,2002, ch.2Nov 6,2001, ch.1Nov 6,2001, ch.2

Figure 2.21: Selected linesof equal MIDI velocity (20,40, 60, and 80 units) for twochannels of two recordingsessions (Nov 6, 2001 andJan 7, 2002) plotted againstpitch (from C4 to C6).

pointed at different directions during recording, it is likely that this trend (the higherthe louder) was due to microphone position, sound radiation, and room acoustics.

In the present recording, one microphone pointed more towards the treble stringsthan the other. The microphone pointing to the treble strings captured more the di-rect sound including all high-frequency noise components especially from the stringsclose it, while the other captured more indirect sounds after reflections from thewalls. This might explain why the channel form the microphone pointing towardsthe treble strings showed an increase of peak loudness over pitch.

In addition to that, the values derived from the signal represent only peak sound-level values or peak loudness values. As loudness perception integrates over timeduring an interval of approximately half a second (cf., e.g., Hall, 2002, p. 119),the overall energy of a single tone might not increase over pitch as displayed inFigure 2.20b.

Repp (1997a) found that the variation in peak sound level over the keyboardchanged significantly with microphone position. To replicate this finding, the twochannels of the recording accomplished on January 7, 2002 were compared withan earlier recording performed in November 6, 2001 on the same Bosendorfer SEgrand piano.33 In the latter recording session, tones from C4 (60) to C6 (84) were

33The recording equipment was identical to the recording session on January 7, 2002. Themicrophones were placed more closely to the strings (approximately 1 meter from the strings at

54 Chapter 2. Dynamics and the Grand Piano

Table 2.4: Mean correlation coefficients between (eight) lines of equal MIDI velocity(from 20 to 90 in steps of 10 units) of the four sources (two channels of two recordingsessions). The displayed coefficients were averaged over 21 (auto correlation) or 64 coeffi-cients.

7.Jan’02 Ch.1 7.Jan’02 Ch.2 6.Nov’01 Ch.1 6.Nov’01 Ch.27.Jan’02 Ch.1 0.9546 0.5221 −0.1023 0.05437.Jan’02 Ch.2 0.9708 0.1690 0.26016.Nov’01 Ch.1 0.8604 0.07596.Nov’01 Ch.2 0.8023

produced with all MIDI velocities ranging from 20 to 90 units, again each tonelasting for 300 ms. Selected lines of equal MIDI velocity (20, 40, 60, and 80 MIDIvelocity units) from this recording are compared to the more recent recording inFigure 2.21.

The lines of the different sources did not depict parallel behaviour. A tone whichhad a peak in one channel, did not necessarily have a peak in another. To quantifythe relations of the four sets of lines to each other, all lines of equal MIDI velocity(from 10 to 90 in steps of 10 units) from the four sources (two channels of tworecording sessions) were correlated to each other. The result was a correlation matrixof 32 by 32 coefficients. The mean coefficients for each source are listed in Table 2.4.The mean correlations of the eight lines of equal MIDI velocity to lines of theirown group showed high correlation coefficients, whereas no other combination got asignificant correlation coefficient (except the two channels of the 2002 recording).

This finding suggested that the digitised samples of the recorded sound exhibiteda consistent intensity pattern when recorded from exactly the same position, butmay have a totally different pattern with another microphone position. No ultimateconclusions can be drawn from the peak sound levels of these recordings other thanthat the intrinsic intensity response of a given piano cannot be derived from it, atleast not with these methods. Nevertheless, for the purpose of the study reportedin Section 4.4 and 4.5, the peak sound level for samples from only a single source(one channel) was sufficiently reliable.

The perception of dynamics of piano tones can be partly independently fromits sound level (Parncutt and Troup, 2002). Although usually changes in dynamicsresult both in changes in loudness and changes in timbre, listeners might use thetimbral information more to overcome different loudness levels due to differences inthe distance to the source or differences in recording volume at recordings. Imaginea piano that is played by someone in the room next to you. Although the level is notas loud as if you were in the room in which the piano was played, you will be ableto tell how loud the pianist played (e.g., fortissimo or mezzo forte). Similarly, whenlistening to a piano recording on a stereo system, you can turn up and down the

the right-hand side of the piano viewed by the sitting pianist).

2.4. A note on MIDI velocity 55

volume and you hear (possibly after a short moment of adaptation) exactly whatdynamics, what timbral intensity the piano was played with.To overcome the above mentioned problems of inferring dynamic level of a piano

from loudness information derived from recorded samples, a perceptual scale ofpiano dynamics is suggested here that includes timbral models of the piano tone foreach pitch and the whole dynamic range in order to deduct intensity informationindependently from the sound level of the signal. However, the author is aware ofthe problems arising from sympathetic vibrations of more than one piano tone ata time and the use of pedals. Also for this reason, this has to remain for futureinvestigations.

56 Chapter 2. Dynamics and the Grand Piano

Chapter 3

Bringing Out the Melody inHomophonic Music—ProductionExperiment

This chapter reports research already published in Goebl (2000, 2001). As reportedin the recent literature on piano performance research, the melody—as the mostimportant voice—is not only played louder, but also around 30 ms earlier (melodylead). This effect is generally associated with, and presumably causally related to,differences in hammer velocity between the melody and accompaniment (velocityartifact). The velocity artifact explanation implies that pianists initially strike thekeys in synchrony; it is only different velocities that make the hammers arrive atdifferent points in time.

Two pieces by Frederic Chopin were performed on a Bosendorfer computer-controlled grand piano (SE290) by 22 skilled pianists. The performance data wereinvestigated with respect to the relative tone-onset timing (tone-onset asynchrony)and dynamic differences between the melody tones and the accompaniment. Fur-thermore, this study examined the asynchronies at the beginning of the key move-ment (finger–key). These asynchronies were estimated through calculation. For this,Goebl (2000, 2001) used information from an internal computer memory chip of theBosendorfer SE system in which the system stores internal calibration measurementsof how long the hammer of each key needs to travel from its resting position to thestring contact in relation to the also measured final hammer velocity. This infor-mation was extracted with the help of the SE developer Wayne Stahnke, who neverconfirmed the interpretation of that data.

Since the first publication, the piano action timing properties and especially thetravel time functions were studied in detail with an extended measurement setup(as reported in Chapter 2). The results from Goebl (2000, 2001) were adjusted withthese more recent travel time functions and reported in Section 3.6 (p. 74).

57

58 Chapter 3. Production of Melody

3.1 Introduction

Simultaneous notes in the printed score (chords) are not played strictly simultane-ously by pianists. An emphasised voice is not only played louder, but additionallyprecedes the other voices typically by around 30 ms; this phenomenon is referred toasmelody lead (Hartmann, 1932; Vernon, 1937; Palmer, 1989, 1996; Repp, 1996a). Itis still unclear whether this phenomenon is part of the pianists’ deliberate expressivestrategies and used independently from other expressive parameters (Palmer, 1996),or whether it is mostly due to the timing characteristics of the piano action (velocityartifact, Repp, 1996a) and thus a result of the dynamic differentiation of differentvoices. Especially in chords played by the right hand, high correlations betweenhammer velocity differences and melody lead times (between melody notes and ac-companiment) seem to confirm this velocity artifact explanation (Repp, 1996a).

The data used in previous studies, derived mostly from computer-monitoredpianos, represent asynchronies at the hammer–string contact points. The presentstudy examined asynchrony patterns at the finger–key contact points as well. Thesefinger–key asynchronies represent what pianists initially do when striking chords.If the velocity artifact explanation is correct, the melody lead phenomenon shoulddisappear at the finger–key level. This means that pianists tend to strike the keysalmost simultaneously, and it is only the different dynamics (velocities) that resultin the typical hammer–string asynchronies (melody lead).

3.1.1 Background

In considering note onset asynchronies, one has to differentiate between asynchroniesthat are indicated in the score (arpeggios, appoggiaturas) and asynchronies that areperformed but not especially marked in the score. The latter come in two kinds:(1) The melody precedes other voices by about 30 ms on average (melody lead), or(2) the melody lags behind the other voices. Asynchronies of the second type occurmainly between the two hands and usually show much larger timing differences (over50 ms). A typical example would be when a bass note is played clearly before themelody (melody lag or bass lead), which is well known from old recordings of pianoperformances, but has been observed in contemporary performances too (Palmer,1989; Repp, 1996a). Asynchronies of the first type are common within one hand(especially within the right hand, as the melody often is the highest voice), but mayalso occur between the hands.

Note asynchronies have been studied since the 1930s, when Hartmann (Hart-mann, 1932) and the Seashore group (Vernon, 1937) conducted the first objectiveinvestigations of piano performances. Hartmann used piano rolls as a data sourceand found mostly asynchronies of the second type. Vernon (1937) differentiatedbetween asynchronies within one hand and asynchronies between different hands.For the former he observed melody lead (type 1), whereas the latter mainly showedbass note anticipation (type 2).

3.1. Introduction 59

In the recent literature, Palmer (1989, 1996) and Repp (1996a) have studied themelody lead phenomenon. Palmer (1989) used electronic keyboard recordings toanalyse chord asynchronies among other issues. Six pianists played the beginning ofthe Mozart Sonata K. 331 and of Brahms’ Intermezzo op. 117/1 (“Schlaf sanft, meinKind...”). The melody led by about 20 to 30 ms on average; this effect decreasedfor deliberately ‘unmusical’ performances and for melody voices in the middle ofa chord (Brahms op. 117/1). In a second study, melody lead was investigatedexclusively (Palmer, 1996). Six pianists played the first section of Chopin’s Preludeop. 28/15 and the initial 16 bars of Beethoven’s Bagatelle op. 126/1 on a Bosendorfercomputer-monitored grand piano (SE290, as in the current study). Again, melodylead was found to increase with intended expressiveness, also with familiarity with apiece (the Bagatelle was sight-read and repeated several times), and with skill level(expert pianists showed a larger melody lead than student pianists).

In another study published at the same time, in part with the same music,Repp (1996a) analysed 30 performances by 10 pianists of the whole Chopin Preludeop. 28/15, a Prelude by Debussy and Traumerei by Schumann on a Yamaha uprightDisklavier. To reduce random variation, Repp averaged over the three performancesproduced by each pianist. He then calculated timing differences between the (righthand) melody and each other voice, so that asynchronies within the right handand between hands could be treated separately. He argued that melody lead couldbe explained mostly as a consequence of dynamic differences between melody andaccompaniment. Dynamic differences (differences in MIDI velocity) were positivelycorrelated with timing differences between the melody and each of the other voices,and these correlations were generally higher for asynchronies within the right handthan for those between hands.

Palmer (1996) also computed correlations between melody lead and the averagehammer velocity difference between melody and accompaniment, but her correla-tions were mostly non-significant. In her view, the anticipation of the melody voiceis primarily an expressive strategy that is used independently from other perfor-mance parameters such as intensity, articulation, and pedal use. In a perceptiontest, listeners had to identify the intended melody in a multi-voiced piece by ratingdifferent artificial versions: one with intensity differences and melody lead, one withmelody lead only, and one without any such differences. Melody identification wasgood for the original condition (melody lead and intensity difference), but the re-sults in the melody lead condition did not differ much from the results in the neutralcondition, especially for non-pianist listeners. Only pianist listeners showed somesuccess in identifying the intended melody from melody leads alone. A conditionwith intensity differences only was not included (Palmer, 1996, p. 47).

3.1.2 Piano action timing properties

The temporal properties of the piano action were explained in detail in Section 2.2.1(p. 10) and will not discussed here again.

60 Chapter 3. Production of Melody

3.2 Aims

Almost nothing is known about asynchronies at the finger–key level, because noneof the instruments used for acquiring performance data measure this parameter.However, to clarify the origin of melody lead, it is important to consider exactlythose finger–key asynchronies. When pianists stress one voice in a chord, do theyhit the keys asynchronously or do their fingers push the keys down at the same timebut with different velocities, so that the hammers arrive at the strings at differentpoints in time?

To examine this question, it is necessary to determine the finger–key contacttimes. One possibility might be to observe finger key contacts by using a videocamera or by special electronic measurements at the keyboard. In this study, thefinger–key contacts were inferred from the time the hammer travels from its restingposition to the strings at different final hammer velocities (timing correction curve).With the help of this function, the finger–key contacts could be accurately estimated;also the size of the expected melody lead effect in milliseconds could be predictedfrom the velocity differences between the voices, assuming simultaneous finger–keycontacts.

3.3 Method

3.3.1 Materials and participants

The Etude op. 10, No. 3 (first 21 measures, Figure 3.1) and the Ballade op. 38 (initialsection, bars 1 to 45, Figure 3.2) by Frederic Chopin were recorded on a BosendorferSE290 computer-monitored concert grand piano1 by 22 skilled pianists (9 female and13 male).2 They were professional pianists, graduate students or professors at the‘Universitat fur Musik und darstellende Kunst’ (University of Music and PerformingArts) in Vienna. They received the scores several days before the recording session,but were nevertheless allowed to use the music scores during recording. Their averageage was 27 years (the youngest was 19, the oldest 51). They had received their firstpiano lesson at 6 and a half years of age on average. They had received pianoinstruction for a mean of 22 years (standard deviation= 7); 8 of them had alreadyfinished their studies; about half of them played more than 10 public concerts peryear.

After the recording, the pianists were asked to play the initial 9 bars of theBallade in two additional versions: first with a particularly emphasised highestvoice (voice 1, see Figure 3.2) and second with an emphasised third voice (thelowest voice in the upper stave, played also by the right hand, see Figure 3.2). The

1This grand piano is situated in the Bosendorfer company in Vienna (4., Graf-Starhemberg-gasse 14) and has the internal Bosendorfer number 19–8974 and was built in August 1986 (onlypianos that are sold outside the company get serial numbers).

2The recordings were performed between January 13 and February 9, 1999 (see Goebl, 1999a,b).

3.3. Method 61

&

?

#

#

#

#

#

#

#

#

4

2

4

2

legato

Lento ma non troppo ( = 100)

1

j

œ

;

œ œ œœ

21œ

3

7 œ4œ

>

œ

œ

œ

>

œ

œœœœ œ œ œ œ

p

œ

œ

œ œ œ œ œ

>

J

œ

J

œ œ

œ

œ

>

œ

œ

œ

>

œ

œ œ œ œ œœœœ

œ

œ

œ œ œ œ

>

œ

œ

œ

œ

œ œ

œ

œœ

œ

œ œ

œ

œ œ

œ

œ

œ œ

œ œ 1œ

>

œ

œ

œœ

œœ

œœ œœ

œœ

œ

œ œ

23

œ

œ œ

œ

œ

œ œ œ œ œ

>

œœ

œœ œœ

œœœœœœ

œ

œ œ

œ

œ œ

œ

œ

&

?

#

#

#

#

#

#

#

#

6

œ œ œ œ

cresc.

œ œ œ œ

œn œ œ œ œœœœ

œ

œ œ

œ

œ œ

œ

œ

j

œ

œ

r

œ# œ œ

œœœ

>

œ

œ

œ

œ

œ

œ

œ>œ

œ

œ>œ

œ œ

j

œ

r

œ# œ œ

riten.ten.œ

œ

>œ#

œ

œ

œ

œn œ œ œ

œ

œ œ

œ

œ œ

œ œ

œœ œ œ

œœ

œœœœ œ œ œ œ

œ

œ œ

œ

œ œ

œ

œ

œ œ œ œ œ

>

J

œ

J

œ œ

œ

œ œ

œ

œ œ

œ œ œ œ œœœœ

œ

œ

œ œ œ œ

>

œ

œ

œ

œ

œ œ

œ

œœ

œ

œ œ

œ

œ œ

œ

œ

&

?

#

#

#

#

#

#

#

#

12

œ œ

œ œœ

>

œ

œ

œœ

œœ

œœ œœ

œœ

œ

œ œ

œ

œ œ

œ

œ

œ œ œ œ œ

>

œœ

œœ œœ

œœœœœœ

œ

œ œ

œ

œ œ

œ

œ

cresc.

œ œ œ œ œ œ œ œœœn

œœ œœ

œœœœœœ

œ

œ œ

œ

œ œ

œ

œ

stretto

œ œ œ œ

cresc.

œ œ œ# œœ

œ

# œœ œœ# œœœœœœ

œ

œ œ

œ

œ œ

œ

œ

con forza

riten.123

œ

œ

œ

#

œœ

.

œ

œ

œ

œ

œ

œœ

œœ

.

œ

œ

œ

œœ

.

4567

œœœœ

# œœœœ

œœœœ

œœœœ

œœœœn

œœœœ

œœœœ

œœœœ

&

?

#

#

#

#

#

#

#

#

17

ƒ

ten.

ten.

œœ

œ œ œœ

œœœœœœœœ

˙˙˙

˙

ten.œ œ œ œ

œ

sempre legato

œœœœ œœœœ

œ

œ

œ

œ

œ

œ

œ

œ

ten.œ

dim.

œ œ œ

œ

œœœœ œœœœ

œ

œ

œ

œ

œ

œ

œ

œ

π

˙

>rallent.

œœœœœœœœ

°

œ

œ

œ

œ

œ

œ

œ

œ

œœ œ

œœ

œ

œ

Figure 3.1: Frederic Chopin. Beginning of the Etude in E major, op. 10, No. 3. Thenumbers against the note heads are voice numbers (soprano: 1, ..., bass: 7). (Score preparedwith computer software by the author following Paderewski Edition.)

purpose of these special versions was to investigate how pianists change melody leadand dynamic shaping of the voices when they were explicitly advised to emphasiseone particular voice.

All performance sessions were recorded onto digital audio tape (DAT), and theperformance data from the Bosendorfer grand piano were stored on a PC’s hard disk.The performances were consistently of a very high pianistic and musical level.3 Atthe end of the session, the participants filled in a questionnaire. The pianists werenot paid for their services.

3All recordings can be downloaded from http://www.ai.univie.ac.at/∼wernerg in MP3format.

62 Chapter 3. Production of Melody

&

?

b

b

8

6

8

6

sotto voce

1J

œ œ

J

œ

4J

œ

°

œ

J

œ

œ

J

œ œ

J

œ

œ

J

œ œ

J

œ

13œœ

J

œœœœ

J

œœ

45œ

œ

J

œ

œ

œ

œ

J

œ

œ

.œ œ œ

œ

œ

j

œ

œ

œ

J

œ

œ

œ

J

œ

œ

*

œ

œJ

œ

œ

œ

j

œ

œ œ

œ

123

j

œœ

œ

œ

J

œ

œ

œ

œ 45

j

œ

œ

.œ œ

œ

j

œ

œœ

œ

J

œœ

œ

j

œ

œ

œJ

œ

œ

œ

œ

j

œœ

œ

œ

j

œ

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

.œ œ œœ

œ

j

œ

œœ

J

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

&

?

b

b

8

.œ œ œ.œ

œ

J

œ œ

J

œœ

œ

œ

J

œ

œ

œ

œ

J

œ

œ

œ

J

œ

œœ

J

œœ

œœ

J

œœ

œ

œ

J

œ

œ

œ

j

œœœ

œœœ

j

œœœœ

œ

œ

œ

J

œ

œ

œ

œ

J

œ

œ

.œ œ œ

œ

œœ

j

œ

œ

œœ

J

œœ

œ

œ

J

œ

œ

œ

œJ

œ

œ

œ

j

œ

œ œ

œ

j

œœ

œ

œ

J

œ

œ

œ

œ

j

œ

œ

.œ œ

œ

j

œ

œœ

œ

J

œœ

œ

j

œ

œ

œJ

œ

œ

œ

œ

j

œœ

œ

œ

j

œ

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

.œ œ œœ

œ

j

œ

œœ

J

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

&

?

b

b

16

.œ œ œ.œ

œ

J

œ œ

J

œœ

œ

œ

J

œ

œ

œ

œ

J

œ

œ

œ

J

œœ

J

œ

œœ

J

œœœ

œ

J

œ

œ

œ

œ

j

œ

œœ

nj

œ œ

œœ

ggg

j

œœ

œ

#

œ

œ

J

œ œ

j

œ

œ

œ

œœ

j

œ

œœ

œ

œœ

123

j

œ

œœ

œ

œ

J

œ

œ

œ

œ

445J

œ

œ

œ

œ

œ

j

œ

œœ

j

œ œ

œœ

ggg

j

œœ

œ

n

œ

œ

J

œ œ

j

œ

œ

π

œ

œ

j

œ

œœœ

J

œ

œ

œ

J

œ

œ

œ

œJ

œ

œ

œ

œ

J

œœ

œ

œ

J

œ

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

.œ œ œœ

œ

j

œ

œœ

J

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

&

?

b

b

24

.œ œ œ.œ

œ

J

œ œ

J

œœœ

n

œ

œ

J

œ

œ

œ

œ

J

œ

.

œ

œ

J

œ œœ

J

œ

œb

œ

j

œ œ

J

œ

œœ

j

œœœ

œœ

J

œœ

œ

œ

J

œ

œ

œ

œ

J

œ

œ

.œ œ œ

œ

œ

j

œ

œ

œ

J

œ

œ

œ

J

œ

œ

œ

œJ

œ

œ

œ

j

œ

œ œ

œ

j

œœ

œ

œ

J

œ

œ

œ

œ

j

œ

œ

.œ œ

œ

j

œ

œœ

œ

J

œœ

œ

j

œ

œ

œJ

œ

œ

œ

œ

j

œœ

œ

œ

j

œ

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

.œ œ œœ

œ

j

œ

œœ

J

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

&

?

b

b

32

.œ œ œ.œ

œ

J

œ œ

J

œœ

œ

œ

J

œ

œ

œ

œ

J

œ

œ

.œ œ

j

œ

œ

J

œ œ

J

œ

œ

œ

J

œ

œ

œ

œ

J

œ

œ

œ

œ

j

œœ

œ

œ

j

œ

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

.œ œ œœ

œ

j

œ

œœ

J

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

.œ œ œ.œ

œ

J

œ

œ

œœ

J

œ

œ

#

œ

œ

J

œ

œ œ

J

œ

.œ..œœ

œ

J

œœœb

J

œ

œ

j

œœ#

j

œ

.

.

.

œœ

œ

œœ

œ

# œ

œ

œ

œ

n

J

œ

œ

.œ œœ œ

j

œ

œ

.

.œœ

œ

J

œ

œb

J

œ

œ

œ

j

œ œ#

j

œ

&

?

b

b

40

j

œ ..

.

œœ

œggg

œœ

œ

# œ

œ

œ

œ

n

J

œ

œ

.œ œœ œ

j

œ

œ

.œ.œ

œ

J

œ

œ

œ

J

œ

œ

œ

j

œ .œ

.œ œ œ œ

J

œ

.œ.œ

œ

J

œ

œ

œ

J

œ

œ

œ

j

œ .œ

.œ œ œ œ

J

œ

smorzando - - - - - - - -

œ

œ

j

œ

œ

œ

œ

œ

j

œ

œ

œ

œ

œ

j

œ

œ

œ

œ

j

œ

œ

œ

œ

j

œ

œ

œ

œ

œ

j

œ

œ

œ

œ

œ

j

œ

œ

œ

œ

j

œ

œ

°

U

u

œ œ œ

3

œ œ œ

.

.

.

˙

˙˙

œ

œ

œ

œ

œ

œ

œ

.

.

.

˙

˙

˙

*

œ

Andantino

Figure 3.2: Frederic Chopin. The beginning of the second Ballade op. 38 in F major. Thevoices are numbered as in Figure 3.1, but the highest voice number is now 5 for the bass.(Score prepared with computer software by the author following Henle Urtext Edition.)

3.3. Method 63

10.3 0.4 0.6 0.8 2 3 4 5

Final hammer velocity (m/s)

0

50

100

150

200

250

Tra

vel t

ime

(ms)

TCC

mean TCMcurve fit Figure 3.3: The timing characteristics of

a grand piano action: the hammer traveltimes as a function of final hammer veloc-ity. This timing correction curve (TCC)was fitted to average data derived froman EPROM chip of the Bosendorfer SEsystem (confer to Figure 2.5 on Page 23).The y axis represents the time intervalbetween finger–key contact times (mea-sured 2–3 mm below the key surface) andthe hammer–string contact times.

3.3.2 Apparatus

To provide accurate performance data, a Bosendorfer SE290 Imperial computer-monitored concert grand piano.4 was used. The precise functionality of the Bosen-dorfer SE system is described in Section 2.3

3.3.3 Procedure

Note onsets and the hammer velocity information were extracted from the perfor-mance data. These data were matched to a symbolic score in which each voice wasindividually indexed, beginning with 1 as the highest voice5 (see Figure 3.1 and Fig-ure 3.2). Wrong notes (substitutions) or missing notes (deletions) were marked assuch. The rate of not-played or wrongly played notes was very low: for all pianists0.43% for the Etude (of ntotal = 9988), 0.69% for the Ballade (of ntotal = 16082),and 1.75% for the two repeated versions of the Ballade (of ntotal = 5764).

6

Timing differences and hammer velocity differences between the first voice (me-lody) and each other voice were calculated separately for all nominally simultaneous

4“SE” stands for Stahnke Electronics, 290 indicates the length of the piano in cm.5The lowest voice played by the right hand was called 3. If there were three simultaneous

notes in the right hand, the middle one was labeled 2. The highest voice played by the left handwas indexed 4, the bass line 5 in the Ballade, and 7 in the Etude. Voices 5 and 6 in the Etudeoccurred only in measures 16 and 17. In the Ballade, there was only one chord (bar 19) with threesimultaneous notes in the left hand. Here, the two higher notes were labeled 4, the bass 5.

6Additional notes (insertions) that were so soft (or silent) that they did not disturb the perfor-mance and were apparently not perceived as mistakes, were not counted as errors. In the Etude weobserved 181 such notes over the 22 performances (+1.8%), in the Ballade 189 (+1.17%). Similarobservations were made also by Repp (1996c).

64 Chapter 3. Production of Melody

6900 7000 7100 7200Time (ms)

50

56

62

68

74

80

MID

I pitc

h nu

mbe

r

finger-key estimatehammer-string

Calculation procedure

0.985 m/s

0.554 m/s

0.492 m/s

0.465 m/s

125.62 ms

90.21 ms

134.5 ms

138.9 ms

TCC FHVFigure 3.4: Finger–key times calcula-tion procedure. A typical example ofa four voiced chord with melody lead(at hammer–string level, closed circles),and the estimation of its finger–key con-tact times (open circles) according to theTCC (see Figure 3.3).

events in the score. All missing or wrong notes, as well as chords marked in the scoreas arpeggio (Ballade) or as appoggiatura (Etude) were excluded.7 The finger–keycontact times were calculated for each note by subtracting from the hammer–stringimpact time the corresponding travel time, which was determined by the TCC (seeFigure 3.3). From this, finger–key asynchronies were calculated, again betweenvoice 1 and all other voices separately for all nominally simultaneous events in thescore. The calculation procedure is sketched in Figure 3.4.

3.4 Results

Figure 3.5 shows the mean velocity profiles (top graphs) as well as the mean asyn-chrony profiles (bottom graphs) of the 22 performances of the Ballade and the Etudeand their overall averages. All pianists played the first voice consistently louder thanthe other voices. None of the pianists chose another voice to be played as the loudestvoice. The velocity levels of the individual voices were fairly constant in the perfor-mances of the Ballade, so averaging over all notes in a voice made sense. For theperformances of the Etude the dynamic climax of bar 17 caused a strong increasein the velocity values. Therefore, in Figure 3.5 the section from bar 14 to 18 wasaveraged separately and was not included in the overall average. Again, the firstvoice showed clearly the highest velocity values.The two bottom graphs in Figure 3.5 show the hammer–string and the finger–key

asynchrony profiles for the two pieces. The thicker lines with the standard deviationbars represent the average of the mean asynchrony profiles of the 22 performances(thin lines without symbols).In the hammer–string domain, the melody preceded other voices, as expected,

by about 20–30 ms. In the Ballade the asynchrony profiles of the individual perfor-mances were very similar to each other, and the melody lead was slightly greaterrelative to the left hand voices than to the right hand voices. The individual chord

7The excluded events for the Etude were ([bar number].[relative position in the bar]): 7.75,8.25, and 21.0; for the Ballade: 18.5, 20.5, 40.0, and 45.0.

3.4. Results 65

-40 -20 0 20 40 60 80Asynchrony (ms)

5

4

3

2

1

Voi

ce

right

han

dle

ft ha

nd

1 2 3 4 5Voice

1.0

0.40.50.60.70.8

2.0

FH

V (

m/s

)

Ballade

right hand left hand

-40 -20 0 20 40 60 80Asynchrony (ms)

7

6

5

4

3

2

1

Voi

ce

hs allfk allhs averagefk average

right

han

dle

ft ha

nd

1.0

0.40.50.60.70.8

2.0

FH

V (

m/s

)

1 2 3 4 5 6 7Voice

Etudeallmean bars 14 18

right hand left hand

Figure 3.5: The individual and mean final hammer velocity (FHV) and asynchrony profiles(with standard deviation bars) of 22 performances for the Etude (left-hand panel) andthe Ballade (right). In the top panel, the mean intensity values by pianists and voiceare plotted. The thicker lines with squares indicate the average across pianists. In theEtude, bars 14–18 are averaged separately. The profiles at the bottom show the averagedtiming delays of voices relative to voice 1. Solid lines represent hammer–string (“hs”)asynchronies, dashed lines inferred finger–key (“fk”) asynchronies. The horizontal barsare standard deviations, computed across individual performers.

66 Chapter 3. Production of Melody

profiles for the Etude showed more variability among pianists, especially in the lefthand, where the bass voice (7) tended to lead for some pianists (for an example, seebelow).

The asynchronies at the finger–key level (Figure 3.5, dashed lines, average withcircles) were consistently smaller than those at hammer–string level. In particular,the melody lead within the right hand is reduced to about zero, whereas the left handtends to lead the right hand. Two repeated-measure analyses of variance (ANOVA)on the average melody leads for each voice in each performance with type of asyn-chronies (hammer–string and finger–key) and voice (2 to 5 in the Ballade and 2 to 7in the Etude) as within-subject factors separately for the two pieces (Etude, Ballade)showed significant main effects of type of melody lead and significant interactionsbetween type and voice.8

A real outlier was Pianist 3, who played the melody 40–70 ms before the ac-companiment, as shown in Figure 3.6. This was a deliberate strategy that Pianist 3habitually uses to emphasise melody. In personal communication with Pianist 3, heconfirmed this habit and called it a personal speciality. His finger–key profiles stillshowed a melody lead of about 20 ms and more. A similar but smaller tendencywas shown by two other pianists. This finding suggests that melody lead can beapplied deliberately and used as an expressive device—in addition to a dynamicdifferentiation—to highlight the melody. We argue here that, when melody lead isused as a conscious expressive device, it should be observable at the finger–key level.This strategy seems to be fairly rare.

The results of the two emphasised versions of the first 9 bars of the Ballade areshown in Figure 3.7. In the top graphs, the mean intensity values are plotted byvoices. In the first voice version (top left graph), the emphasised voice was playedlouder than in the normal version (mean FHV 1.28 m/s versus 1.01 m/s), while theaccompaniment maintained its dynamic range. The melody lead increased up to 40to 50 ms (Figure 3.7, bottom left graph).

When the third voice was emphasised, that voice was played loudest (at aboutFHV 1.12 m/s on average), with the melody somewhat attenuated (0.84 m/s) andthe other voices as usual (top right graph). The third voice led by about 20 mscompared to the first voice, while the left hand lagged by about 40 ms (Figure 3.7).Thus, when pianists are asked to emphasise one voice, they play this voice louder,and the timing difference changes correspondingly.

The first nine bars of the (normal version of the) Ballade were compared withthese two special versions (Ballade 1st voice, Ballade 3rd voice) with regard to ham-mer velocity and melody lead. A repeated-measure ANOVA on the average hammervelocities of each voice in each performance with instruction (normal, 1st, 3rd) and

8The repeated-measure ANOVA for the Ballade: significant effect of type [F (1, 21) = 718.2, p <.001], no significant effect of voice [F (3, 63) = 1.2, p > .05], and a significant interaction betweentype and voice [F (3, 63) = 112.3, p < .001]; for the Etude: significant effects of type [F (1, 21) =603.9, p < .001], and voice [F (5, 105) = 5.59, p < .002], and an interaction between type and voice[F (5, 105) = 34.83, p < .001].

3.4. Results 67

-40 -20 0 20 40 60 80Asynchrony (ms)

5

4

3

2

1

Voi

ce

right

han

dle

ft ha

nd

BalladePianist 3

-40 -20 0 20 40 60 80Asynchrony (ms)

7

6

5

4

3

2

1

Voi

ce

right

han

dle

ft ha

ndEtude

hammer-stringfinger-key

Figure 3.6: The asynchrony profiles of pianist 3 (with standard deviation bars) at hammer–string contact (solid lines with triangles) and finger–key contact (dashed lines with circles).

voice (1–5) as within-subject factors was conducted. Significant effects on instruc-tion [F (2, 21) = 4.98, p < .05], voice [F (4, 84) = 466.2, p < .001], and a significantinteraction between instruction and voice [F (8, 168) = 88.58, p < .001] indicate thatpianists changed the dynamic shaping of the individual voices significantly. An-other repeated-measure ANOVA was conducted on the melody leads averaged foreach voice in each performance, again with instruction (normal, 1st, and 3rd) andvoice (2–5) as within-subjects factors. It showed significant effects of instruction[F (2, 42) = 114.41, p < .001] and voice [F (3, 63) = 24.12, p < .001], and an interac-tion between instruction and voice [F (6, 126) = 31.29, p < .001].

3.4.1 Relationship between velocity and timing

Generally, it was the case that the larger the dynamic differences, the greater theextent of melody lead. The velocity differences between the first voice and the othernotes were negatively correlated with the timing differences. The mean correlationcoefficients across the 22 pianists are shown in Table 3.1a), separately for each piece

68 Chapter 3. Production of Melody

-40 -20 0 20 40 60 80Asynchrony (ms)

5

4

3

2

1

Voi

ce

right

han

dle

ft ha

nd

1 2 3 4 5Voice

1.0

0.40.50.60.70.8

2.0

FH

V (

m/s

)

3rd voice

right hand left hand

-40 -20 0 20 40 60 80Asynchrony (ms)

5

4

3

2

1

Voi

ce

hs allfk allhs averagefk average

right

han

dle

ft ha

nd

1.0

0.40.50.60.70.8

2.0

FH

V (

m/s

)

1 2 3 4 5Voice

1st voiceallmean

right hand left hand

Figure 3.7: Average velocity and asynchrony profiles of the 22 individual performancesin the Ballade’s emphasized melody conditions. On the left-hand side, the first voice wasemphasized, on the right, the third voice. The solid lines indicate hammer–string (“hs”)contacts, dashed lines finger–key (“fk”) contacts.

3.4. Results 69

Table 3.1: (a) Mean correlation coefficients, with standard deviations (s.d.), betweenmelody lead and final hammer velocity differences across 22 pianists. nmax indicatesthe maximum number of note pairs that went into the computation of each correlation(missing notes reduced this number at some individual performances). #r∗∗ indicatesthe number of highly significant (p < 0.01) individual correlations (#r∗∗max = 22). (b)The mean correlation coefficients, with standard deviations (s.d.), between observed andpredicted melody lead across 22 pianists, and the number of highly significant (p < 0.01)correlations of the pianists (#r∗∗).

Etude Ballade Ballade 1st voice Ballade 3rd voiceright hand left hand right hand left hand right hand left hand right hand left hand

(a)nmax 126 103 181 269 29 58 29 58mean −0.45 −0.15 −0.42 −0.31 −0.55 −0.29 −0.73 −0.53s.d. 0.12 0.20 0.13 0.12 0.17 0.22 0.14 0.17#r∗∗ 21 2 22 20 16 12 22 18

(b)nmax 126 103 181 269 29 58 29 58mean 0.66 0.34 0.58 0.50 0.72 0.55 0.79 0.63s.d. 0.10 0.23 0.13 0.13 0.17 0.22 0.11 0.13#r∗∗ 22 14 22 22 21 21 22 22

and for right-hand (within-hand) and left-hand (between-hand) comparisons.9

The within-hand coefficients were substantially higher than the between-handcoefficients. This suggests a larger independence between the hands than betweenthe fingers of a single hand. Especially for the Etude, almost all of the between-handcoefficients were non-significant (with the exception of two pianists). The coefficientsfor the special versions were slightly higher than those for the ‘normal’ versions.

These correlation coefficients assume a linear relationship between melody leadsand the velocity differences. However, the expected effect resulting from the pianoaction timing properties (velocity artifact) does not represent a linear, but ratheran inverse power relation (see Figure 3.3). To test the presence of this effect inthe data, the observed timing differences were correlated with the timing differencespredicted by the TCC (Table 3.1b). These correlations were generally higher thanthe correlations between timing differences and final hammer velocity differences.Eighty-seven out of 88 individual coefficients were highly significant for the right

9The negative sign of the correlation coefficients stems from the way of calculating timingand velocity differences and has no relevance for data interpretation: from the onset time of eachaccompanying note (tn) the onset time of the corresponding melody note (t1) is subtracted (tn−t1),so the melody lead is positive. Similarly, the velocity differences are calculated as: vn − v1, whichresults in negative values. Therefore, the correlation coefficients between melody leads and velocitydifferences are negative, whereas the coefficients between observed and predicted melody leads arepositive.

70 Chapter 3. Production of Melody

hand. This result shows that the connection of melody lead and intensity variationis even better explained by the velocity artifact than by a linear correlation, as donein previous studies (Palmer, 1996; Repp, 1996a).

Some of the individual left hand correlation coefficients between observed andpredicted melody lead were non-significant in the Etude, but not in the Ballade or inthe special versions (Table 3.1b). This suggests not only the general trend of largerbetween-hand asynchrony variability, but it is also due to large bass anticipations—the type 2 asynchronies mentioned above—played by some pianists, who clearlystruck some bass notes earlier. To illustrate these bass anticipations, the beginningof the Etude performed by pianist 5 is shown in Figure 3.8. In the bottom graphof Figure 3.8, we can observe five bass leads. Two are quite small (bars 6 and 7about 35–40 ms), two are somewhat larger (bars 2 and 8 about 75 ms) and one ishuge (bar 9, 185 ms). All bass leads are even larger in the finger–key domain (seeFigure 3.8, open symbols). In this example, most of the large bass leads occur atmetrically important events. These bass leads are well perceivable and often exceedthe range of the melody leads.

3.5 Discussion

In this study, a large and high quality set of performance data was analysed. Inaddition to the measuring of asynchronies at the hammer–string impact level, weestimated the asynchronies at the start of the key acceleration (finger–key level)through calculation. The hypothesis that melody lead occurs as a consequence ofdynamic differentiation was supported in three ways.

1. The consistently high correlations between hammer–string asynchronies anddynamic differences show the overall connection of melody lead and velocitydifference. The more the melody is separated dynamically from the accompa-niment, the more it precedes it. These findings replicate Repp’s results (Repp,1996a).

2. In addition to these findings, the estimated finger–key asynchronies show that,with few exceptions, the melody lead phenomenon disappears at finger–keylevel. Pianists start to strike the keys almost synchronously, but differentvelocities cause the hammers to arrive at the strings at different points intime.

3. With the help of the timing correction curve (TCC), melody lead was predictedin ms. The correlations between this predicted and the observed melody leadwere even higher than the correlations between velocity differences and melodylead. Differences in hammer velocity account for about half of the variancein asynchronies in the data. The other variance could be due to deliberateexpression, or motor noise.

3.5. Discussion 71

Pianist 5

0.4

0.7

1

1.3

1.6

1.92.22.52.8

FH

V (

m/s

)

12347

&

?

#

#

#

#

#

#

#

#

4

2

4

2

legatoj

œ

œ œ œœœ

œ

œ

>

œ

œ

œ

>

œ

œœœœ œ œ œ œ

p

œ

œ

œ œ œ œ œ

>

J

œ

J

œ œ

œ

œ

>

œ

œ

œ

>

œ

œ œ œ œ œœœœ

œ

œ

œ œ œ œ

>

œ

œ

œ

œ

œ œ

œ

œœ

œ

œ œ

œ

œ œ

œ

œ

œ œ

œ œœ

>

œ

œ

œœ

œœ

œœ œœ

œœ

œ

œ œ

œ

œ œ

œ

œ

œ œ œ œ œ

>

œœ

œœ œœ

œœœœœœ

œ

œ œ

œ

œ œ

œ

œ

œ œ œ œ

cresc.

œ œ œ œ

œn œ œ œ œœœœ

œ

œ œ

œ

œ œ

œ

œ

j

œ

œ

r

œ# œ œ

œœœ

>

œ

œ

œ

œ

œ

œ

œ>œ

œ

œ>œ

œ œ

j

œ

r

œ# œ œ

riten.ten.œ

œ

>œ#

œ

œ

œ

œn œ œ œ

œ

œ œ

œ

œ œ

œ œ

œœ

œœ

œ

œ

œ

1 2 3 4 5 6 7 8 9

−200

−150

−100

−50

0

50

100

Score time (bars)

Del

ay (

ms)

12 3 4 7

Figure 3.8: The dynamic profiles (top panel) and the note onset asynchronies (bottom)for the first bars of the Etude op. 10, No. 3 for pianist 5. Top graph: The final hammervelocity (FHV) is plotted against nominal time according to the score. Each voice isplotted separately. The melody is played clearly more loudly than the other voices. Thebottom graph shows the time delay of each note relative to onset time of the correspondingmelody note (voice 1). The closed symbols represent hammer–string asynchronies, theopen symbols the estimated finger–key contact times.

72 Chapter 3. Production of Melody

The findings of this study are consistent with interpretations of Repp (1996a, velocityartifact explanation) rather than those of Palmer (1989, 1996), who regarded melodylead to be produced independently of other expressive parameters (e.g. dynamics,articulation). Of course it remains true that melody lead can help a listener toidentify the melody in a multi-voiced music environment. Temporally offset elementstend to be perceived as belonging to separate streams (stream segregation, Bregman,1990), and spectral masking effects are diminished by asynchronous onsets (Rasch,1978, 1979, 1988). But in the light of the present data, perceptual segregation isnot the main reason for melody lead. Primarily, the temporal shift of the melodyis a result of the dynamic differentiation of the voices, but both phenomena havesimilar perceptual results, that is, separating melody from accompaniment.

Nevertheless, pianists clearly played asynchronously in some cases. Some bassnotes were played before the melody. Bass lead time deviations were usually around50 ms and extended up to 180 ms in some cases. These distinct anticipations seemto be produced intentionally, although probably without immediate awareness. Thisbass lead has been well documented in the literature, not only as a habit of an oldergeneration of pianists, but also in some of today’s pianists’ performances (Palmer,1989; Repp, 1996a).

The case of Pianist 3 suggests that pianists can enlarge the melody lead delib-erately if they wish to do so. In this case, even in the finger–key domain melodylead was observable. However, it does not seem possible for pianists to dynamicallydifferentiate voices in a chord without producing melody lead in the hammer–stringdomain. At least there was no example in the present data that would prove this.

In the examples of deliberately produced asynchronies (bass lead and enlargedmelody lead), the extent of the asynchrony usually exceeded 30 ms. Such asyn-chronies may be regarded as a deliberate expressive device under direct control ofthe pianists. According to the pianists in this study, they were produced in a some-what subconscious way (personal communication with the pianists), but pianistsreported a general awareness of the use of these asynchronies and that they couldsuppress them if they wanted to. However, the use of the ‘normal’ melody leadthat was produced by all pianists was unconscious. Pianists reported that theyemphasise one voice by playing it dynamically louder, but not earlier (the same wasreported by Palmer, 1989, p. 335).

The asynchronies in the finger–key domain were computed by using a timingcorrection curve which provides the time interval from key-press to hammer–stringimpact as a function of final hammer velocity. The key shutter reacts when the keyis depressed about 2 to 3 mm (the touch depth of the key is usually about 9.5 mmAskenfelt and Jansson, 1991, p. 2383, varying slightly across pianos). Thus, to beprecise, the finger–key domain represents points in time when keys are depressed by2 to 3 mm. However, almost nothing is known about how keys are accelerated andreleased in reality. In very precise acceleration measurements by Van den Bergheet al. (1995, p. 17), it can be seen that sometimes keys are not released entirely,especially in repetitions. The modern piano action has the double repetition fea-

3.5. Discussion 73

ture that allows a second strike without necessarily releasing the key entirely. Ifthe system measured onsets close to the zero position, some onsets would not bedetected as such. Nevertheless, the 2 to 3 mm below zero level still gives a goodimpression about the asynchronies at the start of a key acceleration. For moreaccurate statements about played and perceived onset asynchronies in piano per-formance, evaluation of acceleration measurements at different points in the pianoaction would be necessary.This study was concerned with the particular properties of the piano. Other

keyboard actions (harpsichord, organ) may have similar timing properties as faras the key itself is concerned (a key that is depressed faster reaches the keybedearlier than a slower one), but their actions respond differently due to their differentway of producing sound: the harpsichord plucks the strings, and on the organ apipe valve is opened or closed. Additionally, they do not allow continuous dynamicdifferentiation like a piano does, and therefore performers may choose timing as ameans to separate voices. However, we note a difference in the played repertoire:homophonic textures, like the Chopin excerpts used in this study, are seldom seenin the harpsichord or organ repertoires.According to Vladimir Horowitz, when accenting a tone within a chord one

should “raise the whole arm with as little muscular effort as possible, until the fingersare between three and five inches above the key. During the up and down movementsof the arm, prepare the fingers by placing them in position for the depression of thenext group of notes and by holding the finger which is to play the melody-note atrifle lower and firmer than the other fingers which are to depress the remaining keysof the chord.” (Eisenberg, 1928).10 This would suggest that an asynchrony at thekey is intended, but Horowitz goes on: “The reason for holding the finger a triflelower is only psychological in effect; in actual practice, it isn’t altogether necessary.Experience shows that in the beginning it is almost impossible to get a student tohold one finger more firmly than the others unless he is also permitted to hold itin a somewhat different position from the others. Holding it a little lower does notchange the quality or quantity of tone produced and does not affect the playing inany way but it does put the student’s mind at greater ease” (Eisenberg, 1928). Asthe pianists in the present study, Horowitz is aiming at intensity differences here,but not at differences in timing: “The finger which is held a trifle lower and muchfirmer naturally strikes the key a much firmer blow than do the more relaxed fingerswhich do not overcome the resistance of the key as easily as does the more firmlyheld finger. The tone produced by the key so depressed is therefore stronger thanthe others” (Eisenberg, 1928). The quote suggests that Horowitz was unaware of theconsequences of his recommendation for onset synchrony or that he did not consideronset asynchrony as an important goal.

10This article may be found at http://users.bigpond.net.au/nettheim/horo28.htm.

74 Chapter 3. Production of Melody

3.6 Finger–key contact estimation with alterna-

tive travel time functions

The timing correction curve (TCC) used in Section 3.3.3 (p. 63) was replicatedwith results from extensive measurements on the same grand piano in Section 2.2.3(p. 21). The present section presents finger–key chord profiles inferred through thetravel time approximations obtained in Section 2.2.3.

In Figure 3.9, the three travel time approximations as listed in Table 2.1 (p. 22)are plotted together with the TCC used in Section 3.3.3 (t = 89.16h−0.570; seeFigure 3.3, p. 63 and Figure 2.4, p. 20). It is evident that the TCC and the legatocurve fit and the staccato and the reproduction curve fits were very similar to eachother (also reflected in the coefficiants in Table 2.1).

It was surprising that the TCC obtained from the internal calibration functioncoincided better with the legato approximation rather than with the staccato or thereproduction curve fits. The internal sensor below the key reacts at about 2 mmkey depression (see Figure 2.1, p. 16). Thus the registered TCC would be expectedto depict travel times shorter than the reproduction by the system as measured bythe accelerometer setup, where the finger–key was determined as the beginning ofthe key movement and thus at 0 mm key depression. However, since there are someuncertainties in obtaining and interpreting the TCC from the Bosendorfer’s internalcalibration mode, the finger–key approximations as displayed in the Figures 3.5 and3.7 were re-calculated with the curve approximations of the legato and the staccatodata using the same procedure as in Section 3.3.3 (see also Figure 3.4, p. 64).

0 1 2 3 4 5 6 70

50

100

150

200

250

Tra

vel t

ime

(ms)

Hammer velocity (m/s)

TCC (Goebl, JASA 2001)BoeSE SE290–3, legatoBoeSE SE290–3, staccatoBoeSE SE290–3, repro Figure 3.9: Comparison of dif-

ferent travel time approxima-tions. Displayed are the tim-ing correction curve (TCC) asused in Section 3.3.3 and Goebl(2001), and the three powercurves fitted onto legato, stac-cato and the reproduction dataas reported in Section 2.2.3 (seeTable 2.1, p. 22). The TCCand the legato curve are almostidentical as well as the staccatoand the reproduction curve.

3.6. Finger–key contact estimation with alternative travel time functions 75

1

2

3

4

5

6

7

−40 −20 0 20 40 60 80Asynchrony (ms)

Voi

ce

Etude

mean hsmean fkmean fk−new (st)mean fk−new (lg)

1

2

3

4

5

−40 −20 0 20 40 60 80Asynchrony (ms)

Voi

ce

Ballade

mean hsmean fkmean fk−new (st)mean fk−new (lg)

Figure 3.10: Grand average asynchrony profiles for the Etude (left) and the Ballade (right)for 22 pianists. The profiles at hammer–string level (“hs,” diamonds with solid line) andfinger–key (“fk,” circles with dotted line) are identical to those depicted in Figure 3.5(p. 65). The profile with squares and a dash-dotted line represent finger–key times inferredby the power function of the staccato data; the profile with asterisks and a solid line thoseinferred by the legato data (see Figure 3.9).

In Figure 3.10, the grand average chord profiles at finger–key level inferredthrough the legato and the staccato curve approximations are displayed togetherwith those already shown in Figure 3.7 (p. 68) separately for the Ballade and theEtude. Although there were considerable differences between the legato and thestaccato curve (see Figure 3.9), the finger–key profiles were very similar to thoseinferred through the TCC. Also the two emphasised melody conditions of the firstnine bars of the Ballade showed the same behaviour with the two alternative traveltime approximations (Figure 3.11). Also there, the ‘old’ and the ‘new’ finger–keyprofiles coincided well. Thus, the basic findings of Goebl (2001) could be replicatedhere.

For the sake of completeness, also the correlation coefficients between observedand predicted melody lead as shown in Table 3.1 (p. 69) were re-calculated separatelyfor melody leads predicted through the different alternative travel time functions(legato, staccato, reproduction, see Figure 3.9). The mean correlation coefficients

76 Chapter 3. Production of Melody

1

2

3

4

5

−40 −20 0 20 40 60 80Asynchrony (ms)

Voi

ce

1st Voice

mean hsmean fkmean fk−new (st)mean fk−new (lg)

1

2

3

4

5

−40 −20 0 20 40 60 80Asynchrony (ms)

Voi

ce

3rd Voice

mean hsmean fkmean fk−new (st)mean fk−new (lg)

Figure 3.11: Grand average asynchrony profiles for the emphasised melody conditions ofthe Ballade for 22 pianists. On the left-hand side, the first voice was emphasised; on theright, the third voice. The profiles at hammer–string level (“hs,” diamonds with solid line)and finger–key (“fk,” circles with dotted line) are identical to those plotted in Figure 3.7(p. 68). The profile with squares and a dash-dotted line represent finger–key times inferredby the power function of the staccato data; the profile with asterisks and a solid line thoseinferred by the legato data (see Figure 3.9).

across 22 performances, with standard deviations (s.d.), and the number of highlysignificant correlation coefficients are listed in Table 3.2, separately for the differenttravel time functions. The maximum number of pairs of observed and predictedmelody leads was different for the 22 performances (due to missing or wrong notes),but identical for the different travel time functions and thus only listed once in thetable.

The results of the four different calculations are similar. There was one coefficientmore highly significant at the Etude’s left hand for the staccato and reproductiontravel time function (15 instead of 14), while the Ballade’s 3rd voice version depicted21 significant coefficients, but only 20 with the other three functions. These minutedifferences in the results do not affect the evidence given from the data and theconclusions drawn in Section 3.5 (p. 70).

3.6. Finger–key contact estimation with alternative travel time functions 77

Table 3.2: The mean correlation coefficients, with standard deviations (s.d.), betweenobserved and predicted melody lead across 22 pianists, and the number of highly signifi-cant (p < 0.01) correlation coefficients of the 22 performances (#r∗∗), separately for thedifferent travel time approximations. The TCC data is identical to Table 3.1b (p. 69.nmax indicates the maximum number of note pairs that went into the computation of eachcorrelation (missing or wrong notes reduced this number at some individual performances).

Etude Ballade Ballade 1st voice Ballade 3rd voiceright hand left hand right hand left hand right hand left hand right hand left hand

nmax 126 103 181 269 29 58 29 58

TCCmean 0.66 0.34 0.58 0.50 0.72 0.55 0.79 0.63s.d. 0.10 0.23 0.13 0.13 0.17 0.22 0.11 0.13#r∗∗ 22 14 22 22 21 21 22 22

Legatomean 0.66 0.34 0.58 0.50 0.73 0.55 0.79 0.63s.d. 0.10 0.23 0.13 0.13 0.17 0.22 0.11 0.13#r∗∗ 22 14 22 22 21 20 22 22

Staccatomean 0.67 0.34 0.58 0.50 0.72 0.56 0.79 0.63s.d. 0.10 0.23 0.13 0.13 0.17 0.22 0.11 0.13#r∗∗ 22 15 22 22 21 20 22 22

Reproductionmean 0.67 0.34 0.58 0.50 0.72 0.56 0.79 0.63s.d. 0.10 0.23 0.13 0.13 0.17 0.22 0.11 0.13#r∗∗ 22 15 22 22 21 20 22 22

In fact, the findings derived through the arguable TCC in Sections 3.1–3.5 andGoebl (2001) could be put on firm ground with data obtained from a totally differentsource and with no insecure steps in procedure in between.

78 Chapter 3. Production of Melody

3.7 A model of melody lead

This section describes briefly a model of melody lead according to the velocity ar-tifact hypothesis. The velocity artifact hypothesis assumes that melody lead occursexclusively because of different intensities of the tones in a chord. This model as-sumes furthermore that the pianists start to depress the different keys of a chordsimultaneously (at finger–key level). Different intensities of the keystrokes result indifferent travel times and thus, the hammers will arrive at the strings at differentpoints in time. The faster a key is depressed the earlier it arrives at the strings (seeSection 2.2.3, p. 21).This model of melody lead simply takes approximations of the hammers’ travel

times and infers tone onset asynchronies (melody leads) according to the differentintensities of the chord tones.Taking the TCC measured by the Bosendorfer system’s calibration function as

the cause of melody lead (ml in milliseconds), and the final hammer velocity of themelody (fhv1 in meters per second) and of an accompanying tone (fhvn), melodylead is predicted by

ml = 89.16 · fhv−0.570n − 89.16 · fhv−0.570

1 . (3.1)

In this work, the mapping between MIDI velocity units and final hammer velocity(m/s) as measured by the Bosendorfer system was chosen to be

MIDIvel = 52 + 25 · log2(fhv), (3.2)

thus the model with MIDI velocity units (MIDI velocity from melody MIDI1 , andfrom the softer accompaniment MIDIn) as input is

ml = 89.16 ·(2

MIDIn−5225

)−0.570 − 89.16 ·(2

MIDI1−52

25

)−0.570

. (3.3)

The alternative travel time functions alter only the coefficients of the power curvefit of the model. To be complete, they are listed below (cf. Table 2.1, p. 22).

1. With the curve fitted into the Bosendorfer legato data:

ml = 89.96 ·(2

MIDIn−5225

)−0.5595 − 89.96 ·(2

MIDI1−5225

)−0.5595

(3.4)

2. With the curve fitted into the Bosendorfer staccato data:

ml = 58.39 ·(2

MIDIn−5225

)−0.7377 − 58.39 ·(2

MIDI1−5225

)−0.7377

(3.5)

3. With the curve fitted into the Bosendorfer reproduction data:

ml = 60.90 ·(2

MIDIn−5225

)−0.7731 − 60.90 ·(2

MIDI1−5225

)−0.7731

. (3.6)

Chapter 4

The Perception of Melody inChord Progressions

This part discusses the perceptual side of bringing out the melody in piano per-formance. From performance studies in Chapter 3, we learned that pianists playthe voice intended to be prominently heard not only louder, but also slightly beforethe accompaniment (melody lead). In this chapter, this phenomenon is approachedfrom the listener’s perspective. The main question here is to clarify the influence ofrelative asynchrony and variation in tone intensity balance on the perception of thesalience of different voices in artificial music stimuli and real music. Another interestof this chapter is whether such small asynchronies as they are typically played bypianists are detected as such by listeners.

In the pilot experiment (Section 4.3, p. 87), two equally loud tones with asyn-chronies up to ±50 ms are used to investigate the perceived loudness of the twotones (question 1) and their perceived order (question 2). In this pilot experiment,different types of tones are used (pure, sawtooth, MIDI-synthesised piano, and realpiano) to test whether different attack curves change loudness perception or tempo-ral order identification. Variation in balance of the chord tones is added to the nextseries of three experiments (Experiments I–III, Section 4.4, p. 95). In Experiment I,participants adjust the relative level of two simultaneous tones (pure, sawtooth, andpiano sound) until they sound equally loud. In Experiment II, they rate the relativeloudness of the two tones of dyads with relative timing and intensity systematicallymanipulated by up to ±54 ms and ±20 MIDI velocity units. In Experiment III,listeners judge whether or not the stimuli of the previous experiment sound simul-taneous. In another series of three experiments, the stimulus material is extendedto three-tone piano chords, sequences of three-tone piano chords (Experiment IVand Experiment V, see Section 4.5, p. 105), and to an excerpt of a piece by Chopin(Experiment VI, Section 4.6, p. 118).

79

80 Chapter 4. Perception of Melody

4.1 Introduction

A pianist can “bring out” a melody tone—that is increase its perceptual salience—either by depressing the key more quickly or by varying timing relative to the ac-companiment. Melody tones typically sound some 30 ms before the other tonesof a chord (melody lead, Palmer, 1989, 1996; Repp, 1996a; Goebl, 2001); this ef-fect is generally associated with, and presumably causally related to, differences inhammer velocity between the melody and accompaniment (velocity artifact, Repp,1996a; Goebl, 2001, see Chapter 3).Independently of why pianists introduce these asynchronies, several perceptual

effects are generally referred to in order to explain the psycho-acoustic relevance ofthese asynchronies.

• Masking. A voice anticipated by several milliseconds avoids being (at leastpartly) masked by the other tones of a chord (Rasch, 1978, see also Sec-tion 4.1.4, p. 83).

• Streaming. Auditory events are more likely to be grouped into simultaneitiesif their onsets are synchronous and into separate melodies if their onsets areasynchronous (Bregman and Pinker, 1978; Bregman, 1990; Palmer, 1996, seealso Section 4.1.5, p. 84).

Apart from the psycho-acoustic effects of asynchrony that will be studied inthe following, intensity differences between melody and accompaniment as they arefound in piano performance may by itself entail psycho-acoustic effects. The loudermelody voice takes on a singing quality, because its pitch becomes more salient (cf.Terhardt et al., 1982). Since the roughness of a beating pair of pure tones fallsrapidly with increasing amplitude difference (cf. Terhardt, 1974), the timbre of thewhole sonority can be expected to become less rough (Parncutt and Troup, 2002,p. 291).

4.1.1 Perception of melody

Under the heading of the “perception of melody” a wide range of psychological andpsycho-acoustic research is summarised. The main focus of this research lies onvarious fundamental principles involved while human listeners perceive melodies.They range from, e.g., pitch processing (Burns, 1999) and interval categorisation(Plomp et al., 1973) to recognition of pitch contour and memorisation (Bharucha,1983; Dowling, 1990; Watkins, 1985). A comprehensive overview was provided byDeutsch (1999b). Music theoretic approaches of melody perception were provided,e.g., by Meyer (1973), Lerdahl and Jackendoff (1983), and Narmour (1990). Inthe present study, we are not concerned with all aspects of melody perception.Our focus lies on the perceptual salience of individual melodic lines in multi-voicedmusical textures and how it changes when their relative intensity and their relativetiming is varied.

4.1. Introduction 81

Different voices in a multi-voiced musical texture exhibit different attentionalproperties. The highest voice is very often also the melody in the classic-romanticrepertoire (Palmer and Holleran, 1994). Thus, there is a perceptual advantage forthe highest-pitched voice (DeWitt and Samuel, 1990) and a disadvantage for middlevoices (Huron, 1989; Huron and Fantini, 1989).

Evidence comes also from error studies by Palmer and van de Sande (1993)and Repp (1996c). Pianists made less mistakes in the melody voice than in theaccompaniment. Harmonically related errors occurred more frequently in the middlevoices (Palmer and van de Sande, 1993) and are less likely to be detected there bylisteners (Palmer and Holleran, 1994).

4.1.2 Perception of isolated asynchronies

Much research has been conducted on the perception of asynchronies especially forthe purpose of speech perception. Studies from psychoacoustic literature used exclu-sively artificial sounds (pure and complex tones, clicks, bursts). To my knowledge,there is no study of the perception of asynchronies with typical musical stimuli suchas music instrument tones (for an overview, confer to Hirsh and Watson, 1996). Thetwo basic questions are (1) what is the temporal threshold beyond which two almostsimultaneous sounds are perceived as asynchronous (detection of asynchrony thresh-old), and (2) from which amount of asynchrony can the correct order of two soundsbe perceptually determined (temporal order threshold, TOT, cf. Pastore et al., 1982).

The detection of asynchrony threshold of tones is very small and lies at thethreshold of the human auditory system in general (auditory acuity, Green, 1971).Two clicks (presented at the same ear) were heard as sounds rather than one singlesound when their temporal difference was not more than 2 ms (Wallach et al., 1949).Under extreme conditions, this threshold was found to be even smaller. Two clickswith different amplitude could be discriminated with an asynchrony of 0.2 to 1 ms(Henning and Gaskell, 1981), and, as a strangely extreme case, a 0.01 ms (= 10µs)asynchrony could be detected with 0.01 ms clicks (Leshowitz, 1971). The detectionof asynchronous onsets and offsets of individual partials in complex harmonics wasstudied by Zera and Green (1993a,b, 1995). The listener’s sensibility for onsets wasmore accurate (than for offsets) and was of the order of 1 to 2 ms.

The second question concerns the correct detection of the temporal order of twostimuli. In an often cited study, Hirsh (1959) found the temporal order threshold(TOT) to lie between 15 and 20 ms for pure tones with rise times of the order of20 ms. Similar threshold were obtained with stimuli of different pitch, and timbre(clicks, noise). His assumption that this threshold is independent of the acousticalnature of the sound was invalidated by Pastore et al. (1982), who tested particularlyeffects of stimulus duration and rise time. They found that the longer the (common)stimulus durations (10–300 ms) were, the higher were the TOTs (4–12 ms); and thelonger the rise times (0.5–100 ms), again, the higher the TOTs (4–23 ms). TheTOT of a condition most similar to a piano tone (300 ms common duration, 25 ms

82 Chapter 4. Perception of Melody

rise time, see Chapter 2) was approximately 13 ms. This threshold correspondedto findings from speech perception, where 20 ms were said to be sufficient to tellthe correct order of two stimuli (Rosen and Howell, 1987). In special conditions,this threshold was found to be even smaller. A TOT of 2 ms was experimentallyvalidated with two tones of a duration of 2 ms (at 1000 and 2000 Hz, respectively,Wier and Green, 1975), or with three tones differing in frequency (Divenyi andHirsh, 1974).

All the above reported studies dealt with artificial stimuli, mostly with puretones. In real musical situations, it could be expected that these thresholds are fartoo low. Handel (1993) said that time differences up to 30–40 ms were perceived asbeing simultaneous, but as beginning at different times (no order). From 40–80 msthey appeared asynchronous and one tone seems to be before the other (Handel,1993, p. 214). Reuter (1995, pp. 31–34) reported that the perception time smear(Reuter, 1995, p. 33) lies around 30–80 ms indicating an integration time of theear below which events are grouped into one percept (see also Meyer-Eppler, 1949;Winckel, 1952; Roederer, 1973). Similarly, Huron (2001) emphasises that onsetdifferences in real music situations can be a lot greater than the TOT of 20 ms andstill give the impression of a single onset. In his opinion, sounds with gradual attackcharacteristics will not be heard separately (especially in reverberant environments)until they are more than 100 ms apart (Huron, 2001, p. 39).

4.1.3 Intensity and the perception of loudness and timbre

Sound intensity is a physical quantity measured by physical instruments in terms ofsound level in decibels (dB), whereas loudness as a psycho-acoustic measure refersto what a human listener senses when exposed to a certain sound intensity. Thedifferent perception of individual pure tones by listeners is reflected in the equal-loudness contours (Fletcher and Munson, 1933; Moore, 1997; Zwicker and Fastl,1999; Yost, 2000, ISO standard 226:1987). The subjective measure is the loudnesslevel, measured in phons (going back to Barkhausen, cf. Zwicker and Fastl, 1999,p. 160): a 40-dB 1-kHz pure tone has a loudness level of 40 phon. Another way tomeasure loudness is the sone scale. One sone corresponds to a loudness of a 40-dB1-kHz pure tone. The same tone twice as loud has about 50 phons or 50 dB SPL,or 2 sones.

Since in real music we almost never hear isolated pure tones, several approachestried to find measures of loudness for complex signals. There were two main meth-ods of adding up loudness values across frequency bands (Hartmann, 1998, p. 73).Stevens (1961) used 26 one-third-octave bands from 40 to 12500 Hz. Zwicker’s ap-proach (Zwicker and Fastl, 1999, pp. 220–238) was similar, but based on the idea ofsumming up neural excitation in critical bands.

These models have only been tested on steady-state sounds. In the real worldof music they would be best evaluated with organ sounds (Hall, 1993). There wereattempts to implement them in real audio data (Langner et al., 2000; Pampalk

4.1. Introduction 83

et al., 2002; Rauber et al., 2002). Such implementations took temporal and spectralmasking into account, as well as the equal-loudness contours. These approacheswere also used to analyse expressive performance (Langner and Goebl, 2002; Dixonet al., 2002a,b; Langner and Goebl, in press).

In addition, it was tried to connect the psycho-physical measures derived fromartificial stimuli with listeners’ ratings from real music. Loudness estimation of arti-ficial stimuli (pure tones, noise) and real music (in this case pop music) was approx-imately proportional to their sound level (Fucci et al., 1997, 1999). No significantdifference between the sounds (artificial versus real music) were found. However,depending on the experimental conditions, there were considerable differences inloudness estimation according to the content of the music presented to the listen-ers. Loudness estimation varied with preference for musical style (Fucci et al., 1993;Hoover and Cullari, 1992) and peer group (Fucci et al., 1998). Loudness estimationof artificial and real stimuli can be described by power functions similar to those thatrelate subjective magnitude of loudness to the physical magnitude of intensity, butthe slopes of the functions varied with stimulus condition and musical skill (Geringeret al., 1993).

In acoustic instruments, loudness cannot be varied independently of timbre. Es-pecially at the piano, both tone intensity and timbre is controlled only by a singleparameter: hammer velocity (see Chapter 2). The louder a tone gets, the morepartials it involves and thus the more bright the tone color of the sound becomes(Hall and Askenfelt, 1988; Hall, 2002, pp. 187–194, esp. p. 190).

4.1.4 Masking

Masking is a common effect in everyday life. A loud sound prevents a softer one tobe heard. You can’t hear what your friend says to you while a loud truck is passingby. Similarly, in music this effect is always present. A typical example is chambermusic, where the piano tends to be too loud and prevents the singer or the violinistto be properly heard by the audience (for an anecdotic example, cf. Moore, 1979).

In psycho-acoustic terms, masking refers to the same notion, but in a moredetailed and elaborated way. There are two types of masking: spectral and temporal.Spectral masking operates basically only within critical bands (Moore, 1997; Zwickerand Fastl, 1999). At moderate to high sound levels, a masker tone disrupts toneswith higher frequencies more than tones with lower frequencies (Zwicker and Fastl,1999, p. 68). A masker also distorts the sensation level (the level below whichtones are not perceived due to masking) over time (temporal masking, Zwicker andFastl, 1999, pp. 78–103). After a loud tone or noise, several tens of milliseconds thesensation level remains as high as during that sound and fades away continuously(post-masking or forward masking). Surprisingly, a similar effect can be observedwith the opposite order. A tone before a loud masker can also be hidden (pre-masking or backward masking). This effect lasts only a few milliseconds (Zwickerand Fastl, 1999, p. 78).

84 Chapter 4. Perception of Melody

The determination of the exact amount of masking in real music stimuli such asa piano chord is not possible with any of the existing models, because the multipleinteractions of the various partials of the sounds that change additionally constantlyover time do not allow precise predictions. In computarised models used for soundcompression (such as the MP3 file format) or loudness calculation, masking effectswere implemented in a simplified manner.

Spectral masking between voices was reduced when one voice is temporallyshifted some tens of milliseconds away from the rest, as Rasch (1978) confirmedwith a tone detection study that used complex artificial signals.

4.1.5 Stream segregation

The theory of auditory scene analysis (Bregman, 1990) describes the processesinvolved when human perception puts together individual frequencies into units(tones) or groups according to various principles. They are similar to the principlesfound in visual perception (Gestalt psychology). Such principles include proximity,similarity, and good continuation in time, pitch, and timbre (for an overview, seeDeutsch, 1999a). These principles apply when a listener hears a four-voiced fugueand perceives the four voices as separate streams. These principles also apply whenmultiple voices are heard in a Bach solo Sonata or Partita (implied polyphony). An-other interesting example where these principles can be played with is the Albertibass (typical bass figures of the 18th century, e.g., ||:C–G–E–G:||). When this figureis played very slowly, one voice is perceived. As the tempo increases, the tones fuseinto separate streams of C–E and G–G (fusion, cf. van Noorden, 1975). When thetempo increases further, all tones almost merge into a single percept. Perceptualgrouping might also be controlled by loudness and timbre as well as the timing ofthe four tones.

Asynchrony in simultaneously occurring events was used for grouping or segre-gation of streams in order to determine sound sources in auditory scene analysis(Bregman, 1990). In Bregman and Pinker’s ABC-experiment (Bregman and Pinker,1978), they used three pure sounds in a cycle. One alternated with the other two(B and C), which occurred simultaneously, so they were fused into one complexsound (when they have simultaneous onsets and offsets). There were two differ-ent streaming interpretations possible: (1) in the simultaneous condition usuallyB and C grouped together (you hear A and a complex sound B and C), or (2) if thefrequency of A and B was close, they could be heard as a stream and C as a separateevent. When one of the two simultaneous tones (C) moved in time (so onset andoffset were not at the same time with B), the first interpretation became more likely.Two perceptual effects were competing with each other (sequential integration andspectral/simultaneous integration, Bregman, 1990, p. 30). Bregman argued furtherwith the Old-Plus-New Heuristic (Bregman, 1990, p. 222) which roughly means thata sound having been presented earlier will be referred to as belonging to that earliersource and filtered out from the new sound.

4.1. Introduction 85

Parallel to this experiment, Rasch (1979, 1988) argued that observed asyn-chronies in ensemble performances of the same order as the melody lead enablelisteners to track voices distinctly. Alongside this argumentation, Huron (1993)showed that J. S. Bach maximises onset asynchrony in the written score (two-partinventions) in order to optimise the perceptual salience of the individual voices andto make every single voice distinctively audible (Huron, 1993).

86 Chapter 4. Perception of Melody

4.2 Aims

The basic questions I address in the following are listed here. They can be split upinto two blocks of questions: the first refers to the perception of asynchrony, andthe other to the perception of the salience of a tone or voice.

1. Perception of tone asynchronies

• At what amount of asynchrony is a listener able to tell the temporal orderwith certainty?

• Which asynchronies are detected as such in what intensity combinations?It is hypothesised that typical patterns like the melody lead (a loudervoice is also early) are so common that they are not perceived as be-ing asynchronous in comparison to less familiar combinations of relativetimings and intensity differences.

• Does the perception of asynchrony depend on the type of signal it is pre-sented with (pure or complex artificial sounds versus real piano sounds)?

2. Perception of tone salience

• The role of shifting a tone back and forth in time relative to other voice(s).Does this change the perceived loudness/salience of that tone?

• Is there a difference in the perceived salience between an anticipated toneand a delayed one? Could it be that delay attenuates a tone’s salience?

• Does a possible effect of asynchrony vary with the types of sound involved(pure, complex, or real piano tones)?

• What is the influence of variation in the tone intensity balance of chordson the perceived salience of a particular tone?

• Is relative intensity the more important perceptual cue in comparison torelative onset asynchrony?

• Is the position in a chord (upper, middle, lower tone) relevant to theloudness perception of a particular tone?

• Does streaming enhance the effect of asynchrony and order in comparisonto the perception of single tone combinations?

4.3. Perception of asynchronous dyads 87

4.3 Perception of asynchronous dyads (pilot study)

This section describes a pilot study on the perception of equally loud dyads thatwere systematically manipulated in their tone onset synchrony. This work has beenpresented at the 2001 Society of Music Perception and Cognition meeting at QueensUniversity, Kingston Ontario, Canada (Goebl and Parncutt, 2001).

4.3.1 Background

As already learned in Section 4.1.2, there are two different tasks to distinguish in theperception of asynchronous onsets. The one is the temporal order threshold (Hirsh,1959; Hirsh and Watson, 1996) which lies around 20 ms. The other is the detectiontask, whether or not two stimuli (tones, clicks) appear together. This threshold cango down to some milliseconds depending on the kind of stimulus.

The first aim of this pilot experiment was to estimate the temporal order thresh-old in dyads of different tone types. It was hypothesised that different timbres orattack characteristics (from pure tones to real piano sounds) strongly influence theperception of asynchronies and the temporal sensitivity of the listener in the waythat the temporal order thresholds decrease the more artificial the stimulus becomes.

As reported in Chapter 3, the perceptual effects of an anticipated voice in multi-voiced musical contexts may include spectral and temporal masking as well asstreaming. The second aim of this pilot study was to examine whether the per-ceived salience of a particular tone in a dyad varies with its relative onset timing.The research question was, whether an anticipated tone is heard as more prominentby listeners than the same tone presented in synchrony with the other tone, evenwhen the asynchronies are very small (below 20 ms). Melody leads (anticipation)were more common than lags (see Chapter 3). The effect of direction (anticipationversus delay) was also investigated here. Does anticipation increase the perceptualsalience of a tone and does delay attenuate it, or does asynchrony yield effects ofsalience perception independently of direction? If temporal masking was an impor-tant factor (cf. Section 4.1.4, p. 83), it would have to be hypothesised that a delayedtone gets more masked by the earlier tones and thus receives a lower perceptualsalience than an anticipated one. As a last aspect in this pilot, the effect of tonetype was examined. Pure tones were expected to entail masking effects to a lesserextent than complex tones and real piano tones.

4.3.2 Method

Participants

The 19 participants were aged between 22 and 37 years. They were divided intotwo groups according to the duration of playing and the regular study of a musicalinstrument: 10 were classified as musically trained, with 10 to 21 years of musicalinstruction (average 16 years), and 9 as musically untrained, with zero to 7 years of

88 Chapter 4. Perception of Melody

&

œ œ

œ œ

Figure 4.1: The two intervals(octave and seventh) used forthe pilot experiment.

playing an instrument (average 3 years). Half of the 10 musicians indicated pianoas their main instrument; the others comprised each one guitar, flute, oboe, violin,and a composer (who regarded the computer as his instrument). They were 12 maleand 7 female listeners. The testing took place in June 2001. Seven were tested inVienna, the other 12 in Stockholm.

Stimuli

The test design resulted in 88 stimuli: 4 tone types × 2 intervals × 11 asynchronies.The four tone types were pure, harmonic complex with 16 partials (−6 dB peroctave), MIDI-synthesised piano, and recorded from a computer-monitored grandpiano. The pair of tones in each trial spanned an interval of an octave or a majorseventh. The lower note was always C5 (525 Hz), the higher note C6 (1050 Hz) orB5 (991 Hz, see Figure 4.1). Asynchronies varied from −50 ms to 50 ms, in 10-mssteps. A negative sign means that the upper tone was before the lower, a positivethat the lower was before the upper. The tone duration ranged from 300 to 400 msso that the overlap of the two tones were constant at 350 ms.

Equipment

The MIDI-synthesised tones were created by using a software synthesiser (Timidity)playing back 22 MIDI files (2 intervals × 11 asynchronies) created with a Matlabscript. The MIDI velocity of each of the two notes was arbitrarily set to 80. Theacoustic piano stimuli were recorded on a computer-controlled Bosendorfer playingback the same files transferred into the Bosendorfer file format. Two AKG (CK91)microphones (placed approximately one meter from the strings in a 6 by 6 meterroom) brought the signal to a Tascam DA–P1 DAT recorder.1 The stimuli weretransferred digitally to the hard disk of a PC using a “Creative SB live! 5.1 Digital”soundcard and stored in WAV format (16-bit, 44.1 kHz, stereo). The pure and thecomplex harmonic (sawtooth) tones were generated with the same computer soft-ware. Their loudness was adjusted by the author in order to sound approximatelyequally loud to the MIDI synthesised tones. The stimuli were presented via head-phones to the participants. All signals were presented diotically (same signal ineach ear), except the acoustic piano tones which were stereo. The experiment wascontrolled by a computer program that had been developed for this purpose by theauthor in a Matlab environment.

1The recordings took place in January 2001 at the Bosendorfer company in Vienna.

4.3. Perception of asynchronous dyads 89

“Which tone is more prominent?”

−50 −40 −30 −20 −10 0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

ed s

alie

nce

of u

pper

tone

Asynchrony (ms)

Musicians

PureComplexMIDIBösendorferAverage

−50 −40 −30 −20 −10 0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

ed s

alie

nce

of u

pper

tone

Asynchrony (ms)

Non−musicians

Figure 4.2: Average answers on the first question (“Which tone is more prominent?”,1 = upper tone, 0 = lower tone) as a function of asynchrony separately for the four typesof sound (different lines) and musicians (left panel) and non-musicians (right panel). Thegrand average is plotted with a solid line and diamonds. The horizontal lines indicate therange of a result not significantly different from chance according to the χ2 distribution.They are plotted separately for the four tone types (dashed) and for the grand average(solid). Negative asynchronies indicate that the upper tone was before the lower, and viceversa.

Procedure

The participants were asked to judge the 88 stimuli on two separate occasions, eachwith a two-alternative forced choice paradigm (2AFC). In the first block, they wereasked “Which tone is more prominent?”, in the second the question was “Whichtone is earlier?” In both blocks the possible answers were “the upper” or “thelower.” The stimuli were presented in random order within each block. Participantscould repeat each stimulus as often as they liked until they were sure about theiranswer. The question “Which tone is earlier?” was asked after “Which tone is moreprominent?” to prevent listeners from guessing that the experiment was about theeffect of asynchrony on loudness. After the whole session a short questionnaire wasfilled in. The session lasted about 20 minutes. The participants did not receivemoney for their services in this pilot study.

4.3.3 Results

Perception of tone salience (question 1)

The mean ratings on the first question (“Which tone is more prominent?”) are plot-ted in Figure 4.2 separately for musically trained (musicians) and not trained par-ticipants (non-musicians), as well as type of tone (pure, complex, MIDI-synthesised,and real piano). Additionally, the grand average (also across tone type) is shown

90 Chapter 4. Perception of Melody

“Which tone is more prominent?”

−50 −40 −30 −20 −10 0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

ed s

alie

nce

of u

pper

tone

Asynchrony (ms)

Musicians

SeventhOctave

−50 −40 −30 −20 −10 0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

ed s

alie

nce

of u

pper

tone

Asynchrony (ms)

Non−musicians

Figure 4.3: Mean ratings (1 = upper tone, 0 = lower tone) displayed separately by interval(two lines with squares and diamonds) and question (two panels).

(Figure 4.2). The two horizontal lines indicate the boundaries beyond which the rat-ings are significantly different from chance (50%) according to the χ2 distribution.2

The complete rating data are listed in Table A.1, p. 164.

It is evident that there was no striking trend in either direction. The ratedsalience was invariant over asynchronies for both groups. Therefore, there was alsono effect of order—anticipation and delay were equally rated. The ratings of thenon-musicians showed considerable differences between types of tone. The uppertone was always favoured in the sawtooth sound, the lower one in the Bosendorfersound, a trend which was also present in the musicians’ ratings, but barely beyondthe boundaries of significance. It could be that participants preferred the highertone in the sawtooth sound because the two tones merged into a single percept andthey rated the brightness of the dyad.

A log-linear analysis on the frequency tables with timbre (4), interval (2), musicalskill (2), timing (11) as design variables and rating (2) as response variable wasperformed. The k factors suggested mainly two-way interactions, the best fitted

2At n cases, k ratings of “1” (fo = k) and n − k ratings of “0” are observed, while fe = n/2ratings of “1” and of “0” are expected to be chance. The χ2-value for one degree of freedom andat 95%-level is χ2

(1;95%) = 3.84. A significant rating is called one which is significantly differentfrom chance when

χ2 =2∑

j=1

(fo(j) − fe(j))2

fe(j)=

(k − n2 )

2

n2

+(n − k − n

2 )2

n2

> 3.84. (4.1)

Take the left panel of Figure 4.2 as an example. There are n = 20 ratings (10 musicians and2 intervals) for each asynchrony and tone type. Either 15 or 5 ratings (of ‘1’) would be significantlydifferent from 10, that is either 0.75 or 0.25, respectively. For the grand average over the four tonetypes, n becomes 80, so the boundaries of significance are therefore 0.61 and 0.39.

4.3. Perception of asynchronous dyads 91

model included a significant (p < 0.01) main effect of musical skill and two-wayinteractions between rating and either timing or interval, respectively. Three-wayinteractions were not favoured by this model. These findings support the splittingof the participants into two groups (musicians versus non-musicians) as done inFigure 4.2. Separate log-linear models on the data split by skill gave a similarpicture. In each case, the interactions between either interval or timbre and ratingwere significant (p < 0.01).

The results plotted separately for the two different intervals is shown in Fig-ure 4.3. The log-linear model always emphasised the effect of interval on the rating.The upper tone in the octave was rated always as more prominent, whereas at theseventh the two tones were rated approximately equally prominent with a smalltendency towards the lower tone. An explanation for this effect could be that thepartials of the lower tone in the octave increase the salience of the higher tone.

Temporal order perception (question 2)

The second question (“Which tone is earlier?”) had a correct answer. As a negativesign in the asynchronies indicated that the upper tone was before the lower, thecorrect answer was “1.” Similarly, a positive sign would entail a rating of “0” asthe correct answer. The rating results as well as the correct answers are plotted inFigure 4.4 separately for types of tone and skill. The complete rating data are listedin Table A.1, p. 164. The difference between the two groups was striking. Whilemusicians could hardly get any correct answers above chance for asynchronies lowerthan 40 ms, non-musicians rated basically totally random.

A log-linear analysis on the frequency tables with timbre (4), interval (2), musicalskill (2), timing (11) as design variables and rating (2) as response variable wasperformed on the whole data set. This model yielded only skill as a significantfactor. For this reason, the analysis was performed on the two groups (musiciansand non-musicians) separately. These models found an interaction between ratingand interval and type of tone, respectively. Only the musicians showed an effecton asynchrony. This again confirmed that this task was simply too difficult fornon-musicians.

Musicians could report the correct order at and beyond ±40 ms with slightlybetter results for lower-upper patterns (Arpeggio). There are some tiny differencesbetween the types of tone. For example, the piano tones were correctly heard al-ready at −30 ms with the acoustic piano tones, or the complex tones were judgedasynchronously at ±10 ms, in the condition with the higher tone leading correctly,in the other falsely.

In Figure 4.5, the ratings are plotted separately for the two intervals and the twogroups. Also in this display it becomes evident that non-musician could not performthat task correctly, they did not show ratings beyond chance. Musicians showed alsoat this question a considerable effect of interval. Considerably correctly answeredwas the Arpeggio condition (lower voice preceded upper) when the interval was an

92 Chapter 4. Perception of Melody

“Which tone is earlier?”

−50 −40 −30 −20 −10 0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

ed o

rder

of u

pper

tone

Asynchrony (ms)

Musicians

PureComplexMIDIBösendorferAverageCorrect answer

−50 −40 −30 −20 −10 0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

ed o

rder

of u

pper

tone

Asynchrony (ms)

Non−musicians

Figure 4.4: Averaged answers on the second question (“Which tone is earlier?”, 1 = uppertone, 0 = lower tone) by amount of asynchrony, displayed separately for tone types (lines)and musical skill (musicians left, non-musicians right). The correct result is indicated bya dotted line.

“Which tone is earlier?”

−50 −40 −30 −20 −10 0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

ed o

rder

of u

pper

tone

Asynchrony (ms)

Musicians

SeventhOctaveCorrect answer

−50 −40 −30 −20 −10 0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

ed o

rder

of u

pper

tone

Asynchrony (ms)

Non−musicians

Figure 4.5: Mean ratings (“Which tone is earlier?”, 1 = upper tone, 0 = lower tone)displayed by interval (two lines with squares and diamonds) and question (two panels).

4.3. Perception of asynchronous dyads 93

octave (diamonds in Figure 4.5), or in the opposite condition when the interval wasa seventh (squares in Figure 4.5).

Tillmann and Bharucha (2002) used the detection of a 50-ms asynchrony asan indicator for harmonic relatedness in chords consisting of three voices. Theirparticipants performed significantly better (that is detected this 50-ms asynchronymore correctly) with a harmonically related primer than with an unrelated one.

The present results are barely interpretable with these findings. The octave wasnot rated generally better. On the contrary, its asynchrony was only more correctlydetected in the Arpeggio condition. The seventh showed the opposite behaviour.Musicians can’t hear the order at ±20 ms, so they guess on the basis of other cues.It can only be guessed here what these other cues might be. Participants mighttend to mix up the two tasks and rate the tone sounding less important to them asearlier (which is the upper in case of the seventh, and more the lower at the octave).

4.3.4 Discussion

This pilot study comprised two questions on dyads with varying onset asynchrony,type of tone, and interval. Although the results obtained were not conclusive due toa too limited number of participants, some preliminary results have to be pointedout here. The most fundamental finding was that non-musicians could obviouslynot judge the stimulus material with sufficient precision. Therefore, only musicallytrained participants were involved in the remaining experiments (see below).

We found no consistent effect of relative onset timing on the perceived salienceof a tone. Furthermore, there was no effect of order, regardless of the tone type:a delayed attack was considered to have the same prominence as an early attack.This casts doubt on the frequently encountered tacit assumption in the music (andespecially piano) performance literature that the first onset is perceived as moresalient. The different types of tone were judged slightly different, especially by thenon-musicians. The upper tone was rated more prominent with the sawtooth sound,the lower tone with the real piano sound.

Regarding the second question (“Which tone is earlier?”), listeners only consis-tently reported the correct order of the two onsets for asynchronies of greater thanabout 30–40 ms—again, regardless of whether the higher or the lower tone beganfirst. Identification of order barely improved as the sounds became more artificial:the threshold was around 30 ms at pure tones, and 40 ms for real piano tones. Theseresults are substantially larger than results reported in the psycho-acoustic litera-ture (see Section 4.1.2), but consistent with findings reported by Handel (1993) andReuter (1995). These large temporal order threshold suggests that melody leads arenot heard as asynchronous by listeners. However, it still has to be expected thatalthough listeners could not tell the correct order below ±40 ms that they do hearthose asynchronies as starting at different times though as a single musical event.

To draw more final conclusions on this pilot study and to be able to interpretthe statistical test results, more participants would have been required. The χ2 test

94 Chapter 4. Perception of Melody

is not applicable for expected frequencies lower than 5 (Bortz, 1999, p. 170). In ourcase, it would require at least 10 subjects for each sub-group (which was the numberof musicians, but not of the non-musicians).The timing precision of the acoustic piano tones depends on the precision of

the Bosendorfer computer-controlled piano. According to results reported in Sec-tion 2.3.3 (Figure 2.14 on p. 41), the mean timing error in the reproduction is of theorder of 3 ms. Still, sympathetic resonances between the two tones could changetheir loudness and thus bias the ratings. To overcome this and control timing preci-sion, the stimuli used in the following experiments were created by adding togetherdigital recordings of individual piano tones.

4.4. Perception of dyads manipulated in tone balance and synchrony 95

4.4 Perception of dyads varying in tone balance

and synchrony

This section reports a set of three experiments conducted in a single test session.This work was first presented at the 7th International Conference on Music Per-ception and Cognition at the University of New South Wales in Sydney, Australia(Goebl and Parncutt, 2002).

4.4.1 Introduction

In a preliminary experiment on the perception of harmonic dyads (see Section 4.3,and Goebl and Parncutt, 2001), we found no significant difference between the per-ceptual prominence of a delayed and an anticipated higher tone. In that pilot study,we used two tones of equal intensity. We also found that musically trained partici-pants could report the correct order of two (equally intense) stimuli at asynchroniesexceeding about ±40 ms, irrespective of tone complexity.However in piano performance, anticipation of an emphasised voice usually oc-

curred in parallel with an increase in intensity of that particular voice, especiallywhen the voices fell into one hand. In the present experiments, we are thereforeinterested how the perceptual prominence of individual tones changes when relativetiming is varied as well as the intensity balance of the tones.

In the present study, we used three different types of tone (pure, complex, piano).We first asked which relative dynamic level of the tones of a harmonic major-sixthdyad produced an impression of equal loudness or salience (balance adjustment, Ex-periment I). This was done separately for each listener and for each of three differenttone types. Using these data as a baseline, we then investigated the relative percep-tual salience of the tones of a harmonic dyad in which relative timing and relativeintensity were varied systematically (Salience perception, Experiment II). Finally,we investigated listeners’ sensitivity to asynchrony detection in the context of varia-tion in tone intensity balance of the dyads (Asynchrony detection, Experiment III).

The two artificial sounds (pure and sawtooth) were included into the experimentin order to be better able to control the psycho-acoustic effect of masking. Two puretones apart in pitch as far as an interval of a major sixth should not mask each other,while the partials of the two complex tones will fall within the same critical bandsand therefore partly mask each other. Hence, the salience of the upper complex toneshould be rated higher with increasing asynchrony because it is no longer maskedby the lower complex tone as it was in synchrony.

Moreover, it is hypothesised that masking changes the listeners’ sensitivity toasynchrony. A loud tone coming early will prevent a softer later arriving tone to beheard as beginning later. The opposite condition (a soft tone followed by a louderone) will be perceived as more asynchronous.

96 Chapter 4. Perception of Melody

4.4.2 Determination of balance baseline (Experiment I)

This experiment determined for each participant the relative dynamic level of thetones of a major-sixth dyad at which they sound equally loud to the participant.

Method

Participants The 26 participants were aged between 23 and 32 years. All weremusicians who had been playing their instrument regularly for an average period of18.9 years. Twenty-three of them had studied their instrument at a post-secondarylevel for an average period of 8.3 years. They comprised 15 pianists, 5 violinists,1 singer (a tenor), 3 cellists, 1 double bass player, and 1 composer (who regardedthe computer as his main instrument).

Stimuli Three tone types were used: pure, sawtooth with 16 partials (−6 dBper octave), and piano. To avoid uncontrolled asynchronies and sympathetic vibra-tions, harmonic dyads of piano tones were created by digital superposition of indi-vidual monophonic tone recordings. The MIDI velocity values ranged from 79/31(higher/lower tone) to 31/79, in increments of ±2 units (79/31, 77/33, 75/35, etc.).The nominal equality was thus 55/55—a typical mezzo forte. The amplitudes of thepure and sawtooth stimuli were similar to those of the recorded piano sounds.3

Five different dyads were presented to the participants, each spanning the musicalinterval of a major sixth. Three of them comprised piano tones: B4 and G#5(approx. 494 and 831 Hz), C5 and A5 (523 and 880 Hz), and Db5 and Bb5 (554 and932 Hz) respectively.4 The other two dyads were synthetic; one comprised two puretones, the other two sawtooth tones. In both cases the (fundamental) frequencieswere 523 and 880 Hz, corresponding to C5 and A5.

Equipment The acoustic piano tones were played on a computer-controlled Bo-sendorfer SE290 (at every MIDI velocity between 20 and 905) and recorded withAKG (CK91) microphones (placed approximately one meter from the strings, in a6 by 6 meter room) onto a Tascam DA–P1 DAT recorder.6 They were transferreddigitally to the hard disc of a PC using a “Creative SB live! 5.1 Digital” soundcardand stored in WAV format (16-bit, 44.1 kHz, stereo). The pure and synthetic com-plex tones were generated using Matlab software. During the experiment, all sounds

3 The relationship between key velocity (in MIDI velocity units) and peak sound level (in dB)for the 1700 single tones played on the Bosendorfer SE290 was approximated by the expression:−77.2 + 26.1 · log10(MIDI velocity)+5.3 · log10(MIDI velocity)2.

4The given frequency values are calculated and correspond to equal temperament with a A4 at440 Hz. The actual frequency of the lowest partial of each tone will be slightly different from thesevalues due to inharmonicity and pitch shifts.

5In this study, the relation between MIDI velocity and hammer velocity (in meters per second)at the Bosendorfer system was set to be: MIDI velocity= 52 + 25 · log2(hammer velocity).

6The recording took place on November 6, 2001 at the Bosendorfer company in Vienna (Thesame recording as in Section 2.4 on p. 50).

4.4. Perception of dyads manipulated in tone balance and synchrony 97

(a)

C5/A5 pure C5/A5 sawtooth B4/G#5 piano C5/A5 piano Db5/Bb5 piano −10

−5

0

5

10

15

20

25

30

Inte

nsity

diff

eren

ce (

MID

I vel

ocity

uni

ts) n=26

(b)

C5/A5 pure C5/A5 sawtooth B4/G#5 piano C5/A5 piano Db5/Bb5 piano

−4

−2

0

2

4

6

8

10

Inte

nsity

diff

eren

ce (

dB)

n=26

Figure 4.6: Experiment I. Intensity difference between simultaneous tones judged to beequally loud, averaged across 26 participants. Error bars denote 95% confidence intervalsof the means. Vertical axes: (a) relative to MIDI velocity or equivalent; (b) in dB peaksound level. In each case, a positive value means that the higher tone was more intensethan the lower at equal loudness.

were played back via the same sound card and Sennheiser HD 25–1 headphones (di-otic presentation: same signal in each ear). The experiments were controlled by acomputer program that had been developed in a Matlab environment.

Procedure In each trial, participants adjusted the level of two simultaneous tonesrelative to each other until they sounded equally loud. Five trials were presentedin a random order that differed from one participant to the next. The relativeintensities of the two tones at the start of each trial was also selected at random,from 25 possibilities. Participants first adjusted the relative level of the tones inrelatively large increments of ±6 MIDI units (i.e., one tone became 6 units louderwhile the other became 6 units softer, so that the difference in MIDI velocitieschanged by 12 in each step). In the second block, the five stimuli were repeated inthe same order and adjusted in smaller steps of ±2 MIDI units. Participants wereasked to go past the point of equal loudness and return to it from the other sidebefore going on to the next dyad. Each stimulus could be adjusted and repeatedfor an indefinite period. To test reliability, the entire procedure (coarse followed byfine tuning) was repeated. If the mean difference between the results for the firstand second block was larger than 6 MIDI velocity units, a third block was run; thishappened for 5 of the 26 participants. After all three experiments were completed,a questionnaire was filled in. The whole session lasted between 30 and 70 minutes.Participants were paid 20 Euro for their services.

Results and discussion

The mean adjustments over all 26 participants’ median adjustments are plottedin Figure 4.6a against the (equivalent) MIDI velocity differences between the twotones (see Footnote 3, p. 96). The complete rating data by all participants arelisted in Table A.2, p. 168. A positive difference on the y axis indicates that, at

98 Chapter 4. Perception of Melody

equal loudness, the higher tone had greater intensity or MIDI velocity. The datainitially suggest that for all three piano dyads and for the sawtooth dyads, thehigher tones had considerably higher level than the lower tones at equal loudness(salience). Piano tones with the same hammer velocities do not necessarily havethe same peak SPL.7 For example, B4 on our piano samples was always 6 dB moreintense than G#5 played with the same MIDI velocity.8 Once the data have beenadjusted to account for this (Figure 4.6b), the sound level differences in the pianosamples disappeared. Only in the sawtooth sounds there was a significant differencein SPL (of about 6 dB) at equal subjective loudness.

The effect cannot be accounted for by the Fletcher–Munson loudness contoursfor pure tones, which would predict just the opposite tendency. Instead, the effectmay be accounted for by masking between the higher partials. Since lower puretones generally mask higher pure tones more than the reverse (Moore, 1997), higherharmonic complex tones may need to have greater SPL to be perceived as equallyloud as simultaneously sounding lower complex tones with identical temporal andspectral envelopes. The masking effect in the piano tones may have been less promi-nent due to the spectral and temporal variability of the amplitudes of the partials,and because the spectral slope of each tone depends on both intensity and register.The greater spread in the data for the sawtooth tones is consistent with commentson the final questionnaire to the effect that the sawtooth sounds were the hardestof the three tone types to judge, presumably due to unfamiliarity.

4.4.3 Perception of tone salience (Experiment II)

Method

Equipment and participants were the same as in the previous experiment. Eachof the three tone types (pure, sawtooth, and piano) was presented in five intensitycombinations and with five degrees of asynchrony, resulting in 3× 5× 5 = 75 stimuli.The intensity combinations were +20/−20, +10/−10, 0/0, −10/+10, and −20/+20MIDI velocity units, relative to the median levels judged to be equally loud inthe previous experiment; the baselines were maintained separately for each tonetype and for each participant. The asynchronies were −54, −27, 0, 27, and 54 ms(where a negative value indicates that the higher tone began before the lower tone).Regardless of whether the onset was synchronous or asynchronous, the tones alwayssounded together for a total of 350 ms, and faded out simultaneously.

The chosen time differences were typical of melody leads in piano performance.The velocity artifact hypothesis (Repp, 1996a; Goebl, 2001) is based on the simple

7The peak dB value of a piano sample was the maximum value of a RMS smoothed soundsignal. The window size was 10 ms.

8The peak dB values for the same key and the same hammer velocity change strongly withmicrophone position. The lines of equal MIDI velocity of the second channel of our piano recordingsshowed a quite different picture than the first channel (for a similar discussion see Repp, 1997a,p. 1880, and Section 2.4, p. 50).

4.4. Perception of dyads manipulated in tone balance and synchrony 99

“Which of the two tones is louder?”

−54 −27 0 27 541

2

3

4

5

6

7

Rat

ed s

alie

nce

of u

pper

tone

n=25

n=8

n=8

n=59

Pure

+20/−20+10/−10 0/0−10/+10−20/+20

−54 −27 0 27 541

2

3

4

5

6

7

Asynchrony (ms)

Sawtooth

−54 −27 0 27 541

2

3

4

5

6

7 Piano

Figure 4.7: Experiment II. Mean relative loudness ratings over 25 participants as a functionof tone type and asynchrony. Rating scale: 1, lower tone much louder; 4, two equallyloud; 7, upper tone much louder. The five horizontal lines in each panel correspond to the5 intensity combinations (MIDI velocity of upper tone relative to lower tone); the threepanels correspond to the three tone types (pure, sawtooth, piano). Error bars indicate95% confidence intervals of the means across participants.

observation that the faster a piano key is depressed, the earlier the hammer arrivesat the strings. In a typical modern grand piano, when two tones are struck simul-taneously from the key surface, a higher tone that is 20 MIDI velocity units louderthan the lower tone typically sounds about 27 ms before the lower tone (cf. to themodel for melody lead, Section 3.7, p. 78).

Participants indicated which of the two tones sounded louder on a scale from 1(lower tone much louder) to 7 (higher tone much louder). Equal loudness wasindicated by 4. After 13 practice stimuli, the 75 stimuli were presented in a randomorder that was varied from one listener to the next. Each stimulus could be repeatedas often as desired before deciding on a rating.

Results and discussion

In the final questionnaire, participants indicated that this experiment was the mostdifficult of the three. One participant’s results had to be excluded, because he couldnot perform the task at all (as he indicated in the questionnaire).

The mean ratings are plotted in Figure 4.7, separately for tone type (panels),intensity combination (lines), and asynchrony (x axes).9 The complete rating data

9 Due to a programming mistake of the author, the −54 and the −27 ms condition of the puretones were omitted and the 0 ms condition presented three times for the first 17 participants (alsoin Exp. III). The modified n values are specially indicated in Figure 4.7 and Figure 4.8, the different

100 Chapter 4. Perception of Melody

are listed in Table A.3, p. 169.A repeated-measures analysis of variance was performed on the ratings with

timbre (pure, sawtooth, piano), asynchrony (5-fold) and intensity (5-fold) as within-subject factors and instrument (piano, other instrument) as a between subject fac-tor.10 It revealed no significant difference in the ratings between pianists and mu-sicians with another main instrument [F (1, 23) = 2.17, p = 0.154]. There weresignificant effects of timbre [F (2, 46) = 6.18 εG.G. = 0.90, padj = 0.0057], asyn-chrony [F (4, 92) = 3.54 εG.G. = 0.83, padj = 0.0153], and intensity [F (4, 92) =525.26 εG.G. = 0.46, padj < 0.001]. The two-way interaction between timbre andasynchrony [F (8, 184) = 2.95 εG.G. = 0.65, padj = 0.0140] and between timbreand intensity [F (8, 184) = 22.36 εG.G. = 0.46, padj < 0.001] were significant,whereas the interaction asynchrony × intensity not [F (16, 368) = 1.85 εG.G. =0.51, padj = 0.0692]. The three-way interaction of timbre × asynchrony × in-tensity [F (32, 736) = 1.07 εG.G. = 0.35, padj = 0.3868] did also not gain significantlydifferent ratings.Regarding timbre, the range of judgements was smallest for pure tones and

largest for the piano tones, suggesting that the difference in salience between twosimultaneous tones depends on the number of audible harmonics in each tone (prob-ably consistent with the masking hypothesis advanced above).To evaluate whether anticipation or delay of the tones changed their loudness

judgement, linear contrasts were performed on the asynchronies (+1, +1, 0, −1, −1)separately for each intensity combination and the sawtooth and the piano sounds.11

The results of these contrasts are listed in Table 4.1. Only three intensity con-ditions showed significantly louder ratings at anticipation in comparison to delay.Surprisingly, none of the piano tones showed such results.To conclude, in the case of dyads perceived loudness was primarily controlled

by the loudness of the tones presented. Relative timing did not help to change theratings. Only at the sawtooth sounds, there was an advantage for anticipated tonesin comparison to delayed ones. More complex sounds were easier to judge withrespect to their loudness than pure tones.

4.4.4 Asynchrony detection (Experiment III)

The psychoacoustic literature initially suggests that listeners can easily distinguishsynchronous from asynchronous dyads: the temporal order threshold (which tonecame first?) is around 20 ms (Hirsh, 1959), and the threshold for asynchrony de-tection (were the tones synchronous?) can be as low as 2 ms (Zera and Green,

lengths of the error bars depicting 95%-confidence intervals reflect this.10Missing data was introduced because of a programming mistake of the author (see Footnote 9).

They comprised two timing conditions (−54, −27 ms) at the pure tones for the first 17 participants,thus 17 × 5 × 2 = 170 ratings which is 8.7% of the data. For the ANOVA, the missing data wereinterpolated by the ratings of the other participants.

11The linear contrasts were not performed for the pure tones due to the missing data in theanticipation condition (see Footnote 9).

4.4. Perception of dyads manipulated in tone balance and synchrony 101

Table 4.1: Experiment II. Linear contrasts of asynchrony (+1, +1, 0, −1, −1) betweenanticipation (−55 and −27 ms) and delay (+27 and +55 ms), separately for each intensitycombination and two timbre conditions (sawtooth and piano).

Sawtooth PianoIntensity F p F p+20/−20 12.0181 0.0021∗∗ 0.7830 0.3854+10/−10 5.4335 0.0289∗ 1.4839 0.23550/ 0 17.5392 0.0004∗∗ 1.3567 0.2561

−10/+10 3.1571 0.0888 0.1867 0.6697−20/+20 3.7097 0.0666 1.1382 0.2971

1993b). But in (piano) music, where tones have overlapping spectra and unequalloudness (so that one tone is masked by the other), the thresholds are higher. Inthis experiment it was set out to measure these higher thresholds, while additionallymanipulating the intensity balance of the tones.

It is expected that different intensity combinations will clearly influence the asyn-chrony detection threshold. At intensity combinations with a louder tone followedby a softer tone, the relative timing differences have to be larger to be detected assuch by listeners than in the opposite condition.

Method

The procedure, participants, and stimuli were identical to Experiment II. The onlydifference was the question: The participants were asked whether or not the twotones were simultaneous, in a 2AFC paradigm.

Results and discussion

The results are plotted in Figure 4.8 in terms of relative frequencies (ranging from 0to 1). The expected responses were “yes” (1) for synchrony, and “no” (0) for thefour asynchronous conditions. The correct answers are sketched in Figure 4.8 bygrey dashed lines. The dotted lines mark the range in which observed frequenciesare not significantly different from chance according to the χ2 test (cf. Section 4.3.3,p. 89).12 The complete rating data are listed in Table A.4, p. 170.

The synchronous dyads were reliably recognised for the two artificial tone types,independent of relative intensity. But for the piano tones, asynchronous dyads wereoften heard to be synchronous when the louder tone preceded the softer tone (melodylead). This effect also appeared for the synthetic tones; it was weakest for the puretones and strongest for the piano sounds. For instance, the +20/−20 condition (firstrow in Figure 4.8) at −27 ms (sawtooth and piano) was perceived as simultaneous

12The different n values that were due to the missing data of the two anticipated conditions withpure tones (see Footnote 9, p. 99) warp those lines.

102 Chapter 4. Perception of Melody

“Are the two tones simultaneous?”

−54 −27 0 27 540

0.5

1 Pure

+20/−20

n=9

n=9

n=60n=26

−54 −27 0 27 540

0.5

1

+10/−10

−54 −27 0 27 540

0.5

1

Rat

ed s

imul

tane

ousn

ess

0/0

−54 −27 0 27 540

0.5

1

−10/+10

−54 −27 0 27 540

0.5

1

−20/+20

−54 −27 0 27 540

0.5

1 Sawtooth

−54 −27 0 27 540

0.5

1

−54 −27 0 27 540

0.5

1

−54 −27 0 27 540

0.5

1

−54 −27 0 27 540

0.5

1

Asynchrony (ms)

−54 −27 0 27 540

0.5

1 Piano

−54 −27 0 27 540

0.5

1

−54 −27 0 27 540

0.5

1

−54 −27 0 27 540

0.5

1

−54 −27 0 27 540

0.5

1

Figure 4.8: Experiment III. Mean ratings over 26 participants as a function of asynchrony.The answers could be “yes” (1) or “no” (0). Different rows of panels for the differentintensity combinations (relative MIDI velocity upper/lower tone), separate columns ofpanels for the three tone types. The grey dashed lines in the background indicate the‘correct’ answer; the two dotted lines denote the boundaries within which the observedfrequencies are not significantly different from chance. The missing data (reflected in thedifferent n values in the pure tone condition, left column) of the two anticipated conditionsat the pure tones warp those lines.

4.4. Perception of dyads manipulated in tone balance and synchrony 103

by around 70% of the participants (significantly different from chance), whereas at+27 ms it was heard as asynchronous by almost everyone. This asymmetry was alsofound at the opposite intensity combination (−20/+20 MIDI velocity units, bottomrow) for sawtooth and piano sound, as well as for the +10/−10 condition with pianosounds.

Two possible explanations may be advanced for these asymmetries. The firstinvolves familiarity with piano music: either listeners are insensitive to melody leadin piano performance (due to overexposure); or the listeners in this experiment onlynoticed, and hence correctly identified, asynchrony when a musically unfamiliarcombination of relative loudness and timing was presented. Those participants whowere also pianists might additionally have associated the characteristic sound ofmelody lead with the (kinesthetic) sensation of fingers simultaneously striking thekey surface. The second explanation is more psychoacoustic in nature: the effectcould be due to forward masking. A louder, anticipated tone masks a softer toneby forward masking, which is stronger than backward masking and attenuates thefollowing softer tone for about the same period of time as typical melody leads show(some tens of milliseconds, Zwicker and Fastl, 1999). Simultaneous masking appliesespecially among the partials of complex tones spanning typical music intervals,consistent with the finding that the observed asymmetry is stronger for complexthan for pure tones.

4.4.5 Conclusion

These three experiments investigated the perception of dyads manipulated system-atically in tone balance and synchrony. The two main questions were (1) whetherrelative timing of the tones of the dyads changed their perceptual salience and (2)whether the detection of asynchronies was influenced by imbalance of tone intensi-ties.

Since the variation of tone balance was new in this series of experiments, it wasimportant to find out what tone intensities engender the subjective impression ofequal loudness in two-tone sonorities. Participants adjusted a tone balance to soundequally loud when the individual tones were equal in sound level rather than equalin terms of MIDI velocity.

The perceived salience of the tones was primarily determined by the relative in-tensity of the tones. Relative timing did not change the ratings, except for sawtoothsounds where anticipated tones received slightly higher salience ratings than delayedtones. Loudness ratings were more clear with more complex sounds. Participantsused a smaller range of the rating scale with pure tones, but almost the whole rangewith real piano sounds.

Asynchronies were generally well detected when the two tones were at least 27 msapart in time. However, there was a strong effect of relative intensity. Asynchroniesas large as 55 ms were rated randomly (participants could not tell whether they weresynchronous or not), when the first tone was louder than the second. These intensity

104 Chapter 4. Perception of Melody

combinations (early and loud) corresponded to the typical melody-lead situation.Either the participants were so familiar with these combinations of asynchrony andimbalance so they detect asynchronies only in unfamiliar combinations of relativetiming and intensity, or due to temporal masking the onset of the weaker secondtone gets masked by the first louder tone. These new insights into the perceptionof asynchronous onsets with variations in loudness seem to explain why pianists arelargely unaware of the melody lead (Parncutt and Holming, 2000).

4.5. Perception of asynchronous and unbalanced tones in chords 105

4.5 Perception of chords and chord progressions

varying in tone balance and synchrony (Ex-

periments IV and V)

The following three experiments are an extension and continuation of the previousexperiments (Goebl and Parncutt, 2002, see Experiments I–III) They use chordsinstead of dyads (Exp. IV), sequences of chords (Exp. V), and real music (Exp. VI)as stimuli. They were included in a single test session. Experiments IV–VI will bepresented at the 5th Triennial ESCOM conference in Hanover, Germany (Goebl andParncutt, 2003).

4.5.1 Introduction

In these experiments, we again investigated how asynchrony enhances the perceptualsalience of individual voices with respect to changes in relative intensity. Findingsfrom the previous experiments supported the hypothesis that changes in intensitywere the dominating factor and onset asynchrony had only a marginal influence onthe percptual salience of a given tone of a chord. However, these experiments useddyads as stimuli. Real music contexts may as well involve more than two voices ata time (e.g., four voices as in the excerpts of Chapter 3, see p. 60). The perceptualattention by the listeners varies with pitch position in a chord. As reported inSection 4.1.1 (p. 80), there is empirical evidence that outer voices (soprano or bass)receive greater perceptual attention than inner voices, and that additionally thehighest voice, in the classic-romantic repertoire mostly the melody, enjoys generallyperceptual advantage (Huron, 1989; DeWitt and Samuel, 1990; Palmer and van deSande, 1993; Palmer and Holleran, 1994; Repp, 1996c).

In the following two experiments (IV and V), we used piano triads instead ofdyads. We asked musicians to judge the perceived loudness of a particular tone ina triad that was simultaneously manipulated in relative onset timing and intensitybalance by up to ±55 ms and +30/−22 MIDI velocity units. Experiment V had thesame design as Experiment IV, except that it used sequences of chords instead ofisolated chords. Each chord was repeated five times giving the impression of a shortmusical unit in 4/4 metre. The loudness of the individual voices had to be ratedas before. With this design, we were able to test whether streaming (introducedby a temporal shift of one voice) changed the perceived salience of a given voice.Moreover, we were able to test whether the direction of the asynchrony (melodylead versus melody lag) had any influence of the perceived salience. If streaminginfluenced the salience of an individual voice, we would expect different results thanin Experiment IV.

Four research questions were addressed here. First, it was investigated whetherfindings from the previous experiments (tone imbalance far more important thanasynchrony) could be repeated with three-tone chords and three-tone chord pro-

106 Chapter 4. Perception of Melody

gressions. Second, the influence of vertical position in the chord on the perceptualsalience was examined. Third, whether streaming as introduced with the use ofchord progressions enhances the perceptual salience of an individual voice. Andfourth, whether the direction of relative timing is crucial for the perceptual salience(e.g., anticipation enhances the perceptual salience while delay attenuates it).

4.5.2 Method

Participants

Experiments IV, V, and VI were included in a single test session. The 26 musi-cally trained listeners comprised 17 pianists and 9 other instrumentalists (violin,violoncello, and acoustic guitar). They had been playing their instruments regularlyfor an average of 17.9 years (s.d. = 5.5 years). Twenty-three of them had studiedtheir instruments at post-secondary level for an average period of 7.2 years (s.d. =4.1 years). Their ages ranged from 19 to 35 with an average of 26.5 (s.d. = 4.5).

Stimuli

Two chords consisting of three tones were used. The two upper tones spanned aninterval of a major sixth as in the previous experiments. The bottom tone waseither a major sixth or a fifth below the middle tone so that one chord resulted in amajor triad (second inversion) and the other in a minor triad (root position). Thesetwo chords appeared randomly also in transpositions one semitone higher and lower.The two chords and their transpositions are shown in Figure 4.9. The transpositionswere arranged randomly so that every transposition occurred equally often.

In each chord, one tone (the target) was shifted in time and varied in intensityrelative to the other two. The five asynchronies were −55, −27, 0, 27, and 55 ms.The five intensity combinations were [target tone/other two tones]: 30/−12, 15/−5,0/0, −12/5, −22/12 MIDI velocity units relative to a medium intensity of 50 MIDIvelocity units. These intensity combinations were chosen so that the differences invelocity imply the above named asynchronies according to the velocity artifact (seeSection 3.3.3, p. 63 and the model for melody lead, Section 3.7, p. 78). In additionto that, the pairs of velocities were supposed to produce a sonority the loudnessof which remains approximately constant over the five combinations. The targettone could occur in any of the three voices of the chord (upper, middle, or lower).The listeners’ attention was directed to the target by a priming tone which started

&

?

˙

˙#

˙

˙

˙

˙

˙#

˙

˙

˙

˙

b

˙# ˙n ˙b ˙# ˙n ˙b1 2

Figure 4.9: Experiment IV and V: The twochord combinations and their transpositionsone semitone upwards and downwards.

4.5. Perception of asynchronous and unbalanced tones in chords 107

&

?

c

c

˙

Ó

œ

œ

œ

œ

œ

œ

œ

œ

œ œ œ œ

˙

˙

˙

Figure 4.10: Experiment V. Example fora possible stimulus with a primer for themiddle voice. The inter-onset interval ofa quarter note is 300 ms (quarter note =100 beats per minute).

1200 ms before the tested chord and sounded for 600 ms. The intensity of the primerwas always constant at a medium intensity of 50 MIDI velocity units.The test design was therefore: 2 chords (randomly transposed one semitone up

or down) × 3 target voices × 5 intensities × 5 asynchronies = 150 stimuli for eachblock of the experiment.The stimuli for Experiment V were the same as in the previous experiment:

2 chords (randomly transposed one semitone up or down) × 3 target voices ×5 intensities × 5 asynchronies. Each chord was repeated five times (see Figure 4.10).The inter-onset interval was 300 ms except the last chord was 330 ms after theprevious event in order to give an impression of a 4/4 metre. Each chord was fadedout shortly before the new chord started (21 samples at a 44100 Hz sampling rate).The last chord sounded its full length (uncut). The sequence of chords soundedfairly natural, like five portamento chords linked together without pedal. Again thelisteners’ attention was guided by an acoustical primer which came 1200 ms beforethe stimulus, lasted for 600 ms and was held at a constant intensity of 50 MIDIvelocity units. The time interval between primer and the first chord was equal tothe time interval between the first and the last chord (as represented in standardmusic notation in Figure 4.10).

Equipment

The piano sounds were taken from acoustic recordings of tones produced by theBosendorfer SE290.13 Each of the 97 keys were played with MIDI velocities from10 to 11014 in steps of two MIDI velocity units; all tones had a duration of 300 ms inthe file. The sounds were transferred digitally onto computer hard disk15 and storedin WAV format (16-bit, 44.1 kHz, stereo).As a result of the adjustment experiment in Section 4.4.2 (p. 96), the intensity of

the recorded piano samples was referred to in terms of peak sound level in decibelsand not in terms of MIDI velocity units. The mean relation between MIDI velocityunits and dB peak sound level of the 1700 recorded tones from the Bosendorfer was

13The Bosendorfer SE290 played back a computer-generated file in the Bosendorfer file formattriple. The recording were performed on January 7, 2002 at the Bosendorfer company in Viennausing a TASCAM digital audio tape recorder (DA–P1) and AKG (CK91) stereo microphones withORTF positioning (see also Section 2.4 on p. 50). For the present study, only the first channel ofthis recording was included.

14Using the same velocity map as in Experiment I–III (see Chapter 4.4): MIDI velocity= 52 +25 · log2(hammer velocity).

15Using the digital input of a “Creative SB live! 5.1 Digital” soundcard.

108 Chapter 4. Perception of Melody

Figure 4.11: Screen shot of the graphical user interface used for Experiments IV and V.For the participants, the experiments were numbered starting with I. In this figure thefirst repetition is displayed (Ia).

approximately

pSL = −77.19 + 26.11 · log10(MIDI vel) + 5.277 · log10(MIDI vel)2 (4.2)

(see Section 4.4.2, p. 96). For each stimulus tone, the sample that was closest in peaksound level was chosen out of the pool of recorded sample tones. Since the peaksound level increment between one sample and its next louder (or softer) samplewas comparably small (see Figure 2.20 on p. 51), the introduced rounding error wasnegligible. All tones were added up on hard disk to avoid sympathetic resonances.The sound samples were all faded out after 500 ms (fade out time 15 ms) so thatthe overall duration of each stimulus did not exceed 600 ms.

Procedure

A graphical user interface was designed for Experiments IV, V, and VI by the author,a screen shot of which is depicted in Figure 4.11. The computer program guidedthe users through the experiment. They received instructions and a short trainingsession before each task.

4.5. Perception of asynchronous and unbalanced tones in chords 109

Each stimulus was preceded by an acoustical primer indicating the pitch of thetarget tone. At the same time, the chord was presented in musical notation with anarrow pointing at the target tone. In these two experiments they were asked: “Howloud does the target tone/voice sound to you (in comparison to the other tones)?”The participants answered by clicking on seven radio buttons representing a 7-pointscale from 1, “very much softer,” over 4, “equally loud” to 7, “very much louder”(see Figure 4.11). Each stimulus could be repeated as often as the listeners wishedto do so (using the “Play again” button) until they were sure about their judgement.The next stimulus was played when the “Next” button was pressed.

A short training session familiarised the participants with the stimulus materialand the graphical user interface. In the training session, only extreme cases (veryloud, very soft) were presented to the participants. In this training session, theywere supervised by the computer so that they had to revise their rating when itwas opposite the expected (e.g., when the target tone was played very soft and theyrated it to be “very much louder”). This feed-back loop was introduced to assurethat the participants rated the correct tone in the chord. They had to rate 10 stimuli‘correctly’ before they could proceed to the actual experiment.

The user interface played the stimuli in random order and stored the stimulusinformation and the ratings of the participants on hard disk. The stimuli werepresented diotically (same signal in each ear)16 to the participants via headphones(Sennheiser HD 25–1).

The participants completed two blocks of Experiment IV, one before and oneafter Experiment V. The order of experiments within the session was thus IVa, V,IVb, VI. After having finished the whole listening test, they filled in a questionnaireto indicate their age, musical skill, and feed back about the test. The whole listeningtest took between 60 and 90 minutes. The participants were paid 15 Euros for theirservices.

4.5.3 Results and discussion

The participants had to rate the 150 stimuli three times: once for Exp. IVa, once forExp. V, and a third time for Exp. IVb. Thus, they became more and more familiarwith the stimulus material. This effect is reflected in the duration of each block andin the number of repetitions of each stimulus.

The participants typically needed about 24 minutes for the first block of Exper-iment IV, 19 minutes for Experiment V, 18 minutes for the repetition of Exp. IV,and 8 minutes for the last experiment. They heard each stimulus more than twotimes in the first round of Experiment IV, and less than two times in its repetitionand in Exp. V, and only slightly more than once (which is the minimum possible)in Experiment VI (see Figure 4.12).17

16For the stimuli of Experiments IV and V. The stimuli from Experiment VI, the stimuli arepresented stereo (as recorded from the SE grand piano); see Section 4.6, p. 118.

17Two separate repeated-measures analyses of variance (ANOVAs) on the average number of

110 Chapter 4. Perception of Melody

Repetition Duration

Exp 4a Exp 5 Exp 4b Exp 61

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

Mea

n nu

mbe

r of

rep

etiti

ons

Exp 4a Exp 5 Exp 4b Exp 65

10

15

20

25

30

Mea

n du

ratio

ns (

Min

utes

)

Figure 4.12: Experiments IV–VI. Number of stimulus repetitions averaged over all stimuliand participants (left panel) and average duration (right panel) separately for the fourexperimental blocks. The experimental blocks are sorted in the order of their appearancein the listening test session.

Participants came faster to their judgements in Experiment V and were thentrained to cope faster (and presumably more accurately) with the second block ofExperiment IV. This coincides with oral communication with the participants. 12 ofthem indicated that Experiment V was easier, the repetition of Exp. IV easier thanits first block, but “exhausting” as well, and Exp. VI “relaxing” and “a relief” afterthe artificial stimuli (see Section 4.6, p. 118). We can therefore assume that Exp. IVbwas passed by the participants with higher skill after being trained by the repeatingchords of Exp. V, but probably also with some noise in the answers that might havebeen due to fatigue and decreasing concentration.

Effects of intensity balance and asynchrony

Experiment IV To investigate the main issues of this experiment—the effects ofvoice position, asynchrony, and relative intensity on the perceptual salience—tworepeated-measures ANOVAs were performed on the ratings of the two repetitionsof Experiment IV separately, with voice (upper, middle, and lower), asynchrony(5-fold), and intensity (5-fold) as within-subject factors and instrument (pianist,non-pianist) as a between-subject factor. The data for this ANOVA are listed inTable A.5, p. 177 and Table A.6, p. 178. There was no significant difference be-tween the ratings of the pianists and the ratings of performers of other instruments([F (1, 24) = 3.68, p = 0.067], [F (1, 24) = 0.12, p = 0.73] for IVa and IVb respec-

repetitions and on the average duration by experiment (4a, 4b, 5, and 6) revealed significanteffects of experiment [F (3, 23) = 19.28, p < 0.001 and F (3, 23) = 89.93, p < 0.001, respectively].Bonferroni post-hoc tests confirm that all means are significantly different from each other exceptthe means of Experiment 4b and 5 (for both dependent variables). These two variables relateto each other; the more repetitions a listener wishes to hear, the longer it will take to finish theexperiment.

4.5. Perception of asynchronous and unbalanced tones in chords 111

tively), nor did any of the interactions between instrument and the other variablesachieve significance. Familiarity with piano sounds can therefore be excluded ashaving an influence on the perception of loudness of piano sounds.

Effects of voice [F (2, 48) = 59.18, εG.G. = 0.86, padj < 0.001], [F (2, 48) =43.08, εG.G. = 0.97, padj < 0.001], asynchrony [F (4, 96) = 11.48, εG.G. =0.84, padj < 0.001], [F (4, 96) = 6.52, εG.G. = 0.86, padj < 0.001], and inten-sity [F (4, 96) = 495.69, εG.G. = 0.81, padj < 0.001], [F (4, 96) = 349.51, εG.G. =0.45, padj < 0.001] are significant in both blocks of the experiment. The five inten-sity combinations were rated as expected: the louder the target tone, the louder itwas rated. The lowest voice was rated generally higher than the other voices. Therange of all ratings was larger when the target tones was in the highest voice. Therange of ratings became smaller in the repetition of this experiment.

Also the two-way interactions were all significant,18 even if corrected accordingto the Greenhouse-Geisser correction of violations of sphericity.

The three-way interaction of voice × asynchrony × intensity was also significantfor both blocks of Experiment IV [F (32, 768) = 3.82, εG.G. = 0.39, padj < 0.001],[F (32, 768) = 3.72, εG.G. = 0.33 padj < 0.001]. These interactions are plottedseparately for the two blocks in Figure 4.13a/b.

Experiment V As in the previous experiment, a repeated-measures ANOVA onthe ratings of Experiment V was conducted with voice (upper, middle, lower), asyn-chrony (5-fold), and intensity (5-fold) as within-subject factors and instrument (pi-anist, non-pianist) as a between-subject factor. Again, there was neither a significanteffect of instrument [F (1, 24) = 3.37, p = 0.079], nor any of the interactions betweeninstrument and the other factors were significant. No systematic effect of instrumentcould be observed. Thus, the participants did not rate differently when the playedpiano or any other instrument.

The effects of voice [F (2, 48) = 107.72, εG.G. = 0.96, padj < 0.001], asyn-chrony [F (4, 96) = 33.63, εG.G. = 0.57, padj < 0.001], and intensity [F (4, 96) =639.15, εG.G. = 0.52, padj < 0.001] were all significant, as well as the 2-way inter-actions between the repeated-measures factors.19 Again, the three-way interactionof voice × asynchrony × intensity was also significant [F (32, 768) = 3.57, εG.G. =0.35, padj < 0.001]. It is plotted in Figure 4.14.

All effects were similar to the previous experiment. In contrast to Exp. IV, therange of ratings was larger in all voices, although the lowest voice still did not receive

18There were significant two-way interactions of voice × asynchrony [F (8, 192) = 4.80, εG.G. =0.66, padj < 0.001], [F (8, 192) = 5.98, εG.G. = 0.63, padj < 0.001], voice × velocity [F (8, 192) =14.53, εG.G. = 0.60, padj < 0.001], [F (8, 192) = 10.23, εG.G. = 0.59, padj < 0.001], and asynchrony× velocity [F (16, 384) = 2.93, εG.G. = 0.58, padj < 0.001], [F (16, 384) = 4.08, εG.G. = 0.57, padj <0.001] (always for IVa and IVb, respectively).

19There were significant 2-way interactions of voice × asynchrony [F (8, 192) = 2.50, εG.G. =0.57, padj = 0.040], voice × velocity [F (8, 192) = 15.67, εG.G. = 0.48, padj < 0.001], and asyn-chrony × velocity [F (16, 384) = 6.37, εG.G. = 0.48, padj < 0.001].

112 Chapter 4. Perception of Melody

(a) Experiment IVa

−55 −27 0 27 551

2

3

4

5

6

7Upper voice

Per

ceiv

ed s

alie

nce

of ta

rget

tone

**

−55 −27 0 27 551

2

3

4

5

6

7Middle voice

*

***

−55 −27 0 27 551

2

3

4

5

6

7Lower voice

**

80/3865/4550/5038/5528/62

(b) Experiment IVb

−55 −27 0 27 551

2

3

4

5

6

7

**

Per

ceiv

ed s

alie

nce

of ta

rget

tone

−55 −27 0 27 551

2

3

4

5

6

7

*

Asynchrony (ms)−55 −27 0 27 55

1

2

3

4

5

6

7

80/3865/4550/5038/5528/62

Figure 4.13: Experiment IVa/b. Mean ratings over 26 participants, separately for twoblocks of the experiment (a/b), different voices (panels), intensity combinations (differentmarkers), and asynchronies (x axes). The error bars denote confidence intervals of themeans on 95% level. The asterisks between adjacent temporal events indicate a significantdifference between them accoring to Bonferroni post-hoc tests (∗ p < 0.05, ∗∗ p < 0.01).

4.5. Perception of asynchronous and unbalanced tones in chords 113

Experiment V

−55 −27 0 27 551

2

3

4

5

6

7Upper voice

Per

ceiv

ed s

alie

nce

of ta

rget

tone

**

*

−55 −27 0 27 551

2

3

4

5

6

7Middle voice

Asynchrony (ms)

*

**

−55 −27 0 27 551

2

3

4

5

6

7Lower voice

** **

80/3865/4550/5038/5528/62

Figure 4.14: Experiment V. Mean ratings over 26 participants, separately for differentvoices (panels), intensity combinations (different markers), and asynchronies (x axis).Error bars denote 95% confidence intervals.

very soft ratings. The effect of asynchrony seems to bear more interpretable results,thus this effect will be discussed in the following.

Post-hoc comparisons In order to evaluate whether the temporal effects in thedata reflected significant trends, post-hoc tests were performed according to Bon-ferroni (on the three-way interactions of the three repeated-measures ANOVAs re-ported on p. 111 and p. 111). Significant differences between temporally adjacentconditions are indicated in Figure 4.13 and Figure 4.14 with asterisks (∗ p < 0.05,∗∗ p < 0.01). Only few of the adjacent timing conditions were significantly differentfrom each other so no conclusive interpretations can be drawn from these tests.

In order to test whether anticipation (−55 and −27 ms) changed rating in com-parison to delay (+27 and +55 ms), those asynchrony conditions were linearly con-trasted to each other (asynchrony: +1, +1, 0, −1, −1), separately for each intensitycombination in each voice and each block of the two experiments (5 × 3 × 3 = 45contrasts). The results of these contrasts are listed in Table 4.2.

In Experiment IV, 5 and 6 of these contrasts were significant; in Experiment V,this number increased to 10. Thus, delayed tones tended to sound softer than an-ticipated tones of equal asynchrony and intensity. This trend was stronger in thestreaming experiment (V). However, the effect was quite inconsistent. In two cases,the post-hoc comparisons revealed significant effects in the opposite direction: inExperiment IVa, middle voice (38/55) and in Experiment IVb, upper voice (50/50),the +27 ms condition was rated significantly louder than the corresponding simul-taneous condition.

114 Chapter 4. Perception of Melody

Table 4.2: Linear contrasts of asynchrony (+1, +1, 0, −1, −1) between anticipation (−55and −27 ms) and delay (+27 and +55 ms), separately for each intensity combination ineach voice and each block of the two experiments (IVa/b and V).

Upper voice Middle voice Lower voiceIntensity F p F p F p

Exp. IVa80/38 0.2279 0.6479 0.0660 0.7994 3.3539 0.079565/45 0.3179 0.5781 1.9761 0.1726 7.9464 0.0095∗∗

50/50 0.0667 0.7983 17.0903 0.0004∗∗ 0.0020 0.964938/55 4.1516 0.0528 0.0238 0.8786 7.8970 0.0097*28/62 10.2152 0.0039∗∗ 0.2282 0.6372 6.1162 0.0209*

Exp. IVb80/38 0.0156 0.9005 6.3255 0.0190* 0.0064 0.936865/45 6.5128 0.0175∗∗ 0.2971 0.5907 0.0612 0.806750/50 1.2158 0.2811 9.1419 0.0059∗∗ 0.1064 0.747138/55 9.4499 0.0052∗∗ 5.2994 0.0303* 0.6748 0.419528/62 23.9241 0.0000∗∗ 3.6664 0.0675 0.6839 0.4164

Exp. V80/38 4.9993 0.0349* 2.9992 0.0961 2.5730 0.121865/45 8.9578 0.0063∗∗ 23.0551 0.0000∗∗ 15.7706 0.0006∗∗

50/50 10.1842 0.0039∗∗ 1.7277 0.2011 32.0123 0.0000∗∗

38/55 36.2793 0.0000∗∗ 34.1422 0.0000∗∗ 3.1109 0.090528/62 10.4704 0.0035∗∗ 10.5604 0.0034∗∗ 0.2784 0.6026

Effects of chord, transposition, and voice

Experiment IV The previous section examined the influence of tone balance,asynchrony, and position in the chord on the perceptual salience. In this section,the effects of the two types of chords, the three transpositions, and the position ofthe target tone within the chord on the listeners’ ratings were investigated.

These independent variables were introduced to check for effects of intensityfor the indivudual samples involved in these experiments. Since for this (and inthe next) experiment, the tones for the experiments were chosen from the pool ofsampled sounds with respect to their peak sound level (dB) and not according to theMIDI velocity that produced them, it was evaluated here whether the participantsrated the loudness of the target tones more with respect to peak sound levels ormore with respect to MIDI velocity values. In Figure 2.20 on p. 51, some tonesshowed (partly considerably) higher sound levels at all dynamic levels compared toothers. If a particular sample (A) showed considerably lower peak sound levels forthe same MIDI velocity than an adjacent one (B), it could be that listeners rated it(A) louder because the sample involved in the test was produced by a higher MIDIvelocity values than the sample from the adjacent note (B).

A repeated-measures analysis of variance was conducted on the ratings with

4.5. Perception of asynchronous and unbalanced tones in chords 115

Chord 1 Chord 2

3

3.5

4

4.5

5

Transposition −1

Per

ceiv

ed m

ean

salie

nce

of ta

rget

tone

Chord 1 Chord 2

3

3.5

4

4.5

5

Transposition 0

Chord 1 Chord 2

3

3.5

4

4.5

5

Transposition 1

Upper voiceMiddle voiceLower voice

Figure 4.15: Experiment IV. Perceived salience (ratings) as an interaction of voices (lines),chords (x axes), and transpositions (panels). Ratings are averaged over the two repetitionsof Experiment IV. Error bars denote 95% confidence intervals.

repetition (IVa, IVb), target voice (upper, middle, lower), chord (major, minor)and transposition (−1, 0, +1) as within-subject factors.20 It revealed significanteffects of voice [F (2, 50) = 61.1, εG.G. = 0.93, padj < 0.001],21 chord [F (1, 25) =56.99, εG.G. = 1.00, p < 0.001], and transposition [F (2, 50) = 10.74, εG.G. =0.84, padj < 0.001], but no significant effect of repetition [F (1, 25) = 0.05, εG.G. =1.00, p = 0.83]. The interaction of interest between voice, chord, and transpositionwas highly significant [F (4, 100) = 18.04, εG.G. = 0.89, padj < 0.001]. It is plottedin Figure 4.15.

The three independent variables—chord, transposition, and target voice—inter-acted significantly indicating that participants perceived the individual tones thatproduced the tested sonorities as differently loud on average. Participants did notrate the two repetitions of this experiment differently. It can be seen that the lowervoice was always rated loudest except for the minor chord transposed one semitoneupwards. In that condition it was heard equally loud with the others. It is not clearhere, whether the different ratings were context effects (e.g., attention attracted tothe highest tone in the context at transposition +1) or due to different subjectiveintensities of the different piano samples.

Experiment V Similarly to the previous experiment, the effects of voice, chordand transposition were evaluated for Experiment V. A repeated-measures ANOVAwas performed on the ratings with target voice (upper, middle, lower), chord (major,

20The other factors (intensity and asynchrony) were averaged out for this analysis in order toreduce the degrees of freedom.

21The adjusted p values are computed according to the Greenhouse-Geisser correction. Thecorrected degrees of freedom are not reported.

116 Chapter 4. Perception of Melody

Chord 1 Chord 2

3

3.5

4

4.5

5

Transposition −1

Per

ceiv

ed m

ean

salie

nce

of ta

rget

tone

Chord 1 Chord 2

3

3.5

4

4.5

5

Transposition 0

Chord 1 Chord 2

3

3.5

4

4.5

5

Transposition 1

Upper voiceMiddle voiceLower voice

Figure 4.16: Experiment V. Perceived salience (ratings) as an interaction of voices (lines),chords (x axes), and transpositions (panels). Ratings averaged over the two repetitions ofExperiment IV. Error bars denote 95% confidence intervals.

minor) and transposition (−1, 0, +1) as within-subject factors.22 It revealed signifi-cant effects of voice [F (2, 50) = 90.44, εG.G. = 0.92, padj < 0.001], chord [F (1, 25) =4.97, εG.G. = 1.00, p = 0.035], and transposition [F (2, 50) = 11.14, εG.G. =0.82, padj < 0.001]. As in Experiment IV, the interaction between voice, chord, andtransposition was highly significant [F (4, 100) = 18.4, εG.G. = 0.61, padj < 0.001].It is plotted in Figure 4.16. The results were very similar to the previous experi-ment. Participants gave different average loudness ratings to the different tones ofthe chords, although the samples of Experiment IV and V were identical (see above).

In both Figure 4.15 and Figure 4.16, the lower voice received generally louderratings as already reported earlier. But especially chord 2 (the minor chord) wasrated softer when it was transposed one semitone upwards. This trend might be dueto the attribution of intensity. The lower voice in chord 2 transposed one semitoneupwards (note number 58) depict higher peak sound level than the lower voice ofchord 2, transposition +1 (note number 56). This tendency was reflected in thedata, that is, the lower voice in chord 2 was rated softer than the lower voice inchord 1 (always transposition +1). On the other hand, although the upper and themiddle voice were the same tones in the two chords (again transposition +1), theywere rated considerably different in Experiment V (Figure 4.16). It could be thatwe observed an effect of chord type that could be explained by stability of the chordtone within the chord (once the third, once the fifth). Since these findings were notin the main focus of this study, further evaluations cannot be advanced here.

To summarise this examination of effects of chord, transposition, and voice, par-ticipants rated differently along these independent variables. The effect of voice

22The other factors (intensity and asynchrony) were averaged out for this analysis in order toreduce degrees of freedom.

4.5. Perception of asynchronous and unbalanced tones in chords 117

(lower voice received higher salience) was replicated from the analyses performedabove. The effects and interactions of chord and transposition cannot entirely ex-plained here. One possible explanaition was the attribution of intensity throughpeak sound level instead or MIDI velocity. Especially in the case of chord 2, trans-position +1, and lower voice, there was a possible explanation in Figure 2.20, p. 51(see above). The question of whether MIDI velocitities or peak sound levels betterdescribe the intensities of recorded sound samples is further examined and discussedin Section 4.7, p. 127.

4.5.4 Conclusion

To conclude, the main cue for the perceived loudness of a tone or voice was intensity;the effect of temporal shifting was relatively small and inconsistent. Synchronybecame relevant only when intensity was absent as a cue (voices equally loud) orwhen the target tone/voice was very soft. In the latter case, anticipation helpedto overcome the spectral masking that occurred when the tones were simultaneous.Lower tones or voices were generally rated higher than upper voices. This findingmight also be explained by spectral masking: lower tones mask higher tones morethan vice versa. There was a small trend of direction of asynchrony. An earlytone received slightly higher salience ratings than a delayed one. The ratings of themusicians were independent of whether they had piano as their main instrument oranother. The already small temporal effects (of Experiment IV) were marginallyreinforced through streaming in Experiment V.

118 Chapter 4. Perception of Melody

4.6 Asynchrony versus relative intensity as cues

for melody perception: Excerpts from a ma-

nipulated expressive performance (Experi-

ment VI)

4.6.1 Background

This is the last of a series of experiments on the perceptual salience of individualvoices in multi-voiced musical contexts. The aim here is to test and replicate previousfindings in a real music situation. Manipulated performance files with an artificiallyadded pedal track are played back on the Bosendorfer computer grand piano andpresented to listeners via headphones. They had to judge the relative loudness oftwo selected voices that were varied in intensity (in terms of MIDI velocity units)and asynchrony (shifted back and forth in time) as in the previous experiments.

This experiment was inspired by and designed similarly to Palmer’s fourth ex-periment (Palmer, 1996, Exp. 4, pp. 46–51). Palmer used the theme of the lastmovement of Beethoven’s piano sonata op. 109. Palmer tested four melodic in-terpretations (exaggerated lower voice, lower voice, upper voice, and exaggeratedupper voice) of this Beethoven theme performed by one pianist. The four melodicinterpretations resulted in increases in melody lead and intensity of the two voicesin question: bass and soprano (see Palmer, 1996, p. 41). She presented these fourversions to musically trained listeners in three conditions: (1) with all of these cuesremoved (without timing and intensity), (2) with timing only (intensity removed),and (3) with timing and intensity (original performances). She did not test a con-dition with intensity variations only (timing removed), which would possibly haveaffected her findings significantly. The listeners indicated which of the two voicesin question was the melody as intended by the performer on a 6-point scale from 1(“Very sure it’s the upper voice”) to 6 (“Very sure it’s the lower”). She found nodifference between condition 1 and 2, except for expert pianists. They detected thecondition with melody lead only (2) slightly better than the neutral one (1).

In the current experiment, all combinations of asynchrony (melody lead andlag) and intensity differences were examined. From the previous experiments, welearned that asynchrony hardly changed the perceptual salience of a voice, regard-less of whether by shifting it forwards or backwards in time. The hypothesis in thisexperiment was the same: temporally shifting the melody is only a minor cue fordetecting it as being the melody (most salient voice), while differences in intensity(timbre) are the main criterion. A short excerpt of a piece by Chopin was used(around 20 seconds) for the experiment. In addition to the melody (the highestvoice), a middle voice was manipulated which is a voice that is not particularly em-phasised in a normal musical interpretation (see also Chapter 3, p. 57). The artificialversions were computed from a single performance with all other cues as articulation,pedalling, and expressive timing held constant over all stimuli conditions.

4.6. Asynchrony versus relative intensity as cues for melody perception 119

0 1 2 3 4 5 6 7 8 9

20

30

40

50

60

MID

I vel

oci

ty

&

?

b

b

8

6

8

6

sotto voce

1J

œ œ

J

œ

4J

œ œ

J

œ

œ

J

œ œ

J

œ

œ

J

œ œ

J

œ

13œœ

J

œœœœ

J

œœ

45œ

œ

J

œ

œ

œ

œ

J

œ

œ

.œ œœ

œ

œ

j

œ

œ

œ

J

œ

œ

œ

J

œ

œ

œ

œJ

œ

œ

œ

j

œ

œ œ

œ

123

j

œœ

œ

œ

J

œ

œ

œ

œ 45

j

œ

œ

.œ œ

œ

j

œ

œœ

œ

J

œœ

œ

j

œ

œ

œJ

œ

œ

œ

œ

j

œœ

œ

œ

j

œ

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

.œ œœœ

œ

j

œ

œœ

J

œ

œ

œ

J

œ

œ

œ

œ

J

œœ

.œ œœ.œ

œ

J

œ œ

J

œœ

œ

œ

J

œ

œ

œ

œ

J

œ

œ

œ

J

œ

œœ

J

œœ

0 1 2 3 4 5 6 7 8 90

250

500

750

1000

1250

IOI (

ms)

Score time (bars)

Figure 4.17: Average velocity profile (top panel) and inter-onset intervals (IOIs, bottompanel) of the melody (first voice) of the initial bars of Chopin’s Ballade op. 38 as performedby Pianist 5 against score time in bars (see Section 3.3.1, p. 60). These profiles servedas baselines for the artificially generated performances of Experiment VI. The numbers inthe score against some note heads indicate voice numbers. (Note that the IOI graph justdisplays the time intervals between pairs of adjacent melody tones in milliseconds withoutany correction regarding their nominal length in the score.)

4.6.2 Method

Stimuli

The first 9 bars of Chopin’s Ballade, op. 38 (F major) were chosen to be the testexcerpt. As the two possible melodic interpretations to be tested, the first and thethird voice were selected (see voice numbering in Figure 4.17). To avoid too arti-ficially sounding performances, one expressive performance of this piece was takenand modified in order to control the experimental conditions. The timing profilestemmed from the expressive timing of the melody (highest voice) of Pianist 5’sperformance of that piece.23. The intensity profile was calculated from the dynamicprofile of the melody of the same performance (in terms of MIDI velocity units),but reduced in loudness by half of the average distance between melody and accom-paniment. These profiles are plotted in Figure 4.17. They served as the baselinefrom which the test stimuli were calculated and then played back on the Bosendorfercomputer-controlled grand piano.

23The recording session is described in detail in Section 3.3.1, p. 60 This performance was chosenbecause it was highly rated in informal listening tests by several musically trained listeners.

120 Chapter 4. Perception of Melody

1 2 3 4 510

20

30

40

50

60

70Dynamics Ballade (Bars 1−9)

Voices

MID

I vel

oci

ty u

nit

s

Normal1st emphasised3rd emphasised

Figure 4.18: Average dynamics of the first 9 bars of the Ballade performed by 22 pianists(see Chapter 3) separately for the different voices. Error bars denote standard errors ofthe means. The mapping between MIDI velocity units and (final) hammer velocity (m/s)is as in Section 4.4.2, p. 96.

The two designated melodies were shifted in time back and forth, but other asin the previous experiments, the two melodies were only increased in loudness, andnot decreased. The increments in MIDI velocity were obtained from measurementsof 22 expressive performances of that piece (see Chapter 3, p. 57). The averagedynamic levels in MIDI velocity units are plotted in Figure 4.18 separately by thedifferent voices (compare with Figures 3.5, p. 65 and 3.7, p. 68, respectively). Themelody (upper voice) in the ‘normal’ condition (without any specific instruction) wasplayed 12 MIDI units louder than the middle voices (voice 2 and 3), and 24 MIDIunits louder in the emphasised condition. When the middle voice (voice 3) wasasked to be played strongly emphasised, it was played only 20 MIDI units louderthan the middle voices. The left hand (voice 4 and 5) was always about 10 MIDIunits softer than the middle voices.

According to these data, the following loudness combinations were chosen for themelody voice (0, +12, +24 MIDI velocity units), and for the middle voice (0, +10,+20). The accompaniment (voice 4 and 5) were set constantly to −10 MIDI velocityunits. All these velocity values were relative to the expressive loudness profile ofPianist 5 (see Figure 4.17). Parallel to this, the timing was calculated relative tothe timing profile as displayed in Figure 4.17. The manipulations began at thebeginning of the second bar (the opening unisone octaves were not manipulated).

Thus, the experimental design was as follows: 2 voices (upper and middle) ×3 loudness combinations (0, +12/+10, +24/+20 MIDI velocity units) × 5 asyn-

4.6. Asynchrony versus relative intensity as cues for melody perception 121

−54 −27 0 27 54

0/20/−10

0/10/−10

0/0/−10

0/0/−10

12/0/−10

24/0/−10

Asynchrony (ms)

MID

I vel

oci

ty c

om

bin

atio

ns

(Up

per

/mid

dle

/res

t)

Upper voice (1)Middle voice (3)Palmer, 1996, Exp. 4

Figure 4.19: Experiment VI: Test design schema. Two melodic interpretations: uppervoice (triangles) and middle voice (circles), three velocity levels of the ‘melodies’ and fiveasynchronies. Only combinations on the axes and their diagonals are included in theexperiment. The grey area sketches the test design of Palmer’s 4th experiment (Palmer,1996). However, her design is not directly comparable to the present, because she had4 interpretations × 3 conditions.

chronies (−55, −27, 0, 27, 55 ms) = 30 combinations. To reduce the amount ofstimuli, only orthogonal and diagonal combinations of asynchrony and intensitywere included, resulting in 22 combinations (see Figure 4.19). The two combina-tions without asynchrony and loudness variation were identical, but were left in thedesign for symmetry reasons. The duration of one stimulus is 22 seconds.

The note durations of each note was set to 75% of the corresponding IOI. To givean impression of legato throughout the whole excerpt, an artificial pedal track wasadded to the computed MIDI files. The pedal was programmed to be released firstwith the onset of each chord and re-depressed again sufficiently before (150 ms) thecorresponding note off, except if harmony remained constant (see Figure 4.20). Thelocations of pedal change were determined by the author and occurred parallel withchanges in harmony. This kind of pedalling is called syncopated or legato pedalling(Repp, 1997b). The individual pedal changes (represented in MIDI control valuesbetween 0 and 10024) are modelled by a sine curve from 1/2π to 3/2π for a pedalpress and 3/2π to 5/2π for a pedal release within a time period of 160 or 240 ms foreach change, respectively. This time period was informally taken from expressive

24In standard MIDI, the conceptual range of the right pedal is from 0 (released) to 127 (fullydepressed). At the Bosendorfer system, it has 256 steps from 0 to 255. In our case, it was sufficientto press the pedal up to a value of 100.

122 Chapter 4. Perception of Melody

0 1 2 3 4 5 6 7 8 90

20

40

60

80

100

120

20

20

20

20

23

23

23

23

26

26

25

25

32

32

6440

3030

6541

3131

7046

3636

7551

4141

8157

4747

817955

4545

78

54

4444

77

5343

7753

43

7

Time (seconds)

MID

I pit

ch /

Ped

al v

elo

city

6q010101

Figure 4.20: Piano roll display of an excerpt of a stimulus with the first voice shiftedforwards in time (−55 ms) and increased in velocity (+24 MIDI velocity units, representedhere by darker color). The individual MIDI velocity values are printed on top of each tone.The continuous line indicates the artificial (right) pedal track, where 0 means released and127 fully depressed.

performances of the same piece.

Asynchrony and intensity variation started always in the second bar, so that theintroductory octaves on C remained unchanged in all conditions. The generatedMIDI files were converted into the Bosendorfer file format triple (“.kb,” “.lp,” and“.sp” files) and played back on that device.25 The acoustic recording was accom-plished with the identical setup and equipment as in Section 4.5.2. The microphoneswere placed at an imagined player’s position in front of the keyboard.

Procedure

The experiment was carried out with the same graphical user interface as in theprevious experiments (see Figure 4.21). After a short training period (3 stimuli hadto be rated ‘correctly’ before proceeding to the experiment), where the participantgot familiar with the stimuli, they heard the 22 stimuli in random order with thesame headphones as above (Sennheiser HD 25–1), but in full stereo quality. Dueto the ORTF recording technique, a very elaborated spatial impression emerges inthe listener’s mind. The participants saw the music score of the Chopin excerptwith the two voices (voice 1 and 3) marked in colour (red and blue). They wereasked to judge the prominence of the two voices by answering the question: “Whichvoice attracts your attention more?” Answers were allowed from 1 (“very much thelower one” to 7 (“very much the upper one” via 4 (“the two melodic interpretationssound equally important to me”). The background colour of the text boxes varied

25The recording took place on January 9, 2003 at the Bosendorfer company in Vienna on the290–3 SE grand piano.

4.6. Asynchrony versus relative intensity as cues for melody perception 123

Figure 4.21: Experiment VI: Screen shot of the graphical user interface used for this ex-periment (The counting in the experimental session was different from the counting in thisthesis). The two voices are indicated by colour (red, blue), the rating scale correspondinglyconverged between these two colours.

according to the colours used for indicating the two voices. Each stimulus could berepeated an infinite number of times.

4.6.3 Results and discussion

As reported in Section 4.5.3 (see also Figure 4.12, p. 110), the participants indicatedthis experiment to be easiest due to its comparatively naturally sounding stimulusmaterial. They needed 8:10 minutes (s.d. = 2:15 minutes) to accomplish it, whilerepeating each stimulus 1.23 times on average (s.d. = 0.22). However, two par-ticipants (cello, piano) found this experiment to be the most difficult of the testsession.The mean ratings are plotted in Figure 4.22 separately for the two voices (panels),

three intensity combinations (different markers), and five asynchronies (x axes).The asterisks next to the error bars indicate significant differences from the neutralrating according to t-tests for single means (∗ p < 0.05, ∗∗ p < 0.01). The conditionswith the two voices equally loud were rated significantly lower than a rating of

124 Chapter 4. Perception of Melody

−55 −27 0 27 551

2

3

4

5

6

7

* * *

**

**

**

****

**

Asynchrony

Rat

ed s

alie

nce

of u

pper

voi

ce

Upper voice

n.s. *

n.s. **

n.s. n.s. n.s. n.s.

+24 MIDI units+12 MIDI units 0 MIDI units

−55 −27 0 27 551

2

3

4

5

6

7

** ****

** ** **

**** **

Asynchrony

Middle voice

n.s. n.s.

n.s. n.s.

n.s.n.s.

n.s. n.s.

****

**

+20 MIDI units+10 MIDI units 0 MIDI units

Figure 4.22: Experiment VI. Mean ratings over 26 participants, separately for differentvoices (panels), intensity combinations (different markers), and asynchronies (x axes). Er-ror bars denote confidence intervals on 95% level. The asterisks next to the error bars meansignificant differences to a rating of 4 (“The two melodic interpretations sound equally im-portant to me”) according to t-tests for single means (∗ p < 0.05, ∗∗ p < 0.01). Resultsof Bonferroni post-hoc tests are marked between adjacent asynchrony conditions eitherby asterisks or ‘n.s.’ (non significant). Significant non-adjacent asynchrony conditions aremarked, if they were significant (here, only in the right panel).

four (“They sound equally important to me”) in the simultaneous conditions andwhen the middle voice appeared earlier than the upper. As expected, all the otherintensity conditions differed from a rating of four significantly. It is unclear whetherthe middle voice was in fact louder than the upper voice (even though the twowere equally loud in terms of MIDI velocity units) or whether musically trainedparticipants expected the upper voice (melody) to be louder in this musical contextand therefore considered the middle voice to be louder when this expectation wasviolated.

The intensity combination with the two voices equally loud was tested in acombined two-way repeated-measures ANOVA with asynchrony (5) and voice (2)as within-subject factors and instrument (pianist versus non-pianist) as a betweensubject factor. There was a significant effect of voice [F (1, 24) = 25.6, εG.G. =1.00, p < 0.001] and an significant interaction between voice and asynchrony[F (4, 69) = 8.36, εG.G. = 0.91, padj < 0.001], but no significant effects of asyn-chrony alone [F (4, 96) = 2.04, εG.G. = 0.85, padj = 0.1065], nor of instrument[F (1, 24) = 1.8, p = 0.1923].26 Post-hoc tests revealed that all differences in rating

26None of the (2-way or 3-way) interactions between instrument and asynchrony or timing gainedstatistical significance: voice × instrument [F (1, 24) = 3.61, p = 0.0696], asynchrony × instrument[F (4, 96) = 0.61, p = 0.6507], voice × asynchrony × instrument [F (4, 96) = 0.21, p = 0.9318].

4.6. Asynchrony versus relative intensity as cues for melody perception 125

in the first voice did not differ significantly from each other. In the middle voice,only the −55 ms and the −27 ms conditions showed a significant difference from the+27 ms condition, and −55 ms from +55 ms condition (as sketched in Figure 4.22,right panel). All adjacent asynchrony conditions were non-significant at the middlevoice condition.

As in the previous experiments, linear contrasts were calculated on anticipationand delay (asynchrony: +1, +1, 0, −1, −1) for the two voices separately. Theyboth showed significant effects. The upper voice [F (1, 24) = 6.48, p = 0.0178] andthe middle voice [F (1, 24) = 36.18, p < 0.001] showed significantly different ratingsbetween anticipation and delay. In both voices, anticipation enhanced and delayattenuated the loudness ratings when the two voices were equally loud.

Due to the reduced experimental design, four repeated-measures analyses of vari-ance were conducted on the ratings with asynchrony (3) as within-subject factorsand instrument as a between subject factor separately for the intensity combinations+12/+10 and +24/+20. The results of the two ANOVAs concerning the upper voiceconditions (see Figure 4.22, left panel) yielded effects of asynchrony,27 but no effectsof instrument.28 Post-hoc tests according to Bonferroni depicted the delayed voicesto be rated significantly higher than the simultaneous conditions, but not the op-posite (see Figure 4.22). The anticipation of the melody was not rated significantlydifferent from the simultaneous condition.

On the other hand, the results of the two ANOVAs concerning the middlevoice conditions (Figure 4.22, right panel) yielded neither significant effects of asyn-chrony,29 nor any effects of instrument.30

If the middle voice was equally loud in terms of MIDI velocity units, it was ratedtendencially as being more prominent than the (equally loud) upper voice. Theanticipation of it could enhance its prominence only by comparison to the delayedconditions. But if the hammer velocity of the middle voice was increased, the alreadysmall effects of asynchrony ceased away. Then, it was only the loudness of an alreadydominant middle voice that controlled its perceptual salience.

During the setup of the experiment, the author followed average measurementresults of several recordings (see Figure 4.18, p. 120). While listening to the stimulias played back by the Bosendorfer, the left hand always sounded very loud. This

27Effects of asynchrony for the +12 condition [F (2, 23) = 3.64, p < 0.05] and for the +24 con-dition [F (2, 23) = 3.6, p < 0.05].

28Effect of instrument in the +12 condition [F (1, 24) = 0.35, p = 0.56] and in the +24 condition[F (1, 24) = 0.45, p = 0.51]. Only in the +12 MIDI velocity units condition, the interaction betweenasynchrony and instrument revealed that pianists rate the simultaneous condition slightly lowerthan the others in comparison to the non-pianists.

29Effects of asynchrony for the +10 condition [F (2, 23) = 1.02, p = 0.38] and for the +20 con-dition [F (2, 23) = 0.14, p = 0.87].

30Effect of instrument in the +10 condition [F (1, 24) = 2.03, p = 0.17] and in the +20 condition[F (1, 24) = 0.55, p = 0.47]. Only in the +10 MIDI velocity units condition, there was a significantinteraction between asynchrony and instrument [F (2, 23) = 4.22, p < 0.05], but post-hoc tests didnot disclose any significant effects to interpret.

126 Chapter 4. Perception of Melody

corresponded also what participants reported in personal communication after theexperiment or in the questionnaire. They expected the upper voice to sound evenlouder sometimes whereas they found the middle voice always too dominant and theleft hand too strong. This was surprising taking into consideration that the middlevoice was emphasised only in steps of 10 MIDI velocity units instead of steps of12 units in the upper voice. In addition to that, the ratings were not asymmetricat all (e.g., going only down to 2, but up to 7). Outer voices tended to receivegreater perceptual attention (Palmer and van de Sande, 1993; Palmer and Holleran,1994; Repp, 1996c), but they also required expressive emphasis to fulfill perceptualfamiliarities of musically trained listeners. Evidence was given here only for an uppervoice, the behaviour and expectancies for a bass voice might be very different.The loudness of all the voices followed the average velocity profile with a fixed

distance (in terms of MIDI velocity units). A human performer is definitely moreflexible in timbral shaping of the individual voices than this coarse algorithm. Thepedal used in the stimuli continuously also might be responsible for enforcing thelower voices more than the upper. It might be interesting for future performancerendering approaches that lower voices need much less emphasis than those of higherpitch, especially when the right pedal is involved.It can be summarised here that the effects of asynchrony were small compared to

the effects of intensity. When intensity was missing as a cue, anticipation could leadto a slightly enhanced perception of a voice (in our data more in the middle thanin the upper voice), but when the voices were played louder, asynchrony became aminor cue (especially in the middle voice).But still, there were effects of asynchrony in the data. (1) When the two voices

were equally loud, anticipation increased the ratings significantly in comparisonto delay in both the upper and the middle voice. (2) In contrast to findings of theprevious experiments, delay was more taken as a cue for attraction than anticipationin conditions with an emphasised upper voice . The typical Melody lead condition(−27 ms and +12 MIDI velocity units in the upper voice) was not significantly rateddifferently from the simultaneous condition, but the corresponding delay was! Noanalogous effect was observed in the middle voice.This experiment went the opposite way of obtaining the stimulus material than

Palmer (1996). She reduced original expressive performances by a professional pi-anist by excluding particular cues stepwise. It can be assumed that other cues suchas articulation and pedalling also varied with the different melodic interpretationsand thus also served as cues for melody detection. In the present study, the ‘ex-pressive’ cues to test were added to a prototypical expressive baseline (derived fromone professional pianist). With the present procedure, it was possible to exclude allother possibly influencing factors such as articulation and pedalling. Nevertheless,small trends of timing could be found in the present study.

4.7. Model 127

4.7 Model

4.7.1 Introduction

In this section, we compared the results of the last three experiments featuring dif-ferent experimental design in order to develop a comprehensive theory to explainthe data. The final question of this study is to evaluate the relative influence of eachof the varied expressive cues on the listeners’ ratings. To this end, multiple regres-sion models were fitted to the data of Experiment IVa, IVb, V, and VI, separately.The stimuli of Experiment IV and V shared the same design; Experiment VI wasslightly different. The models to be developed in this section assumed that the in-dependent variables related directly to the perception of salience by the participantsas represented in their ratings.

4.7.2 Input of the models

The input of the models consisted of the independent variables by which the stim-uli were created and the responses by the participants. For Experiments IVa, IVb,and V, voice (1: upper, 2: middle, 3: lower), (signed) asynchrony (−55, −27, 0, 27,55 ms), and intensity (five levels from 1: 28/62 to 5: 80/38 MIDI velocity units, seeFigure 4.13, p. 112 and Figure 4.14, p. 113) were the relevant variables by whichthe stimuli were created. One research question was whether or not anticipation oftones had the same effect on their perceived salience as delay. Therefore, anotherindependent variable was introduced that referred to the amount of asynchronyindependently of direction (unsigned asynchrony); it was the absolute value of asyn-chrony (55, 27, 0 ms).

The intensity values in these experiments were chosen from the database ofsound samples with respect to their peak sound level (dB) and not according to theMIDI velocity that produced a given tone. This decision was based on the resultsof Experiment I where listeners adjusted piano tone pairs equally loud that wereroughly equal in peak sound level.31 In order to evaluate how much the hammervelocities contributed to the listeners’ ratings, the (absolute) MIDI velocities of therated tones were introduced as alternative independent variable of dynamics (MV,with values between 0 and 127). The two independent variables referring to thedynamics of the stimuli were examined in separate models per experiment.

Multiple regression models were fitted separately onto the ratings32 of Exper-iment IVa, IVb, and V with voice, asynchrony, unsigned asynchrony, and either

31In the stimulus material that a certain velocity combination was sometimes composed withtones produced by slightly different MIDI velocities on different pitches; e.g., 50/50/50 MIDI veloc-ity units corresponding to −17.6/−17.6/−17.6 dB–pSPL were realised at f#–d#–b (chord 1, trans-position −1, see Figure 4.9, p. 106) with 56/64/68 MIDI velocity units, at g–e–c with 72/58/68,and at ab–f–db with 84/66/78.

32The ratings were averaged across the two chord types, but not across participants to includethe entire between-subject variance.

128 Chapter 4. Perception of Melody

intensity, or MIDI velocity (MV) as predictor variables, resulting in six separateregression models for these experiments. For Experiment VI the model design wasslightly different. Voice (1: upper, 2: middle), signed asynchrony (as before), un-signed asynchrony (as before), and intensity (1: 0, 2: +12/+10, 3: +24/+20 MIDIvelocity units, see Figure 4.22, p. 124) served as independent variables.33 The rat-ings of Experiment VI (Ro) ranging from 1 (“the lower voice attracts my attentionmore”) to 7 (“the upper”), with 4 (“they both sound equally loud to me,” seeFigure 4.21, p. 123) were modified for the model so that perceived equality of thetwo voices was zero at the rating scale and the maximum perceived attraction toa specific voice was 3, irrespective of voice [Ratingnew = abs(Ratingold − 4)]. Thismodification was considered to account for the asymmetry of the rating scale.

Thus, the models had the form

Rating = I +4∑

i=1

Bi · Vi, (4.3)

with I being the intercept, Bi the individual factors for the independent variablesas listed in Table 4.3, and Vi the variable numbers as described above.

34

4.7.3 Results and discussion

The results of the multiple regression models with intensity as the fourth indepen-dent variable are listed in Table 4.3a, the results for those models with MIDI velocityin Table 4.3b.

Experiments IVa, IVb, and V

Intensity All models for Experiments IVa, IVb, and V managed to explain morethan 76% of the rating data. In these six models most of the variance was explainedby the two intensity variables (with β values of 0.87–0.90, while the largest β valueof the other independent variables was 0.11). The models involving MIDI velocityas the fourth independent variable showed slightly higher R2s for Exp. IVa andExp. V by comparison to those involving intensity (although the difference might beinsignificant), but almost equal values in Exp. IVb. Thus, in Exp. IVa and V, theMIDI velocity numbers explained the results slightly better than a simple numberingof the five intensity conditions. This was not true for the model fitted onto theratings of Exp. IVb. However, the two loudness variables were correlated highly(r = 0.97∗∗) to each other, suggesting that the diverse attribution of intensity once bypeak sound levels and once by MIDI velocity values did not make a great differencein the results.

33In this experiment, intensity was controlled by MIDI velocity anyway, so no alternative variablehad to be introduced.

34Experiments IVa, IVb, and V used once intensity and another time MIDI velocity as the fourthindependent variable.

4.7. Model 129

Table 4.3: Results of the multiple regression models fitted onto rating data of Experi-ments VIa/b, V, and VI (see p. 112 and 113). The fourth independent variable was eitherintensity (a) or MIDI velocity (b). Highly significant independent variables (p < 0.01) areindicated by ∗∗.

(a) Exp. IVa Exp. IVb Exp. V Exp. VIN = 1950 N = 1950 N = 1950 N = 572

F (4, 1945) = 1861.0 F (4, 1945) = 1607.7 F (4, 1945) = 2278.4 F (4, 567) = 185.7p < 0.001 p < 0.001 p < 0.001 p < 0.001

R2 = 0.793 R2 = 0.768 R2 = 0.824 R2 = 0.567

Vi β B β B β B β B

Intercept 0.6768∗∗ 1.0718∗∗ −0.0074∗∗ −0.7178∗∗1 Voicea 0.1104 0.2119∗∗ 0.1130 0.1977∗∗ 0.1138 0.2492∗∗ 0.1293 0.2587∗∗2 Signed asynchrony −0.0373 −0.0015∗∗ −0.0339 −0.0015∗∗ −0.0792 −0.0037∗∗ −0.0533 −0.00143 Unsigned asynch. 0.0312 0.0024∗∗ 0.0189 0.0013 0.0259 0.0022∗∗ 0.0932 0.0043∗∗4 Intensityb 0.8822 0.9774∗∗ 0.8681 0.8469∗∗ 0.8968 1.1341∗∗ 0.7315 0.8785∗∗

aFor Exp. IVa, IVb, and V: 1, 2, 3; for Exp. VI: 1, 2.bFor Exp. IVa, IVb, and V: 1, 2, 3, 4, 5; for Exp. VI: 1, 2, 3.

(b) Exp. IVa Exp. IVb Exp. VN = 1950 N = 1950 N = 1950

F (4, 1945) = 2031.5 F (4, 1945) = 1562.3 F (4, 1945) = 2499.8p < 0.001 p < 0.001 p < 0.001

R2 = 0.807 R2 = 0.763 R2 = 0.837

Vi β B β B β B

Intercept −0.3412∗∗ 0.0238 −1.3233∗∗1 Voice 0.0964 0.1849∗∗ 0.1108 0.1938∗∗ 0.1086 0.2379∗∗2 Signed asynchrony −0.0096 −0.0004 −0.0071 −0.0003 −0.0632 −0.0029∗∗3 Unsigned asynch. 0.0395 0.0030∗∗ 0.0321 0.0022∗∗ 0.0104 0.00094 MIDI velocity 0.8907 0.0579∗∗ 0.8656 0.0532∗∗ 0.9043 0.0698∗∗

These results cannot suggest that participants tended to rate the loudness of thestimuli more according to their underlying MIDI velocity, because the differences inthe R2 values were too small to draw any conclusions on it. Moreover, the model ofExperiment IVb did not show larger R2 values for MIDI velocities. The evidence ofExperiment I, where participants adjusted the perceptual equal-intensity baseline oftwo simultaneous tones according to their peak sound levels, is still more convincing.Nevertheless, it can be concluded that in all for models intensity explained the majorpart (86–90%) of the ratings.

Unsigned asynchrony The models of Experiment IV revealed unsigned asyn-chrony as a significantly contributing variable in four out of six models indepen-dently of whether the target tone was before or after the other tones. This indicatedthat the further the target tone was shifted away from the chord the louder it wasrated (according to the positive sign of the B values). The importance of this vari-able changed considerably with model. With MIDI velocity, it was significant forExp. IVa/b, but not for Exp. V, while with intensity, this picture was different(Exp. IVa and V significant, while Exp. IVb not). This inconsistent behaviour over

130 Chapter 4. Perception of Melody

different intensity conditions meant that with this independent variable no definiteconclusion is possible.

Signed asynchrony Asynchrony was significant in all models that included inten-sity as independent variable, but explained only a very small portion of the ratings(3–7%, respectively, see Table 4.3). The negative sign of β denoted an increase inthe perceived salience for anticipated target tones, and an attenuation for delayedones. The slope of this effect was very small: at −55 ms it attenuated the ratingsby −0.083 in Exp. IVa/b and −0.204 in Exp. V. The models with MIDI velocity asdynamic reference exhibited asynchrony as significantly contributing in Exp. V, butnot in Exps. IVa/b. Although the two model approaches delivered diverse results,they were not contradicting. Both approaches confirmed that streaming as intro-duced in Experiment V increased the effect of relative timing of the target tone,however, only to a very small degree.

Voice The position of the target tone in the chordal context played a slightlymore important role. In the models of Experiments IVa, IVb, and V, the lowervoice received louder ratings (e.g., the ratings of the lowest voice were higher by0.42 in Exp. IVa and 0.5 in Exp. V by comparison to the highest voice for the firstset of models). This finding coincided with the masking hypothesis: the lower tonestend to mask higher tones more than in the opposite direction so that the highertones are perceived softer as they would be when presented alone, assuming thatthe tones were equally loud.

Experiment VI

The model fitted onto the data of Experiment VI also favoured intensity as themost important contributing variable (with a β value of 0.73 versus 0.13 of voice).Moreover, voice as well as unsigned asynchrony were important predictors for thelisteners’ ratings. The independent variable voice had a positive B value. Thisindicated that with voice = 1 (upper voice), the prediction are multiplied with0.2587 and with voice = 2 (middle voice) with 0.5174, with the effect that all otherthree (or at least the other two significant independent) variables had a greater(about 25%) impact on the ratings in the middle voice than in the upper voice.This could mean that listeners are more sensitive to the same amount of expressivechanges of relative timing and intensity in the middle voice than in the upper voice,where they usually expect these expressive variations.

The positive coefficient for unsigned asynchrony denoted that, irrespective ofdirection, asynchrony helped to attract listeners’ attention to a particular voice. Thefurther away the voice from the accompaniment, the more attracting was it ratedby the participants. However, the B value of this effect was small. The regressionmodel of this experiment did not specify asynchrony as a significantly contributingindependent variable. This result was somewhat surprising, because especially in a

4.7. Model 131

realistic musical context the effect of streaming and asynchrony was expected to bemore prominent than in the preceding experiments (confer to Section 4.6, p. 118, aswell as to Palmer, 1996, Exp. 4).

132 Chapter 4. Perception of Melody

4.8 General discussion

This chapter focussed on two basic questions: the first investigated what amount ofasynchrony can be detected as such by listeners and whether this threshold is depen-dent on the type of tone involved. The second tackled the influence of relative onsettiming on the perceptual salience of particular tones in chordal musical textures.In a series of three experimental sessions with a total of seven experiments, thesequestions were investigated in various different conditions. Asynchrony perceptionwas tested in the pilot experiment and Experiment III, while the other question ofthe pilot experiment, as well as Experiments II, IV, V, and VI focussed on salienceperception.

Listeners could tell the correct order of two tones of a piano dyad for asynchroniesgreater than 30–40 ms. This threshold decreased marginally the more artificialthe stimuli became (synthesised piano, sawtooth, pure tones, cf. Section 4.3, p.87). This threshold was somewhat larger than found in other studies with artificialstimuli (e.g., Hirsh, 1959). The more striking result came from Experiment III(Section 4.4.4, p. 100), where the two tones of the dyads were also manipulatedin intensity. In this experiment, participants tended to perceive the two tones assynchronous even with asynchronies as large as 55 ms, provided that the earlier tonewas also louder. This asymmetry was stronger for more complex sounds (sawtooth,real piano) than for pure tones.

It is still unclear whether masking phenomena (i.e., forward masking) and/orfamiliarity with piano sounds were responsible for this asymmetry in the ratings.Listeners noticed and detected asynchrony only in unfamiliar combinations of rel-ative intensity and asynchrony (e.g., early and soft tone). But, if familiarity withpiano sounds were important for this effect, why did the non-pianists in this studynot rate these stimuli significantly differently, since they were expected to have lessacquaintance with piano sounds and typical combinations of loudness and timing?The other explanation is more psychoacoustic in nature: listeners assign simultane-ity to early–loud combinations, simply because they cannot or only hardly hearthe onset of the second and weaker tone. This effect is best explained by forwardmasking (Zwicker and Fastl, 1999), in which a louder tone attenuates the hearingthreshold of a following (softer) tone within a time interval that is comparable tothe asynchrony of typical melody leads (some tens of milliseconds). However, mask-ing phenomena are too complex especially in real piano sounds to be predicted byexisting models for the present stimulus material. Therefore, it was hard to stateany more precise assumptions about the extent of temporal and spectral maskingthat occurred in the present stimuli.

To conclude, in the light of this experiment the melody lead phenomenon, wherethe onsets of a more intense melody temporarily precede the onsets of the softeraccompaniment, has to be interpreted differently. Recall that Hirsh (1959) foundasynchronies of the order of 20 ms to be easily perceived as asynchronous; even theorder of the two tones could be determined by the listeners at that threshold. With

4.8. General discussion 133

the results of Experiment III, it seems now evident that asynchronies of the orderof some 30 ms—as the melody lead phenomenon typically exhibits—are not heardas being asynchronous by musically trained listeners.

The second question focussed on how perceived salience may be altered withrelative onset asynchrony. Five experiments were exclusively dedicated to this ques-tion. Their stimulus material became acoustically and musically more and morecomplex. It developed from equally loud dyads, over dyads with intensity varia-tion, three-tone chords, and sequences of three-tone chords, to a short excerpt ofa piece by Chopin. The fundamental result of all five experiments was the same:asynchrony had only a small and inconsistent effect on the perceived salience on thetone or voice in question. There were some effects in some experiments, but some-times they contradicted results just found in the preceding experiments. Generally,the perceived salience depended primarily on differences in relative intensity of thetones, while asynchrony altered it only marginally and sometimes inconsistently.

Effects of relative onset timing became relevant, when the target tones or voiceswere softer than the other chord tones so that they were masked by them. Maskingoccurred when the tones were simultaneous and when the softer target tone camelate. In these cases, early tones (anticipation) helped to overcome masking. Thismasking explanation was also consistent with the finding that the perceived saliencewas greatest (thus masking attenuation lowest) for a lower target tone and weakestfor a higher target tone (a lower tone masks a higher more than in the oppositedirection, cf. also to the linear contrasts between anticipation and delay as listed inTable 4.2, p. 114).

In general, anticipation of the tones slightly increased their salience ratings whiledelay attenuated it. This trend was reflected also in the models fitted onto therating data. The asymmetry between early and late was greater, and thus the slopeof the line of best fit was steeper in Experiment V than in Experiment IV (seeTable 4.3, p. 129). This evidence is explained with the chord repetitions introducedto Experiment V. This finding was consistent with the streaming hypothesis asadvanced by Bregman and Pinker (1978); however, its effect was small relative tothe effect of intensity.

On the basis of this explanations, it was surprising that this effect could did notbecome stronger with the real music excerpt by Chopin in Experiment VI. There,delay as well as anticipation tended to enhance perceptual salience about equally(effect of unsigned asynchrony). It might be that in a real music situation otherperformance parameters as, e.g., articulation or pedalling play an important role inspecifying a melody voice. Thus, asynchrony would help to perceptually separatethe voices, but only in conjunction with the other performance parameters.

The stimuli for this last experiment were deliberately based on a single expres-sive performance by one pianist, that is, all other performance parameters were heldconstant over the stimulus conditions (see Section 4.5.2, p. 106). This procedurewas opposite to Palmer’s approach that removed individual performance parame-ters (timing, intensity) from different expressive performances with different melodic

134 Chapter 4. Perception of Melody

intentions (cf. Palmer, 1996, Exp. 4). She found that pianists could slightly betteridentify the correct melodic intention with the intensity cues removed from the stim-ulus performance (timing only) than non-pianists. Since in her stimuli expressiveparameters as articulation and pedalling changed over condition, this weak effectdoes not necessarily depend on relative onset timing.

The present study with a design focussed exclusively on the investigated param-eters showed that relative timing alone did not have a consistent effect on perceivedsalience. It follows that asynchrony might play an important role in expressing mu-sical intentions of the performers only in combination with the above mentionedother performance parameters (in order to explain the results of Palmer, 1996).

One difficulty in the present experiments was the issue of how to measure theloudness of piano sounds. An intensity baseline adjustment experiment (Experi-ment I, p. 96) showed that participants perceived simultaneous tone pairs as sound-ing equally loud that were equal in their peak sound level, but not in their originalMIDI velocities. As measurements described in Section 2.4 revealed, the connectionbetween MIDI velocities and the peak sound level of the recorded tones resultingfrom those MIDI velocities varied strongly with pitch. This finding together withthe results of Experiment I led to the decision to select the samples for the Experi-ments IV and V according to their peak sound levels. This had the consequence thata certain intensity combination showed different MIDI velocity values on differentpitches. Results from the regression models suggest that the absolute MIDI velocityvalues explained the results as well as the velocity combinations only.

This issue requires more profound perceptual and acoustic investigation. It isstill unclear why the peak sound level changes so much from one tone to the otherand why this changes so strongly with microphone position. Furthermore, it needsto be examined how listeners perceive intensity at the piano and how this perceptionrelates to acoustic measures. A possible solution would be to introduce a perceptualmodel of intensity perception that involves intensity as well as timbral informationthat is prototypical for the piano sound. It might be that the effects of asynchronywere so fragile that they got obscured by the above mentioned uncertainty of as-signing intensity to the stimulus material.

As the questions regarding perceived tone salience were formulated in terms of“how loud” particular tones sound to the listeners, it might be that listeners rated theloudness only and tried to cancel out any effects of asynchrony. Some participantsindicated after the experiments that it was easy to assign loudness ratings on thevarious tones, if only they sounded together. So, it might be that for future researchit would be better to ask for “how transparent,” “how distinct,” or “how singing”individual voices sound. Another approach could be to ask how many tones can bedetected out of a four or five-tone chord (cf. e.g., DeWitt and Crowder, 1987).

These points raise the questions of whether alterations in relative onset timingresult in an enhanced salience of a particular voice or in an increased salience ofeven more than one voice at a time (corresponding to the concept of multiplicity,that is, the number of tones simultaneously noticed in a chordal sonority, Parncutt,

4.8. General discussion 135

1989, p. 92). The parallel increase of the individual saliences of voices could leadto a more transparent sonority in which each voice can be tracked distinctly by alistener (Rasch, 1979, 1988). This coincides with Huron’s finding that J. S. Bachmaximised onset asynchrony in his polyphonic compositions in order to enhance theperceptual salience of the individual voices (Huron, 1993, 2001).

136 Chapter 4. Perception of Melody

Chapter 5

Conclusions

This thesis addressed the question of how pianists make individual voices stand outfrom the background in a contrapuntal musical context, how they realise this withrespect to the constraints of piano keyboard construction, and how much each ofthe expressive parameters used by the performers contributes to the perception ofthese particular voices. These basic questions were approached from three differentmethodological directions represented as three major parts (Chapters 2–4) in thisthesis.

First, in a piano acoustics study a vast amount of data was gathered from threegrand pianos produced by different piano makers. These data were collected withan accelerometer setup monitoring key and hammer movements under differenttouch conditions. This study explored the relationship between the duration ofthe keystroke (travel time) and the dynamics of the produced tone (in terms ofhammer velocity). This relation reflects a simple mechanical constraint: the faster akey was depressed, the shorter was the hammer’s travel to the strings, and thus, thelouder the produced sound. This basic relation (travel time in ms versus hammervelocity in m/s) was approximated by a power curve and used for analysis of datacollected in the second approach (melody lead study).

The temporal characteristics (travel time, key–bottom contact times, instantsof maximum hammer velocity) of the measured grand piano action varied onlymarginally among investigated pianos, hardly at all between different keys, butgreatly with the type of touch. When a tone with a certain intensity (hammer ve-locity) is played, it takes around 30–40 ms less time to produce a sound when itwas hit from a certain distance above the keys (staccato touch) by comparison to akeystroke “from the keys” (legato touch). This finding demonstrated the complexityof what pianists need to be (even unconsciously) aware of when aiming for a desiredexpressive timing with tones of different intensities and types of touch. Other find-ings confirmed assertions from piano education literature. Depressing the keys fromthe key surface reduces finger–key noise and thus produces a cleaner sound (cf. toGat, 1965). Moreover, legato touch allows closer control of the tone, because theinstants of maximal hammer velocity, corresponding closely to the points in time

137

138 Chapter 5. Conclusions

when the hammer loses contact with the jack, are later—and thus the time intervalsof free flight shorter—relative to a staccato touch of the same intensity. On theother hand, very loud tones can only be achieved with keystrokes from a certaindistance above (staccato touch).

Although these studies on the temporal behaviour of three different grand pianoactions delivered a huge amount of data, the research is still in progress. A prelimi-nary attempt was made to infer subjective judgement of playability from the tempo-ral behaviour of the grand piano actions. To reach more definite conclusions on thequality of pianos and on how piano actions should be adjusted, more investigationshave to be performed. Research of this kind has not adequately integrated the vastknowledge of piano makers and piano technicians (see, e.g., Dietz, 1968) into the re-search process. Further studies should take advantage of their large experience and(sometimes anecdotal) knowledge in order to examine the effective impact of theirwork (tuning, intonation, regulation) on the playability of grand pianos. For exam-ple, Askenfelt and Jansson (1990b, 1991, 1992a) worked in close co-operation withwell-regarded piano technicians from the National Swedish Radio and the SwedishAcademy of Music. Research in instrumental acoustics, performance research, andthe wide empirical expertise of piano makers and technicians should no longer beseparate knowledge areas, but interacting fields of interest that mutually benefitfrom each other. Researcher in instrumental acoustics could systematically investi-gate the process of piano tuning, regulation of action, and intonation of the hammersby examining the effect of each individual adjustment by a technician. It might bethat some explanations of craft’s habits by piano technicians are questionable oruntrue, others might be confirmed. Such a co-operation could lead to a better andmore complete understanding of the complex processes involved in expressive pianoperformance.

The second approach of this thesis was through a performance study in which22 professional pianists played two excerpts by Frederic Chopin on a Bosendorfercomputer-controlled grand piano. The performance data were analysed with re-spect to tone onset asynchronies and dynamic differences between the principalvoice and the accompaniment. The melody was found to precede the other voicesby around 30 ms, confirming findings from previous studies (melody lead, cf. Vernon,1937; Palmer, 1989, 1996; Repp, 1996a). The earlier a melody tone appeared thelouder that tone was with respect to timing and intensity of the other chord tones.This evidence supported the velocity artifact hypothesis (Repp, 1996a) that ascribedthe melody lead phenomenon to mechanical constraints of the piano keyboard (thelouder a tone is hit, the earlier it will arrive at the strings). In order to test thishypothesis, the relative asynchronies at the onset of the keystrokes (finger–key asyn-chronies) were inferred through the travel time–hammer velocity relation from theprevious study. Key onset differences between the principal and the other voicesshowed almost no asynchrony anymore. This finding suggests that pianists startedthe key movement basically in synchrony; the typical asynchrony patterns (melodylead) were caused by different sound intensities in the different voices. It was con-

Chapter 5. Conclusions 139

cluded that melody lead can be largely explained by the mechanical properties ofthe grand piano action rather than to be seen as an independent expressive devicethat is applied (or not) by pianists for purposes of expression (Palmer, 1996).

In a further evaluation of the collected piano action data of the first part ofthis thesis, the recording and reproducing capability of the two computer-controlledpianos were investigated—an issue never tackled in performance research although ofcrucial importance. It revealed that the recording accuracy in timing was ±3 ms forthe Bosendorfer SE290 and +20/−28 ms for the Yamaha Disklavier (see Section 2.3,p. 36). This suggests that a performance study examining an effect of the order ofsome 30 ms would not have been possible with such a Disklavier as used in theaccuracy study. It could be that the results, e.g., of Repp (1996a) were blurred,if the Disklavier used in that study (a MX100A upright piano) had comparableproperties to that used here. However, this consideration remains speculation untilmeasurements are performed also with Repp’s device. As reported before (p. 40), therecording accuracy could be enhanced by eliminating the trend over tone intensityusing a polynomial curve fit (see Figure 2.13, p. 40).

Although the performance study on melody lead (Chapter 3) generated convinc-ing evidence for the role of the mechanical constraints in the genesis of melody lead,its perceptual relevance had still to be studied in detail. The third approach inthis study involved psychological experiments and judgements of trained musiciansto investigate how the systematic manipulation of the two parameters investigatedin the previous study (relative onset timing and variation in tone balance of thechords) altered the perception of individual tones in a multi-voiced musical context.

In a series of seven experiments, two main issues were addressed. The first wasthe threshold for the perception of asynchronies between two or three almost simulta-neous tones. This threshold was at around 30–40 ms for piano tones and was almostindependent of tone type (pure, sawtooth, synthesised piano, and real piano), some-what larger than typical values reported in the literature (Hirsh, 1959). However, itwas strongly dependent on the sound level of the tones involved. Asynchronies aslarge as 55 ms were identified as simultaneous by musically trained listeners whenthe earlier tone was also considerably louder than the later tone. This finding maybe explained either by familiarity with piano music (only unfamiliar combinationsof relative timing and intensity are detected as asynchronous by listeners) or by for-ward masking (the loud and early tone attenuates the sensation level of a subsequentsofter tone for some tens of milliseconds).

The second investigative direction examined how manipulation of relative onsettiming and intensity altered the perceived salience of individual tones. Five exper-iments investigated the perception of dyads, three-tone chords, and sequences ofthree-tone chords with and without intensity manipulation, as well as an excerpt ofreal music by Chopin. In all experiments, the effects of relative onset timing wererelatively small although many factors were simultaneously manipulated through-out the experiments (type of tone, interval, pitch, position of the target tone). Theperceived salience depended primarily on the intensity relations of the stimuli. Only

140 Chapter 5. Conclusions

when intensity was absent as a cue or when the rated tone (target tone) was softerthan the rest, did melody lead help the softer and thus masked tone to be heard.This effect was stronger when the target tone was in the upper or middle voice andweakest with the target tone in the bass. Apart from these masking effects, stream-ing (Bregman and Pinker, 1978) barely changed loudness ratings in the experimentalcondition with repeated chords in comparison to the experiment without repeatedchords. According to Bregman’s theory of auditory scene analysis (Bregman andPinker, 1978; Bregman, 1990), early tone onsets helped to group those tones intoa melodic stream (stream segregation) in a chordal music context. However, thiseffect was weak in the present study. In the real music excerpt by Chopin, a delayedvoice attenuated and an early voice (anticipation) enhanced loudness ratings, butonly in conditions with dynamically balanced voices. When intensity variation be-tween voices was included, delay just as anticipation enhanced loudness ratings—incontrast to findings from the previous experiments.

These unclear effects of asynchrony bring up several questions. When anticipa-tion and delay do not clearly affect loudness ratings, why can we find the principalinstrument leading by the same amount of time in various instrumental ensemblesother than piano (Rasch, 1979, 1988)? Previously, research found an early voice tocatch the attention, because it is first and it is for some fractions of a second—thoughalmost not as such perceivable—not masked by any other sound. On the other hand,these ensemble asynchronies may be explained by the simple fact that the leadinginstruments also leads the ensemble and, thus, appears some tens of millisecondsahead of the others (as, e.g., the beat of some conductors especially of large orches-tras are often visibly before the orchestra). However, it might be that asynchroniesdo not enhance the salience of certain voices, that is the likelihood to be heard asindividual voices by listeners (Parncutt, 1989), but increase the salience of all voices.In other words, relative timing differences may render a multi-voiced context moretransparent. With respect to the present listening experiments, it might have beenthat asking whether a tone or voice gets louder or softer with time shift was notthe right thing to ask; so it would have been better to ask for voice transparency,singing quality, expressivity of a voice, or even for the number of voices perceptuallyimmediately present to the listeners.

In an interview study by Parncutt and Holming (2000), university students ofpiano performance were largely unaware of the melody lead phenomenon. Thiswas possibly unsurprising, because melody lead occurs automatically with dynamicdifferentiation between voices and second it is hardly detected as sounding asyn-chronously at all. This would explain why we find many statements in the pedagogicliterature of how to shape chords timbrally or how to emphasise single voices exclu-sively with reference to tone intensity, but almost never to small timing changes. Tocomplement quotes by Horowitz (Eisenberg, 1928, cf. p. 73) and Neuhaus (1973, cf.p. 1), an excerpt of an interview of Konrad Wolff with the pianist Alfred Brendel iscited here.

Chapter 5. Conclusions 141

“(...) at the beginning of the ‘Waldstein’ Sonata you have four-voicedchords. If you play them in the manner recommended in the book1 (thesoprano and bass leading and the middle voices slightly in the back-ground) you will get a great deal of clarity but a totally wrong atmo-sphere. The atmosphere of this beginning is pianissimo misterioso (...)”

“In the case of the ‘Waldstein’, it is not daylight but dawn, I would say,not bright energy but mystery – even within the strict rhythmic pulse– and for me that tips the balance in favour of the inner voices. I playthe inner voices slightly stronger than the outer voices. That makes thechord sound softer. This is an important matter. If the outer voicesare played louder than the inner voices it does not sound pianissimo,no matter how soft you try to play them. The inner voices, in certainpositions, give the dolce character, the warmth.” (Brendel, 1990, p. 241,emphasis in original.)

Both pianists, Brendel and Schnabel, referred to how softly or strongly to playcertain keys under the tacit assumption that chords have to be played synchronouslyor at least with no reference to tone onset synchronisation. They refer to the overallintensity impression, the timbre of the chord, and its “character.” According to thefindings from the present studies (velocity artifact hypothesis), the two inner voiceswould appear some milliseconds earlier in Brendel’s performance—because his innervoices are louder—in comparison to Schnabel’s performance in which the oppositewould be the case. It might be that the dynamic impression and the character of achord is at least as much dependent on the relative timing than on the intensity ofits tones. However, the relation between relative timing and intensity balance of achord and the perceived timbre has to remain a topic for future investigation.

In the production study we found that pianists start the key movement basi-cally simultaneously, based on evidence inferred from a travel time approximation.However, it could be that special playing techniques might entail an early onset ofthe key movement for an emphasised tone. In this context, instructions by AlfredCortot for the Chopin Etude op. 10, No. 3 have to be considered here.

“A definitive rule must be followed without fail while practising thispolyphonic technique: i.e. the weight of the hand should lean towardsthe fingers which play the predominant musical part, and the musclesof the fingers playing an accessory part should be relaxed and remainlimp.” (Cortot, 1915, p. 20)

Cortot regarded one of the main “difficulties to overcome” in that piece the “intenseexpressiveness imparted by the weaker fingers and the particular position of thehand arising therefrom” (Cortot, 1915, p. 20). With the hand skewed towards the

1The purpose of the interview was to discuss about the teaching of Artur Schnabel as publishedby Wolff (1979).

142 Chapter 5. Conclusions

“weaker fingers,” it is possible that with this particular way of realising this task,key movement starts somewhat before the other fingers depress the accompanimentkeys. Verbal descriptions of playing techniques are always difficult to implementand to realise on the piano by somebody else, because the bodily awareness of allmuscular and motor processes is very different between players. This is usuallyovercome during a piano lesson by showing certain playing techniques directly tothe student; however, this is difficult to achieve through books. To more conclusivelyaddress the issue of asynchrony at finger–key level, video recordings of performingpianists coming from different pianistic schools or other special studies would berequired.

Brendel once mentioned deliberate asynchrony to shape the tone balance of achord in a special way. In this passage, he suggests to delay a middle voice (which heregarded as the most meaningful voice) in order to increase its perceptual salience.

“To my ears, the sound of thirds and sixths should often be on the darkside. That means that the lower voice in Schubert and Brahms has to beat least as prominent and expressive as the main voice – particularly inminor keys. If I listen to the slow movement of Schubert’s B flat majorSonata, at least the thought that the inner voice is the most meaningfulis valuable to me, even if it is not louder than the soprano. Maybe itcomes just a split second after the soprano and thus draws imperceptiblya little attention to itself.” (Brendel, 1990, p. 244)

So, Brendel aims for two equally loud tones with the lower voice put slightlymore into the foreground by delaying its onset. Such a condition was tested inExperiment VI (Section 4.6, p. 118). There, the importance of the two voices wasconsidered equally to the simultaneous condition by the listeners, but the delayedmiddle voice lost perceptual salience in comparison to the opposite condition (earlymiddle voice). This Brendel quote either suggests that a dark chord timbre developswhen the lower (middle) voice is at least as loud as the upper voice. This correspondswith my experience with rendering the stimulus performances in Experiment VIwhere the lower voices needed to be strongly reduced in order not to cover thesingle melody. Secondly, it could be that Brendel rotates his hand as suggestedby Parncutt and Holming (2000) to realise what he intended to do (delay despiteequally loud voices). And thirdly, Brendel suggests delay of a middle voice as adelicate expressive means to emphasise a single voice. While listening to a recordingof Brendel playing the mentioned movement,2 the dark timbre of the two melodyvoices becomes immediately apparent. However, the mentioned delay of the middlevoice could not be heard by the author. Since it is still impossible with presentsignal processing methods to reliably determine the individual onsets within a piano

2The second movement (Andante sostenuto) from Schubert’s last piano sonata in B flat major,D. 960, Philips Classics, 456 573-2, recorded on June 25, 1997, live at the Royal Festival Hall inLondon.

Chapter 5. Conclusions 143

chord, we cannot further verify Brendel’s statements with this recording, or provethat he might have changed his mind after 18 years.3

All these considerations lead to the question as to whether it is possible at all toplay chords with differently shaded single intensities, but without the melody leadeffect, thus, consciously cancelling out the mechanical constraints of the keyboardconstruction. Therefore, the softer tones of a chord have to be depressed slightlybefore the stronger keystrokes in order to achieve synchrony at the strings. Parncuttand Troup (2002) suggest to lift the finger to play the louder tone from a certaindistance above the keys while setting the other fingers in motion. The lifted fingerthen gains on the others at the strings (Parncutt and Troup, 2002, p. 296). There isno reason why such a technique is not learnable, it is only the question whether it isworthwhile to do so. Perhaps, if young students perform such practising regularlyand while doing so sharpen their perception to asynchronies, they would be ableto more consciously use asynchrony as an expressive device. And it could be thatin special performance conditions absolutely simultaneous chords may sound as aninteresting expressive alternative.As a possible application to go more into the phenomenon of melody lead would

be a keyboard that compensates for the velocity artifact so that a sound starts laterthe stronger the key is depressed using an electronic keyboard.4 How would pianistsreact on such an altered acoustical feedback? Will they be able to notice that melodylead is missing although they play the melody louder? Would the outcoming pianosound seem strange to them, because the supposedly “singing” quality is missing(cf. to Dunsby, 1996, pp. 67–73)?Although the melody lead phenomenon as a small part of the spectrum of tone

onset asynchrony effects and possibilities on the piano was thoroughly investigatedin this thesis from three different sides (piano acoustics, performance practice, per-ceptual experiments), there are still many issues to address in future research. Es-pecially in the case of onset asynchronies, it would be fruitful for piano students,piano teachers, and researchers to collaborate and to share their knowledge, notonly because this issue is not yet conclusively examined, but also because not allneighbouring effects have been clarified entirely.

3The interview with Konrad Wolff dates back to 1979.4Such a device would have to deal with the temporal properties of an electronic keyboard. The

travel time function of an electronic keyboard lasts from finger–key contact to the note-on commandthat might differ from key–bottom or hammer–string contact times as measured in Chapter 2.

144 Bibliography

Bibliography

Allen, F. J. (1913), “Pianoforte touch,” Nature 91(2278), 424–425.

Askenfelt, A. (Ed.) (1990), Five Lectures on the Acoustics of the Piano (Publica-tions issued by the Royal Swedish Academy of Music, Vol. 64, Stockholm).

Askenfelt, A. (1991), “Measuring the motion of the piano hammer during stringcontact,” Speech, Music, and Hearing. Quarterly Progress and Status Report1991(4), 19–34.

Askenfelt, A. (1994), “Observations on the transient components of the piano tone,”in Proceedings of the Stockholm Music Acoustics Conference (SMAC’93), July 28–August 1, 1993, edited by A. Friberg, J. Iwarsson, E. V. Jansson, and J. Sundberg(Publications issued by the Royal Swedish Academy of Music, Stockholm), vol. 79,p. 297–301.

Askenfelt, A. (1999), “Personal communication,” .

Askenfelt, A., Galembo, A., and Cuddy, L. L. (1998), “On the acoustics and psy-chology of piano touch and tone,” Journal of the Acoustical Society of America103(5 Pt. 2), 2873.

Askenfelt, A. and Jansson, E. V. (1990a), “From touch to string vibrations,” inFive Lectures on the Acoustics of the Piano, edited by A. Askenfelt (Publicationsissued by the Royal Swedish Academy of Music, Stockholm), vol. 64, p. 39–57.

Askenfelt, A. and Jansson, E. V. (1990b), “From touch to string vibrations. I.Timing in grand piano action,” Journal of the Acoustical Society of America88(1), 52–63.

Askenfelt, A. and Jansson, E. V. (1991), “From touch to string vibrations. II. Themotion of the key and hammer,” Journal of the Acoustical Society of America90(5), 2383–2393.

Askenfelt, A. and Jansson, E. V. (1992a), “From touch to string vibrations. III.String motion and spectra,” Journal of the Acoustical Society of America 93(4),2181–2196.

145

146 Bibliography

Askenfelt, A. and Jansson, E. V. (1992b), “On vibration and finger touch in stringedinstrument playing,” Music Perception 9(3), 311–350.

Behne, K.-E. and Wetekam, B. (1994), “Musikpsychologische Interpretationsfor-schung: Individualitat und Intention,” inMusikpsychologie. Empirische Forschun-gen, asthetische Experimente, edited by K.-E. Behne, G. Kleinen, and H. d.la Motte-Haber (Noetzel, Wilhelmshaven), vol. 10, p. 24–32.

Bharucha, J. J. (1983), Anchoring Effects in Melody Perception: The Abstractionof Harmony from Melody, Ph.D. thesis, Harvard University, Cambridge, USA.

Bolzinger, S. (1995), Contribution a l’etude de la retroaction dans la pratique musi-cale par l’analyse de l’influence des variations d’acoustique de la salle sur le jeu dupianiste, Ph.D. thesis, Institut de Mecanique de Marseille, Unpublished doctoralthesis, Universite Aix-Marseille II, Marseille.

Bork, I., Marshall, H., and Meyer, J. (1995), “Zur Abstrahlung des Anschlag-gerausches beim Flugel,” Acustica 81, 300–308.

Bortz, J. (1999), Statistik fur Sozialwissenschaftler (Springer, Berlin, Heidelberg,New York), 5th revised ed.

Boutillon, X. (1988), “Model for piano hammers: Experimental determination anddigital simulation,” Journal of the Acoustical Society of America 83(2), 746–754.

Bregman, A. S. (1990), Auditory Scene Analysis. The Perceptual Organization ofSound (The MIT Press, Cambridge, Massachusetts).

Bregman, A. S. and Pinker, S. (1978), “Auditory streaming and the building oftimbre,” Canadian Journal of Psychology 32, 19–31.

Brendel, A. (1990),Music Sounded Out. Essays, Lectures, Interviews, Afterthoughts(Robson Books, London).

Bresin, R. and Battel, G. U. (2000), “Articulation strategies in expressive pianoperformance,” Journal of New Music Research 29(3), 211–224.

Bresin, R. and Widmer, G. (2000), “Production of staccato articulation in Mozartsonatas played on a grand piano. Preliminary results,” Speech, Music, and Hear-ing. Quarterly Progress and Status Report 2000(4), 1–6.

Baron, J. G. (1958), “Physical basis of piano touch,” Journal of the AcousticalSociety of America 30(2), 151–152.

Baron, J. G. and Hollo, J. (1935), “Kann die Klangfarbe des Klaviers durch die Artdes Anschlages beeinflußt werden?” Zeitschrift fur Sinnesphysiologie 66(1/2),23–32.

Bibliography 147

Bryan, G. H. (1913a), “Pianoforte touch,” Nature 91(2271), 246–248.

Bryan, G. H. (1913b), “Pianoforte touch,” Nature 91(2281), 503–504.

Bryan, G. H. (1913c), “Pianoforte touch,” Nature 92(2297), 292–293.

Bryan, G. H. (1913d), “Pianoforte touch,” Nature 92(2302), 425.

Burns, E. M. (1999), “Intervals, scales, and tuning,” in The Psychology of Music,edited by D. Deutsch (Academic Press, San Diego), 2nd ed., pp. 215–264.

Cadox, C., Lisowski, L., and Florens, J.-L. (1990), “A modular feedback keyboarddesign,” Computer Music Journal 14(2), 47–51.

Chaigne, A. and Askenfelt, A. (1994a), “Numerical simulations of piano strings. I:A physical model for a struck string using finite differences methods,” Journal ofthe Acoustical Society of America 95(2), 1112–1118.

Chaigne, A. and Askenfelt, A. (1994b), “Numerical simulations of piano strings.II: comparisons with measurements and systematic exploration of some hammer-string parameters,” Journal of the Acoustical Society of America 95(3), 1631–1640.

Cochran, M. (1931), “Insensitiveness to tone quality,” Australian Journal of Psy-chology 9, 131–134.

Coenen, A. and Schafer, S. (1992), “Computer-controlled player pianos,” ComputerMusic Journal 16(4), 104–111.

Conklin, H. A. (1996a), “Design and tone in the mechanoacoustic piano. Part I.Piano hammers and tonal effects,” Journal of the Acoustical Society of America99(6), 3286–3296.

Conklin, H. A. (1996b), “Design and tone in the mechanoacoustic piano. Part II.Piano structure,” Journal of the Acoustical Society of America 100(2), 695–708.

Conklin, H. A. (1996c), “Design and tone in the mechanoacoustic piano. Part III.Piano strings and scale design,” Journal of the Acoustical Society of America100(3), 1286–1298.

Cortot, A. (Ed.) (1915), Chopin. 12 Studies Op. 10. Student’s Edition (EditionsSalabert, Paris).

Dain, R. (2002), “The engineering of the concert piano,” Ingenia 12(May), 20–39,Published online at http://www.pianosonline.co.uk/.

Deutsch, D. (1999a), “Grouping mechanisms in music,” in The Psychology of Music,edited by D. Deutsch (Academic Press, San Diego), 2nd ed., pp. 299–348.

148 Bibliography

Deutsch, D. (1999b), “The processing of pitch combinations,” in The Psychology ofMusic, edited by D. Deutsch (Academic Press, San Diego), 2nd ed., pp. 349–411.

DeWitt, L. A. and Crowder, R. G. (1987), “Tonal fusion of consonant musicalintervals: The oomph in Stumpf,” Perception and Psychophysics 41(1), 73–84.

DeWitt, L. A. and Samuel, A. G. (1990), “The role of knowledge-based expectationsin music perception: Evidence from musical restoration,” Journal of ExperimentalPsychology: General 119(2), 123–144.

Dietz, F. R. (1968), Steinway Regulation. Das Regulieren von Flugeln bei Steinway(Verlag Das Musikinstrument, Frankfurt am Main).

Divenyi, P. L. and Hirsh, I. J. (1974), “Identification of temporal order in three-tonesequences,” Journal of the Acoustical Society of America 56(1), 144–151.

Dixon, S. E., Goebl, W., and Widmer, G. (2002a), “Real Time Tracking and Vi-sualisation of Musical Expression,” in Proceedings of the Second InternationalConference on Music and Artificial Intelligence (ICMAI2002), Edinburgh, editedby C. Anagnostopoulou, M. Ferrand, and A. Smaill (Springer, Berlin et al.), pp.58–68.

Dixon, S. E., Goebl, W., and Widmer, G. (2002b), “The Performance Worm: Realtime visualisation based on Langner’s representation,” in Proceedings of the 2002International Computer Music Conference, Goteborg, Sweden, edited by M. Nor-dahl (The International Computer Music Association, San Fransisco), pp. 361–364.

Dowling, W. J. (1990), “Expectancy and attention in melody perception,” Psy-chomusicology 9(2), 148–160.

Dunsby, J. (1996), Performing Music: Shared Concerns (Clarendon Press, Oxford).

Eisenberg, J. (1928), “Noted Russian pianist urges students to simplify me-chanical problems so that thought and energy may be directed to artis-tic interpretation,” The Musician 1928(June), 11, available electronically athttp://users.bigpond.net.au/nettheim/horowitz/horo28.htm.

Fletcher, H. and Munson, W. A. (1933), “Loudness, its definition, measurementand calculation,” Journal of the Acoustical Society of America 5, 82–108.

Fletcher, N. H. and Rossing, T. D. (1998), The Physics of Musical Instruments(Springer, New York, Berlin), 2nd ed.

Friberg, A. (1995), A Quantitative Rule System for Musical Performance, Ph.D.thesis, Department of Speech, Music and Hearing, Royal Institute of Technology,Stockholm.

Bibliography 149

Friberg, A. and Sundberg, J. (1995), “Time discrimination in a monotonic, isochro-nous sequence,” Journal of the Acoustical Society of America 98(5), 2524–2531.

Fucci, D., Harris, D., Petrosino, L., and Banks, M. (1993), “The effect of prefer-ence for rock music on magnitude-estimation scaling behavior in young adults,”Perceptual and Motor Skills 76(3, Pt 2), 1171–1176.

Fucci, D., Kabler, H., Webster, D., and McColl, D. (1999), “Comparisons of magni-tude estimation scaling of rock music by children, young adults, and older people,”Perceptual and Motor Skills 89, 1133–1138.

Fucci, D., McColl, D., and Petrosino, L. (1998), “Factors related to magnitudeestimation scaling of complex auditory stimuli: Aging,” Perceptual and MotorSkills 87(3, Pt 1), 836–838.

Fucci, D., Petrosino, L., McColl, D., Wyatt, D., and Wilcox, C. (1997), “Magnitudeestimation scaling of the loudness of a wide range of auditory stimuli,” Perceptualand Motor Skills 85, 1059–1066.

Gabrielsson, A. (1987), “Once again: The Theme from Mozart’s Piano Sonata inA Major (K.331),” in Action and Perception in Rhythm and Music, edited byA. Gabrielsson (Publications issued by the Royal Swedish Academy of Music,Stockholm), vol. 55, pp. 81–103.

Galembo, A. (1982), “Quality evaluation of musical instruments (in Russian),”Technical Aesthetics 5, 16–17.

Galembo, A. (2001), “Perception of musical instrument by performer and lis-tener (with application to the piano),” in Proceedings of the Internationalworkshop on Human Supervision and Control in Engineering and Music,September 21–24, 2001 (University of Kassel, Kassel, Germany), p. 257–266,http://www.engineeringandmusic.de/.

Galembo, A. and Cuddy, L. L. (1997), “Large grand versus small upright pianos:Factors of timbral difference,” Journal of the Acoustical Society of America 102(5Pt. 2), 3107.

Geringer, J. M., Fucci, D., Harris, D., Petrosino, L., and Banks, M. (1993), “Loud-ness estimations of noise, synthesizer, and music excerpts by musicians and non-musicians. The effect of preference for rock music on magnitude-estimation scalingbehavior in young adults,” Psychomusicology 12(1), 22–30.

Gillespie, B. (1992), “Dynamical modeling a the grand piano action,” in Proceed-ings of the International Computer Music Conference (ICMC’1992) (InternationalComputer Music Association, San Francisco), pp. 77–80.

150 Bibliography

Giordano, N. (1997), “Simple model of a piano soundboard,” Journal of the Acous-tical Society of America 102(2), 1159–1168.

Giordano, N. (1998a), “Mechanical impedance of a piano soundboard,” Journal ofthe Acoustical Society of America 103(4), 2128–2133.

Giordano, N. (1998b), “Sound production by a vibrating piano soundboard: Exper-iment,” Journal of the Acoustical Society of America 104(3, Pt. 1), 1648–1653.

Giordano, N. and Winans II, J. P. (2000), “Piano hammer and their force compres-sion characteristics: Does a power law make sense?” Journal of the AcousticalSociety of America 107(4), 2248–2255.

Goebl, W. (1999a), “Analysis of piano performance: towards a common perfor-mance standard?” in Proceedings of the Society of Music Perception and Cogni-tion Conference (SMPC99) (North-Western University, Evanston, Illinois, USA).

Goebl, W. (1999b), Numerisch-klassifikatorische Interpretationsanalysemit dem “Bosendorfer Computerflugel”, Magisterarbeit, Institut furMusikwissenschaft, Universitat Wien, Wien, available electronically athttp://www.oefai.at/∼wernerg/.

Goebl, W. (2000), “Skilled piano performance: Melody lead caused by dynamicdifferentiation,” in Proceedings of the 6th International Conference on Music Per-ception and Cognition (ICMPC6), Aug 5–10, 2000, edited by C. Woods, G. Luck,R. Brochard, F. A. Seddon, and J. A. Sloboda (Keele University, Department ofPsychology, Keele, UK), pp. 1165–1176.

Goebl, W. (2001), “Melody lead in piano performance: Expressive device or arti-fact?” Journal of the Acoustical Society of America 110(1), 563–572.

Goebl, W. and Bresin, R. (2001), “Are computer-controlled pianos a reliable tool inmusic performance research? Recording and reproduction precision of a YamahaDisklavier grand piano,” in Workshop on Current Research Directions in Com-puter Music, November 15–17, 2001, edited by C. L. Buyoli and R. Loureiro(Audiovisual Institute, Pompeu Fabra University, Barcelona, Spain), p. 45–50.

Goebl, W. and Bresin, R. (2003a), “Measurement and reproduction accuracy ofcomputer-controlled grand pianos,” Journal of the Acoustical Society of America114, in press.

Goebl, W. and Bresin, R. (2003b), “Measurement and reproduction accuracy ofcomputer-controlled grand pianos,” in Proceedings of the Stockholm Music Acous-tics Conference (SMAC’03), August 6–9, 2003, edited by R. Bresin (Departmentof Speech, Music, and Hearing, Royal Institute of Technology, Stockholm, Swe-den), vol. 1, p. 155–158.

Bibliography 151

Goebl, W., Bresin, R., and Galembo, A. (2003), “The piano action as theperformer’s interface: Timing properties, dynamic behaviour, and the per-former’s possibilities,” in Proceedings of the Stockholm Music Acoustics Confer-ence (SMAC’03), August 6–9, 2003, edited by R. Bresin (Department of Speech,Music, and Hearing, Royal Institute of Technology, Stockholm, Sweden), vol. 1,p. 159–162.

Goebl, W. and Parncutt, R. (2001), “Perception of onset asynchronies: Acousticpiano versus synthesized complex versus pure tones,” in Meeting of the Societyfor Music Perception and Cognition (SMPC2001), August 9–11, 2001 (Queens’sUniversity, Kingston, Ontario, Canada), p. 21–22.

Goebl, W. and Parncutt, R. (2002), “The influence of relative intensity on the per-ception of onset asynchronies,” in Proceedings of the 7th International Conferenceon Music Perception and Cognition, Sydney (ICMPC7), Aug. 17–21, 2002, editedby C. Stevens, D. Burnham, G. McPherson, E. Schubert, and J. Renwick (CausalProductions, Adelaide), pp. 613–616.

Goebl, W. and Parncutt, R. (2003), “Asynchrony versus intensity as cues for melodyperception in chords and real music,” in Proceedings of the 5th Triennial ES-COM Conference, September 8–13, 2003, edited by R. Kopiez, A. C. Lehmann,I. Wolther, and C. Wolf (Hanover University of Music and Drama, Hanover, Ger-many).

Green, D. M. (1971), “Temporal auditory acuity,” Psychological Review 78(6),540–551.

Gat, J. (1965), The Technique of Piano Playing (Corvina, Budapest), 3rd ed.

Hall, D. E. (1986), “Piano string excitation in the case of small hammer mass,”Journal of the Acoustical Society of America 79(1), 141–147.

Hall, D. E. (1987a), “Piano string excitation II: General solution for a ard narrowhamer,” Journal of the Acoustical Society of America 81(2), 535–546.

Hall, D. E. (1987b), “Piano string excitation III: General solution for a soft narrowammer,” Journal of the Acoustical Society of America 81(2), 547–555.

Hall, D. E. (1993), “Musical dynamic levels of pipe organ sounds,” Music Perception10(4), 417–434.

Hall, D. E. (2002), Musical Acoustics (Brooks/Cole, Pacific Grove, CA), 3rd ed.

Hall, D. E. and Askenfelt, A. (1988), “Piano string excitation V: Spectra for realhammers and strings,” Journal of the Acoustical Society of America 83(4), 1627–1638.

152 Bibliography

Handel, S. (1993), Listening. An Introduction to the Perception of Auditory Events(MIT-Press, Cambridge, Massachusetts, London, UK).

Hart, H. C., Fuller, M. W., and Lusby, W. S. (1934), “A precision study of pianotouch and tone,” Journal of the Acoustical Society of America 6, 80–94.

Hartmann, A. (1932), “Untersuchungen uber das metrische Verhalten in musikalis-chen Interpretationsvarianten,” Archiv fur die gesamte Psychologie 84, 103–192.

Hartmann, W. M. (1998), Signals, Sound, and Sensation, Modern Acoustics andSignal Processing (Springer, New York).

Hayashi, E., Yamane, M., and Mori, H. (1999), “Behavior of piano-action in a grandpiano. I. Analysis of the motion of the hammer prior to string contact,” Journalof the Acoustical Society of America 105(6), 3534–3544.

Heaviside, O. (1913), “Pianoforte touch,” Nature 91(2277), 397.

Henderson, M. T. (1936), “Rhythmic organization in artistic piano performance,”in Objective Analysis of Musical Performance, edited by C. E. Seashore (The Uni-versity Press, Iowa City), vol. IV of University of Iowa Studies in the Psychologyof Music, pp. 281–305.

Henning, G. B. and Gaskell, H. (1981), “Monaural phase sensitivity with Ronken’sparadigm,” Journal of the Acoustical Society of America 70(6), 1669–1673.

Hirsh, I. J. (1959), “Auditory perception of temporal order,” Journal of the Acous-tical Society of America 31, 759–767.

Hirsh, I. J. and Watson, C. S. (1996), “Auditory psychophysics and perception,”Annual Review of Psychology 47, 461–484.

Hoover, D. M. and Cullari, S. (1992), “Perception of loudness and musical prefer-ence: Comparison of musicians and nonmusicians,” Perceptual and Motor Skills74(3, Pt 2), 1149–1150.

Hudson, R. (1994), Stolen Time: The History of Tempo Rubato (Clarendon Press,Oxford).

Huron, D. B. (1989), “Voice denumerability in polyphonic music of homogeneoustimbres,” Music Perception 6, 361–382.

Huron, D. B. (1993), “Note-onset asynchrony in J. S. Bach’s two part inventions,”Music Perception 10(4), 435–444.

Huron, D. B. (2001), “Tone and voice: A derivation of the rules of voice-leadingfrom perceptual principles,” Music Perception 19(1), 1–64.

Bibliography 153

Huron, D. B. and Fantini, D. (1989), “The avoidance of inner-voice entries: per-ceptual evidence and musical practice,” Music Perception 9, 93–104.

Juslin, P. N. and Madison, G. (1999), “The role of timing patterns in recogni-tion of emotional expression from musical performance,” Music Perception 17(2),197–221.

Kendall, R. A. and Carterette, E. C. (1990), “The communication of musical ex-pression,” Music Perception 8, 129–164.

Knoblaugh, A. F. (1944), “The clang tone of the pianoforte,” Journal of the Acous-tical Society of America 16(1), 102.

Koornhof, G. W. and van der Walt, A. J. (1994), “The influence of touch on pianosound,” in Proceedings of the Stockholm Music Acoustics Conference (SMAC’93),July 28–August 1, 1993, edited by A. Friberg, J. Iwarsson, E. V. Jansson, andJ. Sundberg (Publications issued by the Royal Swedish Academy of Music, Stock-holm), vol. 79, p. 302–308.

Langner, J. and Goebl, W. (2002), “Representing expressive performance in tempo-loudness space,” in ESCOM 10th Anniversary Conference on Musical Creativity,April 5–8, 2002 (Universite de Liege, Liege, Belgium), CD-ROM.

Langner, J. and Goebl, W. (in press), “Visualizing expressive performance intempo-loudness space,” Computer Music Journal .

Langner, J., Kopiez, R., and Feiten, B. (1998), “Perception and Representation ofMultiple Tempo Hierarchies in Musical Performance and Composition: Perspec-tives from a New Theoretical Approach,” in Controlling Creative Processes inMusic, edited by R. Kopiez and W. Auhagen (P. Lang: Schriften zur Musikpsy-chologie und Musikasthetik, Frankfurt a. M.), vol. 12, pp. 13–35.

Langner, J., Kopiez, R., Stoffel, C., and Wilz, M. (2000), “Realtime analysis ofdynamic shaping,” in Proceedings of the 6th International Conference on Mu-sic Perception and Cognition (ICMPC6), Aug 5–10, 2000, edited by C. Woods,G. Luck, R. Brochard, F. A. Seddon, and J. A. Sloboda (Keele University De-partment of Psychology, Keele, UK), pp. 452–455.

Lerdahl, F. and Jackendoff, R. (1983), A Generative Theory of Tonal Music (MITPress, Cambridge (Mass.), London).

Leshowitz, B. (1971), “Measurement of the two-click threshold,” Journal of theAcoustical Society of America 49(2, Pt. 2), 462–466.

Lieber, E. (1985), “On the possibilities of influencing piano touch,” Das Musikin-strument 34, 58–63.

154 Bibliography

Lisboa, T., Zicari, M., and Eiholzer, H. (2002), “Mastery through imitation,” inESCOM 10th Anniversary Conference on Musical Creativity, April 5–8, 2002(Universite de Liege, Liege, Belgium), CD-ROM.

Maria, M. (1999), “Unscharfetests mit hybriden Tasteninstrumenten,” in GlobalVillage – Global Brain – Global Music. KlangArt Kongreß 1999, edited by B. En-ders and J. Stange-Elbe (Osnabruck, Germany).

Martin, D. W. (1947), “Decay rates of piano tones,” Journal of the AcousticalSociety of America 19(4), 535–541.

Meyer, J. (1965), “Die Richtcharakteristik des Flugels,” Das Musikinstrument 14,1085–1090.

Meyer, J. (1978), Acoustics and the Performance of Music (Verlag Das Musikin-strument, Frankfurt am Main, Germany).

Meyer, J. (1999), Akustik und musikalische Auffuhrungspraxis (Bochinsky, Ger-many), 4th ed.

Meyer, L. B. (1973), Explaining Music: Essays and Explorations (University ofCalifornia Press, Berkeley, CA).

Meyer-Eppler, W. (1949), Elektrische Klangerzeugung. Elektronische Musik undsysnthetische Sprache (Dummler, Bonn).

Moog, R. A. and Rhea, T. L. (1990), “Evolution of the keyboard interface: TheBosendorfer 290 SE recording piano and the Moog multiply-touch-sensitive key-boards,” Computer Music Journal 14(2), 52–60.

Moore, B. C. J. (1997), An Introduction to the Psychology of Hearing (AcademicPress, San Diego, CA), 4th ed.

Moore, G. (1979), Am I Too Loud? Memoirs of an Accompanist (Hamish Hamilton,London).

Mori, T. (2000), Ein Vergleich der qualitatsbestimmenden Faktoren von Klavierund Flugel, Braunschweig, TU Carolo-Wilhelmina, Diss (Verlagsgruppe Mainz,Wissenschaftsverlag, Aachen).

Morton, W. B. (1913), “Pianoforte touch,” Nature 91(2280), 477.

Nakamura, I. (1989), “Fundamental theory and computer simulation of the decaycharacteristics of piano sound,” Journal of the Acoustical Society of Japan 10(5),289–297.

Nakamura, T. (1987), “The communication of dynamics between musicians and lis-teners through musical performance.” Perception and Psychophysics 41(6), 525–533.

Bibliography 155

Namba, S. and Kuwano, S. (1990), “Continuous multi-dimensional assessment ofmusical performance,” Journal of the Acoustical Society of Japan 11(1), 43–51.

Namba, S., Kuwano, S., Hatoh, T., and Kato, M. (1991), “Assessment of musicalperformance by using the method of continuous judgement by selected descrip-tion,” Music Perception 8, 251–276.

Narmour, E. (1990), The Analysis and Cognition of Basic Melodic Structures: TheImplication-Realization Model (University of Chicago Press, Chicago).

Neuhaus, H. (1973), The Art of Piano Playing (Barrie & Jenkins, London).

Ortmann, O. (1925), The Physical Basis of Piano Touch and Tone (Kegan Paul,Trench, Trubner; J. Curwen; E. P. Dutton, London, New York).

Palmer, C. (1989), “Mapping musical thought to musical performance,” Journal ofExperimental Psychology: Human Perception and Performance 15(12), 331–346.

Palmer, C. (1996), “On the assignment of structure in music performance,” MusicPerception 14(1), 23–56.

Palmer, C. and Brown, J. C. (1991), “Investigations in the amplitude of soundedpiano tones,” Journal of the Acoustical Society of America 90(1), 60–66.

Palmer, C. and Holleran, S. (1994), “Harmonic, melodic, and frequency heightinfluences in the perception of multivoiced music,” Perception and Psychophysics56(3), 301–312.

Palmer, C. and van de Sande, C. (1993), “Units of knowledge in music perfor-mance,” Journal of Experimental Psychology: Learning, Memory, and Cognition19(2), 457–470.

Pampalk, E., Rauber, A., and Merkl, D. (2002), “Content-based organization andvisualization of music archives,” in Proceedings of the 10th ACM InternationalConference on Multimedia (ACM, Juan les Pins, France), p. 570–579.

Pampalk, E., Widmer, G., and Chan, A. (2003), “A New Approach to HierarchicalClustering and Structuring of Data with Self-Organizing Maps,” Intelligent DataAnalysis Journal 8(2), in press.

Parlitz, D., Peschel, T., and Altenmuller, E. (1998), “Assessment of dynamic fingerforces in pianists: Effects of training and expertise,” Journal of Biomechanics31(11), 1063–1067.

Parncutt, R. (1989), Harmony. A Psychoacoustical Approach (Springer, Berlin).

156 Bibliography

Parncutt, R. and Holming, P. (2000), “Is scientific research on piano performanceuseful for pianists?” in Poster presentation at the 6th International Confer-ence on Music Perception and Cognition (ICMPC6), Aug. 5–10, 2000, editedby C. Woods, G. Luck, R. Brochard, F. A. Seddon, and J. A. Sloboda (KeeleUniversity, Psychology Department, Keele, UK), pp. 412–413.

Parncutt, R. and Troup, M. (2002), “Piano,” in The Science and Psychology ofMusic Performance. Creative Strategies for Teaching and Learning, edited byR. Parncutt and G. McPherson (University Press, Oxford, New York), pp. 285–302.

Pastore, R. E., Harris, L. B., and Kaplan, J. K. (1982), “Temporal order iden-tification: Some parameter dependencies,” Journal of the Acoustical Society ofAmerica 71(2), 430–436.

Pickering, S. (1913a), “Pianoforte touch,” Nature 91(2283), 555–556.

Pickering, S. (1913b), “Pianoforte touch,” Nature 92(2302), 425.

Plomp, R., Wagenaar, W. A., and Mimpen, A. M. (1973), “Musical interval recog-nition with simultaneous tones,” Acustica 29, 101–109.

Podlesak, M. and Lee, A. R. (1988), “Dispersion of waves in piano strings,” Journalof the Acoustical Society of America 83(1), 305–317.

Rasch, R. A. (1978), “The perception of simultaneous notes such as in polyphonicmusic,” Acustica 40, 21–33.

Rasch, R. A. (1979), “Synchronization in performed ensemble music,” Acustica 43,121–131.

Rasch, R. A. (1988), “Timing and synchronization in ensemble performance,” inGenerative Processes in Music: The Psychology of Performance, Improvisation,and Composition, edited by J. A. Sloboda (Clarendon Press, Oxford), pp. 70–90.

Rauber, A., Pampalk, E., and Merkl, D. (2002), “Using psycho-acoustic models andself-organizing maps to create a hierarchical structuring of music by sound simi-larities,” in Proceedings of the 3rd International Conference on Music InformationRetrieval (ISMIR’02) (IRCAM – Centre Pompidou, Paris, France), p. 71–80.

Repp, B. H. (1993a), “Music as motion: A synopsis of Alexander Truslit’s ‘Gestal-tung und Bewegung in der Musik’,” Psychology of Music 21, 48–72.

Repp, B. H. (1993b), “Some empirical observations on sound level propertiesof recorded piano tones,” Journal of the Acoustical Society of America 93(2),1136–44.

Bibliography 157

Repp, B. H. (1994), “On determining the basic tempo of an expressive music per-formance,” Psychology of Music 22, 157–167.

Repp, B. H. (1995a), “Acoustics, perception, and production of legato articulationon a digital piano,” Journal of the Acoustical Society of America 97(6), 3862–3874.

Repp, B. H. (1995b), “Expressive timing in Schumann’s “Traumerei”: An analysisof performances by graduate student pianists,” Journal of the Acoustical Societyof America 98(5), 2413–2427.

Repp, B. H. (1996a), “Patterns of note onset asynchronies in expressive pianoperformance,” Journal of the Acoustical Society of America 100(6), 3917–3932.

Repp, B. H. (1996b), “Pedal timing and tempo in expressive piano performance: Apreliminary investigation,” Psychology of Music 24(2), 199–221.

Repp, B. H. (1996c), “The art of inaccuracy: Why pianists’ errors are difficult tohear,” Music Perception 14(2), 161–184.

Repp, B. H. (1996d), “The dynamics of expressive piano performance: Schumann’s“Traumerei” revisited,” Journal of the Acoustical Society of America 100(1),641–650.

Repp, B. H. (1997a), “Acoustics, perception, and production of legato articula-tion on a computer-controlled grand piano,” Journal of the Acoustical Society ofAmerica 102(3), 1878–1890.

Repp, B. H. (1997b), “The effect of tempo on pedal timing in piano performance,”Psychological Research 60(3), 164–172.

Repp, B. H. (1999), “A microcosm of musical expression: II. Quantitative analysisof pianists’ dynamics in the initial measures of Chopin’s Etude in E major,”Journal of the Acoustical Society of America 105(3), 1972–88.

Reuter, C. (1995), Der Einschwingvorgang nichtperkussiver Musikinstrumente (P.Lang, Frankfurt am Main).

Riley-Butler, K. (2001), “Comparative performance analysis through feedbacktechnology,” in Meeting of the Society for Music Perception and Cognition(SMPC2001), August 9–11, 2001 (Queen’s University, Kingston, Ontario,Canada), p. 27–28.

Riley-Butler, K. (2002), “Teaching expressivity: An aural–visual feed-back–replication model,” in ESCOM 10th Anniversary Conference on MusicalCreativity, April 5–8, 2002 (Universite de Liege, Liege, Belgium), CD-ROM.

Roads, C. (1986), “Bosendorfer 290 SE computer-based piano,” Computer MusicJournal 10(3), 102–103.

158 Bibliography

Roederer, J. G. (1973), Introduction to the Physics and Psychophysics of Music(Springer, New York, Heidelberg, Berlin).

Rosen, S. and Howell, P. (1987), “Is there a natural sensitivity at 20 ms in rela-tive tone-onset-time continua? A reanalysis of Hirsh’s (1959) data,” in The Psy-chophysics of Speech Perception, edited by M. E. H. Schouten (Martinus NijhoffPublishing, Dordrecht, Netherlands), vol. X, pp. 199–209.

Seashore, C. E. (1937), “Piano touch,” Scientific Monthly, New York 45, 360–365.

Shaffer, L. H. (1981), “Performances of Chopin, Bach and Bartok: Studies in motorprogramming,” Cognitive Psychology 13, 326–376.

Shaffer, L. H. (1984), “Timing in solo and duet piano performances,” QuarterlyJournal of Experimental Psychology: Human Experimental Psychology 4, 577–595.

Shaffer, L. H., Clarke, E. F., and Todd, N. P. M. (1985), “Metre and Rhythm inPianoplaying,” Cognition 20, 61–77.

Shaffer, L. H. and Todd, N. P. M. (1987), “The interpretative component in mu-sical performance,” in Action and Perception in Rhythm and Music, edited byA. Gabrielsson (Publications issued by the Royal Swedish Academy of Music,Stockholm), vol. 55, pp. 139–152.

Skinner, L. and Seashore, C. E. (1936), “A musical pattern score of the first move-ment of the Beethoven sonata, opus 27, No. 2,” in Objective Analysis of MusicalPerformance, edited by C. E. Seashore (University Press, Iowa), vol. IV of Studiesin the Psychology of Music, pp. 263–279.

Stahnke, W. (2000), “Personal communication,” .

Stevens, S. S. (1961), “The measurement of loudness,” Journal of the AcousticalSociety of America 33, 1577–1585.

Suzuki, H. (1986), “Vibration and sound radiation of a piano soundboard,” Journalof the Acoustical Society of America 80(6), 1573–1582.

Suzuki, H. (1987), “Model analysis of a hammer-string interaction,” Journal of theAcoustical Society of America 82(4), 1145–1151.

Taguti, T., Ohtsuki, K., Yamasaki, T., Kuwano, S., and Namba, S. (2002), “Qualityof piano tones under different tone stoppings,” Acoustical Science and Technology23(5), 244–251.

Terhardt, E. (1974), “On the perception of periodic sound fluctuations (rough-ness),” Acustica 30, 201–213.

Bibliography 159

Terhardt, E. (1979), “Calculating virtual pitch,” Hearing Research 1(2), 155–182.

Terhardt, E., Stoll, G., and Seewann, M. (1982), “Pitch of complex signals ac-cording to virtual-pitch theory: Tests, examples, and predictions,” Journal of theAcoustical Society of America 71(3), 671–678.

Tillmann, B. and Bharucha, J. J. (2002), “Effect of harmonic relatedness on thedetection of temporal asynchronies,” Perception and Psychophysics 64(4), 640–649.

Timmers, R., Ashley, R., Desain, P., and Heijink, H. (2000), “The influence of mu-sical context on tempo rubato,” Journal of New Music Research 29(2), 131–158.

Tro, J. (1994), “Perception of micro dynamical variation in piano performance,” inProceedings of the Stockholm Music Acoustics Conference (SMAC’93), July 28–August 1, 1993, edited by A. Friberg, J. Iwarsson, E. V. Jansson, and J. Sundberg(Publications issued by the Royal Swedish Academy of Music, Stockholm), vol. 79,pp. 150–154.

Tro, J. (1998), “Micro dynamics deviation as a measure of musical quality in pi-ano performances?” in Proceedings of the 5th International Conference on MusicPerception and Cognition (ICPMC5), August 26–30, 1998, edited by S. W. Yi(Western Music Research Institute, Seoul National University, Seoul, Korea).

Tro, J. (2000a), “Aspects of control and perception,” in Proceedings of theCOST–G6 Conference on Digital Audio Effects (DAFX–00), December 7–9, 2000,edited by D. Rocchesso and M. Signoretto (Universita degli Studi di Verona, Di-partimento Scientifico e Tecnologico, Verona, Italy), pp. 171–176.

Tro, J. (2000b), “Data reliability and reproducibility in music performance mea-surements,” in Proceedings of the Seventh Western Pacific Regional AcousticsConference (WESTPRAC–VII), October 3–5 2000 (The Acoustical Society ofJapan, Kumamoto, Japan), p. 391–394.

Truax, B. (1978), Handbook for Acoustic Ecology, vol. 5 ofWorld Soundscape Project(A.R.C. Publications, Vancouver, B.C.), first ed.

Truslit, A. (1938), Gestaltung und Bewegung in der Musik (Chr. Fiedrich Vieweg,Berlin-Lichtenfelde).

Van den Berghe, G., De Moor, B., and Minten, W. (1995), “Modeling a grand pianokey action,” Computer Music Journal 19(2), 15–22.

van Noorden, L. (1975), Temporal Coherence in the Perception of Tone Sequences,Doctoral dissertation, Institute for Perception Research, Eindhoven University ofTechnology, Eindhoven, The Netherlands.

160 Bibliography

Vernon, L. N. (1937), “Synchronization of chords in artistic piano music,” in Objec-tive Analysis of Musical Performance, edited by C. E. Seashore (University Press,Iowa), vol. IV of Studies in the Psychology of Music, pp. 306–345.

Vos, J. and Rasch, R. A. (1981a), “The perceptual onset of musical tones,” inMusic,Mind and Brain. The Neuropsychology of Music, edited by M. Clynes (PlenumPress, New York, London), pp. 299–319.

Vos, J. and Rasch, R. A. (1981b), “The perceptual onset of musical tones,” Per-ception and Psychophysics 29(4), 323–335.

Wallach, H., Newman, E. B., and Rosenzweig, M. R. (1949), “The precedence effectin sound localization,” American Journal of Psychology 62, 315–336.

Watkins, A. J. (1985), “Scale, key, and contour in the discrimination of tunedand mistuned approximations to melody,” Perception and Psychophysics 37(4),275–285.

Weinreich, G. (1977), “Coupled piano strings,” Journal of the Acoustical Societyof America 62, 1474–1484.

Weinreich, G. (1990), “The coupled motion of piano strings,” in Five Lectures onthe Acoustics of the Piano, edited by A. Askenfelt (Publications issued by theRoyal Swedish Academy of Music, Stockholm), vol. 64, pp. 73–81.

Wheatley, C. W. C. (1913), “Pianoforte touch,” Nature 91(2275), 347–348.

White, W. B. (1930), “The human element in piano tone production,” Journal ofthe Acoustical Society of America 1, 357–367.

Widmer, G. (2001), “Using AI and machine learning to study expressive music per-formance: Project survey and first report,” AI Communications 14(3), 149–162.

Widmer, G. (2002a), “In search of the Horowitz factor: Interim report on a mu-sical discovery project,” in Proceedings of the 5th International Conference onDiscovery Science (DS’02), Lubeck, Germany (Springer, Berlin).

Widmer, G. (2002b), “Machine discoveries: A few simple, robust local expressionprinciples,” Journal of New Music Research 31(1), 37–50.

Wier, C. C. and Green, D. M. (1975), “Temporal acuity as a function of frequencydifference,” Journal of the Acoustical Society of America 57(6), 1512–1515.

Winckel, F. (1952), Klangwelt unter der Lupe. Aesthetisch-naturwissenschaftlicheBetrachtungen (Hesse, Berlin, Wunsiedel).

Wolff, K. (1979), Interpretation auf dem Klavier (The Teaching of Artur Schnabel)Was wir von Schnabel lernen (R. Piper & Co. Verlag, Munchen, Zurich).

Bibliography 161

Yost, W. A. (2000), Fundamentals of Hearing (Academic Press, San Diego).

Zera, J. and Green, D. M. (1993a), “Detecting temporal asynchrony with asyn-chronous standards,” Journal of the Acoustical Society of America 93(3), 1571–1579.

Zera, J. and Green, D. M. (1993b), “Detecting temporal onset and offset asyn-chrony in multicomponent complexes,” Journal of the Acoustical Society of Amer-ica 93(2), 1038–1052.

Zera, J. and Green, D. M. (1995), “Effect of signal component phase on asynchronydiscrimination,” Journal of the Acoustical Society of America 98(2, Pt 1), 817–827.

Zwicker, E. and Fastl, H. (1999), Psychoacoustics. Facts and Models, Springer Seriesin Information Sciences Vol. 22 (Springer, Berlin, Heidelberg), second updated ed.

162 Bibliography

Appendix A

Ratings of Listening Tests

• Pilot studyQuestions 1–2. Ratings for musicians and non-musicians . . Table A.1. p. 164

• Experiment IAdjustment ratings by timbre and chord . . . . . . . . . . . . . . . . . . Table A.3, p. 168

• Experiment IIRatings by instrument, timbre, asynchrony, and intensity . .Table A.3, p. 169

• Experiment IIIRatings by instrument, timbre, asynchrony, and intensity . .Table A.4, p. 170

• Experiment IVaRatings by instrument, voice, asynchrony, and intensity . . . Table A.5, p. 177

• Experiment IVbRatings by instrument, voice, asynchrony, and intensity . . . Table A.6, p. 178

• Experiment VRatings by instrument, voice, asynchrony, and intensity . . . Table A.7, p. 179

• Experiment VIRatings by instrument, voice, intensity, and asynchrony . . . Table A.8, p. 180

163

164 Appendix A. Ratings

Table A.1: Pilot experiment (Section 4.3, p. 87). Frequencies of ratings (1: “the upper”,0: “the lower”) separately for timbre (1: pure, 2: complex, 3: MIDI, 4: samples recordedfrom the Bosendorfer SE290), interval (8: octave, 7: seventh), relative timing (in ms) forthe two questions (Question 1: “Which tone is more prominent?”, and Question 2: “Whichtone is earlier?”), with musicians’ (Mus) and non-musicians’ (Non-mus) rating frequenciesin separate columns.

Frequencies of ratingsQuestion 1 Question 2

Timbre Interval Timing Rating Mus Non-mus Mus Non-mus1 7 −50 0 5 4 7 62 7 −50 0 4 2 7 73 7 −50 0 7 6 11 104 7 −50 0 9 9 12 141 8 −50 0 3 4 9 82 8 −50 0 4 1 8 33 8 −50 0 5 4 9 84 8 −50 0 4 5 7 81 7 −40 0 5 6 6 122 7 −40 0 4 2 4 73 7 −40 0 5 5 7 124 7 −40 0 9 9 11 131 8 −40 0 5 3 10 82 8 −40 0 3 0 9 43 8 −40 0 3 2 10 74 8 −40 0 3 5 7 111 7 −30 0 3 6 5 92 7 −30 0 5 3 8 73 7 −30 0 5 5 10 94 7 −30 0 6 9 8 141 8 −30 0 3 1 10 72 8 −30 0 2 3 7 83 8 −30 0 2 5 7 124 8 −30 0 4 5 8 101 7 −20 0 6 3 9 92 7 −20 0 5 3 7 63 7 −20 0 3 8 9 104 7 −20 0 10 8 17 141 8 −20 0 5 3 13 82 8 −20 0 1 1 7 53 8 −20 0 2 6 8 124 8 −20 0 4 3 7 81 7 −10 0 8 6 13 102 7 −10 0 4 3 6 63 7 −10 0 5 6 10 94 7 −10 0 10 9 13 131 8 −10 0 2 3 9 82 8 −10 0 4 4 8 7

continued on next page

Appendix A. Ratings 165

Table A.1: continued

Timbre Interval Timing Rating Mus1 Non-Mus1 Mus2 non-Mus23 8 −10 0 2 3 9 94 8 −10 0 4 3 10 71 7 0 0 4 5 11 92 7 0 0 4 5 10 83 7 0 0 2 8 6 144 7 0 0 9 8 16 141 8 0 0 3 3 10 72 8 0 0 1 2 8 93 8 0 0 2 4 9 94 8 0 0 3 4 7 91 7 10 0 5 7 9 92 7 10 0 5 3 6 63 7 10 0 4 6 8 104 7 10 0 9 7 13 141 8 10 0 4 3 11 72 8 10 0 4 1 9 53 8 10 0 4 5 11 74 8 10 0 6 4 11 101 7 20 0 8 6 12 92 7 20 0 4 3 7 63 7 20 0 6 7 10 124 7 20 0 10 9 14 141 8 20 0 2 0 8 42 8 20 0 2 2 10 63 8 20 0 2 4 10 84 8 20 0 2 3 7 91 7 30 0 7 4 10 112 7 30 0 3 3 6 53 7 30 0 5 6 13 104 7 30 0 7 8 12 121 8 30 0 2 3 10 62 8 30 0 2 0 10 63 8 30 0 2 4 8 84 8 30 0 3 3 10 91 7 40 0 8 5 13 102 7 40 0 6 3 10 63 7 40 0 6 6 12 104 7 40 0 8 8 13 131 8 40 0 4 4 13 102 8 40 0 1 0 10 33 8 40 0 3 5 10 104 8 40 0 2 5 11 91 7 50 0 8 6 15 102 7 50 0 3 3 9 63 7 50 0 6 6 11 114 7 50 0 8 8 15 13

continued on next page

166 Appendix A. Ratings

Table A.1: continued

Timbre Interval Timing Rating Mus1 Non-Mus1 Mus2 non-Mus21 8 50 0 1 3 9 72 8 50 0 3 0 13 63 8 50 0 2 3 12 104 8 50 0 3 4 11 101 7 −50 1 5 5 13 122 7 −50 1 6 7 13 113 7 −50 1 3 3 9 84 7 −50 1 1 0 8 41 8 −50 1 7 5 11 102 8 −50 1 6 8 12 153 8 −50 1 5 5 11 104 8 −50 1 6 4 13 101 7 −40 1 5 3 14 62 7 −40 1 6 7 16 113 7 −40 1 5 4 13 64 7 −40 1 1 0 9 51 8 −40 1 5 6 10 102 8 −40 1 7 9 11 143 8 −40 1 7 7 10 114 8 −40 1 7 4 13 71 7 −30 1 7 3 15 92 7 −30 1 5 6 12 113 7 −30 1 5 4 10 94 7 −30 1 4 0 12 41 8 −30 1 7 8 10 112 8 −30 1 8 6 13 103 8 −30 1 8 4 13 64 8 −30 1 6 4 12 81 7 −20 1 4 6 11 92 7 −20 1 5 6 13 123 7 −20 1 7 1 11 84 7 −20 1 0 1 3 41 8 −20 1 5 6 7 102 8 −20 1 9 8 13 133 8 −20 1 8 3 12 64 8 −20 1 6 6 13 101 7 −10 1 2 3 7 82 7 −10 1 6 6 14 123 7 −10 1 5 3 10 94 7 −10 1 0 0 7 51 8 −10 1 8 6 11 102 8 −10 1 6 5 12 113 8 −10 1 8 6 11 94 8 −10 1 6 6 10 111 7 0 1 6 4 9 92 7 0 1 6 4 10 10

continued on next page

Appendix A. Ratings 167

Table A.1: continued

Timbre Interval Timing Rating Mus1 Non-Mus1 Mus2 non-Mus23 7 0 1 8 1 14 44 7 0 1 1 1 4 41 8 0 1 7 6 10 112 8 0 1 9 7 12 93 8 0 1 8 5 11 94 8 0 1 7 5 13 91 7 10 1 5 2 11 92 7 10 1 5 6 14 123 7 10 1 6 3 12 84 7 10 1 1 2 7 41 8 10 1 6 6 9 112 8 10 1 6 8 11 133 8 10 1 6 4 9 114 8 10 1 4 5 9 81 7 20 1 2 3 8 92 7 20 1 6 6 13 123 7 20 1 4 2 10 64 7 20 1 0 0 6 41 8 20 1 8 9 12 142 8 20 1 8 7 10 123 8 20 1 8 5 10 104 8 20 1 8 6 13 91 7 30 1 3 5 10 72 7 30 1 7 6 14 133 7 30 1 5 3 7 84 7 30 1 3 1 8 61 8 30 1 8 6 10 122 8 30 1 8 9 10 123 8 30 1 8 5 12 104 8 30 1 7 6 10 91 7 40 1 2 4 7 82 7 40 1 4 6 10 123 7 40 1 4 3 8 84 7 40 1 2 1 7 51 8 40 1 6 5 7 82 8 40 1 9 9 10 153 8 40 1 7 4 10 84 8 40 1 8 4 9 91 7 50 1 2 3 5 82 7 50 1 7 6 11 123 7 50 1 4 3 9 74 7 50 1 2 1 5 51 8 50 1 9 6 11 112 8 50 1 7 9 7 123 8 50 1 8 6 8 84 8 50 1 7 5 9 8

168 Appendix A. Ratings

Table A.2: Experiment I (Section 4.4.2, p. 96). Pairs of MIDI velocity units adjustedby the 26 participants (P), separately for five different conditions involving three tonetypes (pure, sawtooth, and real piano) and three chords (B4/G#4, C5/A5, Db5/Bb5).Every participant had to give at least two adjustments, some of them three, because theirprevious two adjustments were too inconsistent.

Pure Sawth. Piano

P C5/A5 C5/A5 B4/G#4 C5/A5 Db5/Bb5

1 51/59 61/49 69/41 67/43 65/451 53/57 61/49 71/39 65/45 63/472 49/61 49/61 65/45 59/51 65/452 45/65 45/65 61/49 57/53 59/513 49/61 65/45 69/41 63/47 65/453 47/63 63/47 67/43 65/45 65/454 53/57 77/33 63/47 61/49 61/494 47/63 75/35 65/45 55/55 55/555 63/47 67/43 69/41 61/49 69/415 57/53 69/41 65/45 65/45 63/476 59/51 71/39 69/41 65/45 63/476 59/51 77/33 63/47 63/47 51/596 65/45 79/31 67/43 65/45 63/477 55/55 75/35 71/39 63/47 63/477 55/55 69/41 61/49 63/47 63/478 57/53 73/37 63/47 63/47 61/498 63/47 71/39 69/41 65/45 63/479 55/55 59/51 63/47 61/49 59/519 51/59 49/61 63/47 59/51 57/5310 53/57 75/35 71/39 75/35 69/4110 53/57 75/35 73/37 69/41 71/3911 47/63 77/33 65/45 65/45 57/5311 59/51 47/63 63/47 65/45 61/4911 67/43 71/39 69/41 65/45 63/4712 49/61 49/61 65/45 65/45 61/4912 51/59 57/53 63/47 65/45 59/5113 41/69 59/51 71/39 63/47 65/4513 49/61 51/59 61/49 55/55 61/4913 57/53 57/53 61/49 55/55 59/5114 49/61 59/51 63/47 61/49 63/4714 51/59 55/55 65/45 63/47 51/5914 53/57 63/47 69/41 63/47 55/5515 65/45 79/31 77/33 71/39 73/3715 61/49 77/33 77/33 75/35 69/4116 45/65 73/37 71/39 63/47 65/4516 49/61 67/43 69/41 65/45 57/5317 57/53 55/55 65/45 65/45 49/6117 55/55 55/55 67/43 65/45 61/4918 39/71 47/63 59/51 67/43 37/7318 63/47 43/67 59/51 43/67 41/6918 49/61 55/55 61/49 53/57 49/6119 53/57 63/47 69/41 67/43 69/4119 57/53 65/45 69/41 71/39 73/3720 55/55 59/51 67/43 65/45 63/4720 55/55 59/51 67/43 65/45 65/4521 53/57 63/47 67/43 63/47 67/4321 55/55 55/55 67/43 65/45 63/4722 53/57 73/37 63/47 59/51 65/4522 53/57 67/43 63/47 61/49 65/4523 55/55 69/41 69/41 63/47 61/4923 55/55 67/43 67/43 65/45 61/4924 53/57 59/51 63/47 63/47 61/4924 53/57 55/55 65/45 63/47 63/4725 53/57 65/45 63/47 55/55 61/4925 51/59 63/47 63/47 63/47 59/5126 59/51 55/55 67/43 63/47 65/4526 55/55 61/49 67/43 63/47 63/47

Appendix A. Ratings 169

Tab

leA.3:Exp

erim

entII

(Section

4.4.3,

p.98).

Lou

dnessrating

s(“W

hich

ofthetw

otone

sis

loud

er?”,from

1:thelower

to7:

the

uppe

r)by

instrument(instr,1:

pian

o,0:

othe

rinstruments),

timbre(pure,

sawtooth,

samples

recorded

from

theBosen

dorfer

SE290),

asyn

chrony

(−54,−2

7,0,

+27,+54

ms),an

dvelocity

combina

tion

s(v1:

+20/−

20,v2

:+10/−

10,v3

:0/0,

v4:−1

0/+10,v5

:−2

0/+20

MID

Ivelocity

units).The

missing

data

was

left

blan

kforthefirst

17pa

rticipan

tsin

thepu

retone

cond

itionat

−54an

d−2

7ms(see

Section4.4.3,

Footno

te9on

p.99),

while

theda

taforthesimultane

ouspu

re-ton

econd

itionwas

averaged

over

threerating

s.

Pure

tone

Saw

tooth

Pia

no

−54

ms

−27

ms

0m

s+

27

ms

+54

ms

−54

ms

−27

ms

0m

s+

27

ms

+54

ms

−54

ms

−27

ms

0m

s+

27

ms

+54

ms

inst

rv1

v2

v3

v4

v5

v1

v2

v3

v4

v5

v1

v2

v3

v4

v5

···

05.6

67

4.6

67

3.6

67

3.3

33

26

44

21

64

43

26

44

21

65

52

26

53

21

65

32

15

53

21

76

43

27

64

32

75

44

27

64

32

76

53

20

4.6

67

3.6

67

3.3

33

1.3

33

15

14

21

54

31

16

54

21

65

53

16

44

11

65

21

15

55

21

76

32

16

63

21

66

42

17

53

11

76

32

10

6.3

33

5.3

33

4.3

33

3.6

67

36

65

44

66

53

36

63

22

65

42

27

54

21

75

43

26

55

21

65

33

26

55

32

64

44

26

54

32

66

43

21

6.3

33

55.3

33

33

64

54

46

45

44

45

62

26

64

46

76

71

15

44

22

54

56

25

54

42

65

33

26

45

22

65

53

16

64

22

05.3

33

5.3

33

4.3

33

3.6

67

2.6

67

54

44

25

43

42

76

52

27

54

31

65

53

26

55

51

64

42

26

63

31

76

42

17

53

41

76

42

17

53

31

06

5.6

67

43.3

33

2.6

67

65

54

36

64

33

76

44

36

54

32

66

53

16

63

42

56

43

27

54

21

75

42

27

53

22

76

32

17

74

12

05.3

33

4.6

67

4.6

67

43.3

33

55

54

46

55

34

65

43

36

54

52

65

43

25

42

32

65

52

36

65

22

65

54

26

43

32

65

43

26

55

32

16

5.3

33

43.3

33

36

54

33

65

54

36

55

43

55

44

46

64

42

55

43

36

54

33

76

54

27

54

32

76

42

27

54

32

76

52

21

6.6

67

5.3

33

5.6

67

1.3

33

1.3

33

56

47

26

52

21

76

41

17

66

21

77

51

17

74

11

67

21

17

64

22

77

46

17

64

21

76

33

16

73

26

15.6

67

54.3

33

3.6

67

2.6

67

65

43

36

65

23

65

51

15

55

51

54

42

15

55

32

44

43

27

65

32

74

52

36

54

22

76

53

26

63

43

15.3

33

4.3

33

3.6

67

44

55

54

46

35

43

66

54

37

55

43

55

44

25

55

43

65

43

27

65

32

76

43

27

65

21

76

53

27

64

22

05.6

67

5.3

33

43.6

67

1.6

67

75

44

16

24

22

75

52

17

65

11

66

42

16

65

22

65

53

17

56

21

76

55

17

64

31

77

65

17

75

32

16

4.6

67

3.6

67

43.3

33

65

44

36

44

34

66

43

26

65

31

65

53

76

54

21

64

32

17

65

22

65

32

17

44

21

64

42

16

53

31

16

5.3

33

4.3

33

2.6

67

1.6

67

65

43

27

44

31

66

52

26

65

21

66

52

16

64

11

66

23

26

54

32

65

44

66

54

32

66

44

27

54

22

16

4.6

67

33

1.6

67

65

33

26

53

45

66

54

27

55

42

46

33

25

43

32

64

33

27

54

33

74

43

17

54

22

76

33

27

54

32

07

6.3

33

4.6

67

3.6

67

2.6

67

66

44

37

65

32

64

33

16

43

31

74

42

15

43

21

55

31

17

63

31

64

42

27

63

31

65

43

17

53

21

04.6

67

4.3

33

43.6

67

36

34

42

54

43

35

53

42

54

44

34

54

42

55

43

25

43

42

76

53

27

64

32

75

53

27

65

42

76

53

21

35

44

45

44

44

64

44

35

44

44

65

43

36

64

33

75

43

26

54

31

55

43

25

44

32

75

44

36

44

43

75

43

26

54

32

75

44

21

65

43

26

54

52

65

42

36

64

22

66

42

26

65

32

65

53

17

64

21

65

33

17

54

22

77

62

27

64

21

77

62

17

76

21

77

52

11

75

33

25

53

32

74

43

66

63

11

65

33

26

75

21

75

43

17

65

11

64

42

16

54

11

76

52

17

64

22

75

43

17

53

22

75

43

21

65

43

36

54

32

65

44

36

55

33

65

32

26

64

11

64

52

26

64

11

65

32

16

53

21

65

22

16

44

21

54

42

27

53

21

75

22

10

65

34

35

44

22

66

52

26

45

42

66

43

26

55

32

65

43

26

53

21

65

33

25

54

32

55

43

27

44

23

63

33

26

43

32

65

33

21

66

54

27

55

33

55

54

16

55

33

76

55

27

76

21

66

43

17

55

11

76

52

17

74

21

76

52

27

64

32

76

53

27

55

32

76

55

11

54

54

35

65

44

55

44

35

54

44

55

54

34

55

32

65

43

25

54

21

64

33

15

53

32

64

33

26

64

32

65

52

25

53

32

65

32

21

66

53

35

44

43

45

45

25

44

32

44

54

27

55

32

65

43

16

54

41

75

53

26

45

21

66

52

36

55

53

67

44

16

56

31

76

53

2

170 Appendix A. Ratings

Table A.4: Experiment III (Section 4.4.4, p. 100). Frequencies of asynchrony detectionratings (“Are the two tones simultaneous?”, 1: simultaneous, 0: asynchronous) separatelyfor instrument (1: piano, 0: other instruments), timbre (1: pure tones, 2: sawtooth tones,3: samples recorded from the Bosendorfer SE290), relative timing (in ms), velocity combi-nations (v1: +20/−20, v2: +10/−10, v3: 0/0, v4: −10/+10, v5: −20/+20 MIDI velocityunits).

Instrument Timbre Timing Velocity Rating Frequency of ratings0 1 −54 v1 0 11 1 −54 v1 0 50 2 −54 v1 0 81 2 −54 v1 0 100 3 −54 v1 0 71 3 −54 v1 0 90 1 −27 v1 0 11 1 −27 v1 0 30 2 −27 v1 0 31 2 −27 v1 0 50 3 −27 v1 0 21 3 −27 v1 0 50 1 0 v1 0 01 1 0 v1 0 00 2 0 v1 0 01 2 0 v1 0 10 3 0 v1 0 11 3 0 v1 0 40 1 27 v1 0 101 1 27 v1 0 150 2 27 v1 0 111 2 27 v1 0 150 3 27 v1 0 111 3 27 v1 0 150 1 54 v1 0 111 1 54 v1 0 150 2 54 v1 0 111 2 54 v1 0 150 3 54 v1 0 111 3 54 v1 0 150 1 −54 v2 0 21 1 −54 v2 0 50 2 −54 v2 0 101 2 −54 v2 0 90 3 −54 v2 0 81 3 −54 v2 0 140 1 −27 v2 0 11 1 −27 v2 0 50 2 −27 v2 0 61 2 −27 v2 0 5

continued on next page

Appendix A. Ratings 171

Table A.4: continued

Instrument Timbre Timing Velocity Rating Frequency of ratings0 3 −27 v2 0 31 3 −27 v2 0 40 1 0 v2 0 01 1 0 v2 0 10 2 0 v2 0 01 2 0 v2 0 00 3 0 v2 0 41 3 0 v2 0 60 1 27 v2 0 111 1 27 v2 0 150 2 27 v2 0 111 2 27 v2 0 140 3 27 v2 0 111 3 27 v2 0 150 1 54 v2 0 111 1 54 v2 0 150 2 54 v2 0 111 2 54 v2 0 150 3 54 v2 0 111 3 54 v2 0 150 1 −54 v3 0 21 1 −54 v3 0 60 2 −54 v3 0 101 2 −54 v3 0 140 3 −54 v3 0 101 3 −54 v3 0 140 1 −27 v3 0 11 1 −27 v3 0 40 2 −27 v3 0 81 2 −27 v3 0 90 3 −27 v3 0 81 3 −27 v3 0 90 1 0 v3 0 01 1 0 v3 0 10 2 0 v3 0 01 2 0 v3 0 00 3 0 v3 0 61 3 0 v3 0 50 1 27 v3 0 111 1 27 v3 0 140 2 27 v3 0 111 2 27 v3 0 130 3 27 v3 0 101 3 27 v3 0 150 1 54 v3 0 111 1 54 v3 0 15

continued on next page

172 Appendix A. Ratings

Table A.4: continued

Instrument Timbre Timing Velocity Rating Frequency of ratings0 2 54 v3 0 111 2 54 v3 0 150 3 54 v3 0 111 3 54 v3 0 150 1 −54 v4 0 21 1 −54 v4 0 70 2 −54 v4 0 111 2 −54 v4 0 150 3 −54 v4 0 111 3 −54 v4 0 150 1 −27 v4 0 11 1 −27 v4 0 70 2 −27 v4 0 101 2 −27 v4 0 120 3 −27 v4 0 101 3 −27 v4 0 130 1 0 v4 0 01 1 0 v4 0 00 2 0 v4 0 11 2 0 v4 0 00 3 0 v4 0 11 3 0 v4 0 20 1 27 v4 0 111 1 27 v4 0 120 2 27 v4 0 81 2 27 v4 0 80 3 27 v4 0 91 3 27 v4 0 110 1 54 v4 0 111 1 54 v4 0 140 2 54 v4 0 101 2 54 v4 0 140 3 54 v4 0 101 3 54 v4 0 150 1 −54 v5 0 21 1 −54 v5 0 70 2 −54 v5 0 111 2 −54 v5 0 150 3 −54 v5 0 111 3 −54 v5 0 150 1 −27 v5 0 21 1 −27 v5 0 70 2 −27 v5 0 111 2 −27 v5 0 140 3 −27 v5 0 111 3 −27 v5 0 13

continued on next page

Appendix A. Ratings 173

Table A.4: continued

Instrument Timbre Timing Velocity Rating Frequency of ratings0 1 0 v5 0 01 1 0 v5 0 00 2 0 v5 0 11 2 0 v5 0 10 3 0 v5 0 01 3 0 v5 0 00 1 27 v5 0 71 1 27 v5 0 90 2 27 v5 0 31 2 27 v5 0 40 3 27 v5 0 31 3 27 v5 0 20 1 54 v5 0 91 1 54 v5 0 140 2 54 v5 0 91 2 54 v5 0 110 3 54 v5 0 71 3 54 v5 0 80 1 −54 v1 1 11 1 −54 v1 1 20 2 −54 v1 1 31 2 −54 v1 1 50 3 −54 v1 1 41 3 −54 v1 1 60 1 −27 v1 1 11 1 −27 v1 1 40 2 −27 v1 1 81 2 −27 v1 1 100 3 −27 v1 1 91 3 −27 v1 1 100 1 0 v1 1 111 1 0 v1 1 150 2 0 v1 1 111 2 0 v1 1 140 3 0 v1 1 101 3 0 v1 1 110 1 27 v1 1 11 1 27 v1 1 00 2 27 v1 1 01 2 27 v1 1 00 3 27 v1 1 01 3 27 v1 1 00 1 54 v1 1 01 1 54 v1 1 00 2 54 v1 1 01 2 54 v1 1 0

continued on next page

174 Appendix A. Ratings

Table A.4: continued

Instrument Timbre Timing Velocity Rating Frequency of ratings0 3 54 v1 1 01 3 54 v1 1 00 1 −54 v2 1 01 1 −54 v2 1 20 2 −54 v2 1 11 2 −54 v2 1 60 3 −54 v2 1 31 3 −54 v2 1 10 1 −27 v2 1 11 1 −27 v2 1 20 2 −27 v2 1 51 2 −27 v2 1 100 3 −27 v2 1 81 3 −27 v2 1 110 1 0 v2 1 111 1 0 v2 1 140 2 0 v2 1 111 2 0 v2 1 150 3 0 v2 1 71 3 0 v2 1 90 1 27 v2 1 01 1 27 v2 1 00 2 27 v2 1 01 2 27 v2 1 10 3 27 v2 1 01 3 27 v2 1 00 1 54 v2 1 01 1 54 v2 1 00 2 54 v2 1 01 2 54 v2 1 00 3 54 v2 1 01 3 54 v2 1 00 1 −54 v3 1 01 1 −54 v3 1 10 2 −54 v3 1 11 2 −54 v3 1 10 3 −54 v3 1 11 3 −54 v3 1 10 1 −27 v3 1 11 1 −27 v3 1 30 2 −27 v3 1 31 2 −27 v3 1 60 3 −27 v3 1 31 3 −27 v3 1 60 1 0 v3 1 111 1 0 v3 1 14

continued on next page

Appendix A. Ratings 175

Table A.4: continued

Instrument Timbre Timing Velocity Rating Frequency of ratings0 2 0 v3 1 111 2 0 v3 1 150 3 0 v3 1 51 3 0 v3 1 100 1 27 v3 1 01 1 27 v3 1 10 2 27 v3 1 01 2 27 v3 1 20 3 27 v3 1 11 3 27 v3 1 00 1 54 v3 1 01 1 54 v3 1 00 2 54 v3 1 01 2 54 v3 1 00 3 54 v3 1 01 3 54 v3 1 00 1 −54 v4 1 01 1 −54 v4 1 00 2 −54 v4 1 01 2 −54 v4 1 00 3 −54 v4 1 01 3 −54 v4 1 00 1 −27 v4 1 11 1 −27 v4 1 00 2 −27 v4 1 11 2 −27 v4 1 30 3 −27 v4 1 11 3 −27 v4 1 20 1 0 v4 1 111 1 0 v4 1 150 2 0 v4 1 101 2 0 v4 1 150 3 0 v4 1 101 3 0 v4 1 130 1 27 v4 1 01 1 27 v4 1 30 2 27 v4 1 31 2 27 v4 1 70 3 27 v4 1 21 3 27 v4 1 40 1 54 v4 1 01 1 54 v4 1 10 2 54 v4 1 11 2 54 v4 1 10 3 54 v4 1 11 3 54 v4 1 0

continued on next page

176 Appendix A. Ratings

Table A.4: continued

Instrument Timbre Timing Velocity Rating Frequency of ratings0 1 −54 v5 1 01 1 −54 v5 1 00 2 −54 v5 1 01 2 −54 v5 1 00 3 −54 v5 1 01 3 −54 v5 1 00 1 −27 v5 1 01 1 −27 v5 1 00 2 −27 v5 1 01 2 −27 v5 1 10 3 −27 v5 1 01 3 −27 v5 1 20 1 0 v5 1 111 1 0 v5 1 150 2 0 v5 1 101 2 0 v5 1 140 3 0 v5 1 111 3 0 v5 1 150 1 27 v5 1 41 1 27 v5 1 60 2 27 v5 1 81 2 27 v5 1 110 3 27 v5 1 81 3 27 v5 1 130 1 54 v5 1 21 1 54 v5 1 10 2 54 v5 1 21 2 54 v5 1 40 3 54 v5 1 41 3 54 v5 1 7

Appendix A. Ratings 177

Table A.5: Experiment IVa (Figure 4.13a, p. 112). Loudness ratings averaged over twochords for 26 participants (P), separately for voice (upper, middle, lower), asynchrony(−55, −27, 0, +27, +55 ms), velocity combinations (v1: 80/38, v2: 65/45, v3: 50/50, v4:38/55, v5: 28/62 MIDI vel. units), and instrument (instr, 1: piano, 0: other instrument).

Upper voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 2 2 3 4 6.5 2 2 4 5 6 1.5 2 3.5 6 6 2 1.5 3 5.5 6 1 2 3.5 4.5 72 0 2 2 4.5 5 6.5 2 3 4 5 6.5 1.5 3 4.5 6 6.5 2 2.5 3.5 5 6 2 2 4 5.5 6.53 1 2 3 4 4.5 6 1.5 2.5 3.5 5.5 6.5 1.5 2 4.5 5 5.5 2 3 4 5 6 1.5 2 4 5 64 1 3 3.5 3.5 5 6 2 4 3.5 4.5 6 1.5 3 4 5 6 2 2.5 3 5 6.5 1.5 2.5 4.5 5 5.55 1 2.5 3.5 4 5 6.5 2.5 3.5 3.5 5 6 2 3 4 4.5 5.5 3 2.5 4 4.5 6.5 2 3 4 4.5 6.56 1 2.5 3 3.5 5.5 6.5 2.5 3 4 5 6 2 3.5 4.5 6.5 5.5 2 3.5 3 5 6.5 3.5 3 5 5 6.57 1 2 3 4.5 4.5 6 2 2.5 4 4.5 6.5 1.5 3 4.5 4.5 5.5 1.5 3 4.5 6.5 6.5 2 2.5 4.5 4 5.58 0 2 2 2.5 5.5 6.5 3 2.5 2.5 5 6.5 2 2.5 3 5.5 6 3 2 3.5 5 6.5 2 3.5 2.5 5.5 79 1 1.5 3.5 4 6.5 6.5 1 2.5 4 6 7 1.5 2.5 4 6 6 1 2.5 4 7 6.5 2 3 4 4.5 710 0 3 3 4 5.5 6 2.5 3.5 4 5 6 3 2.5 4 5.5 6.5 3 3 3.5 5.5 6.5 2 2.5 4 5 611 1 3 3 4 5.5 6 3 4 4 4 5.5 3 3.5 3.5 5 6.5 2.5 3.5 4 5 6.5 3 3.5 4 5 6.512 1 3.5 3.5 4 4 6 4 3 3.5 5.5 6.5 2 3.5 3.5 4 6.5 2.5 3 3.5 5 7 3.5 3 4 5 6.513 0 3.5 2 4 5 6.5 3 2.5 3.5 4.5 6 2.5 3 3 5 4.5 2 2.5 3 4.5 6.5 2.5 3 4 4.5 6.514 0 2.5 3 3.5 4.5 6.5 3 3 4 5 6 2 3 4 6 6 3 2.5 4 5 5.5 3 3 4.5 5 615 0 2 3 4.5 6 7 4.5 3.5 3.5 5.5 6.5 1 2 4.5 5.5 6.5 1.5 1.5 4 6 7 1.5 1.5 4 5.5 716 1 3 3 3.5 5.5 6.5 2.5 4 4.5 5 6 2 4 4.5 5.5 6.5 3 3 4 5.5 6.5 3 3 4 5 5.517 1 2 2.5 4 6 7 1.5 3 3.5 5.5 7 1.5 2 3.5 6 7 1.5 2.5 3 5.5 7 2 3 4 4.5 6.518 1 1.5 2.5 4 6 6 2 3 4 5.5 6.5 1 1 4 5 7 1 1 3.5 5.5 7 1.5 2 2.5 5 6.519 1 1.5 3 3.5 5 7 3 3.5 3.5 5 6.5 1 3 4 6.5 6.5 2 3 3.5 5.5 7 2 3 4.5 5 6.520 0 1.5 2 3.5 5 6.5 2 2.5 3.5 5.5 6.5 1.5 1.5 4 4.5 6 1.5 2.5 4 6 6 1.5 2 4 5 6.521 0 2 2.5 4 5 6 4 3 4 4.5 6.5 2.5 3 2.5 5 5.5 3.5 3 3 4 6.5 3 2.5 4 5 622 1 2.5 2 4 5.5 6.5 3 3 3.5 5 6 1.5 2.5 4.5 5 5.5 1 2.5 3.5 5.5 6.5 2 2 4 5.5 723 1 1.5 1.5 5 5 6.5 2.5 2.5 4.5 5.5 7 2 3 4.5 5 7 1.5 2 5 6.5 7 2 3.5 4.5 5 624 1 2 2.5 4 5.5 7 3.5 3.5 3.5 5.5 6.5 1.5 1.5 3.5 6.5 6.5 2.5 2 3 6 6 1.5 2.5 3.5 5.5 725 0 2 2 3 5 7 2.5 3 3.5 5.5 6 1.5 2 4 5 6 1.5 3 3.5 5.5 6 2 2.5 4 5.5 4.526 1 2.5 3 3.5 5.5 7 2.5 3 3.5 5.5 6.5 2 3 3.5 6 7 2.5 3 3 5.5 6.5 1.5 2.5 4 5 6

Middle voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 2 2 3.5 5.5 6.5 2.5 4 4 5 6.5 2 2.5 4 4.5 6 2 3.5 4 4.5 6.5 3 4 4 5.5 6.52 0 1.5 3.5 4 4.5 6.5 2 2 3.5 3 6.5 1.5 2.5 3.5 3.5 6.5 2 1.5 3 3 6 2 3 3 5 73 1 1.5 3 4 6 7 2 1.5 4 4 5.5 2 2.5 3.5 5 6.5 1 3 4 5 5 1.5 3 3.5 4.5 64 1 3 3.5 4 4.5 6 2.5 3 4.5 5 6 3 2.5 3.5 4.5 5.5 2 5 3.5 4 5 2 3.5 4 4.5 65 1 2 3.5 4 5 6 2 3 4 4 6.5 2 2.5 4 4.5 6 2.5 2.5 3.5 4 5.5 1.5 4 3.5 4.5 6.56 1 3.5 3.5 3.5 5 6 1.5 4 3.5 5 6.5 2.5 2.5 3.5 5 6 2 3.5 4 5 5.5 2 3 4 5 5.57 1 2.5 3.5 3.5 5.5 6.5 1.5 2 4 3.5 6 1 3.5 3.5 4.5 6.5 2.5 3 3.5 5 6 1.5 4 4 4.5 6.58 0 2.5 3 4 4.5 5 4 2 4 3.5 6 4 3 2.5 3 6.5 3 4 4 3.5 5.5 3.5 3.5 3.5 4.5 69 1 2 2 3 5 6.5 1 2 3 4.5 6.5 2 1.5 3 5.5 7 2 3 3 5 5.5 2 2 4 5 710 0 1.5 4 4 4 5.5 2.5 4.5 3.5 4.5 6 1.5 2.5 3 3.5 6 2.5 3.5 3 4 5 2 4.5 3 4 5.511 1 3.5 4 5 5 6 2.5 3 3.5 4.5 5.5 3 2.5 3.5 5 6 2.5 3 4.5 4.5 6 2.5 4 4.5 4.5 5.512 1 1.5 1.5 3 4 4.5 2.5 3 3 4 5.5 1 1.5 3 3.5 5.5 2 2.5 2.5 4 4.5 1.5 2.5 3.5 3.5 5.513 0 2.5 5 5.5 5 4 3.5 5 5 5 3 2.5 4.5 4 4.5 5 2.5 3.5 4.5 5 3.5 3.5 4 4.5 5 414 0 3 3 3.5 4.5 5 2.5 3.5 4 4 6 2 3 4 4 5 2.5 2.5 4 4.5 5 3 4 3 4 515 0 1 3 5 5 6.5 1.5 2.5 5 5 6.5 1 1.5 4 5 6.5 1.5 2.5 2.5 5 7 1 2.5 3 5 716 1 2 3.5 3.5 5 5.5 2.5 2.5 3 3 5.5 3 3 3.5 4.5 5.5 2.5 3.5 3 4 6 3 3 3.5 4 5.517 1 2 3.5 5 6 7 1 3 3.5 5.5 6.5 1 1 3 5 6 1.5 2.5 3 5 6 1.5 3 3.5 5 6.518 1 2 4 4 5.5 6.5 2 3.5 4 5 6.5 2.5 1.5 3 5 6.5 1.5 2 3.5 4.5 5.5 2 3 3 4 619 1 1.5 3 3.5 5 6 2 2.5 3.5 5 6 2 1.5 4 4.5 6.5 2 5 3 5 6.5 1.5 3.5 3 4.5 720 0 2.5 3 3.5 5 5 2 2.5 4 4 5.5 1 2 3 3.5 4.5 1.5 2 3 4.5 6 1 2.5 3.5 5 621 0 2.5 3.5 4 4 4 3.5 4 3 4.5 5.5 2.5 2 3 3.5 5.5 3 3.5 3 5 5.5 3 3.5 4 4 522 1 1.5 2.5 4 6 6.5 2 2.5 5 4 6 2 2 4.5 4 6 1 3 3.5 5 6.5 2.5 3 3.5 4.5 6.523 1 1 3 3.5 5 7 2.5 3 4 4.5 6.5 3 3 4 5 6.5 2 3 4 5 6.5 2 3.5 4 5 624 1 2 3 4.5 5 6 2.5 3 4 5.5 6 2 3 3.5 4 6 2.5 4.5 3 4.5 6.5 2 4 3 4.5 625 0 1 2.5 3.5 4.5 4.5 1 2 2.5 3.5 4.5 1 1 3.5 4 2.5 1 1.5 2.5 3 4 2 2 1.5 4 426 1 2 3.5 4 5 5.5 2 3.5 4.5 4 6 2 2.5 3.5 4.5 5.5 2 2.5 3.5 5 6 2 3.5 3.5 5 6.5

Lower voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 3 2 4 5.5 6.5 4 2.5 4 6.5 6.5 2.5 3.5 3.5 6.5 6.5 2 3.5 4 4.5 6 3 3 3.5 5 6.52 0 2 3.5 3.5 6 7 3 3.5 4 6 7 3 4.5 3.5 5.5 7 3 3 4 4 6.5 2.5 3 4.5 5.5 73 1 3 3 4.5 5 6.5 3 3.5 4.5 7 6.5 2.5 2.5 3.5 5 6.5 1.5 3.5 4 5.5 6.5 2 3 5.5 5.5 6.54 1 3 3 4 5.5 6 3 4 4.5 5 6 2 3.5 3.5 6 6 2.5 3 4.5 5 6 3 4 4 5 5.55 1 3 3.5 4 5.5 7 3 4 4 5.5 6.5 2.5 4 4 5.5 7 2.5 3.5 4 4.5 6.5 3 4 4.5 5.5 6.56 1 4 4.5 4 5.5 7 3.5 3.5 4.5 6.5 6 3 4 3.5 6 6.5 2.5 2.5 4.5 5 5.5 3 3.5 4 5 6.57 1 2.5 4 4 4.5 7 2 3.5 5.5 6 6.5 2 3 4 5 7 2 3.5 4 5.5 7 1.5 3.5 5 6.5 6.58 0 3 3 2.5 5 7 2 3.5 2.5 5.5 6.5 2 3.5 4 5.5 6.5 3.5 2.5 3 5.5 6 3 4 3.5 5.5 79 1 2 2.5 4 6 7 1.5 4 5 6 7 3 2 4.5 6.5 7 3 3 4.5 6 7 2 2.5 4.5 6 710 0 3 4 3.5 5 6.5 4 4 3.5 6 6.5 2 4 4 6 6 3 5 3 4 6.5 2.5 3.5 4 4 711 1 3.5 3.5 4.5 5.5 6 3.5 3.5 5 5 7 3 3 4.5 5.5 6.5 2.5 4 4 5 6 4 3.5 4.5 5.5 6.512 1 3 3 4 4.5 6.5 1.5 2.5 3 5.5 7 2.5 2.5 4 4.5 7 3 2 3 5.5 6.5 1 3 3.5 4 713 0 3.5 4.5 4.5 5 4.5 4 4.5 4.5 3 4.5 2.5 3.5 4 4 6.5 2.5 3.5 4 5.5 5.5 2.5 3.5 4.5 4.5 6.514 0 2.5 3.5 4.5 5 7 3 4 4 5 6.5 3.5 4 4.5 6 6 2.5 3.5 4 5.5 6.5 3.5 4 5 5 615 0 3 4.5 4 5.5 7 3 4 4 6.5 7 2 4 3.5 7 7 3 3.5 3.5 5.5 7 2 3 4.5 5 716 1 3 3 3.5 5.5 7 3 4 3.5 5.5 6 3 4 3.5 6 6.5 3 2.5 3 4 6.5 3 3 3.5 5 5.517 1 2.5 4.5 4.5 6 7 3.5 4.5 4.5 6.5 6.5 2 4 4.5 7 7 3 3.5 3.5 6 6.5 2.5 4 4 5.5 6.518 1 2.5 3 4 5.5 7 2.5 3.5 4 6.5 7 3.5 4 5 5 7 3.5 3 4 5 4 2.5 4 4 5.5 6.519 1 3 4 4 5.5 7 3 3 4 6.5 7 2 3.5 3.5 6.5 7 3 3.5 4 4.5 7 3 4 4 5 720 0 2.5 5 4 5 7 2.5 3.5 5 5.5 7 2.5 3 3.5 6 7 2.5 3.5 5 5.5 7 2 4 5 6 721 0 3.5 4.5 4.5 5.5 6 4 3 4.5 5.5 5.5 2 3.5 3 5 6.5 3 2.5 3.5 5 4.5 4 4 3 5 5.522 1 2.5 3 4.5 6 7 4 4.5 4 6 7 2 4 4.5 5.5 7 2.5 3.5 3.5 5 6 2 3.5 4.5 5.5 623 1 3 5 4.5 5 7 3.5 4.5 4 6 7 4 4 4.5 6 6.5 3 4 5 5.5 7 2.5 4 4.5 5.5 724 1 3 3.5 4.5 5.5 7 2.5 3.5 4.5 6 7 2 4 3.5 7 6 3 3 4 5 7 3 3.5 4 5 6.525 0 2.5 3.5 4 6 7 3 3.5 4 5 6.5 2 3.5 4.5 4.5 5.5 2.5 3 4.5 5 7 2 4 5 6 526 1 3 4 4.5 5.5 7 3.5 3.5 4.5 6.5 7 3 4 3.5 6.5 7 2.5 3.5 3.5 5 6.5 3 3.5 4.5 5 7

178 Appendix A. Ratings

Table A.6: Experiment IVb (Figure 4.13b, p. 112). Mean ratings averaged over two chords.Labelling as in Table A.5, p. 177.

Upper voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 1.5 3.5 3.5 5 6 2 2.5 4 4.5 6 2.5 2 3 5 5 1.5 2.5 4 4.5 6.5 2 2 3.5 5 72 0 2 3.5 4.5 5.5 6 2 2.5 4.5 5.5 6.5 2 2.5 4 5.5 6.5 2 3 5 4.5 6 2 2.5 4 4.5 63 1 2 3 3.5 5.5 6 1 3 4.5 5 6 1.5 3 4.5 5 6 1.5 2.5 4.5 5 6 2 3 4 5.5 74 1 3 2.5 4.5 5.5 6 3 3 4.5 5 4.5 2.5 2 4 4.5 5.5 2 2.5 4 4.5 5 4.5 3 4 4.5 5.55 1 2.5 3 4 4.5 5.5 3 3.5 4 4.5 5.5 2.5 2 4 4.5 6 1.5 3 4 4.5 6 2 3 4 5 66 1 4 2.5 4 5.5 6.5 3 3 4 5.5 5.5 1.5 3.5 3 5 6 2.5 2.5 5 4 5.5 2.5 2.5 5.5 5 67 1 2.5 3 3.5 5.5 6.5 1 3 4.5 6 6.5 1.5 2.5 3.5 5.5 7 1.5 2.5 5 5 6 1.5 3 4 5 78 0 2.5 3 4 5 5 3.5 4 3.5 5 6 2 3 4 5.5 6.5 3.5 3 4 5 6.5 3.5 2 2.5 5.5 6.59 1 2.5 2 4 5 6.5 2 2 4.5 4.5 6 1 1.5 4.5 6 6.5 1.5 1.5 4 6 7 2 2.5 4 6.5 6.510 0 3 3.5 4.5 5 5.5 3 3 4.5 5.5 5.5 3 2.5 4 5 6 2.5 3 5 5 5.5 2 3 4.5 5 611 1 2.5 4.5 4.5 5 6 3.5 4 5 5.5 6.5 2 4 3.5 5 6 3 3 4.5 5 6 3 3.5 4.5 5 6.512 1 4 3 4 4 6 4 3 4 4.5 6 3 2.5 4 4.5 6 2.5 3.5 4 4 5 3 3.5 4 4.5 613 0 4 3 3.5 3.5 6 2.5 3 4 4.5 5.5 2 3.5 4 5 5.5 3.5 3 4 4.5 5.5 2.5 4 3 5 614 0 3 3 4.5 5.5 6 3 2.5 4 4.5 5.5 3 3 3.5 4 6 2.5 3 4 5 5 2 3 4.5 3.5 515 0 2 3 5 6.5 6 2 2.5 4.5 6 6.5 1 1.5 4 5 7 1 2.5 5 5.5 6 1 1.5 5 5 6.516 1 3 3 4 5.5 5.5 3 3 4 4.5 6 2.5 2.5 3.5 5 6 2.5 3 5 4.5 5.5 2.5 3 4 5 5.517 1 2 3 4.5 5.5 7 2.5 2.5 4 5.5 6.5 1.5 2 4 5 7 1.5 2 5 5 6.5 1.5 3 4.5 5 6.518 1 3 3 4 5 6 2.5 3 4 5 5.5 2 2.5 3.5 5 6 2 2.5 4 5 6 2.5 2.5 4 4 619 1 2.5 3 4.5 5 6.5 2.5 3 4.5 5 6.5 3 2 4 5 6 2 3.5 5 4.5 6 1.5 3 4.5 5 6.520 0 2.5 2.5 4 3.5 6 2.5 3 3 5 6.5 1.5 2 3.5 4.5 6 2 2 3.5 4 6 2 3 4 4.5 621 0 3.5 3.5 4.5 5 7 3 4.5 4 5 6.5 3 3.5 2 4 6.5 2 3 4.5 4 6 3 2 4.5 4.5 622 1 2 2.5 4.5 5 6 2 2.5 5 5.5 6.5 2 2 4 5.5 6.5 2 2 4.5 5 6 1.5 2 4 5 623 1 2.5 3 5 5 6.5 3 3 4.5 5 6 2 3 4.5 5 7 2.5 3 5 6 7 2.5 3.5 4.5 4.5 724 1 2 3.5 4.5 5.5 6 2.5 3.5 4.5 5 6 1.5 2.5 4 5 6 1.5 2 4.5 5 6 2.5 3 4.5 5 625 0 2 3 4.5 5 6 2.5 2 4.5 5 6.5 2.5 1.5 3.5 5 6 1.5 3 4.5 3 6 1.5 1.5 4.5 5 626 1 2.5 3 4.5 5.5 6 2.5 2.5 3.5 5 6 2.5 2.5 3 5 6.5 1 3 5 5 6 2 2.5 5 4 6

Middle voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 1.5 3 4.5 4.5 6 2.5 3 4 4 6 2.5 4 3.5 4 5.5 3 3.5 4 4.5 5.5 2 3 4 5 6.52 0 1.5 2 4.5 3.5 6 2.5 3 3 3.5 6 2 3 2.5 4.5 6.5 2 4 2 5 6 2.5 2 4 4 6.53 1 1.5 2.5 4 5 6.5 2.5 2.5 4 5 6 1.5 3 3.5 5 6 1.5 2.5 4 5 6 1.5 3 4 5 64 1 3 3 4 5 6 3.5 4 3.5 5 5 2 3 3.5 4.5 5.5 2 3 3.5 4.5 5 3 3 3.5 4.5 55 1 2.5 2.5 4 4.5 6 2.5 3 4 4 5.5 1.5 3.5 3 4 6 2 3.5 4 4 6 2.5 3 3.5 5 66 1 1.5 2 5 5.5 5.5 2 3 3.5 5 7 2 2 5 5 6 2.5 3 3 5 5.5 2 2.5 4 4.5 5.57 1 2.5 3 4.5 5 7 2.5 3.5 5 5.5 6 2 2.5 4.5 5 6.5 1.5 3.5 4 5 6 4.5 3 4.5 5 68 0 5 4 4.5 5 6 5 3.5 4 6 5.5 7 5 4 3.5 6.5 6 4.5 3.5 5 6 6.5 4 3.5 5 6.59 1 1.5 3 3.5 5 7 2.5 3 4 5 6.5 2 2 3 5 6.5 1 2 2.5 4 7 1.5 3 3.5 5 6.510 0 2.5 3 4 4 6 3 2.5 4 5 5 2.5 3 3 3.5 5.5 2.5 3 4 3.5 3.5 2 3 3 4 511 1 3 3.5 4.5 5 6 3 4 4.5 5 6 2.5 4 4 5 6 3 3.5 4 5 6 2.5 4 4.5 5 5.512 1 3.5 3 4 3.5 5 3 3 3 4 5.5 3 3 3.5 4 5.5 2.5 1.5 2 3 5.5 2.5 3 3 4 513 0 5.5 3.5 5 5 4.5 3 5 5 5 4 4 4 5 5 4 3 4 4.5 5 2.5 3 4.5 5 5 514 0 2.5 3 3 5 5 3 2.5 3.5 4.5 5 2.5 4 4 4 5 3.5 3.5 4 4.5 5 3 3.5 4 5 515 0 2 1.5 4 5.5 7 2.5 2.5 4 4 6 1 1.5 3.5 4.5 6 2 3 2.5 4.5 6 1.5 2.5 4.5 5 6.516 1 3 3 5 4 6 3 3 3.5 3 6 2.5 3 3.5 4 5.5 2.5 2.5 3 3 5 2 3 3 3 4.517 1 2.5 2.5 4.5 5 7 2.5 2.5 4.5 5.5 6.5 1.5 2.5 3.5 4.5 7 1.5 3.5 3 5 6 1 2 4 5 5.518 1 3 3.5 4 5 5.5 3 3 4 5 5.5 2.5 3.5 4 4.5 5 2 3.5 3.5 4 6 3 3 4 4.5 519 1 2.5 2.5 3 5 6 2.5 3 3 5 6 1.5 3 3.5 5 6 3 3 3 4.5 6 2 2.5 4 5 620 0 2.5 3 3 5 6 2.5 2.5 3.5 4 5.5 2.5 2.5 4 4 5.5 1.5 2.5 3.5 4.5 5.5 1 3 3.5 4.5 621 0 2 2.5 4 4 6 2.5 3 3.5 4 5.5 2.5 2.5 3.5 4.5 5 2 2.5 2.5 4.5 6 2.5 2 4 4 5.522 1 1 2 3 5 6 2 2.5 4 5 6 1.5 4 4 4.5 7 2 2.5 3 5 6 1.5 2.5 4.5 5 623 1 2 3 4.5 5 6 3 3.5 4.5 5 5.5 2.5 3.5 4 5 6 3 3.5 4 5 5.5 3.5 4 4 5 624 1 4 2.5 3.5 4.5 6 3 3 3.5 4.5 6 2.5 2.5 3 5 6 2.5 3 3 4.5 6 2.5 3 3.5 5 625 0 2.5 1.5 2.5 4 6 2 2 3 3 6 1 2.5 2 3.5 5.5 2 2.5 2 4.5 3.5 1 2 4.5 3.5 5.526 1 3 2.5 4 4.5 6 3 3 5 5 6 2.5 3 3.5 5 5 2.5 3.5 3 5 6 2 3 3 5 6.5

Lower voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 2.5 3 3 5 6 2.5 4 3 5 6.5 3 3.5 4 5 6 3.5 3 3.5 5 6 3 3.5 4 5 62 0 2.5 3.5 3 5.5 6 2 3 4 4.5 6 2 2.5 4.5 6 7 3 3 4.5 5 6.5 2 3.5 2.5 5.5 73 1 2 3 5 6 6 2.5 3 4 6 6 3 3.5 4.5 6.5 6.5 2 3.5 4.5 5 6.5 2 3 4.5 6 6.54 1 3 3.5 3.5 5 5 3 2.5 4.5 4.5 6 2 3 5 5 6 2.5 3.5 3 5 5.5 2.5 3.5 4 5 5.55 1 3 3 4 5 6 2.5 4 4 4.5 6 3 4 4 5 6.5 3 3.5 4 4.5 6 2.5 4 4 5 5.56 1 2.5 4 3.5 5 6.5 3 3 5 5 5.5 2.5 4 4 5.5 6 3 3.5 4.5 5.5 6 2 4 3 5 67 1 3.5 3.5 5 6 7 5 4 5 6.5 7 3 4.5 5 7 7 3 4 5.5 6 7 2 4 5 6 5.58 0 4.5 3 3.5 6.5 7 5.5 2.5 4 6.5 6 5.5 3.5 3 5.5 6.5 5.5 4.5 4 5.5 6.5 4.5 3 4 6.5 6.59 1 3 3 5 6.5 7 2 3.5 5 6 7 2.5 4 3.5 5.5 7 2.5 4 4 6.5 7 2 3.5 4 5.5 6.510 0 2 4 3.5 4 5.5 2.5 3.5 4 4.5 6 2 3.5 3.5 5 6 3.5 3.5 3.5 5 6 2 4 3 5 611 1 2 5 5 5 6 4 4 5 5.5 6 2.5 4 4.5 5.5 6.5 2.5 4 5 5.5 6 3 3.5 5 5 6.512 1 3 3 4 5 6 3 4 4 4.5 6.5 2.5 4 3.5 4.5 6 2 3 4 4 6 2.5 3.5 4 5 6.513 0 3.5 5 4.5 5 6 3 3 5 5 6.5 3.5 3.5 4 5 5 3 4.5 5 5 5 4 4 5 5.5 6.514 0 3 3 3.5 5.5 5.5 3 3.5 4.5 5 6 2.5 3.5 5 5.5 6 3.5 4 4 5 6 2 4 4.5 5 615 0 2.5 5 4 6 7 3 3.5 4.5 5.5 6 2.5 3.5 3.5 5.5 7 3.5 4 4.5 6 6.5 2.5 4 3.5 5 6.516 1 3 3 4 5 6 2.5 3 3.5 5 6 3 4 5 5 6 3 4 5 4.5 5 2.5 3 4 5 6.517 1 2.5 4 4 5.5 7 2.5 4.5 4.5 5.5 6.5 2 3.5 4.5 6 7 3 3.5 5 5 6.5 2.5 4 3.5 5 6.518 1 3 3.5 4 5 6 3 4 4 6 6.5 3.5 3 4 5.5 6 3 4 4 5.5 6.5 2.5 3 4 5.5 619 1 2.5 3 3.5 5 6.5 2.5 3 4.5 5 6.5 2 3.5 4.5 5 6 2.5 3.5 4.5 5 6 2.5 3.5 3.5 6 620 0 3 3 4.5 6 6.5 3 3.5 5 6 7 2.5 4 4.5 6 7 3 3.5 5 5.5 6.5 2 3.5 5 6 6.521 0 3 4.5 3.5 5.5 6 3 4 4.5 5 6 2.5 4 4 6 7 3.5 4 4.5 5.5 5.5 2.5 4.5 5 5.5 622 1 2 4 3.5 5.5 6.5 3 3.5 5 5.5 6 2 4 4.5 6 7 3.5 3 5 5.5 6.5 2.5 4 3.5 5.5 723 1 3 4 5 5.5 6.5 3 4 5 6 6.5 3 4 4.5 5.5 3.5 3.5 3.5 5 5 6.5 3.5 4 5 5.5 6.524 1 3 3.5 4 5 5.5 3 3.5 4 5 5.5 2.5 3.5 4 6 6 3 3.5 4.5 5 5.5 2.5 3.5 3.5 5 625 0 2.5 4.5 3.5 5.5 5.5 3.5 4 5 4 6 2.5 2.5 4.5 5.5 7 2.5 3 5 5 6 2 3.5 3 5 626 1 3 4 5 6 6.5 3 3.5 5 5 6.5 2.5 3 5 5.5 7 3.5 3.5 5 6 7 2.5 4.5 4.5 5 6.5

Appendix A. Ratings 179

Table A.7: Experiment V (Figure 4.14, p. 113). Mean ratings averaged over two chords.Labelling as in Table A.5, p. 177.

Upper voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 2.5 2.5 3 5 7 2 3 4 4.5 6.5 1.5 2.5 4 6 6 1.5 2 3.5 5.5 7 2 2 3.5 4.5 6.52 0 1 1.5 4 6 6.5 2 2 4 5 7 1.5 2 4.5 7 6 2 1.5 3 5 7 1.5 2 4.5 6 6.53 1 1 3 4.5 5 6 1 2.5 5 5 6 1 2 4 5 6.5 1.5 2 4 4.5 6 1.5 2 4 5.5 6.54 1 2 3.5 4 5.5 6 2.5 3 3.5 4.5 6.5 2 1.5 4.5 6 6 1.5 2 3.5 5 6.5 2 2 3.5 4.5 65 1 2.5 2.5 4 5.5 7 2 3.5 3.5 4 6.5 1.5 2.5 4.5 5 5.5 1.5 2 3.5 4 7 1 2.5 3.5 5 66 1 2.5 3.5 4.5 5.5 7 2.5 3.5 2.5 5 7 2 2.5 4 6 6 3 2.5 3.5 4.5 6.5 2 3 3 5.5 5.57 1 1 3 5 5.5 7 1.5 2.5 5.5 5 7 1 1.5 3.5 4 6.5 1.5 2 3 5 6 1 2.5 3 5.5 6.58 0 3 1.5 2 5 6.5 2 2 2 6.5 6.5 1 3.5 3 5 7 1 2 2.5 6.5 7 2 1.5 4 6 79 1 1 1.5 3.5 5 7 1.5 2 3.5 7 6.5 1 1 2.5 5.5 6.5 1 1 4 7 6.5 1 1.5 3 6 6.510 0 2 2.5 3.5 5.5 6.5 2 3 3.5 5 6.5 1.5 2 4 6 6 1.5 2.5 3 5 6 1.5 2.5 2.5 5 611 1 1.5 3.5 5 5.5 7 2.5 3.5 4.5 6 6.5 2 2 4.5 5.5 6.5 2 2.5 3.5 6 6 2 3 4.5 6 612 1 2 3 4 6.5 6.5 2 3.5 4.5 5 6.5 2 2 3.5 4.5 7 2.5 2 3.5 5.5 6.5 2.5 2 4.5 5 713 0 1.5 1.5 3.5 5.5 6.5 2.5 2 4.5 5 6 2 2.5 3 5.5 5.5 2.5 3 3 4 6 2 2 3.5 4 714 0 2 2.5 3.5 5.5 5.5 2.5 2.5 4 5 6 2 2.5 3.5 5.5 5 2.5 2.5 3.5 4.5 6 3 2.5 4 5 5.515 0 1.5 2.5 4 6 7 2 3 3.5 5.5 7 1 1 4.5 6 6 1 1.5 4 5 7 1 1.5 4 6 6.516 1 2 2 3.5 5.5 7 2 2 4 5.5 7 1.5 1.5 5 7 6 1 1.5 3 5 7 1 1.5 3 5 6.517 1 2 3 4.5 6 7 2 2.5 3.5 5.5 7 1 1 4 6.5 7 1 1.5 3 5 7 1 1 3.5 5 6.518 1 2 3 4 5 6 2.5 3 4 5 6.5 1.5 1.5 4 5 6.5 1 2 3.5 4.5 6 1.5 2.5 3 4.5 5.519 1 2 2.5 3.5 5 7 2 2.5 3.5 5 7 1.5 3 4 5.5 6 2 1.5 3.5 4.5 7 1.5 2.5 3.5 5 620 0 2 2.5 4.5 5 6.5 2 2 5 5.5 6 1 1 3 4.5 6 1 1 2.5 5 6 1 1.5 3 5.5 6.521 0 3 3 3.5 5.5 7 2.5 2.5 4 5.5 6.5 3.5 3 4 6.5 6 2.5 2.5 3.5 4.5 6.5 2 2.5 3 5 722 1 1 2 4 5.5 7 2 2.5 3.5 6 7 1 2 4 5.5 6 1 1.5 3 5 7 1 1.5 3 5 6.523 1 1.5 3 4 5.5 7 1 3 3.5 5.5 7 1 2 4 5 7 1 2 4.5 5 6.5 1.5 2.5 3.5 5.5 6.524 1 1.5 2.5 4 5.5 6.5 2 2.5 4 5 6.5 2 1.5 4.5 6 6.5 2.5 2 3 5 6.5 1.5 2.5 3.5 5 6.525 0 2 2.5 4.5 5.5 7 2 2.5 4 5.5 7 1.5 2 4.5 5.5 7 1.5 1.5 3.5 5.5 6.5 1 2 3.5 5.5 726 1 2.5 2.5 4 5.5 6.5 2 2.5 3 5 6.5 1.5 2 4 6 5.5 1.5 1.5 2.5 5 6.5 1 1.5 3.5 5 6.5

Middle voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 2.5 3 3.5 5 6.5 3 3.5 4.5 4.5 6.5 2 3 3 5 5.5 2 2.5 4 4.5 6 2 2 3.5 5 6.52 0 1 2 2.5 5.5 4 2 2 2.5 5.5 6 2 3 2.5 5 6 1.5 3 3.5 5 6.5 1.5 2 3 5.5 43 1 1.5 2 3 5.5 6 2 3 4 5 6 1 2 3.5 5.5 6 1 2 3.5 5 6 2 2 4 5 64 1 4.5 3 4 5 5.5 2 3 3.5 5 6.5 1.5 1.5 2.5 5 6 1 2 3.5 4 5.5 2 2.5 3.5 5 65 1 1.5 3.5 3 4 6.5 2 3 3.5 4.5 6 2.5 2.5 4 4 6 2.5 2 4 4.5 5 1.5 2.5 3.5 4.5 66 1 1.5 3 2.5 5.5 6 3 3 2.5 5 6 1.5 2.5 3 6 5.5 2 2 4 5 6 2 2 2.5 5 67 1 2 3 3.5 5.5 6.5 2 2 4.5 5 6.5 1.5 1.5 3 5 7 1 1.5 3 4.5 6.5 1 2 4.5 5 68 0 3 3 4 4.5 6.5 4 2.5 4 4.5 6.5 3 3 1.5 2 7 4.5 3.5 3.5 4 7 3 2 3.5 4.5 5.59 1 1.5 2 4 5 7 1 2 3.5 4.5 6.5 1 1.5 3 5 6.5 1 2 3.5 4 7 1 1 3.5 5 710 0 1.5 2.5 3.5 4.5 6 1.5 3 2.5 5 6 1.5 1.5 3 3.5 5.5 1 2 2.5 4.5 5 1.5 1.5 3.5 3.5 511 1 2.5 3 4 5.5 6.5 2 4.5 4 5 7 5 2 3.5 5.5 6.5 2 3 4 4.5 6 2 3.5 3.5 5.5 612 1 2 2.5 3.5 4 6 3 2 3 3.5 6 1.5 2 2 3 6 2 1.5 2.5 2.5 6 1 2 2 4.5 4.513 0 1.5 3 4 4.5 4 5.5 3.5 3.5 5 2 2.5 3.5 3.5 4 1.5 2 2.5 3 5 4 4.5 3 4 5.5 1.514 0 2.5 3 3 5 5 3 3 3.5 5 5.5 2 3 3.5 4 5 2.5 4 3.5 4 5.5 2.5 3 3.5 5 5.515 0 1.5 2 3 5 6.5 1 2 3 5.5 6 1 1 2.5 3.5 6.5 1 1 3.5 5 6.5 1 1 2 5 6.516 1 2.5 3 3 5.5 7 2 2.5 2.5 5 7 1.5 1.5 2.5 3.5 6.5 1.5 2 2 3.5 6.5 1.5 2 2 4 617 1 2 2.5 4 5.5 6 2 2.5 3 5.5 7 1 1.5 3 5 7 1 1 3.5 5 6.5 1 1 3 4.5 718 1 2 3.5 4 5 6 2.5 2.5 4 4.5 6 1.5 2 3.5 5 6 1 1.5 4 4.5 5.5 1.5 2 3.5 4 5.519 1 2 3 3 5 6.5 2 3 3 5 6 1 2.5 4 5 6.5 2 2 3.5 5 7 1.5 2 3.5 4 6.520 0 2 2.5 2.5 3.5 6.5 2 2.5 3 4.5 6 1 1.5 2 3.5 5.5 1.5 1 3 3 6 1.5 2 3.5 4.5 6.521 0 2.5 3.5 3 4.5 6 2.5 3 3 5 6.5 1 1 3.5 4 6.5 1 2 2 4.5 6 1.5 2 2 4 622 1 1 1.5 3 5.5 7 1.5 2.5 3 5.5 6.5 1.5 2 3 4.5 6.5 1.5 1.5 3.5 5.5 6.5 2 1.5 2.5 5 6.523 1 1.5 3 4.5 5.5 6 2 3 4 5 6 2 3 4 5 6.5 2.5 3 3.5 5 7 2.5 3 3.5 5 624 1 1.5 4.5 3 5 6 2.5 3 3 5 6 1.5 2 3 4 6 2.5 2 3 4 6 1.5 2 3 5 625 0 2 2.5 4 5.5 6 1.5 3.5 2 4 6.5 1 1 1.5 2.5 6 1 1.5 1.5 2.5 6 1.5 1 2 5 626 1 2 2.5 2.5 5.5 6 2 2.5 3 5 6 1 2.5 3 5 6 1.5 2.5 3.5 5 6 2 2 3 4.5 6.5

Lower voice−55 ms −27 ms 0 ms +27 ms +55 ms

P instr v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5 v1 v2 v3 v4 v5

1 1 2 4 3.5 6.5 7 3.5 3.5 4 5.5 6.5 3.5 3.5 4 5.5 6.5 2.5 4 3.5 5 7 2.5 3.5 4 6 72 0 2 3.5 4 6.5 7 2.5 3 4 6 7 1.5 3 3.5 5.5 7 1.5 2.5 2.5 4 7 2 3 3 6.5 73 1 1.5 3.5 5 6 6.5 2 3.5 4 5.5 7 1.5 3.5 5 5.5 7 2 3.5 4.5 5.5 7 1.5 2 4 5.5 74 1 2.5 4 4.5 6 6 3 3.5 4 5.5 6.5 1.5 2.5 3.5 6 6 2 2.5 3.5 4.5 6 2 3.5 3.5 5.5 6.55 1 3 4 4 5.5 6.5 3 4 4 5.5 7 1.5 3.5 4 6 7 2 4 4 5 6 2.5 4 4 5.5 76 1 3 4 5 6 6 3 3.5 5 5 6.5 1.5 3 3 6 7 2 2.5 3.5 5 5.5 2.5 2.5 3 5.5 67 1 2.5 3.5 5.5 6 7 2 3.5 5 5.5 7 3 3 4.5 5.5 7 2.5 3.5 4.5 6 7 2.5 2.5 4.5 5.5 78 0 2 3 4 5.5 6.5 2.5 4 5 6 6 3.5 3 3 6 7 4 4 4.5 5 7 3 3 3.5 5.5 69 1 2 4 4 6 7 2 3.5 4 6.5 7 2.5 3.5 3.5 6.5 7 2 3 4.5 7 7 3 2.5 4 6 710 0 2 3.5 3.5 6.5 6 2 3 4.5 5.5 6.5 2 2.5 3.5 5.5 6.5 2.5 3.5 3.5 3 6.5 2 3 3 5.5 611 1 3 4 5 5.5 7 3 3.5 4.5 5.5 6.5 2.5 3.5 4 6 6.5 2.5 5 4 5.5 6.5 3 3.5 4.5 5.5 712 1 2.5 3 3.5 5.5 7 3 3 3 5 7 1.5 3 3.5 6 7 2 2.5 3 4 7 1.5 3 2 5.5 6.513 0 1.5 4 4.5 4 6.5 2.5 4 4.5 5.5 4.5 1.5 3 4.5 4 4 2.5 3 4 5 6.5 2.5 3 4 6.5 414 0 2.5 4 4 6 6.5 3.5 3.5 4 5 6 2.5 3.5 4 5 6.5 3.5 4 4 4.5 6 2.5 4 4 5.5 615 0 2 4 4.5 7 7 2.5 3 4.5 6.5 7 2 3 3.5 6.5 7 2.5 4 3.5 5 7 1.5 3.5 3 6.5 716 1 3 3 3 7 7 3 3.5 4 6 7 2 2.5 3.5 5.5 7 2 3 3 5 6.5 2.5 3 3 6 717 1 2.5 3.5 3.5 7 7 2.5 3 4.5 5.5 7 1.5 3.5 4.5 6.5 7 2.5 4 4 5 6.5 2 3 3 7 6.518 1 2 3 4 5 6 2.5 3.5 4 5 6.5 2.5 3 4 6 6.5 3 3.5 4 5 6.5 2.5 3 4 5.5 6.519 1 2.5 3.5 3 6.5 6.5 2 3 4 5.5 7 2 3 4.5 6 7 2.5 4 4 5 6.5 2.5 3 3.5 5.5 6.520 0 2 3 4.5 6.5 7 2 3.5 4.5 5.5 7 2.5 3 4 6.5 7 3 3.5 4.5 6.5 6.5 2.5 2.5 4.5 5 721 0 2.5 4 5 6.5 6.5 3 3.5 4 6 7 3 4.5 3.5 5.5 7 2.5 3.5 3 4.5 7 2.5 4.5 3.5 6 622 1 3 3 3.5 6.5 7 2 3.5 5 6 7 2 2.5 4.5 6 7 2 4 3.5 4.5 6.5 2 3.5 4 6.5 723 1 2.5 3.5 5 6 6.5 2 4.5 5 5.5 7 2.5 3.5 5 6.5 7 2 4 4 6.5 6.5 2.5 3 5 7 724 1 2.5 3 3.5 6 6.5 2.5 3 4 5.5 6 2 2.5 3 6 7 2.5 3 3.5 4.5 6 2 3 3.5 6 625 0 2 3.5 4 7 7 2.5 2.5 4.5 6 6.5 2 2.5 3 5 6.5 1.5 3 3 4.5 7 2 3 3.5 6 726 1 2.5 3.5 3.5 7 7 2.5 4 5 6 7 2 3.5 4.5 6 7 2 3.5 4 5 6.5 2.5 3.5 3.5 6.5 4

180 Appendix A. Ratings

Tab

leA.8:Exp

erim

entVI(Section

4.6,

p.118).Ratings

ofthe26

participan

ts(P

)by

instrument(instr,1

:piano

,0:o

ther

instruments),

voice(upp

er,m

iddle),velocity

combina

tion

s(+

0,+10,+20

MID

Ivelocity

units,

MV),

andasyn

chrony

(−55,−2

7,0,

+27,+55

ms).

Upp

ervo

ice

Middlevo

ice

+0MV

+10

MV

+20

MV

+0MV

+10

MV

+20

MV

Pinstr

−55

−27

0+27

+55

−27

0+27

−55

0+55

−55

−27

0+27

+55

−27

0+27

−55

0+55

11

54

33

46

56

77

73

23

34

33

21

11

21

55

43

46

57

77

73

34

54

23

21

11

30

43

44

36

55

77

73

42

32

22

21

11

41

45

34

45

55

76

74

33

45

31

22

12

51

35

45

45

56

66

74

44

54

23

31

22

61

55

56

46

56

66

63

55

55

34

31

12

71

55

43

46

57

77

74

36

44

32

21

11

81

43

43

24

44

66

53

23

24

33

32

32

90

24

44

25

65

77

71

33

33

11

21

11

101

44

44

45

45

66

63

33

43

33

32

22

110

34

43

35

65

76

62

23

43

12

21

11

121

24

35

54

35

67

64

34

44

34

46

22

131

32

44

44

46

56

73

45

54

43

33

21

140

54

33

44

45

76

63

33

45

34

32

22

150

55

44

46

66

77

73

35

44

22

31

11

160

45

44

35

66

65

73

33

54

23

21

11

171

64

34

56

66

77

72

33

43

22

11

11

181

34

43

36

44

77

73

41

34

12

21

11

191

45

43

35

55

76

63

43

45

32

21

21

201

34

33

36

46

66

73

33

32

23

21

11

210

35

44

55

55

65

73

34

33

22

31

23

220

54

53

46

55

76

62

23

44

23

31

22

231

44

43

35

55

66

73

33

44

31

21

11

241

44

43

45

56

65

63

33

53

22

31

22

251

44

23

35

45

66

63

43

43

32

21

12

260

34

23

35

55

76

72

23

43

12

21

21

Appendix B

Curriculum Vitae

Name Werner GoeblAddress Bonygasse 29/2, A-1120 Vienna, Austria

For current contact details, please confer to my webpagehttp://www.oefai.at/∼werner.goebl.

I was born in Klagenfurt, Austria, September 12, 1973.

1979–1983 Primary School (Volksschule) in Pettendorf near Regensburg,Germany.

1983–1991 Highschool “Musisches Gymnasium” BG III in Salzburg, Aus-tria.

June 1991 Graduation with distinction (Matura mit Auszeichnung) atBG III Salzburg in German, Latin, Mathematics, Music, andPhysics.

1990–1995 University of Music “Mozarteum,” Salzburg and University ofMusic, Vienna: piano pedagogy, degree awarded in June 1995(Instrumental- und Gesangspadagogik, IGP I).

1993–1999 University of Vienna: major in Musicology, with Psychology,Sociology, and History as electives.

December 1999 Mag. phil. with the Master’s thesis “Numerisch-klassifikatori-sche Interpretationsanalyse mit dem ‘Bosendorfer Computer-flugel’ ,” University of Vienna.

1994–2000 Piano performance studies (Konzertfach) at the University ofMusic, Vienna (Klasse Prof. Noel Flores).

January 2000 Concert diploma (1. Diplom).

181

182 Appendix B. Curriculum Vitae

Since February 2000 Researcher at the Austrian Research Institute for Artificial In-telligence in the project: “Computer-Based Music Research:Artificial Intelligence Models of Musical Expression” with Prof.Gerhard Widmer.

Since October 2000 PhD student at the Institut fur Musikwissenschaft, Karl-Franzens-Universitat Graz with Prof. Richard Parncutt assupervisor.

April–July 2001 Guest researcher at the Department of Speech Music and Hear-ing at the Royal Institute of Technology, Stockholm, Sweden.

Since October 2002 Piano chamber-music studies with Prof. Avo Kouyoumdjian atthe University of Music, Vienna.