Event-related potential correlates of the expectancy violation effect during emotional prosody...

10
Biological Psychology 86 (2011) 158–167 Contents lists available at ScienceDirect Biological Psychology journal homepage: www.elsevier.com/locate/biopsycho Event-related potential correlates of the expectancy violation effect during emotional prosody processing Xuhai Chen a,b , Lun Zhao c , Aishi Jiang a,b , Yufang Yang a,a State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, 4A Datun Road, Chaoyang District, Beijing 100101, China b Graduate University of Chinese Academy of Sciences, Beijing 100049, China c Center for Visual Art & Brain Cognition, Beijing Shengkun Yan Lun Technology Co., Ltd, Beijing 100192, China article info Article history: Received 5 April 2010 Accepted 9 November 2010 Available online 18 November 2010 Keywords: Event-related potential Emotional prosody Expectancy violation Mismatch negativity Prosodic Expectancy Positivity LPC abstract The present study investigated the expectancy violation effects evoked by deviation in sentential emo- tional prosody (EP), and their association with the deviation patterns. Event-related potentials (ERPs) were recorded for mismatching EPs with different patterns of deviation and for matching control EPs while subjects performed emotional congruousness judgment in Experiment 1 and visual probe detection tasks in Experiment 2. In the control experiment, EPs and acoustically matched non-emotional materials were presented and ERPs were recorded while participants judged the sound intensity congruousness. It was found that an early negativity, whose peak latency varied with deviation pattern, was elicited by mismatching EPs relative to matching ones, irrespective of task-relevance. A late positivity was specif- ically induced by mismatching EPs, and was modulated by both deviation pattern and task-relevance. Moreover, these effects cannot be simply attributed to the change in non-emotional acoustic properties. These findings suggest that the brain detects the EP deviation rapidly, and then integrates it with con- text for comprehension, during which the emotionality plays a role of speeding up the perception and enhancing vigilance. © 2010 Elsevier B.V. All rights reserved. 1. Introduction In addition to the verbal emotion conveyed by semantic content, a speaker expresses his or her emotion in prosody by manipulat- ing acoustic parameters such as pitch, intensity, and speech rate, termed as emotional prosody (EP) or vocal emotion. Banse and Scherer (1996) found that each emotion has its own acoustic profile. For example, the vocalization of anger reveals a higher fundamental frequency (F 0 ) and intensity than that of neutral utterance. Success- ful emotional communication is critical for social interaction. Par- ticularly, it is adaptively important to detect EP deviation in time, since changes in EP are common in spoken interactions, and a sud- den variation in prosody is a direct signal for the speakers’ emotion changes. Despite the fact that human beings are born with compe- tence to process deviation in EP efficiently, the underlying neural mechanism remains unclear. Therefore, it is highly important to investigate the neural correlates of the perception of EP deviation and to further clarify the cognitive mechanism of vocal emotion perception. Corresponding author. Tel.: +86 10 64888629; fax: +86 10 64872070. E-mail address: [email protected] (Y. Yang). 1.1. ERP evidence on expectancy violation in auditory signals The auditory perception is based on predictive representa- tion of temporal regularities, which are continuously generating expectations of the future behavior of sound sources (Winkler, 2007; Winkler et al., 2009). In fact, ERP correlates underlying the expectancy violation in sequential auditory processing have been widely investigated. It was found that pitch deviation in melody and spoken language elicited an early negativity followed by a promi- nent P300 (Brattico et al., 2006; Magne et al., 2006; Schön et al., 2004). Moreover, syntactic deviants in auditory language elicited an Early Left-lateralized Anterior Negativity (ELAN) (Hahne and Friederici, 1999), while unexpected music chords elicited an Early Right-lateralized Anterior Negativity (ERAN) (Koelsch et al., 2000, 2007). In short, all these studies have suggested that the expectancy violations in sequential auditory signals can be detected rapidly, and manifested by an early negativity in brain potentials. The decoding of EP, similar to that of other sequential auditory material like language and music pieces, is based on predictive rep- resentations of temporal regularities. Because of the pitch contour and temporal structure inherent for specific EP, specific events are expected at given time points as acoustic events unfolded. There- fore, deviation in EP is bound to bring about expectancy violation, that is, the expectation for the EP development was violated by deviant sounds (Kotz and Paulmann, 2007). In fact, several studies 0301-0511/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.biopsycho.2010.11.004

Transcript of Event-related potential correlates of the expectancy violation effect during emotional prosody...

Ee

Xa

b

c

a

ARAA

KEEEMPL

1

aitSFfftsdctmiap

0d

Biological Psychology 86 (2011) 158–167

Contents lists available at ScienceDirect

Biological Psychology

journa l homepage: www.e lsev ier .com/ locate /b iopsycho

vent-related potential correlates of the expectancy violation effect duringmotional prosody processing

uhai Chena,b, Lun Zhaoc, Aishi Jianga,b, Yufang Yanga,∗

State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, 4A Datun Road, Chaoyang District, Beijing 100101, ChinaGraduate University of Chinese Academy of Sciences, Beijing 100049, ChinaCenter for Visual Art & Brain Cognition, Beijing Shengkun Yan Lun Technology Co., Ltd, Beijing 100192, China

r t i c l e i n f o

rticle history:eceived 5 April 2010ccepted 9 November 2010vailable online 18 November 2010

eywords:vent-related potential

a b s t r a c t

The present study investigated the expectancy violation effects evoked by deviation in sentential emo-tional prosody (EP), and their association with the deviation patterns. Event-related potentials (ERPs)were recorded for mismatching EPs with different patterns of deviation and for matching control EPswhile subjects performed emotional congruousness judgment in Experiment 1 and visual probe detectiontasks in Experiment 2. In the control experiment, EPs and acoustically matched non-emotional materialswere presented and ERPs were recorded while participants judged the sound intensity congruousness.It was found that an early negativity, whose peak latency varied with deviation pattern, was elicited by

motional prosodyxpectancy violationismatch negativity

rosodic Expectancy PositivityPC

mismatching EPs relative to matching ones, irrespective of task-relevance. A late positivity was specif-ically induced by mismatching EPs, and was modulated by both deviation pattern and task-relevance.Moreover, these effects cannot be simply attributed to the change in non-emotional acoustic properties.These findings suggest that the brain detects the EP deviation rapidly, and then integrates it with con-text for comprehension, during which the emotionality plays a role of speeding up the perception and

enhancing vigilance.

. Introduction

In addition to the verbal emotion conveyed by semantic content,speaker expresses his or her emotion in prosody by manipulat-

ng acoustic parameters such as pitch, intensity, and speech rate,ermed as emotional prosody (EP) or vocal emotion. Banse andcherer (1996) found that each emotion has its own acoustic profile.or example, the vocalization of anger reveals a higher fundamentalrequency (F0) and intensity than that of neutral utterance. Success-ul emotional communication is critical for social interaction. Par-icularly, it is adaptively important to detect EP deviation in time,ince changes in EP are common in spoken interactions, and a sud-en variation in prosody is a direct signal for the speakers’ emotionhanges. Despite the fact that human beings are born with compe-ence to process deviation in EP efficiently, the underlying neural

echanism remains unclear. Therefore, it is highly important tonvestigate the neural correlates of the perception of EP deviationnd to further clarify the cognitive mechanism of vocal emotion

erception.

∗ Corresponding author. Tel.: +86 10 64888629; fax: +86 10 64872070.E-mail address: [email protected] (Y. Yang).

301-0511/$ – see front matter © 2010 Elsevier B.V. All rights reserved.oi:10.1016/j.biopsycho.2010.11.004

© 2010 Elsevier B.V. All rights reserved.

1.1. ERP evidence on expectancy violation in auditory signals

The auditory perception is based on predictive representa-tion of temporal regularities, which are continuously generatingexpectations of the future behavior of sound sources (Winkler,2007; Winkler et al., 2009). In fact, ERP correlates underlying theexpectancy violation in sequential auditory processing have beenwidely investigated. It was found that pitch deviation in melody andspoken language elicited an early negativity followed by a promi-nent P300 (Brattico et al., 2006; Magne et al., 2006; Schön et al.,2004). Moreover, syntactic deviants in auditory language elicitedan Early Left-lateralized Anterior Negativity (ELAN) (Hahne andFriederici, 1999), while unexpected music chords elicited an EarlyRight-lateralized Anterior Negativity (ERAN) (Koelsch et al., 2000,2007). In short, all these studies have suggested that the expectancyviolations in sequential auditory signals can be detected rapidly,and manifested by an early negativity in brain potentials.

The decoding of EP, similar to that of other sequential auditorymaterial like language and music pieces, is based on predictive rep-resentations of temporal regularities. Because of the pitch contour

and temporal structure inherent for specific EP, specific events areexpected at given time points as acoustic events unfolded. There-fore, deviation in EP is bound to bring about expectancy violation,that is, the expectation for the EP development was violated bydeviant sounds (Kotz and Paulmann, 2007). In fact, several studies

Psycho

iPdPert

1

dia(peSbtBwtiit

vvfsepmfstoifi

1

aT2eiafwl“a

dwarretmtc

X. Chen et al. / Biological

nvestigated neural correlates of the expectancy violations in EP.rosodic expectancy violations were reported to elicit a positiveeflection 350 ms post violation (Prosodic Expectancy Positivity,EP) while the combined prosodic-semantic expectancy violationslicited a negative deflection 100 ms after the violation onset,egardless of task relevance, emotional category and speaker iden-ity (Kotz and Paulmann, 2007; Paulmann and Kotz, 2008b).

.2. Evidence for rapid processing of EP

It was reported by a handful of studies that the human brain canifferentiate emotional from neutral prosody rapidly. For instance,

t was found that ERPs induced by emotional prosodic materi-ls differed from those by neutral materials in P200 componentPaulmann and Kotz, 2008a). Furthermore, studies using oddballaradigm found that emotional category changes demonstratedarly negative responses around 200 ms (Goydke et al., 2004;chirmer et al., 2005; Thönnessen et al., 2010), indicating that therain is able to use complex acoustic differences of different emo-ional expression for rapid categorization. In addition, studies byrosch et al. (2008, 2009) indicated that response times for targetsere faster when they appeared at location of emotional compared

o neutral prosody. The explanation for this processing differences that the evolutionary significance of emotion can lead to prior-tized processing strategies which entail fast attentional orientingo emotional stimuli.

A comprehensive working model of EP perception assumed thatocal emotional comprehension is a multi-stage process with indi-idual sub-processes: analyzing acoustic parameters in the timerame of approximately 100 ms, deriving and integrating emotionalignificance from acoustic cues at about 200 ms, and applying themotional significance to higher cognitive processing in later timeoint (Schirmer and Kotz, 2006). According to this Multi-stageodel, to comprehend deviation in EP, one has to analyze acoustic

eatures of the deviant sound to extract emotional significance, sub-equently, to integrate them with the prosodic context precedinghe deviation. Moreover, given the evidence of rapid differentiationf vocal emotion (Goydke et al., 2004; Thönnessen et al., 2010), its conceivable that the brain detects the deviation in EP during therst two stages and completes all these processes in a short time.

.3. The present study

Despite extensive studies of expectancy violation in languagend music, the expectancy deviation in EP remains undetermined.he PEP starting 350 ms after violation onset (Kotz and Paulmann,007; Paulmann and Kotz, 2008b) seems unlikely to be the earli-st marker of the brain’s detection of vocal emotion deviation, as its incompatible with the long observed fact that the brain detectsuditory deviance early before 200 ms and distinguishes emotionalrom neutral prosodies within 200 ms. Moreover, only the deviationith emotion transition “neutral-to-emotional” in these studies

eaves the temporal features of the opposite transition directionemotional-to-neutral”, which is lower in magnitude of deviancend emotional significance, unknown as yet.

Therefore, the current study addressed when the human brainetects EP deviation in sentential prosody in double directions andhether the brain’s detection of EP deviation is independent of

ttention allocation. Specifically, we measured participants’ brainesponses to prosodic changes in two experiments, one of whichequired subjects to judge the congruence of the prosodies via the

motion conveyed by the prosodies, while the other required themo detect visual probes while ignoring prosodies. Experimental

aterials comprised two types of mismatching EPs with bidirec-ional deviations: “neutral-to-angry” which shifts from a state ofalmness to an intense state of anger while “angry-to-neutral”

logy 86 (2011) 158–167 159

has the opposite shifting. Matching angry and neutral prosodiesserved as the control comparison respectively. The mismatchingEPs were created through the method of cross-splicing auditorysignals, which has proven an effective way to investigate the natureof prosody processing (Astesano et al., 2004; Kotz and Paulmann,2007; Paulmann and Kotz, 2008b; Steinhauer et al., 1999). Based onthe oddball research on EP (Goydke et al., 2004; Thönnessen et al.,2010) and the research on auditory expectancy violation (Bratticoet al., 2006; Schön et al., 2004) elaborated above, we hypothe-sized that the brain might detect the deviation in sentential EPrapidly, probably indexed by enhanced early negativities. More-over, late positivities comparable to PEP (Kotz and Paulmann, 2007;Paulmann and Kotz, 2008b) elicited by mismatching EPs were alsoexpected. Additionally, to exclude the possibility of brain detectingthe deviation simply via low level acoustic features and to specifythe role of emotion in EP deviation detection, a control experi-ment was carried out, in which participants were asked to detectthe sound intensity change during listening to EPs and their non-emotional spectrally rotated counterparts (Blesser, 1972; Sauterand Eimer, 2010; Warren et al., 2006).

2. Experiment 1

Experiment 1 was conducted in order to investigate when thebrain detects the deviation in EPs and how the deviation patterninfluences deviation processing. For this purpose, two types ofmismatching EPs with different patterns of deviation and theircorresponding control EPs were presented while subjects wereinstructed to decide the congruence of the emotion conveyed bythe prosodies.

2.1. Methods

2.1.1. ParticipantsFifteen right-handed university students (seven women, aged 20–26, mean

22.36) were recruited to participate in the experiment for payment. All participantswere healthy, right-handed native speakers of Mandarin Chinese and had no historyof affective or hearing disorder. All participants signed an informed consent formbefore the experiment. One participant was excluded from the analysis because ofheavy artifacts during the EEG recording session.

2.1.2. StimuliFifty sentences of neutral content (see Fig. 1 for example) were produced by

a trained native male actor of Mandarin Chinese in neutral and angry prosodiesand recorded in a soundproof chamber, at a sampling rate of 22 kHz. The materialswere acoustically analyzed using Praat (Boersma and Weenink, 2006). Statisticalanalysis of primary acoustical measurements found that, consistent with previousresearch (Banse and Scherer, 1996), angry prosodies have higher mean F0 (197 Hz vs.132 Hz, t(49) = 28.12, P < .001) and intensity (70 dB vs. 63 dB, t(49) = 23.18, P < .001),and faster speech rate (206 ms vs. 216 ms per syllable, t(49) = 5.43, P < .01) thanneutral prosodies (see Fig. 1 for a graphical illustration). Moreover, emotional recog-nition task was performed by 8 volunteers and the results showed that the twoprosodies were distinguished accurately. It is noteworthy that distinct from thematerials of previous studies differing mainly in pitch (Kotz and Paulmann, 2007),the materials in the present study differed in F0, intensity and speech rate, in linewith the finding that Mandarin Chinese depends more on intensity and speech rateto express emotion (Ross et al., 1986). Two types of mismatching EPs (“neutral-to-angry”, and “angry-to-neutral”) were obtained by cross-splicing the first partof a neutral prosody and the second part of a angry prosody, and vice versa (seeFig. 1 for a graphical illustration). Two splicing positions were used to increase thevariability of the prosody development, such that the occurrence of deviation wasunpredictable. Two hundred sentences, including two types of mismatch EPs andtwo types of matching EPs, were presented to participants.

2.1.3. ProcedureSentences were presented in a pseudo-randomized order in four blocks of 50

trials while each participant was seated comfortably at a distance of 115 cm from a

computer monitor in a sound-attenuating chamber. In each block, sentences fromthe same type of prosody were presented up to three consecutive trials. Each trialbegan with a fixation cross in the center of the monitor for 300 ms, and then thesentence was presented aurally via headphones while the cross remained on thescreen. The sound volume was adjusted for each participant to ensure that all sen-tences were heard clearly. The offset of sentences was followed by a question mark

160 X. Chen et al. / Biological Psychology 86 (2011) 158–167

Fig. 1. Example materials and design. Illustration (A) explains the splicing procedure. The NA prosodies were obtained by splicing the first part of neutral prosodies to thesecond part of angry prosodies (green to red), at the onset of the fifth syllable (*) or at the onset of the ninth syllable (#). The AN prosodies were created through cross-splicingin the opposite direction (red to green). Triggers were place at the splicing point and corresponding point in the control prosodies (* or #). Illustration (B) shows the acousticfeature of example NA prosody and its control non-emotional NAr materials (only the splicing-point at ninth syllable was illustrated). The dataset consists of oscillogram(up) and voice spectrographs (down) with uncorrected pitch contours (blue line) and intensity contours (yellow line) superimposed. In the control experiment, all stimuliwere low-pass filtered at 4 kHz, and then spectrally rotated using the method described by Blesser (1972), in which the filtered signal is convolved with a sinusoid at 4 kHz,f telligis inal soN llow)r

“btaa1

2

ittaIh

wnTralswprt4

s“(aMBE

adtMdpFmP

ollowed by low-pass filtering at 4 kHz. This acoustic manipulation produced uniname mean F0, intensity, speech rate and long-term average spectrum with the origN—all Neutral; NAr—rotated Neutral-to-Angry; AAr—rotated Angry; the same be

eferred to the web version of the article.)

?” for 300 ms. Participants were asked to respond as quickly and accurately as possi-le whether the emotion expressed by the prosody was congruent or not by pressinghe “J” or “F” button on the keyboard. The button for “yes” or “no” was counterbal-nced between participants. Participants were asked to look at the fixation cross,voiding eye movements during sentence presentation. The inter-trial interval was500 ms. Practice trails were used to familiarize subjects with the procedure.

.1.4. ERP recording and analysisElectroencephalogram (EEG) was recorded with 64 Ag–AgCl electrodes mounted

n an elastic cap with a sampling rate of 500 Hz. EEG data were referenced online tohe left mastoid and were re-referenced offline to the algebraic average of two mas-oids. Vertical electrooculograms (EOGs) were recorded supra- and infra-orbitallyt the left eye. Horizontal EOG was recorded from the left vs. right orbital rim.mpedances were kept below 5 k�. EEG and EOG recordings were amplified with aigh cutoff of 100 Hz.

EEG data were processed with the software NeuroScan 4.3. EEG and EOG signalsere screened off-line for eye movements and electrode drifting. Epochs with a sig-al change exceeding ±75 �V at any EEG electrode were rejected from averaging.he data were offline filtered with a low-pass filter of 30 Hz. Trials with incor-ect responses were excluded and more than 40 trails per condition remained forveraging. The reported data were time-locked to the splicing point and first base-ine corrected 200 ms before the start of the sentences and then 200 ms before theplicing point. Grand average ERPs were computed for each condition separatelyith an epoch including the 200 ms preceding and 1000 ms following the violationoints. According to visual inspection of these grand averages, the following timeanges were chosen for various ERP components: 150–250 ms (around the peak ofhe early negativity), 250–450 ms (around the peak of the following positivity) and50–900 ms (LPC).

The mean amplitudes over the midline (Fz, Cz, Pz) at time window defined weretatistically analyzed by repeated measures ANOVAs with Match (match: “angry”,neutral” vs. mismatch: “neutral-to-angry”, “angry-to-neutral”), Critical-prosodythe prosody time-locked, “angry” vs. “neutral”), Violation-position (fifth vs. ninth)s within subject factors. The result revealed no significant interaction involvingatch and Position, indicating that both violation positions had similar Match effect.

ecause the effect of violation position was not the focus of the current study, theRPs for two violation positions were merged in further analysis.

To further test Match effect, detail statistical analysis was performed on theccuracy data and the mean voltage for each condition in the defined time win-ows. Reaction time was not analyzed, because the reaction time data was notime-locked to the violation point. For the accuracy data, a repeated ANOVA with

atch and Critical-prosody as within subject factors were performed. As to the ERPata, repeated ANOVAs in each time window were conducted with Match, Critical-rosody and Region (anterior: left—F7, F5, F3, FT7, FC5, FC3, middle—F1, FZ, F2,CZ, FC1, FC, right—F8, F6, F4, FT8, FC6, FC4; central: left—T7, C5, C3, TP7, CP5, CP3,iddle—C1, CZ, C2, CP1, CP2, CPZ, right—T8, C6, C4, TP8, CP6, CP4; posterior: left—P7,

5, P3, PO7, PO3, O1, middle—P1, PZ, P2, POZ, OZ, right—P8, P6, P4, PO8, PO4, O2)

ble sounds that lacked the human vocal quality of the original stimuli but sharedunds (abbreviations: AA—all Angry; AN—Angry-to-Neutral; NA—Neutral-to-Angry;. (For interpretation of the references to color in this figure legend, the reader is

(for regional averaging, see Dien and Santuzzi, 2004) and Hemisphere (left, middle,right) as within-subject factors. The degrees of freedom of the F-ratio were correctedaccording to the Greenhouse–Geisser method in all these analyses.

2.2. Results

2.2.1. Behavioral resultsThe accuracy for four types of prosodies was around 90%

(see Table 1), with no significant difference across four types ofprosodies.

2.2.2. ERP resultsFig. 2 shows the ERPs elicited by all prosodies. Both mismatching

EPs elicited early negative deflections relative to the matching ones,peaking earlier for “neutral-to-angry” prosodies than for “angry-to-neutral” ones, which was confirmed statistically by determining thepeak latency of the most negative peak in the 100–300 time win-dow [middle-line: Fz, Cz, Pz, around 171 ms vs. 206 ms, t(13) = 5.02,P < .001]. Following the early negativity, positive deflections wereelicited by mismatching EPs in comparison to matching ones. Theresults for the particular analysis time windows were described.

2.2.2.1. The early negative effect (150–250 ms). The ANOVA over thisinterval showed robust effects of Region, F(2,26) = 18.47, P < .001,�2

p = .59, and Match, F(1,13) = 14.60, P < .01, �2p = .53. In addi-

tion, these two factors interacted with each other, F(2,26) = 41.32,P < .001, �2

p = .76, and then interacted with Critical-prosody fac-tor, F(2,26) = 8.41, P < .01, �2

p = .39. Further simple tests showedthat “angry-to-neutral” prosodies elicited more negative goingdeflection than “neutral” ones did over anterior, central areas,(Ps < .001), while the “neutral-to-angry” prosodies elicited morenegative going deflection than “angry” ones did only over anteriorareas (P < .05).

2.2.2.2. The early positive effect (250–450 ms). The ANOVA inthis time window yielded significant effects of Hemisphere,F(2,26) = 8.36, P < .01, �2

p = .39. Pos hoc comparison indicted thatthe ERPs were more positive going in middle-line areas than both

X. Chen et al. / Biological Psychology 86 (2011) 158–167 161

Table 1Mean and standard deviation of accuracy rates (ACC) and reaction time (RT) for three experiments as a function of stimulus type.

Experiment 1 (n = 14) Experiment 2 (n = 14) Experiment 3 (n = 16)

4)8)

lCPaP.tssd(p“

2i

Feaas

AA NA NN AN AA NA

RT 525 (215) 417 (203) 459 (185) 399 (223) 681 (203) 679 (20ACC 89.0 (8.9) 88.9 (14.0) 86.3 (14.0) 86.9 (19.6) 98.0 (2.0) 98.0 (1.

eft and right hemisphere (P < .01). Moreover, the main effects forritical-prosody and Match were also significant, [F(1,13) = 6.38,< .05, and F(1,13) = 11.49, P < .01, �2

p = .47, respectively]. Addition-lly, the Match factor interacted with Region factor, F(2,26) = 6.84,< .05, �2

p = .35, Hemisphere factor, F(2,26) = 13.12, P < .001, �2p =

50, and Critical-prosody factor, F(1,13) = 12.43, P < .01,. And thehree-way Region × Critical-prosody × Match interaction was alsoignificant, F(2,26) = 18.00, P < .001, �2

p = .58. Simple effect testshowed “neutral-to-angry” prosodies elicited more positive goingeflection than “angry” ones did at anterior (P < .001), centralP < .001) and posterior areas (P < .01), while “angry-to-neutral”rosodies elicited no significant positive deflection compared with

neutral” ones (Ps > .10).

.2.2.3. The LPC effect (450–900 ms). The analysis over thisnterval found significant main effects of Region and Match

ig. 2. Grand-average ERP waveforms of Experiment 1. (A) ERPs elicited by four types olicited by NA prosodies and thin dashed red line responses to its control comparison AAnd thin dashed blue line responses to its control comparison NN prosodies. In this figure,nd the time (in ms) is on abscissa. (B) Topographies of difference curves (NA minus AAee Fig. 1). (For interpretation of the references to color in this figure legend, the reader is

NN AN AA NA AAr NAr

681 (199) 694 (204) 1054 (188) 1035 (185) 1044 (202) 1043 (189)97.2 (2.5) 97.5 (3.1) 92.0 (4.1) 87.8 (4.2) 90.3 (7.8) 78.9 (9.0)

[F(2,26) = 42.56, P < .001, �2p = .77; F(1,13) = 20.23, P < .001, �2

p =.61; respectively]. And then the Match factor interacted withRegion factor, F(2,26) = 37.66, P < .001, �2

p = .74, and Hemispherefactor, F(2,26) = 7.83, P < .01, �2

p = .38. Further simple analysisrevealed that the match effect reached significance over central andposterior areas [F(1,13) = 21.44, P < .001; F(1,13) = 34.97, P < .001;respectively].

2.3. Discussion

Mismatching EPs elicited more negative early deflection com-

pared with matching ones, and this negativity was largest overanterior-central areas. As predicted, this negativity might signal thehuman brain’s detection of the deviation in the sentential EP. Onepossible explanation for this negative deflection is an exogenous N1evoked by a sudden acoustic change in the level of energy impinging

f prosodies at selected electrode sites. The thick solid red line indicates potentialsprosodies. The thick solid blue line represents potentials elicited by AN prosodiesas in the following ones, the amplitude (in �V) is plotted on ordinate (negative up)

vs. AN minus NN, viewed from the top) in selected time periods (for abbreviationsreferred to the web version of the article.)

1 Psycho

oTnal1(ecsttiao

lcactrspm2lagapSntac

pp2i2stbpoi2

ddothuti

3

3

3

2E

62 X. Chen et al. / Biological

n the sensory receptors (Clynes, 1969; Näätänen and Picton, 1987;hierry and Roberts, 2007). However, our finding of enhanced earlyegativity during mismatching conditions should not be simplyttributed to physical feature changes. First, the peak latency wasater than that of classic N1, which usually peaks between 80 and20 ms and returns to baseline at 160–180 ms post stimulus onsetWoods, 1995). Second, mismatching “angry-to-neutral” prosodieslicited larger negative deflection than “neutral-to-angry” ones did,ontradicting the notion that N1 is smaller to a less intense devianttimulus (Näätänen and Picton, 1987). Third, it was indicated thathe latency of N1 elicited by a deviant stimulus did not vary withhe magnitude of deviation (Lawson and Gaillard, 1981). Obviously,t was not the case in the present study, in which the “neutral-to-ngry” prosodies elicited earlier negativity than “angry-to-neutral”nes did.

An alternative explanation is that it reflected the expectancy vio-ation of EP development. We constructed EP deviations throughross-splicing two types of natural prosodies. Consequently, thecoustic profile of the spliced part did not match the precedingontext (Fig. 1 for detail). As acoustic events unfolded over time,he brain generated expectation for the oncoming events based onegularities extracted from the preceding context and knowledgetored in long-term memory (Winkler, 2007). The brain detectsrosodic changes that violate expectation, as reflected by mis-atch negativity in brain potentials (for review see, Näätänen et al.,

007; Winkler, 2007). Moreover, the early negativity might over-ap with N2b. The participants in the present experiment activelyttended to the deviations in order to judge the emotional incon-ruence. Prior studies indicated that when a sound stream isttended, the mismatch negativity elicited by deviants is at leastartially overlapped by N2b (Näätänen, 1982; Näätänen et al., 2007;chröger, 1998). The smaller magnitude of deviance of “angry-to-eutral” prosodies required more efforts for the brain to detecthe emotional incongruence, consequently to produce larger N2bmplitudes that were associated with increased difficulty of dis-rimination (Fitzgerald and Picton, 1983).

Following the early negativity, mismatching EPs elicited moreositive deflection in comparison to matching ones. In line withrevious studies (Kotz and Paulmann, 2007; Paulmann and Kotz,008b), “neutral-to-angry” prosodies elicited frontal-central max-

mal positive deflection comparable to PEP over the interval of50–450 ms. However, the “angry-to-neutral” prosodies elicited noalient positive deflection compared with their control prosodies athis time window, indicating that this positivity may be influencedy deviation pattern. Furthermore, both patterns of mismatchingrosodies elicited late positive component (LPC) at the intervalf 450–900 ms, mainly over posterior areas, which may signal anntegration process of the expectancy violation (Besson and Schön,001; Brattico et al., 2006).

In short, the current results suggested that the brain detects EPeviation quickly and then integrates it with the context when theeviation is task relevant. However, it is likely that ERPs indicativef EP violation depend on task related strategy and focused atten-ion, and overlap with potentials that emerge when the deviationave to be detected (such as N2b). To exclude this possibility, stim-li identical to Experiment 1 were presented to participants underhe instruction to verify visual probes while ignoring the prosodiesn Experiment 2.

. Experiment 2

.1. Methods

.1.1. ParticipantsFifteen right-handed university students (eight women, aged 20–26, and mean

2.14) participated in the experiment for payment. None of them had participated inxperiment 1. All participants were right-handed native speakers of Mandarin Chi-

logy 86 (2011) 158–167

nese with no history of affective or hearing disorder. One participant was excludedfrom the analysis because of too many artifacts during the EEG recording session.

3.1.2. Stimuli and procedureStimuli and procedure were identical to Experiment 1 except that the task for

participants was to decide whether a visual probe had occurred in the preced-ing sentence while ignoring the prosody. After the sentence offset a two-syllableword serving as visual probe was presented for 300 ms and the participants had torespond as quickly and accurately as possible by pressing the “J” or “F” button onthe keyboard.

3.1.3. ERP recording and analysisERP recording and analysis were identical to Experiment 1, except the time

window of interest. According to visual inspection of grand averages, the followingtime ranges were chosen for various ERP components: 150–250 ms (around the peakof the early negativity), 250–400 ms (around the peak of the following positivity).ANOVAs to test the interaction between Match and Position were performed and nosignificance effects were found, and thus the ERPs for two positions were mergedin further analysis.

3.2. Results

3.2.1. Behavioral resultsParticipants scored with approximately 98% correct responses

in probe detection (see Table 1). The ANOVA for accuracy rate andreaction time yielded no statistically significant difference.

3.2.2. ERP resultsFig. 3 shows the brain responses to all prosodies. While the

matching EPs only elicited amplitude variation waving around thebaseline, the mismatching EPs elicited an early negative deflection,peaking earlier for “neutral-to-angry” prosodies than for “angry-to-neutral” ones. This was confirmed statistically by determining thepeak latency of the most negative peak in the 100–300 time win-dow [middle-line: Fz, Cz, Pz, around 179 ms vs. 206 ms, t(13) = 3.37,p < .01]. Furthermore, the “neutral-to-angry” prosodies elicited pos-itive deflection compared with their control matching EPs, peakingaround 330 ms. The following were the results for the particularanalysis time windows.

3.2.2.1. The early negative effect (150–250 ms). The ANOVA overthis interval showed salient main effects of Region, F(2,26) = 7.87,P < .01, �2

p = .48, and Match, F(1,13) = 8.91, P < .05, �2p = .41. In addi-

tion, the two factors interacted with each other, F(2,26) = 26.61,P < .001, �2

p = .67. The three-way Region × Hemisphere × Matchinteraction was also significant, F(2,26) = 4.09, P < .05, �2

p = .24. Fur-ther simple tests showed that the mismatching EPs elicited morenegative going deflection than matching ones did in anterior andcentral areas over all hemispheres (Ps < .01 or .05).

3.2.2.2. The positive effect (250–400 ms). The ANOVA overthis interval yielded a significant two-way Match × Critical-prosody interaction, F(1,13) = 5.72, P < .05, �2

p = .30, three-wayRegion × Critical-prosody × Match interaction, F(2,26) = 6.81,P < .05, �2

p = .34, and three-way Hemisphere × Critical-prosody × Match interaction, F(2,26) = 5.71, P < .05, �2

p = .31.Further simple tests showed that “neutral-to-angry” prosodieselicited more positive deflection than their control comparisonover anterior (P = .05) and central areas, (P < .05), while “angry-to-neutral” prosodies had no such effect (Ps > .10). Moreover,“neutral-to-angry” prosodies elicited more positive deflectionthan their control “angry” ones in left hemisphere and middle-line(Ps < .05) while only marginal significant (P = .07) in the righthemisphere.

3.3. Discussion

When participants performed a task unrelated to deviation,mismatching EPs elicited salient anterior-central negative ERPs

X. Chen et al. / Biological Psychology 86 (2011) 158–167 163

F pes oe on AAa pogras ces to

ciot“eitscaem

faiidbtwsHt

ig. 3. Grand-average ERP waveforms of Experiment 2. (A) ERPs elicited by four tylicited by NA prosodies and thin dashed red line responses to its control comparisnd thin dashed blue line responses to its control comparison NN prosodies. (B) Toelected time periods (for abbreviations see Fig. 1). (For interpretation of the referen

ompared with matching ones, indicating that the human brains capable of detecting EP deviation quickly even in the absencef focused attention and task-related strategy. The mean ampli-udes of early negativities were not significantly different betweenneutral-to-angry” and “angry-to-neutral” prosodies in the currentxperiment. This excluded the possibility of N2b overlap. Takingnto account their features, we reasoned that the early negativi-ies were caused by EP expectancy violation rather than merely byingle acoustic change. As mentioned before, the auditory systeman automatically extract the regularity from the sensory input andctivate the long-term memory of the prosodic feature of EP. If vocalmotion change brought in violation of the detected regularity,ismatch negativity emerged automatically.Additionally, there were significant more positive deflections

or “neutral-to-angry” prosodies at the interval 250–400 ms overnterior-central areas. This finding is in line with previous stud-es (Kotz and Paulmann, 2007; Paulmann and Kotz, 2008b), whichndicated that EP expectancy violation elicited more positive-goingeflection than those with no EP expectancy violation. This proba-ly resulted from the fact that the participants immediately raised

heir vigilance and assigned significance in speech comprehensionhen they heard “neutral-to-angry” prosodies with emotion tran-

ition from neutral to the angry, even without deliberate attention.owever, “angry-to-neutral” prosodies elicited no salient posi-

ive deflection in comparison to their matching prosodies, possibly

f prosodies at selected electrode sites. The thick solid red line indicates potentialsprosodies. The thick solid blue line represents potentials elicited by AN prosodies

phies of difference curves (NA minus AA vs. AN minus NN, viewed from the top) incolor in this figure legend, the reader is referred to the web version of the article.)

because the transition from angry to neutral only signals a relax-ation of anger.

In sum, the results of these two experiments suggest that a rapiddetection of the EP deviation takes place in the anterior-centralbrain regions, followed by an integration of the deviation to thepreceding context. However, it could be argued that these effectsmight simply reflect the ability of the brain to detect and processchanges in acoustic properties whether they have emotional mean-ing or not. To focus more on EP processing, we try to specify the roleof emotionality in such processes in the following experiment.

4. Experiment 3

4.1. Methods

4.1.1. ParticipantsSixteen right-handed native speakers of Mandarin Chinese (nine women, aged

22–25, mean 23.44), who did not participate in the former two experiments, wererecruited to participate in the experiment. All participants reported normal auditoryand normal or corrected-to-normal visual acuity and no neurological, psychiatric, orother medical problems. Participants gave informed consent and received monetary

compensation.

4.1.2. Stimuli and procedureHalf of the stimuli were EPs (only “neutral-to-angry” prosodies and their control

“angry” prosodies were included) identical to that used in Experiment 1. The otherhalf were spectrally rotated versions of the EPs. Spectral rotation (see Fig. 1) pre-

1 Psycho

swl(tow

4

“c

4

4

rPt.ftr

4

ipsefinPw

4sMtFnF

4naMawFbpPa

4cTi[tTP

64 X. Chen et al. / Biological

erves amplitude envelope and duration information, and pitch and pitch variationhile distorting spectral information (Blesser, 1972), as a result, to provide a base-

ine condition sharing low-level acoustic features but without affective propertiesSauter and Eimer, 2010; Warren et al., 2006). Experiment procedure was identicalo Experiment 1, except that the task for the participants was to decide whetherbvious sound intensity deviation happened in the sentence after they head thehole sentence.

.1.3. ERP recording and analysisERP recording and analysis were identical to Experiment 1, except the factor

Critical-prosody” was replaced by “Type” and the time window of interest washanged to 130–230 ms, 230–430 ms, and 430–900 ms for various ERP components.

.2. Results

.2.1. Behavioral resultsBehavioral results are illustrated in Table 1. The ANOVA on accu-

acy rates yielded significant main effects of Type, F(1,15) = 9.89,< .01, �2

p = .40 and Match, F(1,15) = 18.73, P < .001, �2p = .56, and

wo-way interaction of Type × Match, F(1,15) = 16.02, P < .001, �2p =

52. Further simple test showed that the Accuracy rates were higheror the match than mismatch conditions under both Type condi-ions (P < .05, and P < .001, respectively). No detail analysis abouteaction time was performed because of the delay of the response.

.2.2. ERP resultsFig. 4 shows the brain responses to all prosodies. The mismatch-

ng stimuli elicited an enhanced early negative deflection and lateositive shift compared with their matching control materials, irre-pective of material type. The early negative deflection peakedarlier for EPs than their rotated counterparts, which was con-rmed statistically by determining the peak latency of the mostegative peak in the 100–300 time window [middle-line: Fz, Cz,z, around 171 ms vs. 185 ms, t(15) = 3.45, P < .001]. The followingere the results for the particular analysis time windows.

.2.2.1. The early negative effect (130–230 ms). The ANOVA showedalient main effects of Region, F(2,30) = 13.18, P < .01, �2

p = .38, andatch, F(1,15) = 10.45, P < .01, �2

p = .41. In addition, the Match fac-or interacted with the Region factor, F(2,30) = 3.31, P < .05, �2

p = .18.urther simple tests showed that the Match effect reached sig-ificance over anterior, F(1,15) = 11.17, P < .01, and central areas,(1,15) = 11.32, P < .01.

.2.2.2. The positive effect (230–430 ms). The ANOVA showed sig-ificant effects of Hemisphere, F(2,30) = 18.89, P < .001, �2

p = .56,nd Match, F(1,15) = 27.79, P < .001, �2

p = .65. In addition, theatch factor interacted with Type, F(1,15) = 9.82, P < .01, �2

p = .40,nd Hemisphere F(2,30) = 14.67, P < .001, �2

p = .49. And the three-ay Region × Hemisphere × Match interaction was also significant,

(2,30) = 13.69, P < .001, �2p = .48. Simple effect tests showed that

oth mismatching EPs and their rotated counterparts elicited moreositive going deflection than their control materials, (P < .001, and< .01, respectively), and this effect was significant over all regionsnd hemispheres (Ps < .01).

.2.2.3. The LPC effect (430–900 ms). The analysis yielded signifi-ant main effects of Region, F(2,30) = 35.17, P < .001, �2

p = .70, andype, F(1,15) = 5.98, P < .05. Significant two-way Match × Region

nteraction and Type × Region interaction were also found,F(2,30) = 6.50, P < .05,; F(2,30) = 4.95, P < .05, �2

p = .25; respec-ively]. Further simple analysis revealed that the Match effect andype effect were significant over posterior areas [F(1,15) = 8.81,< .01; F(1,15) = 17.23, P < .001; respectively].

logy 86 (2011) 158–167

4.3. Discussion

The present data, consistent with previous studies (Näätänenet al., 2007; Thierry and Roberts, 2007), suggested that changesin acoustic properties alone elicit the negative-positive deflectionwhen they are task relevant. However, as compared to the changesin acoustically matched control sounds, the changes in EPs elicitedearlier negativity and larger positive deflection. These differencesresulted solely from the emotionality of prosodies, as the EPs sharethe basic acoustic features with their spectrally rotated counter-part. In view of the long observed fact of faster and prioritizedprocessing of emotional stimuli (Ito et al., 1998; Yuan et al., 2007),this phenomenon is very much conceivable. Moreover, it is worthnoting that the task in the current experiment is emotion irrelevant,which implies that the contribution of emotionality to the changedetection is independent of task-relevance.

5. General discussion

Because of the predictive encoding of sequential auditory infor-mation and the fast differentiation of emotional and neutralprosodies, we hypothesized that the brain could detect the EPdeviation quickly and then integrate it with the preceding contextduring the processing of sentential EP. Consistent with our predic-tion, the mismatching EPs elicited early negativities relative to thematching ones irrespective of attention access, and the peak latencywas shorter when the deviation brought in larger deviance. More-over, the deviations also elicited late positive deflection, whichwas influenced by deviation pattern and task relevance. Althoughchanges in non-emotional acoustic clusters also evoked similar pat-tern of effects, prominent differences in early peak latency andpositive amplitude were observed between EPs and their non-emotional counterparts, implying that the emotionality plays aspecial role in the perception of EP deviation.

5.1. Early negative ERP effect

Both mismatching EPs elicited early anterior-central dominantnegativity compared with matching ones, indicating that the braincan detect the deviation in sentential EP rapidly. The question arisesthen, what aspect of the deviation elicited the current early neg-ativity. As stated in Experiment 1, an extraneous N1 account ofthe early negativity is farfetched. Alternatively, the early negativ-ity might be mismatch negativity elicited when incoming soundsviolate regularity encoded in auditory system. As the auditory sys-tem may prepare for encountering certain sounds in the immediatefuture in predictable sound sequences, mismatch negativity wouldbe elicited automatically if incoming sounds violated the predic-tion (Winkler, 2007; Winkler et al., 2009). The current finding isconsistent with the prior studies indicating that the brain is capa-ble of performing a fast and automatic categorization according tocomplex acoustic features of emotional expression (Bostanov andKotchoubey, 2004; Goydke et al., 2004; Thönnessen et al., 2010).With oddball design in prior studies, the standard stimuli withinone emotional category allowed the participants to generate expec-tation for the upcoming events, while the deviant stimuli outsidethe emotional category violated the expectation and consequentlyelicited mismatch negativity. Similarly, the context preceding theviolation point in the present study allowed participants to gener-ate expectation, and the sudden deviation violated the expectation

and thus elicited the current early negativity.

Also, one may argue that the prediction and expectation weresimply based on low level acoustic properties, and thus the effectsobserved might have nothing to do with emotion. Nevertheless,the results of the control experiment proved that only the change of

X. Chen et al. / Biological Psychology 86 (2011) 158–167 165

Fig. 4. Grand-average ERP waveforms of Experiment 3. (A) ERPs elicited by four types of materials at selected electrode sites. The thick solid red line indicates potentialse on AAa ograps ces to

cotttpiisu2pe

tdrted(Mmon

licited by NA prosodies and thin dashed red line responses to its control comparisnd thin dashed blue line responses to its control comparison AAr prosodies. (B) Topelected time periods (for abbreviations see Fig. 1). (For interpretation of the referen

luster of acoustic feature cannot completely explain the effects webserved in the former two experiments, because deviation in emo-ional prosodies elicited earlier negativities. These results allow uso speculate that the brain detects the emotional prosodic devia-ion via both acoustic properties and emotionality conveyed by therosody. As stated in the introduction, vocal emotion is delivered

n the variations in pitch, intensity, rhythm, pauses, and vocal qual-ty (Juslin and Laukka, 2003; Scherer, 2003). It has been proven thatimply based on acoustic features the brain is able to formulate reg-larity and detect the change in an auditory stream (Näätänen et al.,007; Winkler et al., 2009). However, when the cluster of acousticarameters is combined as an auditory object expressing certainmotion, the emotionality is likely to speed up the processing.

Early negativity was observed when participants were dis-racted from the sudden deviation, suggesting the detection ofeviation in emotional prosody can be accomplished without taskelated strategy. This was in line with previous studies revealinghat mismatch negativity reflecting expectancy violation can bevoked when subjects performed irrelevant task, such as detectingeviant instrument (Koelsch et al., 2007) or watching a silent movie

Brattico et al., 2006; Goydke et al., 2004; Thönnessen et al., 2010).

oreover, consistent with the notion that the peak latency of mis-atch negativity decreases with the increasing of the magnitude

f stimulus change (Näätänen et al., 2007), the peak latency of theegativity was shorter as the saliency of the deviation increased.

prosodies. The thick solid blue line represents potentials elicited by NAr prosodieshies of difference curves (NA minus AA vs. NAr minus AAr, viewed from the top) incolor in this figure legend, the reader is referred to the web version of the article.)

In addition, the early negativity is remarkably similar to the ERPeffects found for expectancy violation in spoken language andmusic. For example, syntactic incongruities in auditory languageelicited ELAN (Hahne and Friederici, 1999) while the incongruityof pitch contour (Brattico et al., 2006; Schön et al., 2004), har-monic (Koelsch et al., 2000, 2007) or rhythm (Vuust et al., 2009) inmusic perception elicited early negativity (ERAN). The sequentialauditory signals, including spoken language and music, are all rule-based systems composed of basic elements (acoustic events) thatare combined into higher order structures through abstract ruleslike syntax (Besson and Schön, 2001). Moreover, the perception ofall these signals refers to the predictive regularity representation(Winkler, 2007; Winkler et al., 2009). Therefore, it is conceivablethat deviations in these materials would elicit similar ERP effects.

Early negativity was not found in the studies by Kotz andPaulmann (2007). Two reasons presumably account for thisdiscrepancy. First, whereas the ERPs in previous studies were time-locked to the onset of the sentence, those in the present study weretime-locked to the violation point, which may be more sensitiveto the early component. Secondly, while the materials used in the

previous studies mainly differed in pitch, the materials in the cur-rent study differed in pitch contour, intensity and speech rate. Somestudies found that the early negativity was absent if materials wereonly incongruent in pitch (Astesano et al., 2004; Magne et al., 2006).Nonetheless, the early negativity was always robust in studies with

1 Psycho

h2

5

dwcPrrtdpCprceipwteap

btevetaptdabctattewdehec

tgni(didtcsi

66 X. Chen et al. / Biological

armonic or rhythm incongruity (Brattico et al., 2006; Koelsch et al.,007; Vuust et al., 2009).

.2. Late positive ERP effects

The “neutral-to-angry” prosodies elicited significant positiveeflection irrespective of task relevance, although the time intervalas shorter under task-irrelevant condition. This finding coin-

ided with the PEP reported by Kotz and Paulmann (2007) andaulmann and Kotz (2008b). As suggested, this positivity mayeflect a prosodic transition process that extracts vocal emotionelated parameters in order to assign significance during emo-ional speech comprehension (Kotz and Paulmann, 2007), probablyue to the fact that the large deviance of the “neutral-to-angry”rosodies would immediately increase the vigilance of participants.ontrary to “neutral-to-angry” prosodies, the “angry-to-neutral”rosodies with smaller magnitude of deviance only elicited poste-ior distributed late positivity under task-relevant condition. In theontrol experiment, the anterior-central dominant positivity waslicited in both “neutral-to-angry” prosodies and their correspond-ng spectrally rotated counterparts. Moreover, the amplitude of theositivity was larger for “neutral-to-angry” prosody as comparedith the spectrally rotated vision, even though the task was emo-

ion irrelevant, implying that the emotionality reflects vigilancenhancing. Once again, these results proved that the “neutral-to-ngry” deviation might automatically enhance the vigilance of thearticipants.

These results allow us to propose that the positivity evokedy EP expectancy violation is modulated by deviation pattern andask relevance. Similar to the previously reported P600 (Bratticot al., 2006) or P800 (Astesano et al., 2004) elicited by expectancyiolation in music or linguistic prosody, the positive deflectionlicited by prosodic expectancy violation may reflect re-analysis ofhe prosodic cues that violate the expected EP development in anttempt to integrate mismatching information. When hearing therosody with transition from angry to neutral, the brain detectshe deviation but ignores it if no task related to the deviation isemanded. However, because of the large magnitude of deviancend emotional significance of “neutral-to-angry” prosodies, therain detects the deviation even earlier but unable to ignore it,onsequently an involuntary shift of attention happens and elicitshe previously reported PEP (Kotz and Paulmann, 2007; Paulmannnd Kotz, 2008b). However, under the task-relevant condition, aop-down process requires subjects to re-analyze and integratehe deviation, and thus the late positive component (LPC) waslicited by both patterns of deviations. This finding is compatibleith many studies exploring music expectancy violation, whichemonstrated that the expectancy violation caused by incongru-nce of pitch (Brattico et al., 2006; Schön et al., 2004), chordarmonic (Koelsch et al., 2000, 2007), or rhythm (Vuust et al., 2009)licited an early mismatch negativity and followed by a late positiveomponent.

The current study did not find robust impact of violation posi-ion on deviation effect. However, as prior studies indicated thatlobal trend of fundamental frequency (Paeschke, 2004) and into-ation (Bänziger and Scherer, 2005; Juslin and Laukka, 2003) play

mportant roles in emotional expressions. Moreover, Koelsch et al.2000) found that deviation effects were smaller when elicited byeviants at the third position compared to those at the fifth position

n music piece. Thus, the violation position is likely to modulate the

eviation effect. The null impact of violation position on the devia-ion effect is presumably due to the small amount of trials for eachondition. To clearly specify the effect of violation position, furthertudy with large amount of experimental trails or more participantss needed.

logy 86 (2011) 158–167

6. Conclusion

The present study demonstrated that the brain is able to rapidlydetect the deviation in sentential EP irrespective of attention alloca-tion. Moreover, the EP deviation with large emotional significancelike “neutral-to-angry” would increase the vigilance and assign sig-nificance during speech comprehension. If the deviation is taskrelevant, the brain would re-analyze and integrate the deviationwith context. During these processes, the emotionality of EP seemsto speed up the perception and step up more vigilance. Thesefindings extended previous studies by demonstrating that the emo-tional prosodic violation during sentential EP processing could elicitseveral ERP components in addition to PEP, suggesting that thebrain can quickly detect the deviation and apply the significanceof the deviance to higher order cognition.

Acknowledgements

This research was supported by the National Natural ScienceFoundation of China (31070989). We would like to thank Jiajin Yuanand Weijun Li for very helpful comments on an earlier version ofthe manuscript.

References

Astesano, C., Besson, M., Alter, K., 2004. Brain potentials during semantic andprosodic processing in French. Cognitive Brain Research 18, 172–184.

Bänziger, T., Scherer, K.R., 2005. The role of intonation in emotional expressions.Speech Communication 46, 252–267.

Banse, R., Scherer, K., 1996. Acoustic profiles in vocal emotion expression. Journal ofPersonality and Social Psychology 70, 614–636.

Besson, M., Schön, D., 2001. Comparison between language and music. Annals of theNew York Academy of Sciences 930, 232–258.

Blesser, B., 1972. Speech perception under conditions of spectral transformation. I.Phonetic characteristics. Journal of Speech and Hearing Research 15, 5–41.

Boersma, P., Weenink, D., 2006. Praat: doing phonetics by computer [computerprogram] (Version 4.2). Retrieved July 2006, from http://www.praat.org/.

Bostanov, V., Kotchoubey, B., 2004. Recognition of affective prosody: continuouswavelet measures of event-related brain potentials to emotional exclamations.Psychophysiology 41, 259–268.

Brattico, E., Tervaniemi, M., Näätänen, R., Peretz, I., 2006. Musical scale propertiesare automatically processed in the human auditory cortex. Brain Research 1117,162–174.

Brosch, T., Grandjean, D., Sander, D., Scherer, K.R., 2008. Behold the voice of wrath:cross-modal modulation of visual attention by anger prosody. Cognition 106,1497–1503.

Brosch, T., Grandjean, D., Sander, D., Scherer, K.R., 2009. Cross-modal emotionalattention: emotional voices modulate early stages of visual processing. Journalof Cognitive Neuroscience 21, 1670–1679.

Clynes, M., 1969. Dynamics of vertex evoked potentials: the RM brain function.Averaged Evoked Potentials: Methods, Results, and Evaluations, pp. 363–374.

Dien, J., Santuzzi, A., 2004. Application of repeated measures ANOVA to high-densityERP datasets: a review and tutorial. In: Handy (Ed.), Event-Related Potentials: AMethods Handbook. MIT Press, Cambridge, MA, pp. 57–82.

Fitzgerald, P., Picton, T., 1983. Event-related potentials recorded during the discrim-ination of improbable stimuli. Biological Psychology 17, 241–276.

Goydke, K.N., Altenmüler, E., Möller, J., Mönte, T.F., 2004. Changes in emotional toneand instrumental timbre are reflected by the mismatch negativity. CognitiveBrain Research 21, 351–359.

Hahne, A., Friederici, A., 1999. Electrophysiological evidence for two steps in syntac-tic analysis: early automatic and late controlled processes. Journal of CognitiveNeuroscience 11, 194–205.

Ito, T.A., Larsen, J.T., Smith, N.K., Cacioppo, J.T., 1998. Negative information weighsmore heavily on the brain: the negativity bias in evaluative categorizations.Journal of Personality and Social Psychology 75, 887–900.

Juslin, P., Laukka, P., 2003. Communication of emotions in vocal expression andmusic performance: different channels, same code? Psychological Bulletin 129,770–814.

Koelsch, S., Gunter, T., Friederici, A., Schröger, E., 2000. Brain indices of musicprocessing: nonmusicians are musical. Journal of Cognitive Neuroscience 12,520–541.

Koelsch, S., Jentschke, S., Sammler, D., Mietchen, D., 2007. Untangling syntactic and

sensory processing: an ERP study of music perception. Psychophysiology 44,476–490.

Kotz, S.A., Paulmann, S., 2007. When emotional prosody and semantics dance cheekto cheek: ERP evidence. Brain Research 1151, 107–118.

Lawson, E.A., Gaillard, A.W.K., 1981. Evoked potentials to consonant-vowel syllables.Acta Psychologica 49, 17–25.

Psycho

M

N

N

N

P

P

P

R

S

S

S

S

X. Chen et al. / Biological

agne, C., Schön, D., Besson, M., 2006. Musician children detect pitch violations inboth music and language better than nonmusician children: behavioral and elec-trophysiological approaches. Journal of Cognitive Neuroscience 18, 199–211.

äätänen, R., 1982. Processing negativity: an evoked-potential reflection of selectiveattention. Psychological Bulletin 92, 605–640.

äätänen, R., Paavilainen, P., Rinne, T., Alho, K., 2007. The mismatch negativity(MMN) in basic research of central auditory processing: a review. Clinical Neu-rophysiology 118, 2544–2590.

äätänen, R., Picton, T., 1987. The N1 wave of the human electric and magneticresponse to sound: a review and an analysis of the component structure. Psy-chophysiology 24, 375–425.

aeschke, A., 2004. Global trend of fundamental frequency in emotional speech. In:Paper Presented at the International Conference, 2004 – ISCA.

aulmann, S., Kotz, S.A., 2008a. Early emotional prosody perception based on differ-ent speaker voices. Neuroreport 19, 209.

aulmann, S., Kotz, S.A., 2008b. An ERP investigation on the temporal dynamics ofemotional prosody and emotional semantics in pseudo- and lexical-sentencecontext. Brain and Language 105, 59–69.

oss, E.D., Edmondson, J.A., Seibert, G.B., 1986. The effect of affect on various acousticmeasures of prosody in tone and non-tone languages: a comparison based oncomputer analysis of voice. Journal of Phonetics 14, 283–302.

auter, D.A., Eimer, M., 2010. Rapid detection of emotion from human vocalizations.Journal of Cognitive Neuroscience 22, 474–481.

chön, D., Magne, C., Besson, M., 2004. The music of speech: music training facilitatespitch processing in both music and language. Psychophysiology 41, 341–349.

cherer, K.R., 2003. Vocal communication of emotion: a review of researchparadigms. Speech Communication 40 (1–2), 227–256.

chirmer, A., Kotz, S.A., 2006. Beyond the right hemisphere: brain mechanisms medi-ating vocal emotional processing. Trends in Cognitive Sciences 10, 24–30.

logy 86 (2011) 158–167 167

Schirmer, A., Striano, T., Friederici, A., 2005. Sex differences in the preat-tentive processing of vocal emotional expressions. Neuroreport 16, 635–639.

Schröger, E., 1998. Measurement and interpretation of the mismatch nega-tivity. Behavior Research Methods Instruments and Computers 30, 131–145.

Steinhauer, K., Alter, K., Friederici, A., 1999. Brain potentials indicate immediate useof prosodic cues in natural speech processing. Nature Neuroscience 2, 191–196.

Thönnessen, H., Boers, F., Dammers, J., Chen, Y.H., Norra, C., Mathiak, K., 2010. Earlysensory encoding of affective prosody: neuromagnetic tomography of emotionalcategory changes. NeuroImage 50, 250–259.

Thierry, G., Roberts, M., 2007. Event-related potential study of attention capture byaffective sounds. Neuroreport 18, 245.

Vuust, P., Ostergaard, L., Pallesen, K.J., Bailey, C., Roepstorff, A., 2009. Predictivecoding of music – brain responses to rhythmic incongruity. Cortex 45, 80–92.

Warren, J., Sauter, D., Eisner, F., Wiland, J., Dresner, M., Wise, R., et al., 2006. Positiveemotions preferentially engage an auditory-motor “mirror” system. Journal ofNeuroscience 26, 13067–13075.

Winkler, I., 2007. Interpreting the mismatch negativity. Journal of Psychophysiology21, 147.

Winkler, I., Denham, S., Nelken, I., 2009. Modeling the auditory scene: predictiveregularity representations and perceptual objects. Trends in Cognitive Sciences13, 532–540.

Woods, D., 1995. The component structure of the N 1 wave of the human audi-tory evoked potential. Electroencephalography and Clinical Neurophysiology,102–109 (Supplements only).

Yuan, J., Zhang, Q., Chen, A., Li, H., Wang, Q., Zhuang, Z., et al., 2007. Are we sensi-tive to valence differences in emotionally negative stimuli? Electrophysiologicalevidence from an ERP study. Neuropsychologia 45, 2764–2771.