The Power of Music Valence: Cross modal transference of valence from the auditory to the visual...

29
The Power of Music Valence: Cross modal transference of valence from the auditory to the visual domain By Natalia Drosogianni

Transcript of The Power of Music Valence: Cross modal transference of valence from the auditory to the visual...

The Power of Music Valence: Cross modal

transference of valence from the auditory to the

visual domain

By

Natalia Drosogianni

Abstract

The current study aims to examine cross modal transference of

valence from the auditory to the visual domain, conducting a

simultaneous presentation task with 40 participants.

Furthermore, what is of particular interest in this experiment

is the attempt to bring in the relationship the aspect of

time, as well as the aspect of alexithymia, as a potential

factor mediating the relationship. It is assumed that on the

early stage of the task emotions have been induced but on the

late stage, these emotions have been accumulated to create

mood, that is aggregated emotions, the effects of crossmodal

transfer will be greater. Results however, were non-

significant for all factors, so the null hypotheses were

accepted.

Introduction

Since the ancient times, music has been used as an effective

communicator of emotions. For the ears of the listener, music

can induce a range of emotions and moods, such as anger, fear,

joy, relaxation and so on. It was only in the recent years

however, that researchers started to systematically

investigate this special property of music to influence

emotions, and an even more recent tendency was to investigate

music in a multisensory context, which is closer to real-life

emotional processing where stimuli are not presented in

isolation. In such settings, emotions regarding a current

stimulus may not be solely affected but the contents of it,

but rather, they can be modulated by another affective

stimulus, described as prime stimulus. This process is known

as affective priming (Marin, Gingras, and Bhattacharya, 2011).

Research in the field makes a distinction between musical

emotion, and musical mood, a distinction that is not so clear.

Opinions upon researchers of the field vary greatly, but

typically, mood refers to “an affective states that feature a

lower felt intensity than emotions, that do not have a clear

object, and that last much longer than emotions”, emotions on

the other hand are defined as “relatively intense affective

responses that usually involve a number of sub-components-

subjective feeling, physiological arousal, expression, action

tendency, and regulation- which last minutes to a few hours”

(Marin and Bhattacharya, 2009; Juslin and Vaststfjall 2008, p.

561). Qualitatively, in studies involving longer presentation

of the priming stimuli, it could be said that what is under

examination is mood induction rather than emotion induction.

The study of multisensory integration examines how two

different stimuli from separate sensory modalities are bound

into one single percept. Typically in the research paradigm

conducted in the field, the an auditory stimuli is used as a

prime, and the visual image presented shortly after the prime

is called a target. What is measured in such studies is

typically an evaluation of the target, after participants have

been subjected to different primes. All the more studies show

that cross modal emotional transfer needs to be treated as a

multidimensional phenomenon. This tendency follows the trend

of a more naturalistic approach to information processing.

Crossmodal transfer as defined by Marin, Gingras, and

Bhattacharya refers to “the measurable effects that one

emotional stimulus has on the processing of an emotional

stimulus coming from a different sensory domain”. Over the

years many methods have been used to study transfer of

perceived or felt emotions. From self report measures, to the

paradigm whereby two stimuli that are presented consecutively

have to be congruent or incongruent with respect to at least

one variable under investigation.

Studies on film music are a characteristically example of mood

induction since they normally involve longer stimuli. Studies

following a categorical approach to emotion examining the

simultaneous presentation of musical and visual stimuli have

shown clear cross modal emotional influences in term of

valence. However, as it seems these effects were not valid

for all emotion categories.

In 2006, Baumagartenr, Esslen, and Jancke combined a set of

complex affective pictures from the IAPS that were classified

as fearful, sad, or happy with a set of classical music

excerpts of 70second length from the same emotional

categories, and also presented the stimuli in isolation. The

researchers concluded that participants reported increased

levels of involvement in the combined conditions.

Physiological data showed that in the emotional congruent

combined condition, emotions were more effectively induced,

than in the separate music and separate pictures conditions.

Furthermore participants reported increased emotional

involvement in the combined, and music conditions compared to

the picture condition. The findings however do not hold for

the emotion of fear, as the induced quality of fear and

valence, in the combined condition did not vary significantly

from the picture condition.

Similarly, Kutas, Urbach, Altenmuller, and Munte (2006),

presented participants with two stimuli from the visual or

auditory modality. Participants were asked to ignore one of

the two modalities when rating the perceived emotion expressed

by the other modality. This time, auditory stimulus was an

emotional voice (sung notes) that belong to either the sad,

the happy, or the neutral emotional category, and IAPS

pictures. Findings indicated that only in the condition where

the participants had to ignore the pictures and had to attend

to sounds, for the congruent pairings involving happy and

neutral stimuli valence was indeed more strongly perceived.

This result did not hold for the condition where listeners had

to attend to the pictures or for the condition of the sad

pairings.

Perceptual effects in cross modal presentation have been

demonstrated numerous times n psychophysical research. For

instance, Spelke (1979) presented children with a film in two

conditions. Tin the first one, the film was accompanied by an

appropriate soundtrack, while in the second condition the film

was accompanied by an inappropriate one. Results showed that

children attended significantly more to the film with the

appropriate soundtrack. This effect has been explained using

various theories. What is of essence in all these paradigms is

the sensitivity to invariance which is core to theories such

as the Gestalt theory, information theory, and information

pickup. Congruence between internal structure of music and

visual stimuli alters the attentional strategy employed by the

receiver, and consequently the subsequent encoding of

information of the film. It follows that associations and

connotations from the music bias the listener, setting the

context in which later interpretation of visual stimuli is

interpreted (Marshall S.K, Cohen A.J., 1998). Form, affect,

and connotative nature of the structure of information

contained in music affects the interpretation of music, and

therefore any other stimuli primed by music. In their study

(Marshall S.K, Cohen A.J., 1998) argue that the manner whereby

the interaction arises can be categorized in three distinct

processes by the generation of the meaning of music, by the

information carried, the selective attention which prompts

congruencies in the music and film, and finally, the

association between the meaning of the auditory stimuli and

the attended film items.

“Music is a sequential stimulus for which meaning is achieved

over time” (Marshall S.K, Cohen A.J., 1998) “the assignment of

ascent to events will affect retention, processing, and

interpretation”. This feature of music has lead to the logical

assumption that as time goes by the effects of the structure

of a piece are accumulated, creating stronger congruency

effects.

Research for multisensory integration of musical stimuli

simultaneously presented with static visual stimuli in cross

modal emotional transfer has mainly focused on the role of

valence and arousal, has been largely neglected (Fazio et al.,

1986). This could be partly attributed to the common use of

verbal stimuli in earlier affective priming studies as

discussed above, which as it is expected, vary more in valence

than in intensity. However, a recent trend in the field

involves studying how film, as an art form comprising of both

musical and visual stimuli, can have arousal associated with

music modulate levels of emotion expressed by visual

materials, as well as the meaning of these visual material

(Cohen, 1998). It is widely though that film music can affect

the emotional experience of the film

Following evidence from cognitive studies that the

relationship between film and music is additive Ellis and

Simmons (2005) examined how music affects the perception of

film in terms of emotion self report and physiological

reactions. The experimenters used eight tracks of instrumental

art music of 270-s varying in valance and arousal overlaid

with eight films. Researchers concluded that even though self

reports show a direct and clear cut additive relationship

between music and film, the physiological data indicate a more

complex one.

Expanding on the connotative and structural attributes of

music, and the relations among tones, Cohen conducted an

experiment involving music and film, bringing together the

aspects. It is true that music, as well as visual stimuli can

generate impressions such as aggression or warmth by specific

aspects of its structure that bring to the mind of the

receiver of the stimuli connotative attributes (Marshall S.K,

Cohen A.J., 1998; Levi, 1982; Meyer, 1956). The film selected

for this study was a 2 minute abstract animation of Heider and

Simmel (1944) depicting movements of geometric shapes, which

were interpreted as aggressive by almost all viewers prior to

the main experiment. Te researchers tested the hypothesis that

two musical excerpts that differ on a particular dimension

will influence the judgments of meaning og the film on that

specific dimension. The dimensions under examination were

evaluative (good/bad), potency (weak strong) and activity

(active, passive). The pattern of results indicated that the

meaning of the music for the dimensions of potency and

activity was directly associated with the meaning if the film.

An inverse relation however was observed for the evaluative

dimension, which the experimenters attributed to the

complexity of the interaction of film and music for the

specific dimension, and the type of cognitive congruency that

affects this relation (Marshall S.K, Cohen A.J., 1998).

Typically, individual differences are not involved in research

of crossmodal transfer, however, in the recent years, there

has been expanding interest in the trait of alexithymia, which

relates directly to emotional perception. Studying such a

trait could prove very interesting for the field of crossmodal

transfer of arousal, since, this emotional deficit can serve

to explain the ways in which the transfer of arousal occurs.

Alexithymia was originally related to clinical environments in

disorders characterized with poor social functioning. The

personality construct represents a deficit in cognitive

processing, one that has been especially linked with

somatisation. It is characterized by difficulties in

identifying, differentiating, and describing feelings, as well

as by an important component of externally oriented thinking.

A 2008 study by Karlson, Naatanen, and Stenman, manipulated

for mood induction using films of positive, negative and

neutral valance, while emotional activation was measured in 10

healthy women with alexithymia, and 11 healthy women with no

alexithymia. Results demonstrated clear effects of

differential mode of emotion processing by the two groups.

Women with alexithymia showed a tendency to over-activate

their “bodily” brain regions, as well as reporting self

observed differences in experiencing these emotions such as

difficulty in identifying their emotions.

Overview of the present experiment

The central aim of the current experiment was to examine if

and how could pleasantness induced from music can be

transferred to the visual domain, affecting therefore ratings

of the pleasantness of emotional faces. The present experiment

sought to explore the emotional response to the visual stimuli

by collecting ratings of felt pleasantness, also exploring

possible effects of alexithymia. Of essence to the research

was to select musical pieces that significantly varied in

terms of valence, or else pleasantness. Visual stimuli were a

set of facial expression that also varied in pleasantness. In

light of the current literature on cross modal emotional

priming and the study of the limited research on multisensory

integration, it was hypothesized that musical primes would

affect greatly the participants’ subjective ratings of the

facial expressions in a manner congruent with the valance of

the musical piece. Another important area of research for the

present study however was to also brin in the aspect of time

in this relationship. It is assumed that on the early stage of

the task, emotions have been induced but on the late stage,

these emotions have been accumulated to create mood, that is

aggregated emotions. No clear cut definition of emotion and

mood has been offered by past research on the field of cross

modal study, but in this particular study, mood and emotions

are not treated as different variables, but rather emotion is

what is studied, with the difference that in the late stage

emotion is seen as being accumulated after 2.5 minutes of

listening to the musical excerpt, and becomes what is most

widely known as “mood”.

While examining how valance from the auditory domain

differentially affects perceived valence in the visual domain,

and what is the interaction of time in this relationship, a

third variable was thought to be of particular interest in

this cross examination. This other variable is alexythimia.

Since the that I so highly correlated with emotions, it is

only logical to assume that it must at some part at least

mediate the relationship between the interaction of the

auditory and the visual domain. The hypotheses the

experimenter sets for this study can be grouped into

categories. Firstly, the level of valance of the musical piece

will significantly affect subsequent ratings of the

participants’ evaluations of the facial expressions. It is

expected that this effect of congruency will be very strong,

but the effect of this musical priming will be observed the

most in the neutral emotional valence condition, where the

participants are expected to be more likely to evaluate the

ambiguous face depending on the valance level of the musical

excerpt he/she is given. The second hypothesis made is that

this congruency effect will become stronger in the later

stages of presentation. Simply put, it is assumed that with

time, the effects of positive valance will be aggregated for

the high valance facial expression condition, in the same way

negative valance effects will be aggregated for the low

valance condition. Finally, the trait of alexithymia, as

observed in non-clinical populations, is hypothesized to

significantly affect the interaction between valance from the

auditory domain and ratings of the facial expressions, so that

participans scoring high in alexithymia facets will be less

affected from the valance of the musical piece, and

participants that show low alexithymia traits will be affected

by music in a greater extent.

Method

Design

The experiment had a 2 x 3 x 2 within-participants design with

type of musical valence (positive vs. negative), type of

emotion of facial expression (happy vs. sad vs. neutral), and

the time stage (early stage: from 0 to 2.5 minute vs. late

stage: from 2.5 to 5 minutes) as factors. The dependent

variables (DV) was the participants’ ratings on the valance of

the emotion of the facial expression on a scale from

1=extremely happy to 7=extremely sad. Furthermore,

participants’ scores on a self-reported alexithymia scale

(TAS-20) was used as a between subjects factor.

Subjects

A total of forty university students (27.5 males, 72.5

females) aged between 19 to 34 (Mean age: 21.83, SD= 3.57)

participated in this study. The majority of the participants

were first-year Goldsmiths undergraduate students. They were

obtained through the Psychology Department’s Research

Participation Scheme and took part in exchange for course

credits. Participants were subjected to all levels of the

three independent variables, while for the variable of musical

valence the order was reversed for helf of the participants

for counterbalancing purposes. All participants were living in

the UK for at least the past 3 years and were encultured to

western tonal music. The sample reported normal hearing and

had no neurological or psychiatric disorder. All participants

gave their informed consent to participate. Ethical approval

from the ethics committee of Goldsmiths University of London

was received prior to commencing this research.

Measures and Materials

The task was programmed in MATLAB version 7.10.0 software

(Natick, Massachusetts: The MathWorks Inc., 2010) and was

displayed on a computer.

The task consisted of two musical pieces of 5 minutes length

each, and a set of Nimstim facial expressions (The Research

Network on Early Experience and Brain Development).

a) Musical Pieces:

The two musical pieces were chosen on the basis of dimensions

of arousal and valance. Both stimuli were carefully selected

so that they would be of high arousal and they should only

vary on valance.

For the positive valance mood induction Bach’s Brandenberg

Concerto No. 3 played by Hurbert Laws was used, while for

negative valance mood induction Prokoviev’s Alexander Nevsky:

Russia under the Mongolian Yoke was administered. Both these

musical pieces have been validated from previous mood research

to significantly induce increased and decreased positive

affect respectively on an initial slightly positive mood

(Rowe, Hirsh,and Anderson, 2007)

b) Nimstim set of facial stimuli

A set of face stimuli called the Nimstim Set of Facial

Expressions was used as the second independent variable.

Advantages of the set are that it contains a large multiracial

sample of photos of actors, who are asked to perform naturally

occurring facial expressions. The facial expressions used from

this set for the purposes of the current experiment were those

depicting happy, neutral and sad expressions (see sample

images below). The set has been tested upon a sample of

untrained individuals in a 2009 study by Tottenham et al. and

showed high validity and reliability as well as high intra-

participant agreement across two testing sessions (Tottenham

et al., 2009).

a) b)

Figure 1. Examples of facial expressions of a) sad, and b) happy facial expressions from the NimStim Face Stimulus Set

Courtesy of The Research Network on Early Experience and BrainDevelopment

c) Alexithymia Measure

The Toronto Alexithymia (TAS 20) scale was administered to

participants after the completion of the task to estimate the

levels of alexithymia for each participant (Appendix II). The

scale is the most widely used instrument for assessing

alexithymia in both research and clinical practice. The TAS-20

comprises of three factors, namely; externally oriented

thinking, difficulty identifying feelings, and difficulty

describing feelings. Although it could be argued that a self

report scale involves impairments in self awareness, the

scale has been assessed upon a sample of 1933 individuals

showing replicability of the factor structure as well as

internal reliability (Parker, Taylor, and Bagby, 2003).

Procedure

All participants were tested individually in a quiet cubicle

with no external disturbances. They were firstly asked to

carefully read and sign the consent form (Appendix I). After

that they were seated approximately 40 cm in front of the

computer screen which was set at eye level. During the taskt

participants were asked to listen to the two excerpts of music

lasting 5 minutes each. The experiment was divided into two

tasks, which the participants are asked to complete. In the

first task participants listened to the high arousal auditory

stimulus, during this time, they were presented with 100

randomized facial stimuli from the Nimstim, the presentation

of which lasted for 1 second. After this time the participants

were given 1 second in which they were supposed to provide a

rating of the valance of the emotion presented on a Likert-

type scale from 1 to 7 (1 being extremely sad, 4 being

neutral, and 7 being extremely happy). Their response was

given simply by pressing the number they think the emotion

represents on the keyboard. After this process was repeated

for both songs differing in valance, participants were given

the same musical excepts, however, this time at the point

where in the previous task the visual stimuli appeared, a

fixation cross appeared in the centre of the screen, where the

participants were asked to rate perceived level of valance for

this specific moment of time. The order in which the

participants listened to each song for both tasks was

counterbalanced as to avoid order effects. In total, all four

tasks lasted for approximately 25 minutes. After the two main

asks were finished, participants were given a questionnaire

containing demographics, questions upon the perceived

pleasantness for the two songs, and a short standardized

questionnaire for alexithymia (Appendix II). When participants

finished completing the questionnaire they were fully

debriefed about the aim and the nature of the experiment.

Results

To evaluate emotional spaces, the average and range of all

ratings for the happy, sad, and neutral emotional expressions,

as well as the average and range for both the two musical

pieces were calculated.

Table 1.Descriptive statistics for the perceived valance

of each emotional category of faces.

Minimum Maximum MeanStd.

DeviationHappy Face 4,47 6,82 5,8312 ,43794

Neutral Face 2,03 4,22 3,6294 ,40961

Sad Face 1,03 3,82 2,2343 ,61114

Valid N (listwise)

Table 2. Descriptive statistics for the perceived valance of

each auditory stimuli

N Minimum Maximum MeanStd.

DeviationPositive Music 40 3,00 7,00 5,7250 ,87669Negative Music 40 1,00 5,00 2,5500 1,25983Valid N (listwise)

40

Perceived emotional ratings were assessed using factorial

repeated-measures analysis of variance (ANOVA) with the

factors of valance of musical emotion (two levels: positive

and negative), the valance of facial emotion (three levels:

happy, sad, neutral), the time stage on which the rating takes

place (early stage, late stage) as within subjects factors,

and alexithymia as a between subjects factor (alexithymia or

non-alexithymia)

A) The main interaction between music valence and the

subjective ratings of the valence of presented facial

expressions

Table 3. The mean ratings and standard deviations of the

valance of emotional faces (on a scale from 1=extremely sad to

7=extremely happy), with respect to the valance of the musical

piece.

Music Valence/ Expression of Emotional Face

Mean

Std.Deviatio

n NPositive music, happy face 5,867 ,427 40

Positive music, sad face 2,262 ,622 40

Positive music, neutral face 3,629 ,412 40

Negative music, happy face 5,796 ,460 40

Negative music, sad face 2,207 ,606 40

Negative music, neutral face 3,630 ,413 40

Figure 1. The mean ratings of the valance of facial

expressions (on a scale from 1=extremely sad to 7=extremely

happy), with respect to the valance of the musical piece.

Happy Sad Neutral0

1

2

3

4

5

6

7

Positive musicNegative music

Expression of emotional face

Rati

ngs

of p

osit

ive

vala

nce

The means and standard deviations are presented in Table 1.

Mauchly’s test indicated that the assumption of sphericity had

been violated for the main effect of valence of emotional face

factor, χ²(2)=10,125, p= .006, and for the main interaction

between the valence of the emotional face factor and the music

valence factor χ²(2)= 6,262 p=.044. Therefore degrees of

freedom were corrected using Greenhouse-Geisser corrected

estimated of sphericity (ε=.81 for the main effect of

emotional face and .87 for the main interaction between

emotional faces and musical valence).

The two-factor analysis of variance showed a significant main

effect of the type of expression of the emotional face on the

valance ratings of the facial expression, F(2,78) = 477,73, p<

.001, but no significant main effect for the music valence

factor, F(1,39) = .700, p > .05. The interaction between the

valence of the emotional face factor and the music valence

factor not significant as well, F (2,78) = .750, p > . 05.

B) The effect of time in the interaction of the music

valence on the subjective ratings of the valence of

presented emotional faces

Table 4. The means and standard deviations for the interaction

of music valence on the rating of facial expressions (on a

scale from 1=extremely sad to 7=extremely happy) with respect

to the time stage (early time stage=0-2.5 minutes, late time

stage=2.5 minutes to 5 minutes).

Music Valance

Expression of emotional face Time Stage Mean

Std.Error

Positive Happy Early 5,84 ,079

Late 5,89 ,066

Sad Early 2,23 ,116

Late 2,32 ,096

Neutral Early 3,62 ,070

Late 3,65 ,067

Negative Happy Early 5,81 ,075

Late 5,80 ,075

Sad Early 2,18 ,107

Late 2,25 ,092

Neutral Early 3,65 ,069

Late 3,61 ,069

Figure 2. The mean ratings of valence of facial expressions

(on a scale from 1=extremely sad to 7=extremely happy) with

respect to the time stage (early time stage=0-2.5 minutes,

late time stage=2.5 minutes to 5 minutes) for the musical

pieces of positive and negative valance.

Happy Sad Neutral Happy Sad NeutralPositive Negative

01234567

Early Late

Expression of emotional face

Rati

ngs

of p

osit

ive va

lanc

e

The means and standard deviations are presented in Table 2.

The three-factor analysis of variance (3 by 2 by 3 ANOVA)

showed no significant main effect for the interaction of time

with the valance of music and the rating of the emotional

faces, F(2,78) = .084, p > .05.

C) The effect of alexithymia on music valence and the

subjective ratings of the valence of presented facial

expressions

Table 5. The mean and standard deviation of the effect of

musical valance on ratings of the facial expressions with

respect to the alexithymic state of the participant.

Non alexithymia

Std. Deviation

Alexithymia

Std. Deviation

PositiveMusic

Happy Face 5,93

,365,76

,52

Sad Face 2,21

,572,34

,71

NeutralFace 3,68

,393,55

,44

NegativeMusic

Happy Face 5,88

,435,67

,47

Sad Face 2,1

,582,37

,64

NeutralFace 3,63

,433,63

,40

N=25 N=15

Figure 3. The mean and standard deviation of the effect of

musical valance on ratings of the facial expressions with

respect to the alexithymic state of the participant.

Happy Face

Sad Face Neutral Face

Happy Face

Sad Face Neutral Face

Positive Music Negative Music

01234567

Alexithymia Non-alexithymia

Expression of emotional face

Rati

ngs

of p

osit

ive

vala

nce

Although it clear these are some small effects, the mixed

ANOVA used reveals alexithymia is not a strong factor that

will have an effect on the relationship between the valance of

the musical piece and the rating of the emotional faces.

The valance of the musical piece X valance of the facial

expression X alexithymia interaction was non-significant

F(2,76)=0.43.

Discussion

The current study provides evidence of a dissociation of the

emotional transfer of valance from the auditory to the visual

domain. The results obtained are in contrast with past

research on the filed, that has used similar paradigms, the

majority of which have found strong associations of transfer

of valance from the auditory to the visual domain (Logeswaran

and Bhattacharya, 2009 ; Ellis and Simmons, 2005). In their

2009 study, Logeswaran and Bhattacharya used expressive faces

as visual targets, and found a significant effect of musical

priming on the perceived happiness of the face presented..

Also, electrophysiological data showed event related brain

potential component at a very early stage of neuronal

information processing. In a study by Britton et al. 2006,

participants viewing expressive faces showed increased

activation of the superior temporal gyrous, insula, and the

anterior cingulated, all of which are areas of the brain

associated with emotional responses to music (Marin et al.

2011; Britton et al. 2006). The researcher also argues that as

a social phenomenon, music can strongly affect perception of

emotional expressions, which are inherently more social than

complex affective pictures. Therefore, in the current

experiment, the choice of facial emotions as visual stimuli,

instead of complex affective pictures for example, is

justified and supposedly could not account for the

insignificance of the results.

Results are in line with Chen et al. 2008, and Marin et al.

2011, who both found dissociation between transfer of valance

from the visual to the auditory domain. In the research

conducted by the latter, the authors argue this dissociation

arises from the difference in the variance of the prime and

target stimuli. A similar procedure of pre-evaluation of

emotional spaces was carried out for the present experiment

as well, indicating a very high variance reflected in the

auditory stimuli, compared to a relatively small variance in

the visual stimuli (Tables 1 and 2), for which mean evaluation

were very close, especially regarding sad and neutral facial

expressions.

This could be possibly attributed to the face set. In the

experimenting session, participants would often comment upon

the end of the task that it seemed like the model of the

picture was exaggerating, especially for the sad face,

creating a somewhat “grotesque” depiction of the facial

expression, that the participant received as not being

actually sad. Although a slight pulling down of the lip corner

is present other characteristics relating to sadness such as

lost focus of the eyes and dropping upper eyelids are not

present. It is argued that actors exaggerate aspects of

naturally occurring expressions which can lead to confusion by

the viewer (Tottenham et al. 2009). The Perhaps, since the

emotional aspects of the musical piece reflect differences in

the valence, while high arousal levels remain equal, the

effect of crossmodal transfer could be studied for facial

emotions of equal dimensions, that is happy, perhaps with open

mouth detail, or exuberant, and angry, which reflects an

emotion of high arousal and low valence, in contrast to

sadness, which reflects an emotion of low valance but low

arousal as well. It could be possible that this comparison

could provide a significant effect for the interaction between

musical valance and emotion of facial expression.

One of the main aims of this experiment was to address the

effect of time. It was hypothesized that in the later time

stage, emotions would have been accumulated, creating an even

larger effect of congruency, so that happy emotional

expressions would be perceived as being more, happy, sad as

being more sad, in the emotionally congruent situations, and

neutral faces were thought to follow the emotional valance of

the song. As it can be seen from figure 2, even though there

is a small effect of elevated congruent emotions, for both the

neutral and happy emotional categories, this effect is non-

significant. What is of particular interest however is that

from time stage 1 to time stage 2 for the negative valance

music the ratings of the sad faces show a minor increase

instead of a minor decrease that would be expected. Even

though again, this responds to a non-significant observation,

the effect could be explained in terms of negative emotion

bias that is that negative unpleasant emotions induce higher

arousal levels (Marin, Gingras, and Bhattacharya, 2011)

Alexythimia, was a factor also suggested by the experimenter’s

hypothesis to mediate between the crossmodal transfer of

valance from the auditory to the visual domain. As the mixed

ANOVA revealed however, the effect of this personality

construct is also non-significant. A close look of figure 3

will reveal small differences between the conditions of

alexithymia and non-alexithymia, however, the lack of any

significant effect can be attributed to the

underrepresentation of individuals scoring high in

alexithymia.

The overall non-significant results can be attributed to the

nature of the visual stimuli, which was static. Perhaps, using

a film as the visual stimuli, can be argued to create a more

prolonged emotional response, since it would stimulate real-

life emotional situations better that the stimuli of static

nature.

Another limitation includes the overrepresentation of women

and underrepresentation of men. Women have been shown to show

increased negativity bias as well as other characteristics

that may have hindered the results of the experiment. A more

equal representation of the two sexes could have been more

reliable, and would probably had yielded interesting results

for comparing the two.

The current study explored the emotional spaces of primes and

targets on the same rating scales, providing a clearer

interpretation of the results. Since the emotional spaces have

been established, future research on crossmodal emotional

transfer between the auditory and the visual domain should be

focused on replicating the procedure by combining various

other types of visual and auditory stimuli varying in

emotional valance and semantic contents.

It is also suggested that larger samples could be of vital

importance as they could extenuate the effect of congruency,

and be able to show a significant interaction between the typr

of musical valance, the perception of facial emotion, time,

and alexithymia.

Following research paradigms of Ellis and Simmons (2005),

Logeswaran and Bhattachaya(2009), who focused not only on

behavioral data but also physiological measures as objective

indicators of emotion, as well as the recent Karlson, Naatanen

and Stenman (2008) experiment, that accounted for cortical

activation as response to emotional stimuli in alexithymia,

it is clear that research should be directed into

physiological data, which provide a more rich and truthful

account as to what are the effects of cross modal transfer of

valance from the auditory to the visual domain.

One of the dominant research questions that would also be of

great interest to answer is how a lifetime of intensive

musical training changes the brains of musicians compared to

non-musicians. Imaging studies have found that musicians have

increased gray matter in auditory, and increased connectivity

between the two hemispheres via the corpus callosum (Robert J.

Ellis, 2011). Electrophysiological studies have revealed that

musicians show enhanced cortical representation of musical

stimuli, speech stimuli, and emotional vocalizations (Robert

J. Ellis, 2011).These results suggest that a lifetime of

musical training does not just selectively enhance sensitivity

to music itself, but has facilitatory transfer effects into

broader cognitive processes such as attention, language

processing, and memory. For these reasons, it is argued that

an invaluable insight would be provided by comparing the

ratings of musicians and non-musicians in the experimental

paradigm used in the present study. This is something future

research should focus upon.

At last, future research could focus on other dimensions of

emotions such as arousal, or include other personality

measures as well, that could be shown to relate to the

interaction of valance from the auditory to the visual domain.

Conclusions

There is reasonable evidence to suggest that there are

significant overlaps between musical emotions and visual

emotions. The present study aimed to examine whether beyond

this overlapping, musically induced emotions could affect the

interpretation of visually induced ones. Despite past research

that reports a significant transfer of arousal from the

auditory to the visual domain, in the present experiment did

not yield any significant results regarding this interaction.

Also, both further null hypotheses were accepted regarding the

effect of time in this interaction, which was found to be non

significant, as well the effect of alexithymia which was also

found to be non-significant.