Evidence for effects of task difficulty but not learning on neurophysiological variables associated...

11
Evidence for effects of task difculty but not learning on neurophysiological variables associated with effort Anne-Marie Brouwer , Maarten A. Hogervorst, Michael Holewijn, Jan B.F. van Erp TNO, P.O. Box 23, 3769 ZG Soesterberg, The Netherlands abstract article info Article history: Received 23 December 2013 Received in revised form 8 May 2014 Accepted 10 May 2014 Available online 16 May 2014 Keywords: learning effort workload physiology EEG eye Learning to master a task is expected to be accompanied by a decrease in effort during task execution. We exam- ine the possibility to monitor learning using physiological measures that have been reported to reect effort or workload. Thirty-ve participants performed different difculty levels of the n-back task while a range of physiological and performance measurements were recorded. In order to dissociate non-specic time-related ef- fects from effects of learning, we used the easiest level as a baseline condition. This condition is expected to only reect non-specic effects of time. Performance and subjective measures conrmed more learning for the difcult level than for the easy level. The difculty levels affected physiological variables in the way as expected, therewith showing their sensitivity. However, while most of the physiological variables were also affected by time, time-related effects were generally the same for the easy and the difcult level. Thus, in a well-controlled experiment that enabled the dissociation of general time effects from learning we did not nd physiological variables to indicate decreasing effort associated with learning. Theoretical and practical implications are discussed. © 2014 Elsevier B.V. All rights reserved. 1. Introduction We need to monitor learning for a number of reasons, such as pro- viding trainees with appropriate feedback, determining whether a trainee has learned sufciently well and evaluating educative systems. One straightforward way to do this is to monitor behavioral perfor- mance, e.g. the time it takes to perform a task and the number of errors made. However, behavioral performance is not only determined by (learned) skills. Another important factor is mental effort, where the negative effect of lacking skills on behavioral performance can be counteracted by investing a large amount of effort. This means that while trainees may have reached the desired level of performance, they may need a large amount of effort in order to maintain this level. In such a case, additional learning may still be required in order to transform effortful, controlled cognitive processes into more automatic and efcient processes (Gopher and Kimchi, 1989; Liu and Wickens, 1994; Schneider and Fisk, 1982). Thus, information about performance and effort is needed to monitor the learning process. Information about effort could be extracted from physiological measures as discussed next. 1.1. Effort and its indicators: peripheral physiology, EEG and eye-related measures Effort, or as termed by Brehm and Self, motivational arousal, only oc- curs if a number of conditions are met (Brehm and Self, 1989). Firstly, there should be the expectation that a certain behavior will lead to cer- tain desirable outcome values (task incentive). Secondly, the required behavior should be difcult but considered to be within one's capacity and justied by the potential gain. When the required behavior is con- sidered to be too difcult, i.e. outside one's capacity or outweighing the potential gain, effort will not be invested. When the required behav- ior is easy to perform, effort will be low or absent since the organism will strive to conserve energy. A concept that is very close to mental effort is mental workload (Gaillard and Wientjes, 1994; Hockey, 1986). While the term workloadevokes associations with externally imposed task demands, workload involves internal factors such as the ability of the in- dividual to cope with these demands (Borghini et al., 2012; Gopher and Donchin, 1986; Kantowitz, 1988; ODonnell and Eggemeier, 1986) as well as the motivation of the individual to perform the task at hand (Veltman, 2002). Thus, just like effort, workload can only be high when the task is difcult but perceived to be feasible, and leads to, and is in proportion with, rewarding outcomes. The function of effort is the production of appropriate behavior, therewith rendering measures of physiological arousal (high sympa- thetic relative to parasympathetic activation) likely candidates for measuring effort (Brehm and Self, 1989; Gawron et al., 1989; Mulder and Mulder, 1987). Changes in sympathetic and parasympathetic International Journal of Psychophysiology 93 (2014) 242252 Corresponding author. Tel.: +31 888665960. E-mail addresses: [email protected] (A.-M. Brouwer), [email protected] (M.A. Hogervorst), [email protected] (M. Holewijn), [email protected] (J.B.F. van Erp). http://dx.doi.org/10.1016/j.ijpsycho.2014.05.004 0167-8760/© 2014 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect International Journal of Psychophysiology journal homepage: www.elsevier.com/locate/ijpsycho

Transcript of Evidence for effects of task difficulty but not learning on neurophysiological variables associated...

International Journal of Psychophysiology 93 (2014) 242–252

Contents lists available at ScienceDirect

International Journal of Psychophysiology

j ourna l homepage: www.e lsev ie r .com/ locate / i jpsycho

Evidence for effects of task difficulty but not learning onneurophysiological variables associated with effort

Anne-Marie Brouwer ⁎, Maarten A. Hogervorst, Michael Holewijn, Jan B.F. van ErpTNO, P.O. Box 23, 3769 ZG Soesterberg, The Netherlands

⁎ Corresponding author. Tel.: +31 888665960.E-mail addresses: [email protected] (A.-M.

[email protected] (M.A. Hogervorst), [email protected] (J.B.F. van Erp).

http://dx.doi.org/10.1016/j.ijpsycho.2014.05.0040167-8760/© 2014 Elsevier B.V. All rights reserved.

a b s t r a c t

a r t i c l e i n f o

Article history:Received 23 December 2013Received in revised form 8 May 2014Accepted 10 May 2014Available online 16 May 2014

Keywords:learningeffortworkloadphysiologyEEGeye

Learning tomaster a task is expected to be accompanied by a decrease in effort during task execution.We exam-ine the possibility to monitor learning using physiological measures that have been reported to reflect effort orworkload. Thirty-five participants performed different difficulty levels of the n-back task while a range ofphysiological and performancemeasurements were recorded. In order to dissociate non-specific time-related ef-fects from effects of learning, we used the easiest level as a baseline condition. This condition is expected to onlyreflect non-specific effects of time. Performance and subjective measures confirmed more learning for thedifficult level than for the easy level. The difficulty levels affected physiological variables in the way as expected,therewith showing their sensitivity. However, while most of the physiological variables were also affected bytime, time-related effects were generally the same for the easy and the difficult level. Thus, in a well-controlledexperiment that enabled the dissociation of general time effects from learning we did not find physiologicalvariables to indicate decreasing effort associated with learning. Theoretical and practical implications arediscussed.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

We need to monitor learning for a number of reasons, such as pro-viding trainees with appropriate feedback, determining whether atrainee has learned sufficiently well and evaluating educative systems.One straightforward way to do this is to monitor behavioral perfor-mance, e.g. the time it takes to perform a task and the number of errorsmade. However, behavioral performance is not only determined by(learned) skills. Another important factor is mental effort, where thenegative effect of lacking skills on behavioral performance can becounteracted by investing a large amount of effort. This means thatwhile trainees may have reached the desired level of performance,they may need a large amount of effort in order to maintain this level.In such a case, additional learning may still be required in order totransform effortful, controlled cognitive processes into more automaticand efficient processes (Gopher and Kimchi, 1989; Liu and Wickens,1994; Schneider and Fisk, 1982). Thus, information about performanceand effort is needed to monitor the learning process. Informationabout effort could be extracted from physiological measures asdiscussed next.

Brouwer),[email protected] (M. Holewijn),

1.1. Effort and its indicators: peripheral physiology, EEG and eye-relatedmeasures

Effort, or as termed by Brehm and Self, motivational arousal, only oc-curs if a number of conditions are met (Brehm and Self, 1989). Firstly,there should be the expectation that a certain behavior will lead to cer-tain desirable outcome values (task incentive). Secondly, the requiredbehavior should be difficult but considered to be within one's capacityand justified by the potential gain. When the required behavior is con-sidered to be too difficult, i.e. outside one's capacity or outweighingthe potential gain, effort will not be invested.When the required behav-ior is easy to perform, effortwill be low or absent since the organismwillstrive to conserve energy. A concept that is very close to mental effort ismental workload (Gaillard and Wientjes, 1994; Hockey, 1986). Whilethe term ‘workload’ evokes associations with externally imposed taskdemands,workload involves internal factors such as the ability of the in-dividual to cope with these demands (Borghini et al., 2012; Gopher andDonchin, 1986; Kantowitz, 1988; O’Donnell and Eggemeier, 1986) aswell as the motivation of the individual to perform the task at hand(Veltman, 2002). Thus, just like effort, workload can only be highwhen the task is difficult but perceived to be feasible, and leads to, andis in proportion with, rewarding outcomes.

The function of effort is the production of appropriate behavior,therewith rendering measures of physiological arousal (high sympa-thetic relative to parasympathetic activation) likely candidates formeasuring effort (Brehm and Self, 1989; Gawron et al., 1989; Mulderand Mulder, 1987). Changes in sympathetic and parasympathetic

243A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

autonomous nervous system activity can be estimated through severalperipheral physiological measures such as skin conductance (Roth,1983), heart rate and heart rate variability (Berntson et al., 1997).Electrical skin conductance varies with the moisture level of the skin.Since the sweat glands are controlled by the sympathetic part of theautonomous nervous system (Roth, 1983), electrodermal measuresindicate the level of sympathetic activity. Indeed, a large body ofliterature describes the positive effect of arousal on skin conductance(e.g. Boucsein, 1992, 1999; Brouwer et al., 2013; Greenwald et al.,1989; Winton et al., 1984). Heart rate and its variability are affectedby activation and suppression of both the sympathetic and parasympa-thetic nervous systems (Berntson et al., 1997). Fast changes in heart rate(0.15–0.50 Hz) reflect the adjustment of heart rate to breathing suchthat blood pressure is kept around a certain point (Aasman et al.,1987; Mulder, 1980) and gas exchange between the lungs and theblood is facilitated (Grossman and Taylor, 2007). High frequency heartrate variability reflects only the (fast) parasympathetic nervoussystem (Berntson et al., 1997)whilemid frequency heart rate variability(0.07–0.14 Hz) also reflects sympathetic activity (Berntson et al., 1997;Veltman and Gaillard, 1998). Suppression of parasympathetic activity(associated with high effort) results in less heart rate adaptationand hence less heart rate variability. A range of studies show thateffort and workload are indeed reflected by physiological arousal asindicated by heart rate (e.g. studies as reviewed by Vogt et al., 2006),heart rate variability (reviewed by Aasman et al., 1987; Hancock et al.,1985) and electrodermal measures (Kohlisch and Schaefer, 1996;Reimer and Mehler, 2011). Respiration frequency has been observedto increase with effort (Karavidas et al., 2010; Mehler et al., 2009;Wientjes, 1992). This is also in linewith the idea that effort is associatedwith increased arousal since arousal increases metabolic demand,which could be the cause of the observed increases in respiration fre-quency as well as the observed increases in heart rate (Veltman andGaillard, 1998).

Besides arousal, effort is expected to be associated with cognitiveprocesses as reflected by EEGmeasures. EEG alpha activity (power inthe 8–12 Hz band) has been linked to idling (Pfurtscheller et al.,1996), default mode brain activity (Jann et al., 2009; Laufs et al.,2003) and cortical inhibition (Brouwer et al., 2009; van Dijk et al.,2008; Foxe et al., 1998). This suggests that this measure wouldreflect different levels of effort, with high alpha for low levels of ef-fort. This has indeed been reported in several workload studies(e.g. Brouwer et al., 2012; Fink et al., 2005). Another EEG frequencyband that has been related to workload associated processes istheta (4–8 Hz). Evidence for an association between theta and work-ing memory processes or mental effort has been summarized inseveral reviews by Klimesch (1996, 1997, 1999). Theta increasesas task requirements increase (e.g. Esposito et al., 2009; Jensenand Tesche, 2002; Miyata et al., 1990; Raghavachari et al., 2001). Anumber of studies on workload reported both alpha and theta effects(e.g. Brookings et al., 1996; Fournier et al., 1999; Gevins et al., 1998;Gundel and Wilson, 1992).

A final set ofmeasures that has been found to reflect effort ormentalworkload is related to the eyes. Pupil dilation is caused not only by de-creasing luminance but also by increasing workload (Beatty, 1982;Hampson et al., 2010; Kahneman and Beatty, 1966; Kahneman et al.,1969; May et al., 1990; Porter et al., 2007). While the underlying func-tion is unclear, the effect has been observed in studies that varied taskdifficulty without varying the visual environment (Kahneman andBeatty, 1966; Kahneman et al., 1969) indicating that it does not servepurposes related to visual perception. Observed reductions of blinkfrequency and blink duration with workload could be attributed tomaximizing detection of visual information (Bauer et al., 1987;Fogarty and Stern, 1989) in the sense that the claimed sensitivity ofthese parameters could be explained by highworkload being confound-ed by the presence of large amounts of visual information (Brookingset al., 1996).

1.2. Varying effort experimentally

The majority of studies investigating physiological effects of work-load or effort varies (externally defined) task difficulty where requiredeffort is increased by increasing task difficulty. Participants are requiredto perform additional tasks (e.g. Fairclough et al., 2005; Tole et al.,1982), memory load is heightened (Ayaz et al., 2012; Brouwer et al.,2012) or task complexity changes (also) in otherways, e.g. different sce-narios in a flight simulator (Veltman andGaillard, 1996) or in an air traf-fic control task (Wilson and Russell, 2003). Several studies show thatphysiological variables indicate low effort when task requirements areso high that they become nearly impossible to fulfill and the costs ofinvesting effort no longer outweigh the expected gain (Light andObrist, 1983; Obrist et al., 1978). Some studies aimed to affect effort orworkload through other routes than varying task difficulty, and foundcomparable results. Ewing and Fairclough (2010) varied task incentive.They found that a high incentive (presumably leading to high effort) de-creased EEG alpha activity and increased heart rate compared to a lowincentive. Gendolla and Richter (2005, 2006) led their participants tobelieve that an attention related task was either predictive of academicsuccess or used to pass the time. The taskwas either easy or participantswere asked to produce short reaction times. Only participants who hadreasons to invest a large amount of effort (i.e. the ones who both be-lieved the task was meaningful and were asked to produce short reac-tion times) showed physiological signs of high workload. Scher et al.(1984) varied task difficulty and incentive in an aversive noise avoid-ance task with the expected effects found on heart rate and ECG T-wave amplitude. Fowles et al. (1982) and Tranel et al (1982) variedmonetary incentive in a button pressing task and found heart rate to in-crease with incentive. Capa and Audiffren (2009) and Capa et al. (2008)compared groups of participants with different personality characteris-tics (approach-driven versus avoidance-driven individuals). They foundevidence for approach-driven participants investing more effort espe-cially for difficult tasks as reflected by both physiology (mid frequencyband of heart rate variability) and performance.

In the current study,we examinewhether physiological variables re-flect a decrease in effort when individuals are learning to master a task(i.e. when they improve their skills), similar to what has been reportedwhen task difficulty is low orwhen invested effort is low because of lowincentive or personality factors. If so, physiological variables could beused tomonitor learning in addition to performancemeasures. Learningtomaster a task is a process that takes place over time.When investigat-ing the effects of learning, care should be taken to separate learning ef-fects from other potential effects that co-vary with time such as fatigue.In order to do this, we will compare time effects in a condition in whichno or little learning is expected (an easy task) to time effects in a condi-tionwhere learning is expected to take place (a difficult task)where theconditions are presented in blocks interspersed over time.

1.3. Fairclough et al. (2005): Effects of time and task difficulty onneurophysiology

Our study is inspired by Fairclough et al. (2005). They examined theeffect of both time and task difficulty (high or low) in a multi task situ-ation on a range of physiological and performance measures. Whilemost performance measures were consistent with learning, i.e. indicat-ing improvement of performance over time especially when task diffi-culty was high, some showed a decrease in performance over time.Most of themeasured physiological variables were sensitive to task dif-ficulty in the expected direction, i.e. reflecting more effort when taskdifficulty was high. Also, effects of time were found. As mentionedabove and as also indicated by Fairclough et al. (2005), effects of timeneed not be related to learning. Fairclough et al. (2005) interpret the ef-fects of time they found on respiration frequency and blink frequencynot as effects of learning but as effects of anxiety at the start of thetask which then dissipates over time. However, they argued that for

244 A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

some other variables, effects of learning on physiology were indicatedby the results of a regression analysis on a performance measure(constructed by combining three of the performance measures) andphysiological variables for the high task demand condition only. Thisanalysis showed some correlations at certain points in time. For exam-ple, when examining changes from the first to the second out of fourtime periods, an increase in performance was associated with an in-creased heart rate. However, such effects may not relate to learning —

in our opinion, they would be more consistent with increased investedeffort resulting in increased performance.

1.4. Current study

Similar to Fairclough et al. (2005), we here investigate the effects oftime and task difficulty on performance and on roughly the same set ofphysiological variables with the addition of pupil size and skin conduc-tance. However, we use a different task than Fairclough et al. (2005).Their multi-task may have remained challenging over the course ofthe experiment, and it allowed many different prioritizing strategies.As theymentioned themselves, this made it hard to interpret the differ-ent performance measures some of which showed an improvement ofperformance over time while others showed a deterioration. To over-come these problems, we decided to use different levels of an n-backtask in order to vary task difficulty. This task requires participants to in-dicate of each of successively presented letters whether it is a target let-ter or not. The task is easy when the target letter is an ‘x’ (0-back),intermediate when the target letter is a letter that is the same as theone before (1-back) and difficult when the target letter is a letter thatis the same as two letters before (2-back). Performance measures arespeed and accuracy of the responses. Another advantage of this taskover the multi-task as used by Fairclough et al. (2005), is that visualinput and motor output remain the same across difficulty levels. Thismeans that potential effects of task difficulty are due to differences inmental processes andnot for instance due tomore joystick or eyemove-ments in the difficult condition compared to the easy condition.

We hypothesize that effort will be reflected by a main effect of n-back level on physiological variables in the sameway as found previous-ly in the literature. This means that compared to the easy level, the dif-ficult level should showhigher skin conductance, faster breathing, fasterheart rate, lower heart rate variability, lower EEG alpha power, higherEEG theta power, smaller pupil size and perhaps fewer and shortereye blinks. Learning should be reflected by improved performanceover time and in physiological variables as a decrease in effort overtime. This means that for each dependent variable (whether behavioral,physiological or subjective), the difference between the easy and diffi-cult condition should be larger in the beginning (before learning),than in the end. This is because after learning, the difficult task is expect-ed to be performed better and to require less effort than before, and assuch has become more similar to the easy task. Several mental andphysical states that are not related to learning are expected to changeover the course of the experiment, such as fatigue, anxiety related toparticipation to the experiment, or effects related to sitting still. By pre-senting the conditions in short blocks equally dispersed over the courseof the experiment, we control for these long term time effects and can-not confuse them for effects of learning. Thus, the easy (0-back) condi-tion forms a baseline condition for the difficult (2-back) learningcondition. The intermediate (1-back) condition should fall somewherein between.

As described above, the main aim of this study is to examinewheth-er learning is reflected by a learning-specific decrease of effort as indi-cated through physiological variables. Besides examining learning, wewill provide a comprehensive overview of the effects of effort as variedby task difficulty on a range of physiological variables. This allows arough comparison of their relative sensitivity to mental effort. While ef-fects of task difficulty on different physiological variables have been re-ported before (Berka et al., 2007; Brookings et al., 1996; Christensen

et al., 2012; Fairclough et al., 2005; Taylor et al., 2010; Wilson andRussell, 2003), task difficulty conditions in our study do not co-varywith speech, eyemovements, other types of bodymovement and senso-ry stimuli. Finally, our study will give a comprehensive overview oftime-related effects on the physiological variables under study.

2. Methods

2.1. Participants

35 participants took part in the experiment. Participants were agedbetween 19 and 40 years (mean age 27), 19 female and 16 male. Oneparticipant was left-handed. The experiment was approved of by thelocal ethics committee and performed in accordance with the ethicalstandards as laid down in the Declaration of Helsinki.

2.2. Materials

Stimuli (letters), subjective effort scales and announcements aboutthe type of the n-back task to follow were presented on a Tobii T60Eye Tracker monitor, at a distance of about 50 cm from the participants’eyes. Feedback about task performance was presented through LabtecLCS-1050 speakers in the form of high and low pitched tones. Partici-pants used a keyboard to indicate whether presented letters were tar-gets or non-targets. Which of the keys (1 or 2 on the numerical pad)indicated ‘target’ and which ‘non-target’was counterbalanced betweenparticipants. Participants used the mouse to rate subjective effort on ascale (RSME) between stimulus blocks.

EEG (electro encephalogram) was recorded through a g.tec USBampand g.tec Au electrodes placed at Fz, FCz, Pz, C3, C4, F3 and F4, referencedto linkedmastoid electrodes. A ground electrodewas placed at FPz. Imped-ance was kept below 5 kΩ. EEG data were filtered by a 0.1 Hz high pass-and a 100 Hz low pass filter and sampled with a frequency of 256 Hz(USB Biosignal Amplifier, g.tecmedical engineering GmbH). ECG (electrocardiogram) and skin conductance were recorded using a MindWareBioNex 8-slot chassis with a 3-channel Bio-Potential and GSR amplifier. A4-channel transducer amplifier was used to measure respiration. For ECGmeasurement, self-adhesive 1 1/2" electrodes with 7% chloride wet gelwere attached just below the right collarbone, just below the left lowerrib and above the right hip. To record skin conductance, two self-adhesive1 5/8" electrodes with 1% chloride wet gel were attached to the palm ofthe left hand that was not used for pressing the keys — one below thethumb and one below the little finger. Respiration was recorded using aMindWare respiration belt around the waist at the height of thelower side of the sternum. MindWare's BioLab software was usedto acquire physiological data. ECG, skin conductance and respirationwere sampled with a frequency of 300 Hz. They were acquired with aGain setting of 1000, 10 and 500 and filtered with a 0.5, 1 and 5 Hzhigh-pass filters, respectively. Pupil size, blink rate and blinkduration were measured using a Tobii T60 Eye Tracker that wasintegrated into a 17” monitor. Recording frequency was 60 Hz. Allphysiological signals were synchronized using the TCAP signal fromThe Observer XT (Zimmerman et al., 2009).

We used the RSME scale (Rating Scale Mental Effort - Zijlstra, 1993)to measure subjectively experienced mental effort. This scale runs from0 to 150 with higher values reflecting higher effort. It has nine descrip-tors along the axis, e.g. ‘not effortful’ at value 2 and ‘rather effortful’ atvalue 58. Verwey and Veltman (1996) concluded this simple one-dimensional scale to be more sensitive than the often-used NASA-TLX(Hart and Staveland, 1988).

2.3. Task

Participants viewed letters, successively presented on a screen. Foreach letter, they pressed a button to indicate whether the letter was atarget or a non-target. In the 0-back condition, the letter x is the target.

245A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

In the 1-back condition, a letter is a targetwhen it is the same as the onebefore. In the 2-back condition, a letter is a target when it is the same astwo letters before. With this version of the n-back task, the level ofworkload is varied without varying visual input or frequency and typeof motor output (button presses). A 3-back condition was not used,due to evidence that many participants find it too difficult and tend togive up (Ayaz et al., 2007; Izzetoglu et al., 2007).

Participants received feedback in order to inform them after everybutton press whether it was a correct decision by a high (correct) or alow (incorrect) pitched tone. This was intended to help the participant,who in our experiment switched rather often between n-back condi-tions, and to increase the task incentive since the participant knew theexperiment leader would hear the sounds as well.

2.4. Stimuli

The letters used in the n-back task were black (font style: Matlabstandard, approximately 3 cm high) and were presented on a lightgrey background. The letters were presented for 500 ms followed by a2000-ms inter-stimulus interval during which the letter was replacedby a fixation cross. In all conditions, 33% of letters were targets. Exceptfor the letter x in the 0-back task, letters were randomly selected fromEnglish consonants. Vowels were excluded to reduce the likeliness ofparticipants developing chunking strategieswhich reducemental effort,as suggested in Grimes et al. (2008).

2.5. Design

The three conditions (0-back, 1-back, 2-back) were presented in 2-minblocks divided across four sessions. Each session consisted of two repeti-tions of each of the three blocks. Thus, for each of the three conditionsparticipants performed 4 sessions * 2 repetitions = 8 blocks. In eachblock, 48 letters were presented, 16 of which were targets. The blockswere presented in pseudorandomorder, such that each conditionwas pre-sented once in thefirst half of the session and once in the second half of thesession, and that blocks of the same condition never occurred directly aftereach other. Before each sessionwas a rest interval of twominutes inwhichthe participant quietly fixated a cross on the screen.

2.6. Procedure

After entering the lab, participants read and were explained aboutthe experimental procedure. They were told that the experiment wasabout how (neuro)physiological signals vary with different difficultylevels of a task. Learning was not mentioned. Participants then signedan informed consent form. The physiological sensors were attachedand the Tobii eye tracker was calibrated. The three conditions werepracticed up to the point that the task was clear. Regardless of this, allparticipants completed at least one block of the 2-back task in order toalso practice the RSME rating that appeared at the end of the block. Itwas stressed that the 2-back task could be difficult, but that evenwhen the participant thought it was too difficult he or she shouldkeep trying to do as well as possible. Participants were asked to avoidmovement as much as possible while performing the task and to usethe breaks in between the blocks tomake necessarymovements. Beforethe start of each block, the participant was informed about the nature ofthe block (rest, 0-back, 1-back or 2-back) via the monitor. After eachblock, the RSME scale was presented and the participant rated subjec-tive mental effort by clicking the appropriate location on the scaleusing the mouse. The next block started after the participant indicatedto be ready by pressing a button (whichusually occurred after a fewsec-onds). Between sessions, participants had somewhat longer breaks,chatting with the experiment leader or having a drink. These breakswere intended to keep each individual participant as motivated andfresh during the entire experiment as possible.

2.7. Dependent variables

For each participant, each of the three n-back conditions and each ofthe 8 blocks, we determined the value of a range of dependent variablesto measure task performance and physiology as specified below.

Task performance measures were median button press reactiontime and fraction correct. A response could be categorized as either a‘hit’ or a ‘miss’when a targetwas presented, or as a ‘false alarm’ or a ‘cor-rect rejection’when a non-target was presented. ‘Fraction correct’ is thetotal number of hits and correct rejections divided by the total numberof stimuli. Subjectively experienced effort wasmeasured through RSMEratings that participants provided for each block, with high valuesrepresenting high effort.

EEGmeasures thatwe examined here are power in the alpha band atPz, and power in the theta band at Fz. While the exact location of thealpha effect varies with modality and task (Pfurtscheller et al., 1994),for effortful and attentive processing alpha reduction is observed at pa-rietal regions (Keil et al., 2006; Klimesch et al., 2000). Theta increaseswith increasing task difficulty have been found to be most profoundover frontal electrode locations (e.g. Esposito et al., 2009; Jensen andTesche, 2002; Miyata et al., 1990; Raghavachari et al., 2001). Powerwas determined in frequency bands of interest across intervals from500 ms before stimulus onset until 2000 ms after using Matlab andthe FieldTrip open source Matlab toolbox (Oostenveld et al., 2011). In-tervals for which the standard deviation of EEG traces exceeded100 μV were excluded from analysis which was less than 1% of thedata. For Pz, power was determined in frequency bands from 8 to13 Hz in about 0.5 Hz steps. Alpha power was then determined by aver-aging the natural log transformed power across band widths. For Fz,power was determined in frequency bands from 4 to 8 Hz in about0.5 Hz steps. Theta power was then determined by averaging the natu-ral log transformed power across band widths.

Skin conductance level was determined by averaging skin conduc-tance over each block. Inspection of the raw data showed that frequent-ly, skin conductance peaked around the onset of a block (i.e. after ratingsubjective effort of the previous block) after which skin conductancerapidly decreased and remained around the same level. This led us toalso use minimum skin conductance of each block as a dependentvariable.

As a measure of heart rate, we determined the mean RRI for eachblock. RRI is the interval between successive heart beats or more pre-cisely, the interval between subsequent R-peaks in the ECG. Our peakdetection algorithm used to identify these peaks required an R-peak tooccur at least 222 ms after the previous one (corresponding to a maxi-mum allowed heart rate of 270 b/m). The first R-peak in an ECG traceneeded to be between 1 and 5 mV (as measured between the R-peakand the subsequent S-valley) while subsequent peaks were identifiedas such if they crossed a threshold starting at the height of the just iden-tified peak and then exponentially decreasing over time to an asymp-tote of 1 mV. This procedure proved to reliably detect heart beats asindicated by visual inspection of the raw ECG signal with labeledpeaks. Based on the RRIs, three measures of heart rate variability werecomputed. The root mean squared successive difference (RMSSD:Goedhart et al., 2007) between the RRIs reflects high frequency heartrate variability. High-frequency heart rate variabilitywas also computedas the power in the high frequency range (0.15–0.5 Hz) of the RRI overtime using Welch’s method (Welch, 1967) as implemented in Matlab.For mid-frequency heart rate variability a frequency range of 0.07–0.15 Hz was used.

The respiration signal was filtered using a running Gaussian blurringwindowwith a kernel width of 0.39 s as implemented inMatlab. Subse-quently, peaks and throughs were detected using the zero-crossings ofthe derivative of the signal. Breathing frequency was defined as themean time interval between the peaks.

Pupil sizewas determined by the Tobii Eyetracker and the ClearViewalgorithms as the average size of the left and right pupils whenever they

246 A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

were both detected. When the eyetracker did not detect the pupil forboth eyes for minimally two successive frames (i.e. 33 ms) and maxi-mally 25 successive frames (416 ms), this was considered to be ablink. For each blink, blinkdurationwas determined. Blink rate is the av-erage number of blinks per minute.

2.8. Analyses

3*8 Repeated measures ANOVAs were conducted on each variablewith factors n-back (0, 1 and 2) and block (1 through 8). A variable’ssensitivity to effort as induced by varying task difficulty should bereflected by a main effect of n-back condition. An effect of a factor co-varying with time should be reflected by a main effect of block. Ofspecial interest to us, an effect of learning should be reflected in an inter-action between n-back condition and block. Finding an interaction alonedoes not suffice to speak of a learning mediated effect on physiologicalmeasures related to effort. The way that the variables interact shouldbe consistent with a larger difference between the 0- and 2-backcondition in the beginning of the experiment (when the 2-back task re-quires a lot more effort than the easy 0-back task since the 2-back taskhas not been well learned yet) compared to the end (when the 2-backtask has been learned and the required amount of effort needed forthe learned 2-back task is closer to the amount of effort required forthe 0-back task). Thus, for variables that show an interaction effect be-tween n-back condition and block, we tested whether this particularpattern was present by comparing the difference between the 0- and2-back condition in the first block to the difference between the 0-and 2-back condition in the last block using a planned comparisonpaired t-test.

Effect sizes for effects as explored by the repeatedmeasures ANOVAswere determined by computing partial eta squared ( 2

P). Effect sizes foreffects as explored by the paired t-tests were determined by computingCohen’s d on difference values between pairs of samples (d).

3. Results

3.1. Missing data and outliers

Some data were missing due to technical errors during recording orstorage errors. For each dependent variable, we only included data ofparticipants with complete data sets (i.e. with valid data in each of the8 blocks and each of the 3 n-back conditions). In order to spot outliers,datawere plotted separately per participant, block andn-back conditionfor each dependent variable. This led us to exclude two participants forEEG who showed for two out of the 24 blocks values larger than theirmean+ 2 standard deviations (both for alpha and theta). For each var-iable, the number of included participants is plotted in the figures pre-senting the data.

Med

ian

reac

tion

time

(ms)

750

700

600

500

n=31

550

650

2-back1-back0-back

1 2 3 4

Block5 6 7 8

n, b, n*b

Fig. 1.Median reaction time (left) and fraction correct (right) separately for each block and eacthe letters ‘n’ (n-back), ‘b’ (block), and ‘n*b’ (interaction) indicates whether the correspondinANOVAs.

3.2. Performance

Fig. 1 shows reaction time and fraction correct separately for each n-back condition and each consecutive block. Table 1 presents the resultsof the repeatedmeasures ANOVAs on the performancemeasures and allother dependent variables in this study. Table 2 provides means andstandard errors of the mean of the performance measures as well as ofall other dependent variables in the first and last block of the 0-backand 2-back condition.

Behavioral performance is consistent with participants performingworse when task difficulty is high. Reaction time increases with n-backlevel (main effect of n-back on median reaction time: F(2,60) = 41.86,p b 0.01, 2

P = 0.58) and fraction correct decreases (F(2,60) = 36.75,p b 0.01, 2

P = 0.55). A main effect of block on reaction time(F(7,210) = 8.01, p b 0.01, 2

P = 0.21) and fraction correct(F(7,210) = 8.41, p b 0.01, 2

P = 0.22) reflects that reaction time de-creases and fraction correct increases over time which is consistentwith learning. Importantly, both performance measures also show theanticipated interaction between n-back condition and block that isconsistent with a stronger learning effect for the 2-back conditionthan for the 0-back condition, with the 1-back condition in between(reaction time: F(14,420) = 4.08, p b 0.01, 2

P = 0.12; fraction correct:F(14,420)= 2.37, p b 0.01, 2

P= 0.07). The paired t-test on the reactiontime difference between the 0-back and the 2-back condition in thefirst versus the last block indicates that the difference between theeasy and difficult condition is larger in the beginning than in the end(t(30) = −4.22, p b 0.01, d = −0.76). Similarly, fraction correct alsoshowed a larger difference between 0- and 2-back conditions duringthe first than during the last block (t(30) = 3.32, p b 0.01, d = 0.60).We can thus conclude that learning in the 2-back condition is muchmore pronounced than in the 0-back condition.

To check whether performance measures indicated a modestamount of learning in the 0-back task or none at all, we compared reac-tion time and fraction correct during the first block versus the last blockseparately for 0- and 2-back conditions using paired t-tests. Reactiontime did not decrease in the 0-back condition (t(30)= 1.75, p = 0.09,d = 0.31) while it did in the 2-back condition (t(30) = 4.45, p b 0.01,d= 0.80). Fraction correct did not only increase in the 2-back condition(t(30) = −4.98, p b 0.01, d = 0.63), it also increased for the 0-backcondition (t(30) =−3.48, p b 0.01, d=−0.89). Thus, fraction correctsuggested that some learning also took place in the 0-back task.

3.3. Subjective ratings

Fig. 2 shows the subjective mental effort (RSME) separately for eachn-back condition and each consecutive block. RSME increases with n-back condition (main effect of n-back: F(2,70) = 43.88, p b 0.01, 2

P =0.56) and decreases across time (main effect of block: F(7,245) = 3.42,

100

96

92

88

Fra

ctio

n co

rrec

t (%

)

1 2 3 4

Block5 6 7 8

n=31n, b, n*b

h n-back condition. The number of included participants is indicated by ‘n=’. Font color ofg effects are significant (black font) or not (grey font) as tested using repeated measures

Table 1Results of repeatedmeasures ANOVAs: p- and F values for themain effects of n-back (0, 1 or 2), block (1 through 8) and the interaction between n-back and block. Also indicated are theaccompanying effect sizes. The first three variables are behavioral and subjective measures, the remainder is physiological data. Significant effects are highlighted by one or two asterisks(alpha levels lower than 0.05 or 0.01 respectively).

N-back Block N-back*block

p Fdf1, df2

2P p F

df1, df2

2P p F

df1, df2

2P

Reaction time b0.01** 41.862,60

0.58 b0.01** 8.017,210

0.21 b0.01** 4.0814,420

0.12

Fraction correct b0.01** 36.752,60

0.55 b0.01** 8.417,210

0.22 b0.01** 2.3714,420

0.07

RSME score b0.01** 43.882,70

0.56 b0.01** 3.427,245

0.09 b0.01** 2.2514,490

0.06

EEG alpha Pz b0.01** 12.402,66

0.27 b0.01** 4.617,231

0.12 0.01* 2.0714,462

0.06

EEG theta Fz 0.61 0.502,66

0.01 0.19 1.447,231

0.04 0.24 1.2414,462

0.04

Mean skin conductance 0.02* 3.992,60

0.12 0.13 1.617,210

0.05 0.25 1.2314,420

0.04

Minimum skin conductance b0.01** 8.072,60

0.21 0.24 1.337,210

0.04 0.53 0.9314,420

0.03

Respiration interval b0.01** 21.022,60

0.41 0.09 1.797,210

0.06 0.45 1.0014,420

0.03

RRI b0.01** 13.042,56

0.32 b0.01** 25.467,196

0.48 0.07 1.6414,392

0.06

HRV: RMSSD b0.01** 7.532,56

0.21 b0.01** 6.657,196

0.19 0.74 0.7414,392

0.03

HRV: mid frequency 0.25 1.432,56

0.05 0.09 1.797,196

0.06 0.51 0.9414,392

0.03

HRV: high frequency 0.01* 4.922,56

0.15 b0.01** 5.317,196

0.16 0.16 1.3814,392

0.05

Pupil size b0.01** 26.722,30

0.64 b0.01** 10.987,105

0.42 0.78 0.6914,210

0.04

Number of blinks 0.28 1.332,30

0.08 0.97 0.247,105

0.02 0.38 1.0714,210

0.07

Blink duration 0.26 1.432,28

0.09 b0.01* 2.977,98

0.18 0.79 0.6814,196

0.05

247A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

p b 0.01, 2P = 0.09) where the decrease seems stronger for the 2-back

condition than the other conditions (interaction between n-back condi-tion and block: F(14,490)= 2.25, p b 0.01, 2

P= 0.06).While the plannedpaired t-test that compared the 0- and 2-back differences between blocks1 and 8 did not reach significance (t(35)=−0.61, p= 0.54, d=−0.10),the 0- and 2-back differences between blocks 1 and 7 were significantlydifferent (t(35) = 2.87, p b 0.01, d = −0.48 — see the trend-breakingRSME increase at the very end of the experiment in Fig. 2). Similar to re-action time, paired t-tests between RSME during the first block versus thelast block indicate that RSME did not differ between the first and the lastblock (t(35)= 0.37, p= 0.71, d= 0.06) for the 0-back condition. For the2-back condition, the RSME in the first and the last block did not differsignificantly (t(35) = 1.03, p = 0.31, d = 0.17), but RSME was

Table 2Means and standard errors of the mean of all dependent variables in the first and last block of

0-back

Block 1

Reaction time (ms) 537 ± 24Fraction correct (%) 96.6 ± 0.6RSME score 34.6 ± 3.6EEG alpha Pz (ln, μV2*S) 0.100 ± 0.091EEG theta Fz (ln, μV2*S) 0.757 ± 0.082Mean skin conductance (mS) 7.88 ± 0.94Minimum skin conductance (mS) 7.21 ± 0.87Respiration interval (s) 3.54 ± 0.12RRI (ms) 887.9 ± 28.9HRV: RMSSD (ms) 50.3 ± 4.9HRV: mid frequency (log, s) 5.55 ± 0.18HRV: high frequency (log, s) 5.72 ± 0.23Pupil size (mm diameter) 2.95 ± 0.09Number of blinks 53.2 ± 8.5Blink duration (ms) 131.5 ± 8.6

significantly lower in block 7 compared to block 1 (t(35) =4.04, p b 0.01, d = 0.67).

3.4. Physiological variables

Fig. 3 shows each dependent physiological variable separately forblock and n-back condition.

N-back condition affected most physiological variables as expected,where variables are consistent with higher effort for the more difficultcondition. The graphs indicate that the difference is mostly betweenthe 2-back and the other two conditions. Power in the alpha band issmaller, skin conductance larger (reflected both by mean and mini-mum), respiration faster, heart rate faster, high frequency heart rate

the 0-back and 2-back condition.

2-back

Block 8 Block 1 Block 8

513 ± 21 735 ± 41 625 ± 2898.3 ± 0.4 87.6 ± 1.3 92.6 ± 1.033.3 ± 3.5 65.2 ± 3.5 60.3 ± 4.5

0.302 ± 0.109 0.040 ± 0.096 0.104 ± 0.1030.771 ± 0.090 0.833 ± 0.087 0.791 ± 0.0897.33 ± 0.78 7.99 ± 0.95 7.31 ± 0.776.76 ± 0.72 7.36 ± 0.90 6.86 ± 0.733.66 ± 0.09 3.32 ± 0.12 3.32 ± 0.10

951.8 ± 33.6 849.7 ± 29.6 924.4 ± 31.763.7 ± 7.6 46.2 ± 5.3 55.3 ± 7.25.85 ± 0.20 5.67 ± 0.17 5.76 ± 0.196.15 ± 0.23 5.62 ± 0.24 5.84 ± 0.242.80 ± 0.08 3.10 ± 0.09 2.96 ± 0.0854.3 ± 7.7 53.13 ± 8.5 54.2 ± 8.2

136.8 ± 6.5 125.2 ± 6.6 140.4 ± 6.1

40

30

RS

ME

sco

re70

60

50

2-back1-back0-back

Block1 2 3 4 5 6 7 8

n=36n, b, n*b

Fig. 2. Subjective mental effort as indicated by the RSME (Rating Scale Mental Effort) perblock and n-back condition. Conventions as in Fig. 1.

248 A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

variability lower (reflected both by RMSSD and power in the high fre-quency band) and pupil size larger for the 2-back condition comparedto the (1- and) 0-back condition. Most variables that do not show a sig-nificant effect of n-back condition, do show the expected trends: themid-range HRV tends to be smaller for the 2-back condition than forthe 0-back condition, eye blinks tend to be shorter and less frequent(a larger interval between blinks) in the 2-back condition than in the0-back condition. Note that relatively much data were missing for theeye tracker variables. Theta power is the only physiological variablethat does not seem to be affected by n-back condition at all.

Power in the alpha band, heart rate related variables, pupil size andblink duration showed (near) significant effects of block. These effectswere consistent with decreasing effort over time: alpha increased, RRIincreased (i.e. heart rate decreased), high frequency HRV increased asreflected both by RMSSD and power in the high frequency band (thesame trend was visible for mid frequency HRV), pupil size decreased,and blink duration increased.

In order to interpret the main effects of block as a decrease of effortdue to learning rather than as non-specific effects of time, we should seean interaction between n-back level and block similar to what we ob-serve in the performance and subjective data that reflect more learningin the 2-back condition compared to the 0-back condition. The onlyphysiological variable that showed an interaction effect was EEG alpha(F(14,462) = 2.07, p = 0.01, 2

P = 0.06). However, the direction ofthis interaction is not consistent with learning. The difference in alphapower between the 0-back and the 2-back condition is larger inthe last block than in the first block (t(33) = −3.27, p b 0.01, d =−0.56), which is opposite to the effect expected by learning.

4. Discussion

In the current study we examinedwhether learning to master a taskis reflected by a decreasing amount of effort over time as indicated byphysiological variables that are expected to reflect effort. Therefore,we tested the effect of time, task difficulty and their interaction on arange of performance, subjective and physiological variables in a well-controlled experiment that enabled the dissociation of general time-related effects from learning. We did not find evidence for decreasingeffort associated with learning as reflected by physiological variables.Most of our variables are sensitive to effort as induced by task difficulty,and many variables show an effect of time in general, but none of thevariables show an interaction between task difficulty and time that isin a direction consistent with learning. These results and their implica-tions are discussed in more detail below.

4.1. Learning

Our performance results are consistent with the learning effect thatwe intended to induce: participants get better (faster and more

accurate) over time for the 2-back task and not, or less so, for the 0-back task. Consistentwith the performance results, subjective effort rat-ings indicated a larger decrease in effort over time for the 2-back taskthan for the 0-back task. The effects of n-back condition on the physio-logical measures show that almost all of them are in principle sensitiveto effort. Still, physiological data do not reflect a smaller difference ineffort between the 0-back and 2-back condition at the end of the exper-iment compared to the beginning. There are several possible explana-tions for why we did not find evidence for a physiological correlate oflearning.

Firstly, the effect on effort as induced by n-back level could havebeenmuch stronger than the effect onworkload as induced by learning.However, for performance measures and subjective effort, the changedue to learning in the 2-back condition is in the same order of magni-tude as the difference between the 1-back and 2-back condition.While performance measures and subjective effort measures do notequate effort, these measures do not suggest that effects of learningare negligible compared to effects of n-back condition.

Secondly, physiological variables in the 2-back condition may notshow a stronger decrease in effort because even though increased skillsreduced the required amount of effort to fulfill the task in an acceptableway, participants may have chosen to try and further improve perfor-mance rather than to reduce effort. This may especially be the case inthis study where participants could have noticed that, even thoughthey got faster and better, their performance in the 2-back task wasstill far worse than that in the 0-back task. A first argument againstthis possible explanation is that the subjective effort ratings are atoddswith this explanation: participants reported to have reduced effortmore strongly over time in the 2-back than in the 0-back task. However,subjective effort ratings may not reliably indicate effort (Sander et al.,2005; Vogt et al., 2002). In this particular case participants may havebeen influenced by performance where they mistook their ability to re-spond faster and with fewer errors for lower effort. A second argumentagainst the idea that participants kept effort at a consistently high levelin the 2-back task is that performance does not improve much after thethird or fourth block indicating that if they did not reduce effort, this atleast did not appear to result in the desired improved performance.Given the idea that organisms strive to conserve energy and not need-lessly invest effort (Brehm and Self, 1989), this does not seem to be avery sensible strategy. However, it remains the case that we do notknowwhat the exact relation is between skills, effort and performance,in particular how participantsweigh (expenses of) effort and (improve-ments in, or chances to improve) performance. An improved version ofthe current study would entail a task in which the performance of thedifficult task after learning would be the same as performance of theeasy task. Under such circumstances it is particularly strongly expectedthat participants will decrease invested effort after having reached the‘best possible’ performance.

Finally, we may not have found a learning-related decrease in effortas reflected by physiology because effort as affected by n-back level andlearning may be different constructs not from a subjective, but from aphysiological point of view. Energy expenses and relative contributionsof the parasympathetic and sympathetic autonomous nervous systemcould underlie physiological correlates of effort as affected by taskdifficulty, as well as effort as affected by task incentive (e.g. Ewing andFairclough, 2010; Scher et al., 1984) and personality (Capa andAudiffren, 2009; Capa et al., 2008). In these cases, high effort goes to-gether with high energy needs and relatively strong sympathetic activ-ity or physiological arousal (Brehm and Self, 1989; Gawron et al., 1989;Mulder andMulder, 1987). However, different levels of effort because ofdifferent skill levelmay bemore strongly associatedwith different typesof brain processes, or different strategies, that could be more or less in-dependent from the effects on the autonomous nervous system and en-ergetic needs. In this view, the 2-back task requires a certain (high)amount of arousal or costs a certain (high) amount of energy regardlessof skill level, but with higher skill level the way that the task is solved

249A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

has changed resulting in better performance and lower subjectively ex-perienced effort.

4.2. Effects of task difficulty

Effects of n-back condition are generally strongest for the perfor-mance and subjective effort measures (see F-values and effect sizes inTable 1). However, most of the physiological variables also respondedin the predicted way. Pupil size, respiration frequency, RRI and EEGalpha seem to be the most sensitive measures (F-values and effectsizes in Table 1 — the effect size of pupil size is even larger than the ef-fect sizes of the performance and subjective variables). Our results sug-gest that RMSSD is a more robust measure of high frequency heart ratevariability than the variable as determined by spectral analysis. The ef-fect of n-back condition did not reach significance for themid frequencyheart rate variability measure. Our results suggest that heart rate is rel-atively reliable indicator of task difficulty (in accordance with a reviewby Vogt et al., 2006) and heart rate variability less so (in contrast to areview by Hancock et al., 1985). Note that heart rate variability is aquite complicated variable that is for instance strongly affected bymissed R-peaks in the ECG signal and sensitive to breathing (Berntsonet al., 1997; Task force, 1996; Veltman and Gaillard, 1996) whereslow, deep breathing produces a strong increase in heart rate variability(Angelone and Coulter, 1964; Grossman and Taylor, 2007).While thesefactors are unlikely to have negatively affected the sensitivity of heart

0.3

0.2

0

ln E

EG

alp

ha P

z (µ

V2 *

S)

8.0

7.8

7.4

Mea

n sk

in c

ondu

ctan

ce (

mS

)

3.7

3.5

3.3

Res

pira

tion

inte

rval

(s)

2-back1-back0-back

0.1

7.2

7.6

Block1 2 3 4 5 6 7 8

n=34n, b, n*b

n=31n, b, n*b

n=31n, b, n*b

Fig. 3. Physiological variables per n-back condition: EEG alpha, EEG theta, mean skin conductan(inverse of heart rate), heart rate variability as determined throughRMSSD, power in themid freduration. Conventions as in Fig. 1.

rate variability in our study, these factors can act as confounds andstrengthen or weaken effects of effort on heart rate variability. Studiesthat examined EEG spectral variables next to other physiological vari-ables such as different eye and heart related measures, concluded orsuggested EEG to be the most sensitive or promising indicator of work-load or effort (Berka et al., 2007; Brookings et al., 1996; Christensenet al., 2012; Taylor et al., 2010). Indeed, we find EEG alpha to be relative-ly sensitive but not EEG theta. The studies mentioned did not recordpupil size. In our study, pupil size displayed (by far) the largest n-backeffect size. Pupil size may have been an especially sensitive variable inour study since we kept lighting conditions relatively constant overtime. Our finding that eye blink frequency and durationwere not affect-ed by task difficulty is not surprising given that our taskwas not primar-ily visual (cf. Bauer et al., 1987; Fogarty and Stern, 1989). Relevant visualinformation was presented at known time intervals and easy to detect.

In applied settings, the choice of variables to measure effort woulddepend not only on their sensitivity to effort, but also on their ease ofmeasurement and sensitivity to noise in the situation at hand. As indi-cated above, pupil size turned out to provide a good estimate of n-back condition in the present context, but may not be a good indicatorwhen lighting strongly fluctuates. Currently we examine whether andhow to best estimate effort as affected by task difficulty on the basis ofmultiple physiological measurements for a single individual. Improve-ment through combination would be expected because of the variables’different sensitivity to particular forms of noise. Furthermore, they are

0.82

0.78

0.74

7.4

7.2

6.8

Min

ski

n co

nduc

tanc

e (m

S)

940

900

860

Mea

n R

RI (

ms)

0.70

ln E

EG

thet

a F

z (µ

V2 *

S)

7.0

Block1 2 3 4 5 6 7 8

n=34n, b, n*b

n=31n, b, n*b

n=29n, b, n*b

ce, minimum skin conductance, respiration interval (inverse of respiration frequency), RRIquency band, power in the high frequency band, pupil size, number of eye blinks and blink

60

50R

MS

SD

(m

s)

40

log

HR

V m

id (

s)

6.2

6.0

5.4

log

HR

V h

igh

(s) 3.1

3.0

2.8Pup

il si

ze (

mm

dia

met

er)

140

130

Blin

k du

ratio

n (m

s)65

55

45

Num

ber

of b

links

6.2

6.0

5.8

5.6

5.4

120

2-back1-back0-back

5.8

5.6

2.9

60

50

BlockBlock1 2 3 4 5 6 7 81 2 3 4 5 6 7 8

n=29n, b, n*b

n=29n, b, n*b

n=29n, b, n*b

n=16n, b, n*b

n=15n, b, n*b

n=16n, b, n*b

Fig. 3 (continued).

250 A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

expected to be differentially sensitive to different aspects of processesrelated to effort: EEG is expected to reflect the cognitive processes asso-ciated with task difficulty while arousal (that is also expected to covarywith task difficulty) is captured by skin conductance. Given the com-plexity of the effort concept that leaves room for the effect of individ-uals’ different personality and strategies, as well as differences inindividual (neuro)physiology, models tailored to the individual are ex-pected to outperform general models.

4.3. Effects associated with time

Our study highlights and provides an overview of the sensitivity of arange of physiological variables to non-specific time-related effects. Infact, most physiological variables that showed an effect of n-back levelalso showed an effect of block. Blink duration was the only variablethat showedonly an effect of block and not of n-back level. The directionof the time-related effects seems consistent with decreasing effort.Trends for skin conductance variables were in the same direction. How-ever, since we did not find the expected interaction between task diffi-culty and time, the change over time is more likely to be due to physicalrest and getting used to (or bored by) the task and setting. These strongtime effects on physiology again stress the need for careful experimen-tal design and caution when classifying and interpreting physiologicaldata. Effects that look like decreasing workload over time, which couldeasily be interpreted as due to learning, may in fact be non-specifictiming effects.

Acknowledgements

We would like to thank the editor and two reviewers for their con-structive comments, Emily Coffey for her work on a previous versionof the experiment during her internship at TNO, Roel Boussardt for run-ning the participants, Pjotr van Amerongen and Rob van de Pijpekampfor technical assistance building the experimental setup and PatrickZimmerman and Tobias Heffelaar for assistance with the MindWareequipment.

References

Aasman, J., Mulder, G., Mulder, L.J.M., 1987. Operator effort and the measurement of heartrate variability. Hum. Factors 29, 161–170.

Angelone, A., Coulter, N.A., 1964. Respiratory-sinus arrhytmia: A frequency dependentphenomenon. J. Appl. Physiol. 19, 479–482.

Ayaz, H., Izzetoglu, M., Bunce, S., Heiman-Patterson, T., Onaral, B., 2007. Detecting cogni-tive activity related hemodynamic signal for brain computer interface using function-al near infrared spectroscopy. 3rd International IEEE/EMBS Conference on NeuralEngineering, Kohala Coast, HI, USA, pp. 342–345.

Ayaz, H., Shewokis, P.A., Bunce, S., Izzetoglu, K., Willems, B., Onaral, B., 2012. Optical brainmonitoring for operator training and mental workload assessment. Neuroimage 59(1), 36–47 (Jan 2).

Bauer, L.O., Goldstein, R., Stern, J.A., 1987. Effects of information-processing demands onphysiological response patterns. Hum. Factors 29, 213–234.

Beatty, J., 1982. Task-evoked pupillary responses, processing load, and the structure ofprocessing resources. Psychol. Bull. 91 (2), 276–292.

Berka, C., Levendowski, D.J., Lumicao, M.N., Yau, A., Davis, G., Zivkovic, V.T., Olmstead, R.E.,Tremoulet, P.D., Craven, P.L., 2007. EEG correlates of task engagement andmental work-load in vigilance, learning, and memory tasks. Aviat. Space Environ. Med. 78 (5 Suppl.).

251A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

Berntson, G.G., Bigger, J.T., Eckberg, D.L., Grossman, P., Kaufmann, P.G., Malik, M., Nagaraja,H.N., Porges, S.W., Saul, J.P., Stone, P.H., van der Molen, M.W., 1997. Heart ratevariability: Origins, methods, and interpretive caveats. Psychophysiology 34 (6),623–648.

Borghini, G., Astolfi, L., Vecchiato, G., Mattia, D., Babiloni, F., 2012. Measuring neurophys-iological signals in aircraft pilots and car drivers for the assessment of mental work-load, fatigue and drowsiness. Neurosci. Biobehav. Rev. http://dx.doi.org/10.1016/j.neubiorev.2012.10.003 (Oct 30, pii: S0149-7634(12)00170-4, Epub ahead of print).

Boucsein, W., 1992. Electrodermal activity. Plenum Press, New York.Boucsein, W., 1999. Electrodermal activity as an indicator of emotional processes. Korean

J. Sci. Emot. Sensibility 2, 1–25.Brehm, J.W., Self, E.A., 1989. The intensity of motivation. Annu. Rev. Psychol. 40, 109–131.Brookings, J.B., Wilson, G.F., Swain, C.R., 1996. Psychophysiological responses to changes

in workload during simulated air traffic control. Biol. Psychol. 42, 361–377.Brouwer, A.-M., Hogervorst, M.A., Herman, P., Kooi, F., 2009. Are you really looking? Find-

ing the answer through fixation patterns and EEG. Proceedings of the 5th Interna-tional Conference on Foundations of Augmented Cognition. Lecture notes inartificial intelligence, vol. 5638. Springer, Berlin/Heidelberg, pp. 329–338.

Brouwer, A.-M., Hogervorst, M.A., van Erp, J.B.F., Heffelaar, T., Zimmerman, P.H.,Oostenveld, R., 2012. Estimating workload using EEG spectral power and ERPs inthe n-back task. J. Neural Eng. 9 (4), 045008.

Brouwer, A.-M., van Wouwe, N., Mühl, C., van Erp, J.B.F., Toet, A., 2013. Perceiving blocksof emotional pictures and sounds: Effects on physiological variables. Front. Hum.Neurosci. 7, 295. http://dx.doi.org/10.3389/fnhum.2013.00295.

Capa, R.L., Audiffren, M., 2009. How does achievement motivation influence mental effortmobilization? Physiological evidence of deteriorative effects of negative affects on thelevel of engagement. Int. J. Psychophysiol. 74 (3), 236–242.

Capa, R.L., Audiffren, M., Ragot, S., 2008. The interactive effect of achievement motivationand task difficulty on mental effort. Int. J. Psychophysiol. 70 (2), 144–150.

Christensen, J.C., Estepp, J.R.,Wilson, G.F., Russell, C.A., 2012. The effects of day-to-day var-iability of physiological data on operator state classification. Neuroimage 59, 57–63.

Esposito, F., Aragri, A., Piccoli, T., Tedeschi, G., Goebel, R., Di Salle, F., 2009. Distributedanalysis of simultaneous EEG–fMRI time-series: Modeling and interpretation issues.Magn. Reson. Imaging 27 (8), 1120–1130.

Ewing, K.C., Fairclough, S.H., 2010. The effect of an extrinsic incentive on psychophysio-logical measures of mental effort and motivational disposition when task demandis varied. Proc. Hum. Factors Ergon. Soc. 1, 259–263.

Fairclough, S.H., Venables, L., Tattersall, A., 2005. The influence of task demand and learn-ing on the psychophysiological response. Int. J. Psychophysiol. 56 (2), 171–184.

Fink, A., Grabner, R.H., Neuper, C., Neubauer, A.C., 2005. EEG alpha band dissociation withincreasing task demands. Cogn. Brain Res. 24 (2), 252–259.

Fogarty, C., Stern, J.A., 1989. Eye movements and blinks: Their relationship to higher cog-nitive processes. Int. J. Psychophysiol. 8, 35–42.

Fournier, L.R., Wilson, G.F., Swain, C.R., 1999. Electrophysiological, behavioral, and subjec-tive indexes of workload when performing multiple tasks: Manipulations of taskdifficulty and training. Int. J. Psychophysiol. 31 (2), 129–145.

Fowles, D.C., Fisher, A.E., Tranel, D.T., 1982. The heart beats to reward: The effect of mon-etary incentives on heart rate. Psychophysiology 19, 506–513.

Foxe, J.J., Simpson, G.V., Ahlfors, S.P., 1998. Parieto-occipital ~10 Hz activity reflects antic-ipatory state of visual attention mechanisms. Neuroreport 9, 3929–3933.

Gaillard, A.W.K.,Wientjes, C.J.E., 1994.Mental load andwork stress as two types of energymobilization. Work stress 8, 141–152.

Gawron, V.J., Schiflett, S.G., Miller, J.C., 1989. Measures of in-flight workload. In: Jensen, R.S.(Ed.), Aviation psychology. Brookfield, Aldershot, pp. 240–287.

Gendolla, G.H.E., Richter, M., 2005. Ego involvement and effort: Cardiovascular, electro-dermal, and performance effects. Psychophysiology 42 (5), 595–603.

Gendolla, G.H.E., Richter, M., 2006. Ego-involvement and the difficulty law of motivation:Effects on performance-related cardiovascular response. Personal. Soc. Psychol. Bull.32 (9), 1188–1203.

Gevins, A., Smith, M.E., Leong, H., McEvoy, L., Whitfield, S., Du, R., Rush, G., 1998. Monitor-ing working memory load during computer-based tasks with EEG pattern recogni-tion methods. Hum. Factors 40 (1), 79–91.

Goedhart, A.D., Van der Sluis, S., Houtveen, J.H., Willemsen, G., De Geus, E.J.C., 2007.Comparison of time and frequency domains of RSA in ambulatory recordings. Psycho-physiology 44, 203–215.

Gopher, D., Donchin, E., 1986. Workload: An examination of the concept. In: Boff, K.R.,Kaufman, L., Thomas, J.P. (Eds.), Handbook of perception and human performance,volume II. John Wiley and Sons, New York, pp. 1–49.

Gopher, D., Kimchi, R., 1989. Engineering psychology. Annu. Rev. Psychol. 40, 431–455.Greenwald, M.K., Cook, E.W., Lang, P.J., 1989. Affective judgment and psychophysiological

response: Dimensional covariation in the evaluation of pictorial stimuli. J.Psychophysiol. 3 (1), 51–64.

Grimes, D., Tan, D.S., Hudson, S.E., Shenoy, P., Rao, R.P., 2008. Feasibility and pragmatics ofclassifying working memory load with an electroencephalograph. Proceeding of thetwenty-sixth annual SIGCHI conference on Human factors in computing systems.ACM, Florence, Italy, pp. 835–844.

Grossman, P., Taylor, E.W., 2007. Toward understanding respiratory sinus arrhythmia: Re-lations to cardiac vagal tone, evolution and biobehavioral functions. Biol. Psychol. 74,263–285.

Gundel, A., Wilson, G.F., 1992. Topographical changes in the ongoing EEG related to thedifficulty of mental task. Brain Topogr. 5, 17–25.

Hampson, R.E., Opris, I., Deadwyler, S.A., 2010. Neural correlates of fast pupil dilation innonhuman primates: Relation to behavioral performance and cognitive workload.Behav. Brain Res. 212 (1), 1–11.

Hancock, P.A., Meshkati, N., Robertson, M.M., 1985. Physiological reflections of mentalworkload. Aviat. Space Environ. Med. 56, 1110–1114.

Hart, S.G., Staveland, L.E., 1988. Development of a multi-dimensional workload ratingscale: Results of empirical and theoretical research. In: Hancock, P.A., Meshkati, N.(Eds.), Human mental workload. Elsevier, Amsterdam, The Netherlands,pp. 139–183.

Hockey, G.R.J., 1986. Changes in operator efficiency as a function of environmental stress,fatigue, and circadian rhythms. In: Boff, K.R., Kaufman, L., Thomas, J.P. (Eds.), Hand-book of perception and human performance, vol. 2. John Wiley, New York, pp. 44.1–44.49.

Izzetoglu, M., Bunce, S.C., Izzetoglu, K., Onaral, B., Pourrezaei, A.K., 2007. Functional brainimaging using near-infrared technology. IEEE Eng. Med. Biol. Mag. 26 (4), 38–46.

Jann, K., Dierks, T., Boesch, C., Kottlow, M., Strik, W., Koenig, T., 2009. BOLD correlates ofEEG alpha phase-locking and the fMRI default mode network. Neuroimage 45,903–916.

Jensen, O., Tesche, C.D., 2002. Frontal theta activity in humans increases with memoryload in a working memory task. Eur. J. Neurosci. 15, 1395–1399.

Kahneman, D., Beatty, J., 1966. Pupil diameter and load on memory. Science 154,1583–1585.

Kahneman, D., Tursky, B., Shapiro, D., Crider, A., 1969. Pupillary, heart rate, and skin resis-tance changes during a mental task. J. Exp. Psychol. 79, 164–167.

Kantowitz, B.H., 1988. Mental workload. In: Hancock, P.A. (Ed.), Human factors psycholo-gy. Elsevier, Amsterdam.

Karavidas, M.K., Lehrer, P.M., Lu, S.-E., Vaschillo, E., Vaschillo, B., Cheng, A., 2010. The ef-fects of workload on respiratory variables in simulated flight: A preliminary study.Biol. Psychol. 84 (1), 157–160.

Keil, A., Mussweiler, T., Epstude, K., 2006. Alpha-band activity reflects reduction of mentaleffort in a comparison task: A source space analysis. Brain Res. 1121, 117–127.

Klimesch, W., 1996. Memory processes, brain oscillations and EEG synchronization. Int. J.Psychophysiol. 24, 61–100.

Klimesch, W., 1997. EEG-alpha rhythms and memory processes. Int. J. Psychophysiol. 26,319–340.

Klimesch,W., 1999. EEG alpha and theta oscillations reflect cognitive andmemory perfor-mance: A review and analysis. Brain Res. Brain Res. Rev. 29, 169–195.

Klimesch, W., Doppelmayr, M., Röhm, D., Pöllhuber, D., Stadler, W., 2000. Simultaneousdesynchronization and synchronization of different alpha responses in the humanelectroencephalograph: A neglected paradox? Neurosci. Lett. 284, 97–100.

Kohlisch, O., Schaefer, F., 1996. Physiological changes during computer tasks: Responsesto mental load or to motor demands? Ergonomics 39 (2), 213–224.

Laufs, H., Krakow, K., Sterzer, P., Eger, E., Beyerle, A., Salek-Haddadi, A., Kleinschmidt, A.,2003. Electroencephalographic signatures of attentional and cognitive defaultmodes in spontaneous brain activity fluctuations at rest. Proc. Natl Acad. Sci. USA100, 11053–11058.

Light, K.C., Obrist, P.A., 1983. Task difficulty, heart rate reactivity, and cardiovascular re-sponses to an appetitive reaction time task. Psychophysiology 20, 301–311.

Liu, Y., Wickens, C.D., 1994. Mental workload and cognitive task automaticity: An evalu-ation of subjective and time estimation metrics. Ergonomics 37, 1843–1854.

May, J.G., Kennedy, R.S., Williams, M.C., Dunlap, W.P., Brannan, J.R., 1990. Eye movementindices of mental workload. Acta Psychol. 75, 75–89.

Mehler, B., Reimer, B., Coughlin, J.F., Dusek, J.A., 2009. Impact of incremental increases incognitive workload on physiological arousal and performance in young adult drivers.Transp. Res. Rec. (2138), 6–12.

Miyata, Y., Tanaka, Y., Hono, T., 1990. Long term observation on Fm-theta during mentaleffort. Neuroscience 16, 145–148.

Mulder, G., 1980. The heart of mental effort. (Thesis) , University of Groningen, Groningen.Mulder, L.J.M., Mulder, G., 1987. Cardiovascular reactivity and mental workload. In:

Kitney, R.I., Rompelman, O. (Eds.), The beat-by-beat investigation of cardiovascularfunction. Clarendon Press, Oxford, pp. 216–253.

O’Donnell, R.D., Eggemeier, F.T., 1986. Workload assessment methodology. In: Boff, K.,Kaufman, L., Thomas, J.P. (Eds.), Handbook of perception and human performance.Wiley, New York, pp. 42.1–44.49.

Obrist, P.A., Gaebelein, C.J., Teller, E.S., Langer, A.W., Grignolo, A., Light, K.C., McCubbin, J.A.,1978. The relationship among heart rate, carotid dP/dt, and blood pressure in humansas a function of type of stress. Psychophysiology 15, 102–115.

Oostenveld, R., Fries, P., Maris, E., Schoffelen, J.M., 2011. FieldTrip: Open source softwarefor advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput.Intell. Neurosci. http://dx.doi.org/10.1155/2011/156869 (Article ID 156869).

Pfurtscheller, G., Neuper, C., Berger, J., 1994. Source localization using event relateddesynchronization (ERD) within the alpha band. Brain Topogr. 6, 269–275.

Pfurtscheller, G., Stancak, A., Neuper, C., 1996. Event-related synchronization (ERS) in thealpha band: An electrophysiological correlate of cortical idling. Int. J. Psychophysiol.24, 39–46.

Porter, G., Troscianko, T., Gilchrist, I.D., 2007. Effort during visual search and counting: In-sights from pupillometry. Q. J. Exp. Psychol. 60 (2), 211–229.

Raghavachari, S., Kahana, M.J., Rizzuto, D.S., Caplan, J.B., Kirschen, M.P., Bourgeois, B.,Madsen, J.R., Lisman, J.E., 2001. Gating of human theta oscillations by a workingmemory task. J. Neurosci. 21, 3175–3183.

Reimer, B., Mehler, B., 2011. The impact of cognitive workload on physiological arousal inyoung adult drivers: A field study and simulation validation. Ergonomics 54 (10),932–942.

Roth, W.T., 1983. A comparison of P300 and the skin conductance response. In: Gaillard,A.W.K., Ritter, W. (Eds.), Tutorials in ERP research—endogenous components.North-Holland, Amsterdam, pp. 177–199.

Sander, D., Grandjean, D., Scherer, K.R., 2005. A systems approach to appraisal mecha-nisms in emotion. Neural Netw. 18 (4), 317–352.

Scher, H., Furedy, J.J., Heslegrave, R.J., 1984. Phasic T-wave amplitude and heart ratechanges as indices of mental effort and task incentive. Psychophysiology 21,326–333.

252 A.-M. Brouwer et al. / International Journal of Psychophysiology 93 (2014) 242–252

Schneider, W., Fisk, A.D., 1982. Degree of consistent training: Improvements in searchperformance and automatic process developments. Percept. Psychophys. 31,160–168.

Task Force of the European Society of Cardiology the North American Society of PacingElectrophysiology, 1996. Heart rate variability: Standards of measurement, physio-logical interpretation and clinical use. Circulation 93, 1043–1065.

Taylor, G., Reinerman-Jones, L.E., Cosenzo, K., Nicholson, D., 2010. Comparison of multiplephysiological sensors to classify operator state in adaptive automation systems. Pro-ceedings of the 54nd Annual Meeting of the Human Factors and Ergonomics Society(HFES).

Tole, J.R., Stephens, A.T., Harris, R.L., Ephrath, A.R., 1982. Visual scanning behavior andmental workload in aircraft pilots. Aviat. Space Environ. Med. 53 (1), 54–61.

Tranel, D.T., Fisher, A.E., Fowles, D.C., 1982. Magnitude of incentive effects upon the heart.Psychophysiology 19, 514–519.

van Dijk, H., Schoffelen, J.M., Oostenveld, R., Jensen, O., 2008. Pre-stimulus oscillatory ac-tivity in the alpha band predicts visual discrimination ability. J. Neurosci. (28),1816–1823.

Veltman, J.A., 2002. A comparative study of psycho physiological reactions during simula-tor and real flight. Int. J. Aviat. Psychol. 12, 33–48.

Veltman, J.A., Gaillard, A.W.K., 1996. Physiological indices of workload in a simulatedflight task. Biol. Psychol. 42, 323–342.

Veltman, J.A., Gaillard, A.W.K., 1998. Physiological workload reactions to increasing levelsof task difficulty. Ergonomics 41, 656–669.

Verwey, W.B., Veltman, H.A., 1996. Detecting short periods of elevated workload: A com-parison of nine workload assessment techniques. J. Exp. Psychol. 2, 270–285.

Vogt, J., Adolph, L., Ayan, T., Udovic, A., Kastner, M., 2002. Stress in modern air traffic con-trol systems and potential influences on memory performance. J. Hum. FactorsAerosp. Saf. 2 (4), 355–378.

Vogt, J., Hagemann, T., Kastner, M., 2006. The impact of workload on heart rate and bloodpressure in en-route and tower air traffic control. J. Psychophysiol. 20, 297–314.

Welch, P.D., 1967. The use of fast Fourier transform for the estimation of power spectra: Amethod based on time averaging over short, modified periodograms. IEEE Trans.Audio Electroacoust. 15, 70–73.

Wientjes, C.J.E., 1992. Respiration in psychophysiology: Methods and applications. Biol.Psychol. 34 (2/3), 179–204.

Wilson, G.F., Russell, C.A., 2003. Operator functional state classification using multiple psy-chophysiological features in an air traffic control task. Hum. Factors 45 (3), 381–389.

Winton, W.M., Putnam, L.E., Krauss, R.M., 1984. Facial and autonomic manifestations ofthe dimensional structure of emotion. J. Exp. Soc. Psychol. 20, 195–216.

Zijlstra, F.R.H., 1993. Efficiency in work behaviour. A design approach for modern tools.(PhD thesis, Delft University of Technology) , Delft University Press, Delft, TheNetherlands.

Zimmerman, P.H., Bolhuis, J.E., Willemsen, A., Meyer, E.S., Noldus, L.P.J.J., 2009. The Ob-server XT: A tool for the integrations and synchronization of multimodal signals.Behav. Res. Methods 41, 731–735.