Long-term test-retest reliability of the P3 NoGo wave and two independent components decomposed from...

9
Long-term test-retest reliability of the P3 NoGo wave and two independent components decomposed from the P3 NoGo wave in a visual Go/NoGo task Jan Ferenc Brunner a,b, , Tor Ivar Hansen b , Alexander Olsen a,c , Toril Skandsen a,b , Asta Håberg b , Juri Kropotov d,e,f a Department of Physical medicine and rehabilitation, St.Olav Hospital, Trondheim University Hospital, Norway b Department of Neuroscience, Norwegian University of Science and Technology (NTNU), Trondheim, Norway c MI Lab and Department of Circulation and Medical Imaging (NTNU), Norway d Department of Psychology, NTNU, Trondheim, Norway e Institute of the Human Brain of Russian Academy of Sciences, St. Petersburg, Russian Federation f Andrzej Frycz Modrzewski Krakow University, Krakow, Poland abstract article info Article history: Received 2 November 2012 Received in revised form 31 May 2013 Accepted 4 June 2013 Available online 12 June 2013 Keywords: ERP Independent component analysis P3 NoGo Reliability The objective of the current study was to determine long-term test-retest reliability of the P3 NoGo wave as well as two independent components (IC P3 NoGo early and IC P3 NoGo late) decomposed from this wave by independent component analysis (ICA). For this purpose 19-channel EEG was recorded during a cued visual Go/NoGo task. First, spatial lters of the two independent components (ICs) were obtained by application of ICA to ERPs of 102 healthy adults. Second, in 26 individuals, ERPs were recorded from the same task a second time 618 months after the rst recording. Statistical analyses were performed on both the P3 NoGo waves and the activation curves of the two ICs from both recordings. Amplitude and latency were estimated by peakand fractional area(FA) methods for both the P3 NoGo wave and ICs. Intraclass correlations (ICC) for latency were excellent (ICC > .90) for both the P3 NoGo wave and the two ICs when measured with the FA method, and good (ICC > .75) for amplitude measured by both methods (FA or peak). We conclude that the long-term stability of P3 NoGo wave and the two IC makes them well suited for reliable assessment of cognitive control function in research and clinic. © 2013 Elsevier B.V. All rights reserved. 1. Introduction There is increased optimism for the application of ERPs for clinical purposes (Duncan et al., 2009; Dockree and Robertson, 2011; Folmer et al., 2011). However, to become a useful tool for clinical assessment, ERPs need to be valid and reliable at the individual level. Further- more, high test-retest reliability is important. It is suggested that for research purposes (i.e., exploring the differences between groups), test-retest reliability more than .80 is adequate, while for clinical assessment a test-retest reliability of .90 should be the minimum (Nunnally and Bernstein, 1994). Most of the previous studies on test-retest reliability in ERP research have focused on the P3b wave in the auditory oddball paradigm or the error-related negativity (ERN) in paradigms where errors are committed. Test-retest reliabil- ity varies in the different studies, from 0.50 to 0.86 for amplitude pa- rameters and from 0.40 to 0.88 for latency parameters (Segalowitz and Barnes, 1993; Mathalon et al., 2000; Sandman and Patterson, 2000; Walhovd and Fjell, 2002; Williams et al., 2005; Hall et al., 2006; Segalowitz et al., 2010; Cassidy et al., 2012) Test-retest reliability of ERPs depends on many factors, including uctuations in the functional state of the examinee and measurement errors (Segalowitz and Barnes, 1993). The measurement errors partly depend on the method for measuring latency and amplitude of the ERP wave. The most commonly used method is to measure the ampli- tude and latency of the ERP wave at its maximums or minimum (peak measurements). For ERPs without a clear peak (e.g. P300) a better technique is provided by the fractional area (FA) approach (Hansen and Hillyard, 1980; Kiesel et al., 2008; Kappenman and Luck, 2012). A high number of trials seem important for latency measurement, and averaging of more than 70 trials has been recommended for assessment of wave like P300 (Kiesel et al., 2008). The P300 wave (usually P3b/P3 Go) often does not show a clear onset and its peak latency can be difcult to determine, because the wave has a wide temporal extension without a sharp peak. The Go/NoGo task is commonly used to study cognitive control functions. There are different versions of this paradigm, all consisting of sequential and random presentation of stimuli that a person is instructed to respond to (Go stimuli) or not respond to (NoGo International Journal of Psychophysiology 89 (2013) 106114 Corresponding author at: Fysikalsk Medisin, Lian, Vådanveien 39, 7024 Trondheim, Norway. Tel.: +47 97713663; fax: +47 72822601. E-mail address: [email protected] (J.F. Brunner). 0167-8760/$ see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijpsycho.2013.06.005 Contents lists available at SciVerse ScienceDirect International Journal of Psychophysiology journal homepage: www.elsevier.com/locate/ijpsycho

Transcript of Long-term test-retest reliability of the P3 NoGo wave and two independent components decomposed from...

International Journal of Psychophysiology 89 (2013) 106–114

Contents lists available at SciVerse ScienceDirect

International Journal of Psychophysiology

j ourna l homepage: www.e lsev ie r .com/ locate / i jpsycho

Long-term test-retest reliability of the P3 NoGo wave and twoindependent components decomposed from the P3 NoGo wavein a visual Go/NoGo task

Jan Ferenc Brunner a,b,⁎, Tor Ivar Hansen b, Alexander Olsen a,c, Toril Skandsen a,b,Asta Håberg b, Juri Kropotov d,e,f

a Department of Physical medicine and rehabilitation, St.Olav Hospital, Trondheim University Hospital, Norwayb Department of Neuroscience, Norwegian University of Science and Technology (NTNU), Trondheim, Norwayc MI Lab and Department of Circulation and Medical Imaging (NTNU), Norwayd Department of Psychology, NTNU, Trondheim, Norwaye Institute of the Human Brain of Russian Academy of Sciences, St. Petersburg, Russian Federationf Andrzej Frycz Modrzewski Krakow University, Krakow, Poland

⁎ Corresponding author at: Fysikalsk Medisin, Lian, VåNorway. Tel.: +47 97713663; fax: +47 72822601.

E-mail address: [email protected] (J.F. Brunner)

0167-8760/$ – see front matter © 2013 Elsevier B.V. Allhttp://dx.doi.org/10.1016/j.ijpsycho.2013.06.005

a b s t r a c t

a r t i c l e i n f o

Article history:Received 2 November 2012Received in revised form 31 May 2013Accepted 4 June 2013Available online 12 June 2013

Keywords:ERPIndependent component analysisP3 NoGoReliability

The objective of the current study was to determine long-term test-retest reliability of the P3 NoGo wave aswell as two independent components (IC P3 NoGo early and IC P3 NoGo late) decomposed from this wave byindependent component analysis (ICA). For this purpose 19-channel EEG was recorded during a cued visualGo/NoGo task. First, spatial filters of the two independent components (ICs) were obtained by application ofICA to ERPs of 102 healthy adults. Second, in 26 individuals, ERPs were recorded from the same task a secondtime 6–18 months after the first recording. Statistical analyses were performed on both the P3 NoGo wavesand the activation curves of the two ICs from both recordings. Amplitude and latency were estimated by“peak” and “fractional area” (FA) methods for both the P3 NoGo wave and ICs. Intraclass correlations (ICC)for latency were excellent (ICC > .90) for both the P3 NoGo wave and the two ICs when measured withthe FA method, and good (ICC > .75) for amplitude measured by both methods (FA or peak). We concludethat the long-term stability of P3 NoGo wave and the two IC makes them well suited for reliable assessmentof cognitive control function in research and clinic.

© 2013 Elsevier B.V. All rights reserved.

1. Introduction

There is increased optimism for the application of ERPs for clinicalpurposes (Duncan et al., 2009; Dockree and Robertson, 2011; Folmeret al., 2011). However, to become a useful tool for clinical assessment,ERPs need to be valid and reliable at the individual level. Further-more, high test-retest reliability is important. It is suggested that forresearch purposes (i.e., exploring the differences between groups),test-retest reliability more than .80 is adequate, while for clinicalassessment a test-retest reliability of .90 should be the minimum(Nunnally and Bernstein, 1994). Most of the previous studies ontest-retest reliability in ERP research have focused on the P3b wavein the auditory oddball paradigm or the error-related negativity(ERN) in paradigms where errors are committed. Test-retest reliabil-ity varies in the different studies, from 0.50 to 0.86 for amplitude pa-rameters and from 0.40 to 0.88 for latency parameters (Segalowitzand Barnes, 1993; Mathalon et al., 2000; Sandman and Patterson,

danveien 39, 7024 Trondheim,

.

rights reserved.

2000; Walhovd and Fjell, 2002; Williams et al., 2005; Hall et al.,2006; Segalowitz et al., 2010; Cassidy et al., 2012)

Test-retest reliability of ERPs depends on many factors, includingfluctuations in the functional state of the examinee and measurementerrors (Segalowitz and Barnes, 1993). The measurement errors partlydepend on the method for measuring latency and amplitude of theERP wave. The most commonly used method is to measure the ampli-tude and latency of the ERP wave at its maximums or minimum (peakmeasurements). For ERPs without a clear peak (e.g. P300) a bettertechnique is provided by the fractional area (FA) approach (Hansenand Hillyard, 1980; Kiesel et al., 2008; Kappenman and Luck, 2012).A high number of trials seem important for latency measurement,and averaging of more than 70 trials has been recommended forassessment of wave like P300 (Kiesel et al., 2008). The P300 wave(usually P3b/P3 Go) often does not show a clear onset and its peaklatency can be difficult to determine, because the wave has a widetemporal extension without a sharp peak.

The Go/NoGo task is commonly used to study cognitive controlfunctions. There are different versions of this paradigm, all consistingof sequential and random presentation of stimuli that a person isinstructed to respond to (Go stimuli) or not respond to (NoGo

107J.F. Brunner et al. / International Journal of Psychophysiology 89 (2013) 106–114

stimuli). If electroencephalogram (EEG) is recorded during this task,event related potentials (ERPs) can be computed by averaging EEGin time locked periods for Go and NoGo stimuli. The NoGo stimulievoke a positive P3 wave (P3 NoGo) which differs both in latencyand topography from the P3 wave in the Go condition (Pfefferbaumet al., 1985; Falkenstein et al., 1999). To our knowledge only twostudies (Fallgatter et al., 2001, 2002) have reported test-retest reli-ability of the P3 NoGo wave in a Go-NoGo task.

There is no consensus regarding the functional correlate of the P3NoGo wave. It has been associated with inhibition (Fallgatter andStrik, 1999; Kok et al., 2004) and evaluation of outcome of an inhibi-tory process (Bruin et al., 2001; Liotti et al., 2005; Schmajuk et al.,2006). It is unlikely that the P3 NoGo wave is an inhibitory operationbecause of its late appearance (usually later than 300 ms), but it maybe regarded as a conscious decision to withhold a response (van Gaalet al., 2011; Gajewski and Falkenstein, 2013).

The difficulty in defining the functional correlate of the P3 NoGowave could be partly explained by its multiple generators. ERPwaves are regarded as the sum of multiple sources generated indifferent locations and associated with different neural processes(Kappenman and Luck, 2012). Different attempts have been madein order to separate these multiple sources making up the ERPwaves, including the use of principal - and independent componentanalysis (ICA) (Makeig et al., 1996).

There are at least three different approaches using ICA in ERP re-search. The input data for the first approach are single-trial EEGepochs from a given individual (Makeig et al., 2004; Debener et al.,2005; Zeman et al., 2007; De Vos et al., 2012) or from a group ofindividuals (Mehta et al., 2009; Eichele et al., 2011). The input datafor the second method are averaged ERPs recorded in response tomany stimulus types and many task conditions (Makeig et al.,1999). In the third method the input data are averaged ERPs recordedin a few task conditions, but in many individuals (Olbrich et al., 2005;Kropotov and Ponomarev, 2009; Liu et al., 2009). In our previous studiesthe third approach has been applied to ERPs elicited in the Go/NoGoparadigm, showing that the P3 NoGo wave can be decomposed intotwo independent components (IC P3 NoGo early and IC P3 NoGo late).These ICs have different latencies, different topographies and differentfunctional meaning (Kropotov et al., 2011).

The primary goal of the present study was to estimate and com-pare the test–retest reliability of the P3 NoGo ICs and the P3 NoGowave. We also wanted to compare two scoring methods; peak andFA for both latency and amplitude of the ERPs, as well as investigatethe effect of number of trials on test-retest reliability.

2. Materials and Methods

2.1. Participants

In order to identify the ICs of the P3 NoGo wave a large number ofindividual ERPs are needed. In the present study a total of 102 healthyadults were recruited by advertising among family and friends ofpatients with head injury, and among acquaintances of researchersand staff. EEG was recorded from this group during a Go/NoGoparadigm and ERPs computed. ICA decomposition identified the twoICs (IC P3 NoGo early and IC P3 NoGo late) and their spatial filtersobtained.

In order to investigate test-retest reliability, 27 individuals fromthe group of 102 were recorded a second time (Time 2) during thesame Go/NoGo paradigm, with a test-retest interval of 6 to 18 months(median: 10 months, interquartile range (IQR): 7–13). One person wasexcluded on the basis of EEG artifacts in the second recording, leaving26 individuals (10 females, median: 27.5 years, IQR: 22–46) for thefinal test-retest analysis. None of the participants were excludedon the basis of task performance. The individual ICs used in the

test-retest analysis were obtained by spatial filtration using theunmixing matrix acquired from ICA on the 102 ERPs.

The study was approved by The Regional Committee for MedicalResearch Ethics. Participants gave their written consent prior toparticipation in the study.

2.2. EEG Recording

EEG was recorded with a 21-channel EEG system, manufacturedby Mitsar, (Ltd. http://www.mitsarmedical.com), with a 19-channelelectrode cap (Electro-cap) with tin electrodes that included Fz, Cz,Pz, Fp1/2, F3/4, F7/8,T3/4, T5/6, C3/4, P3/4, and O1/2. The cap wasplaced on the scalp according to the standard 10–20 system. Elec-trodes were referenced to linked earlobes and the input signalswere sampled at a rate of 250 Hz (bandpass0.3–50 Hz). The groundelectrode was placed on the forehead. Impedance was kept below5 kΩ. The participants sat upright in a comfortable chair lookingat a 17 inches computer screen 1.5 meters in front of them. Allrecordings were performed by the first author. ERP waveformswere computed offline in the common average montage. Trialswith omission and commission errors were automatically excludedfrom averaging.

2.3. Behavioral Task

The cued Go/NoGo task consisted of 400 trials sequentiallypresented to the participant every 3 s. Three categories of visual stimuliwere used: 1) 20 different images of animals— referred to later as A, 2)20 different images of plants P, 3) 20 different images of people ofdifferent professions (presented together with an distracting “novel”sound) referred to as H. The trials consisted of presentations of pairsof stimuli with inter-stimulus intervals of 1000 ms. Duration of eachstimulus presentation was 100 ms. Four categories of trials were used:A–A, A–P, P–P, and P–H. In the trials with A–A and P–P pairs the twopictures were identical. The trials were grouped into four blocks withone hundred trials each. In each block a unique set of five A stimuli,five P and five H stimuli was selected. Each block consisted of apseudo-random presentation of 100 pairs of stimuli with equal proba-bility for each category and each trial category. The participants wereinstructed to press a button with the right hand to all A–A pairs(prime and target) as fast as possible. This was the Go-condition. A–Ppairs represented theNoGo condition, inwhich the person shouldwith-hold from responding. For theP–P andP–H it was assumed that thefirststimuli would signal that no preparation for actionwas needed and thatthe trial could be ignored. Participants performed 10 trials without re-cording to ensure that they understood the instructions. They restedfor a few minutes after completing 100 trials. Stimuli occupied about3.8° of the visual field around the center of the screen. Visual stimulihad similar 2D sizes and luminosities.

Mean reaction time (RT) with a standard deviation (SD) of RTwas calculated across trials for each participant. Omission- (notpressing the button to Go trials) and commission errors (pressingthe button to NoGo trials) were also computed for each participantseparately. A response was considered correct if it occurred inrelation to the appropriate second stimulus and took place duringthe time interval from 200 to 1000 ms after the second stimulipresentation.

2.4. Artifact Correction Procedures

Eye blink artifacts were corrected by zeroing the activation curvesof individual independent components corresponding to eye blinks.These components were obtained by application of ICA to the rawEEG fragments (Vigario, 1997; Jung et al., 2000). Epochs with exces-sive absolute amplitude of filtered EEG and/or excessive faster and/orslower frequency activity were automatically marked and excluded

108 J.F. Brunner et al. / International Journal of Psychophysiology 89 (2013) 106–114

from further analysis. Exclusion thresholds were set as follows:(1) 100 μV for non-filtered EEG; (2) 50 μV for slow waves in 0–1 Hzband; and (3) 35 μV for fast waves filtered in the band 20–35 Hz. In ad-dition, recordings were inspected visually and remaining artifacts wereexcluded. After application of ICA for correcting eyemovement artifactswe used fixed thresholds for high and low frequency activity in order toeliminate nonmovement artifacts. In the first recording (Time 1) of the26 individuals the average number of artifact free trials used forcomputing the ERPs was 92.3 (SD = 7.5) (min. 68, max. 100) for theGo condition and 93 (SD = 5.6) (min. 67, max. 100) for the NoGo con-dition. The average number of artifact free trials in second recording(Time 2) was 93 (SD = 6) for the Go condition (min. 75, max 100)and 94 (SD = 5) for the NoGo condition (min. 83, max 100).

Fig. 1. Illustration of the Fraction Area (FA) method for estimating amplitude and latencyof the Grand average P3NoGowave. First, in a selected timewindow (230–480 ms for theP3 NoGo wave) the amplitude difference betweenminimum (min) and maximum (max)is measured. Second, the fractional area (FA) is defined as the time interval in which thepotential exceeds 50% of the max–min amplitude (grey area). Third, the amplitude ofthe wave is measured as the mean amplitude on the FA. Fourth, the latency of the waveis measured as the median in the FA.

2.5. Decomposition of Collection of ERPs Into Independent Components

To get valid ICs the number of training points is essential (Onton andMakeig, 2006). If ICA is applied to collections of ERPs at least 100 individ-uals are required to obtain a robust decomposition (Kropotov et al.,2011). In the present study, ICA (Infomax algorithm) was performedon a two-dimensional matrix (19 scalp locations × ERP time seriesconcatenated for the group of 102 adults). The 700 ms interval afterthe second stimulus in the two conditions (Go andNoGo)with samplingrate 250 samples/second was selected. No re-sampling was done.Consequently, the matrix included 19 rows and 2 × 250 × 0.7 × 102columns, thus giving 35,700 time points.

Assumptions underlying the application of ICA to individual ERPsare as follows: (1) summation of the electric currents induced by sepa-rate generators is linear at the scalp electrodes; and (2) spatial distribu-tion of components' generators remains fixed across time (Ontonand Makeig, 2006). The ICA method used in the present study wasimplemented in the analysis software described in Kropotov (2009).Topographies of the ICs are presented as topographic maps, whiletime courses of the components (also called “activation time courses”)are presented as graphicswith time corresponding to the x-axis. Spatialfilters were obtained and applied to individual ERPs in order to estimatethe corresponding components in a single individual.

2.6. Measurement of Latency and Amplitude of ERP Waves/Components

Two methods were used to measure amplitude and latency of theERPs. First, the conventional peak measurement method defines la-tency and amplitude at the time point where the potential reachedits maximum. The second method was a relative criterion version ofthe fractional area approach (FA). In the FA approach the onset ofthe waveformwas defined by the time point where the amplitude ex-ceeds 50% of peak to peak amplitude and the offset was set to thetime point where the amplitude reached the same level as at onset(Fig. 1). Then the amplitude of the ERP was measured as the meanamplitude of the FA, and latency as the median in the FA. We limitedthe total time window for calculation of FA as follows: 230–480 msfor the P3 NoGo wave, 230–410 ms for the IC P3 NoGo early and270–480 ms for the IC P3 NoGo late. If the value of the onset ampli-tude did not match the off-set value, the time point with the closestamplitude value (from max peak to the end of the time window)was chosen as the off-set time for calculation of FA.

Scripts were made in MATLAB (version R2010A) for detectingpeak latency/amplitude and calculating fractional area (FA) latencyand amplitude for the NoGo wave and both ICs. Latency estimates de-pend on the sampling rate of the EEG. To estimate the FA as preciselyas possible (especially for ERPs with steep slopes), measures were lin-early interpolated to a time resolution of 1 ms. The interpolatedcurves were used for detecting or measuring all parameters of latencyand amplitude on both the P3 NoGo wave and the ICs.

2.7. Statistics

All ERP and behavioral data were analyzedwith IBM SPSS 19.0. Test–retest reliability was calculated by intraclass correlation (ICC) methodusing a two-waysmixed effectmodel with the setting of absolute agree-ment in SPSS. In contrast to Pearson's correlation, ICC takes into consid-eration systematic errors that might influence the measurements ofvariables (Shrout and Fleiss, 1979). There are different models that tryto account for where the error or random effect might be comingfrom. A two-way mixed model (model 3 in the referred article fromShrout and Fleiss) was chosen because the error variance in measuringand scoring the ERPs was considered minimal, and the error varianceof the observations (ERPs) was unknown. Absolute agreement ICCmea-sures to what degree the actual values are similar, although they mightbe linearly related (McGrawandWong, 1996). Absolute agreement thusrepresents a more conservative estimation of reliability than Pearsoncorrelations. Portney and Watkins (1999) classification of levels of reli-ability was applied; ICC below .50 was considered poor, moderatefrom .50 to .75, good from .75 to .90 and excellent when higher than.90. Test–retest reliability analyses were performed on both methodsof scoring, but only FA values were used in other statistical analysis.

Statistical assessment of difference in SD of RT between Time 1and Time 2 was analyzed using One-way ANOVA and Kruskal–WallisTest, while difference in behavioral and ERP parameters between thetwo recordings was performed using two-tailed Paired-Sample T-Testand two-tailed Wilcoxon sign-rank Test for normal and non-normaldistributed data respectively.

Difference scores (Δ) were calculated for behavioral- (Δ RT, Δ SD)and ERP parameters (Δ P3 NoGo wave, Δ IC P3 NoGo early and Δ ICP3 NoGo late) by subtracting the values in the second recording fromthe first recording (Time 2–Time 1). For investigation of possible effectsof gender and age we used the independent sample T-test and theMann–Whitney U test. We also performed correlation analyses (para-metric or non-parametric as appropriate) between behavioral variablesand ERPs in each recording.

2.8. Effect of Number of Trials

To investigate the effect of number of trials on the test–retest reli-ability of the ERPs, we performed a similar analysis on the selection of

109J.F. Brunner et al. / International Journal of Psychophysiology 89 (2013) 106–114

the first 50 trials of each ERP from the two recordings as we did withthe complete number of trials (90 trials in average).

3. Results

3.1. Decomposition of the P3 NoGo wave

The grand average P3 NoGo wave at Cz and Fz electrodes from the102 healthy adults is shown in Fig. 2a. The P3 NoGo wave wasdecomposed in two ICs; IC NoGo early and IC NoGo late. The peak la-tency of IC P3 NoGo early was 328 ms and centrally distributed. Thepeak latency of IC P3 NoGo late was 386 ms and fronto-centrallydistributed. The ICs were back-projected into the Cz electrode bythe corresponding spatial filtration to reconstruct the correspondingactivation in original data space (for more detail see Kropotov andPonomarev, 2009). The grand averages of the two ICs at Cz arepresented in Fig. 2b.

The grand average P3 NoGo waves and the two back-projected ICsfrom 26 individuals in Time 1 and Time 2 are presented in Fig. 3.

Fig. 2. Decomposition of the P3 NoGo wave into two independent components (ICs) fromaveraged for 102 healthy adults. (b) Grand average of the IC P3 NoGo early (blue line) anP3 NoGo wave and the two ICs at their peak latencies.

3.2. Test–Retest Reliability

The test-retest reliability for RT was good (ICC = .86) despite asignificant difference in reaction time from the Time 1 to Time 2.For SD of RT the test–retest reliability was moderate (ICC = .62).

For the P3 NoGo wave both methods of measuring latency pro-duced close to excellent test-retest reliability (ICC > .89). The latencyof P3 NoGo early had excellent test–retest reliability (ICC > .96)when measured with the FA method. Also, by using the peak methodthe ICC value was good (ICC > .86). The variance of the latency differ-ence between the recordings was reduced almost to the half by use ofthe FA method for the IC P3 NoGo early (Table 1). For IC P3NoGo latelatency the ICC value increased from .79 to .91 with the use of FA, ascompared to the peak method. Moreover, for this component the var-iance of the difference in latency was also strongly reduced (Table 1).

Both the peak and the FA method for measuring amplitude of thewave and ICs produced about the same good (ICC > .80) test–retestreliability. This was also the case for the amplitude of the P3 NoGowave and the IC P3 NoGo early, despite the significant differencebetween the two recordings (p b .05).

102 individuals. (a) The P3 NoGo wave (black line) from midline electrode Fz and Czd the IC P3 NoGo late (red line) back-projected to Fz and Cz. (d) Topographies of the

Fig. 3. Grand average P3 NoGo wave and independent components (ICs) at test retest for 26 individuals (green: Time 1, and red: Time 2). (a) Grand average of P3 NoGo wave frommidline electrode Fz and Cz averaged for 26 subjects. (b) Grand average of IC P3 NoGo early back-projected to Cz electrode. (c) Grand average of IC P3 NoGo late back-projected toCz electrode. The time windows that set the limits for defining the FA are shown for each ERP.

110 J.F. Brunner et al. / International Journal of Psychophysiology 89 (2013) 106–114

In Fig. 4 the P3 NoGo wave, IC P3 NoGo early and IC P3 NoGo lateare shown for each individual in the two recordings.

3.3. Comparing Time 1 and Time 2

The mean number of artifact free trials in Go and NoGo conditionsdid not differ significantly between Time 1 and Time 2. On both record-ings the participants performed accuratelywith no significant differencein omission or commissions errors (See Table 1). For the Go conditionthe participants responded significantly faster (p b .05) in Time 2 com-paredwith Time 1. Therewas a significant negative correlation between

Table 1Test–retest reliability assessed with intraclass correlation coefficient (ICC) for parameters oMeans/median and standard deviations (SD)/inter quartile range (IQR): (25%–75%) of the parampresented in the first three columns. Two-tailed paired-sample T-test and two-tailed WilcoxonFA = fractional area (see Method for details).

n Time 1 mean (SD)/median (IQR) Time 2 mean (SD)/m

Behavioral parametersReaction time(RT) ms 26 314 (299–363) 307 (286–342)Standard deviation RT (SD) 26 63 (17) 55 (14)Omissions 26 0 (0–1) 0 (0–1)Commissions 26 0 (0–1.25) 0.5 (0–1)

Amplitude (μV)P3 NoGo wave peak 26 12.6 (3.4) 13.7 (4.2)P3 NoGo Wave FA 26 9.8 (2.9) 10.7 (3.6)IC P3 NoGo early peak 26 8.1 (4.4) 9.2 (4.7)IC P3 NoGo early FA 26 6.5 (3.8) 7.4 (3.9)IC P3 NoGo late peak 26 6.7 (2.4) 6.9 (2.8)IC P3 NoGo late FA 26 5.2 (2.3) 5.5 (2.5)

Latency (ms)P3 NoGo wave peak 26 356 (30) 356 (30)P3 NoGo wave FA 26 355 (25) 355 (23)IC P3 NoGo early peak 26 328 (25) 328 (24)IC P3 NoGo early FA 26 327 (26) 327 (26)IC P3 NoGo late peak 26 385 (22) 387 (23)IC P3 NoGo late FA 26 390 (20) 390 (21)

⁎ p b 0.05 level.⁎⁎ p b 0.01 level.

⁎⁎⁎ p b 0.001 level.

RT and commission errors in both recordings (rho = − .63, p b .001 inTime 1 and rho = − .64, p b .001 in Time 2).

The latency of the P3 NoGo wave correlated significantly with RT inboth recordings (Time 1: rho = .65, p b .001 and Time 2: rho = .69,p b .001). The latency for both ICs also correlated significantly with RTtime in both recordings (IC P3 NoGo-early: Time 1: rho = .55,p b .003 and Time 2: rho = .45, p b .02, IC P3 NoGo late: Time 1:rho = .63, p b .001 and Time 2: rho = .44, p b .02). The amplitudes ofP3 NoGo wave or ICs were not significantly correlated with RT in anyof the recordings. Therewas a strong significant correlation between la-tencies for the two ICs (Time 1: r = .82, p b .001, Time 2: r = .77,p b .001). In all participants the latency of IC P3 NoGo early was shorter

f behavior and ERPs at Time 1 and Time 2 (n = 26).eters for the two recordings and the difference betweenparameters (Time 2 - Time1) aresign-rank test were used. Significant differences are marked by ⁎.

edian (IQR) Time 2– Time 1 diff. in mean (SD)/median (IQR) Single measure ICC

7 (−1.7–21) ⁎ 0.86 ⁎⁎⁎

−8 (14) 0.53 ⁎⁎⁎

0 NA0.5 NA

1.3 (2.2) ⁎ 0.81 ⁎⁎⁎

0.9 (1.9) ⁎ 0.80 ⁎⁎⁎

1.0 (2.4) ⁎ 0.85 ⁎⁎⁎

0.9 (2.1) ⁎ 0.84 ⁎⁎⁎

0.2 (1.7) 0.80 ⁎⁎⁎

0.2 (1.4) 0.82 ⁎⁎⁎

0 (14) 0.90 ⁎⁎⁎

0 (11) 0.89 ⁎⁎⁎

0.8 (13) 0.86 ⁎⁎⁎

-0.4 (7) 0.96 ⁎⁎⁎

1.3 (14) 0.79 ⁎⁎⁎

−0.8 (8)⁎⁎ 0.91 ⁎⁎⁎

Fig. 4. Individual P3 NoGo waves recorded at Cz and the two independent components (ICs) back-projected to Cz in Time 1 (green) and Time 2 (red). Age for each individual ispresented below the graphs.

111J.F. Brunner et al. / International Journal of Psychophysiology 89 (2013) 106–114

than latency of IC P3 NoGo late. There were no significant correlationsbetween amplitudes of the two ICs within each recording. Betweenthe two recordings the correlation for latency between IC P3 NoGoearly Time 1 and IC P3 NoGo late Time 2 was r = .74 (p b .001) andIC P3 NoGo late Time 1 and IC P3 NoGo early Time 2 was r = .81(p b .001). The correlation for amplitude between IC P3 NoGo earlyTime 1 and IC P3 NoGo late Time 2 was r = − .29 (NS) and IC P3NoGo late Time 1 and IC P3 NoGo early Time 2 was r = − .18 (NS).

The amplitude of the P3 NoGo wave and the IC P3 NoGo early wassignificantly higher (p b .05) in Time 2 compared to Time 1 (Table 1),regardless of the method for measuring amplitude.

3.4. Difference scores (Time 2–Time 1)

The difference score for IC P3 NoGo early amplitude (Δ IC P3NoGo) was significantly negatively correlated with Δ RT (r = − .61,p b .001). Otherwise no significant correlations were found betweendifference scores of behavioral- and ERP parameters. The Δ IC P3NoGo early latency and Δ IC P3 NoGo late latency correlated signifi-cantly positively (r = .42, p b .05). The Δ IC P3 NoGo early amplitudeand the Δ IC P3 NoGo late amplitude correlated significantly negatively(r = − .43, p b .02).

Age was not significantly correlated with difference scores ofbehavioral- and ERP parameters. Nor were there any significant cor-relation between time to retest and difference scores of behavioral-and ERP parameters. We also analyzed possible effects of gender(10 females, 16 male) on difference scores, but no significant effectswere found.

3.5. The Effect of Number of Trials

Analysis of the first 50 trials showed a reduction of test–retest re-liability in RT (ICC = .76) and SD of RT (ICC = .36), while ICC's for

the different ERPs were more or less identical to the ICC's obtainedusing the complete number of trials (average 90 trials) (see TableA1 in appendix). The correlation between Δ IC P3 NoGo and Δ RTwas reduced for first 50 trials (rho = −44, p b .02), compared thecorrelation for the complete number of trials (r = − .61, p b .001).

4. Discussion

This study estimated long-term test-retest reliability of ICA de-rived ERP components obtained during a cued visual Go/NoGo para-digm. Our data show that the two ICs of P3 NoGo (early and late)have good (ICC > .75) test–retest reliability for amplitude and excel-lent (ICC > .90) test-retest reliability for latency. These results arecomparable to the test-retest reliability coefficients for the P3 NoGowave in the present study. The high test-retest reliability estimatesfor the latencies of both the P3 NoGo wave and the early and lateICs are supportive for use as a trait index in both experimental studiesand the clinic. To document the reliability of these ICs is an importantstep for further investigations of the functional meanings of the P3NoGo wave. A low reliability would compromise the validity ofinferences drawn from these investigations. In the present study wefound that test–retest reliability for latency of the ICs was higherthan for amplitude. Our data are in line with previous test–retest re-liability studies of ERP waves reporting superior reliability for latency(Joost et al., 1992; Segalowitz and Barnes, 1993; Hall et al., 2006;Sarnthein et al., 2009), and support the assumption that latency, ascompared to amplitude might be less susceptible to fluctuations inindividual state.

4.1. Limitation of the ICA Decomposition

It should be stressed that a critical issue for reliability of all ICAmethods is the number of training points. This number must be

112 J.F. Brunner et al. / International Journal of Psychophysiology 89 (2013) 106–114

high enough to reliably find N2 (where N is number of electrodes)weights of the mixing matrix. Some experimental evidence suggeststhat the number of training points must be considerably larger than20 N2 (Onton and Makeig, 2006). By applying ICA on a collection ofERPs from many individuals this requirement can be met. Withmore than 100 individuals the topographies and latencies of the ICsfound in the present study were similar to those obtained in our pre-vious studies (Kropotov and Ponomarev, 2009; Kropotov et al., 2011).By using spatial filters defined from a large collection of ERPs forback-projecting individual ERPs to the components, we implicitlypresume that topographies of the ICs do not change from one individ-ual to another. This assumption is still not documented in any studyand hence represents a limitation.

4.2. Comparison with Previous Test–Retest Reliability Studies of P3 NoGoWave

The test–retest reliability of P3 NoGo wave in our study can tosome degree be compared with two studies of reliability of the P3NoGo wave by Fallgatter et al. (2001, 2002). In their first study theymeasured test-alternate retest of the P3 NoGo wave in two versionsof a visual cued CPT task and found excellent ICC values for peakamplitude (ICC = .92), but moderate ICC values (ICC = .75) forpeak latency. In the second study, they used the same cued CPT tasktwice and found good test–retest reliability for both peak amplitude(ICC = .87) and peak latency (ICC = .74). The ICC model, types,and measures used are not specified in their second of test-retestreliability study, so exact comparison is not possible. Specification ofICC measurements is generally not explicitly stated in studies oftest-retest reliability of ERPs, as is in other areas of research (Lee etal., 2012). This limits the possibility for valid comparisons. Additionallimitations for comparison with the studies of Fallgatters et al. are thenumber of trials for averaging the P3 NoGo wave, different Go/NoGoparadigm, different sample size and age of the participants.

4.3. Other Factors Affecting the Test-Retest Reliability

The present study demonstrated that the method for measuringERP latency affects test-retest reliability. The excellent test-retestreliability (ICC > .91) was found for the ICs only by using the FA mea-surement for latency. The advantage of the FA method compared topeak measurement for the ICs was also shown by the 40% reductionof variance for the difference in latency between the two recordings(Table 1). This supports the findings by Kiesel et al. (2008) of theadvantage of the fraction area method for measuring latency of P3waves. The FA method for measuring amplitude did not result inhigher ICC for any of the ERPs compared to peak measurement.

Another factor that might affect the test–retest reliability is thenumber of trials included in the ERPs. Reducing the number of trialto 50 did not affect the test–retest reliability in our data (for compar-isons see Tables 1 and A1 in the appendix). The recommendationsfrom Kiesel et al. (2008) to have more than 70 trials for P3 wavesdoes not seem to apply for the P3 NoGo wave. The reason for thismight be the canonical shape of the P3 NoGo wave that most likelymakes latency determination easier.

The effect of number of trials for stable amplitude measurementsfor other ERP waves than the P3 NoGo wave are more investigated(Cohen and Polich, 1997; Olvet and Hajcak, 2009; Larson et al.,2010). It seems from these studies that a quite low number of trials(less than 20) are needed. Although a high number of trials increasethe signal-to-noise ratio and most likely improve reliability, toomany trials can produce fatigue, drowsiness or decreased motivation.A decrease of amplitude and increase of latency have been documentedunder these conditions for the P3 NoGo wave (Kato et al., 2009). In thepresent study we tried to avoid fatigue by giving the participantssome rest every 5 min. On the other hand, decision-making under

time pressure as well as preparation for a Go-response by a cue havebeen associated with a stronger signal of P3 NoGo wave (Smith et al.,2006, 2007). Test-retest-reliability for amplitudes of the first 50 NoGotrials were highly similar to test-retest reliability for amplitudes of thecomplete number of trials. For future research using the P3NoGopoten-tial (s) 50 trials seems to be sufficient to include in the grandmeanwithregard to the temporal stability.

An additional challenge is to make the retest as similar to the testas possible. Comparing behavioral performance for each individual inthe two recordings gives an indication of the degree of similarity, atleast concerning the degree of involvement and effort. In the presentstudy a significant reduction of RT from Time 1 to Time 2 was found. Itis not an uncommon observation in test–retest reliability studies thatparticipants improve their RT when performing the same task thesecond time (Fallgatter et al., 2002; Kropp et al., 2000; Larson et al.,2010).

There are several potential explanations for this, such as familiar-ity in the testing situation, learning effects or state dependent factorsthat were not controlled for. Reaction time can also easily be reducedby instructing the participants to emphasize speed. This has beendone in a Go/NoGo task with the effect of a decreased RT for Go stim-uli by 81 ms and increased amplitude of N2 NoGo (Band et al., 2003).The effect of speeded instructions on ERPs is also reported for ampli-tude of P3b in the auditory oddball task (Carrillo-de-la-Peña andCadaveira, 2000). A negative correlation between the amplitude ofP3 NoGo wave and RT for Go stimuli is reported in several studies(Smith et al., 2006; Nakata et al., 2012). A strong negative significantcorrelation between the Δ IC P3 NoGo early amplitude and the Δ RT inthe present study support this relationship. Regarding the assessmentof test-retest reliability of the IC P3 NoGo early amplitude this pointsto the importance of emphasizing both speed and accuracy equallythe instructions to the participants at both time points for recording.It is a limitation of the present study that we made no assessment(e.g. by using a self report questionnaire) of what the participantsemphasized the most (i.e., speed or accuracy) during the task. Neitherdid we have any formal control of their experience of attentionaleffort nor fatigue after each recording. This information could haveadded valuable information in explaining some of the variability ofthe behavioral as well as ERP data.

The time interval between recordings can also hypotheticallyaffect test-retest reliability. A very short interval increases the carryover effects due to practice and very long term interval increasesthe likelihood of change in mental status. In our study the timebetween the two recordings varied from 6 to 18 months. As otherstudies of test-retest reliability of ERPs (Lew et al., 2007) haveshown, we did not find any effect of time to retest on any of thedifference scores on ERPs, implying that these ERPs can be measuredreliably over retest intervals ranging from 6 to 18 months.

4.4. Functional Meaning of the ICs

The inconsistency regarding the functional meaning of the P3 NoGowave in the previous studies (inhibition vs. outcomemonitoring)mightbe explained by the fact that the wave can be decomposed into at leasttwo functionally distinct independent components (Kropotov et al.,2011). The present study support the association between the ampli-tude of IC P3NoGo early and inhibitory processes, by the strong negativecorrelation between change in amplitude of IC P3 NoGo early andchange in RT. Faster responses are more difficult to inhibit than slowerresponses (significant negative correlation betweenRT and commissionerrors) and additional effort is needed. The IC P3NoGo early has beensource-localized to the pre-supplementary area/supplementary area(Kropotov et al., 2011). This area is considered to be part of a controlnetwork involved in inhibition and reprogramming of action underconflict (Nachev et al., 2007; Coxon et al., 2009; Isoda and Hikosaka,2011), and specifically to exert a facilitatory influence over motor

113J.F. Brunner et al. / International Journal of Psychophysiology 89 (2013) 106–114

processes in decision-making (Forstmann et al., 2008; Neubert et al.,2010; Scangos and Stuphorn, 2010). In this framework the amplitudeof the IC P3 NoGo early might be one of the candidates to index thecapacity for conscious attentional effort (Sarter et al., 2006) or energiza-tion (Stuss et al., 2005) that has been mobilized during inhibition.

The IC P3 NoGo late has been source-localized to the anteriorcingulate area (Kropotov et al., 2011), an area often associated withmonitoring of actions (Gehring et al., 2012). Interestingly, we founda significant negative correlation between the Δ IC P3 NoGo earlyamplitude and the Δ IC P3 NoGo late amplitude. This might indicatethat the two ICs represent dynamic processes involved in a tradeoffbetween speed and accuracy, where the IC P3 NoGo late is relatedto accuracy or error monitoring. The inverse correlation betweencommission errors and mean RT in both recordings must likelyreflected the behavioral tradeoff between speed and accuracy. If thisis the case both hypotheses regarding the functional meaning of theP3 NoGo wave (inhibition and evaluation of outcome) are supportedby the two independent components.

In the present study the number of errors that each individualmade was too small for further investigating this hypothesis, butour future research will focus on this issue.

5. Conclusions

The P3 NoGo wave and the two ICs of P3 NoGo (early and late)extracted by ICA (Independent Component Analysis), demonstratedgood (ICC > .75) test-retest reliability for amplitude and excellent(ICC > .90) test–retest reliability for latency. Both the P3 NoGowave and the two independent components represents highly stablelong-term neurophysiological indexes of cognitive control in healthyparticipants. For latency measurement both the ICs as well as the P3NoGo wave, the test–retest reliability reached into the acceptablerange for clinical application.

Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.ijpsycho.2013.06.005.

Acknowledgements

This work was funded by Liaison Committee between the CentralNorway Regional Authority (RHA) and the Norwegian University ofScience and Technology (NTNU).

References

Band, G.P.H., Ridderinkhof, K.R., van der Molen, M.W., 2003. Speed-accuracy modulationin case of conflict: the roles of activation and inhibition. Psychological Research 67,266–279.

Bruin, K.J., Wijers, A.A., van Staveren, A.S., 2001. Response priming in a go/nogo task:do we have to explain the go/nogo N2 effect in terms of response activationinstead of inhibition? Clinical Neurophysiology 112, 1660–1671.

Carrillo-de-la-Peña, M.T., Cadaveira, F., 2000. The effect of motivational instructions onP300 amplitude. Neurophysiologie Clinique 30, 232–239.

Cassidy, S.M., Robertson, I.H., O’Connell, R.G., 2012. Retest reliability of event-relatedpotentials: evidence from a variety of paradigms. Psychophysiology 49, 659–664.

Cohen, J., Polich, J., 1997. On the number of trials needed for P300. International Journalof Psychophysiology 25, 249–255.

Coxon, J.P., Stinear, C.M., Byblow, W.D., 2009. Stop and go: the neural basis of selectivemovement prevention. Journal of Cognitive Neuroscience 21, 1193–1203.

De Vos, M., Thorne, J.D., Yovel, G., Debener, S., 2012. Let's face it, from trial to trial:Comparing procedures for N170 single-trial estimation. NeuroImage 63, 1196–1202.

Debener, S., Makeig, S., Delorme, A., Engel, A.K., 2005. What is novel in the noveltyoddball paradigm? Functional significance of the novelty P3 event-related poten-tial as revealed by independent component analysis. Cognitive Brain Research 22,309–321.

Dockree, P.M., Robertson, I.H., 2011. Electrophysiological markers of cognitive deficitsin traumatic brain injury: a review. International Journal of Psychophysiology 82,53–60.

Duncan, C.C., Barry, R.J., Connolly, J.F., Fischer, C., Michie, P.T., Näätänen, R., Polich, J.,Reinvang, I., Van Petten, C., 2009. Event-related potentials in clinical research:guidelines for eliciting, recording, and quantifying mismatch negativity, P300,and N400. Clinical Neurophysiology 120, 1883–1908.

Eichele, T., Rachakonda, S., Brakedal, B., Eikeland, R., Calhoun, V.D., 2011. EEGIFT: groupindependent component analysis for event-related EEG data. Computational Intel-ligence and Neuroscience 2011, (Article ID 129365).

Falkenstein, M., Hoormann, J., Hohnsbein, J., 1999. ERP components in Go/Nogo tasksand their relation to inhibition. Acta Psychologica 101, 267–291.

Fallgatter, A.J., Strik, W.K., 1999. The NoGo-anteriorization as a neurophysiologicalstandard-index for cognitive response control. International Journal of Psycho-physiology 32, 233–238.

Fallgatter, A.J., Bartsch, A.J., Strik, W.K., Mueller, T.J., Eisenack, S.S., Neuhauser, B., Aranda,D., Herrmann, M.J., 2001. Test–retest reliability of electrophysiological parametersrelated to cognitive motor control. Clinical Neurophysiology 112, 198–204.

Fallgatter, A.J., Aranda, D.R., Bartsch, A.J., Herrmann, M.J., 2002. Long-term reliability ofelectrophysiologic response control parameters. Journal of Clinical Neurophysiolo-gy 19, 61–66.

Folmer, R.L., Billings, C.J., Diedesch-Rouse, A.C., Gallun, F.J., Lew, H.L., 2011. Electrophys-iological assessments of cognition and sensory processing in TBI: applications fordiagnosis, prognosis and rehabilitation. International Journal of Psychophysiology82, 4–15.

Forstmann, B.U., Dutilh, G., Brown, S., Neumann, J., von Cramon, D.Y., Ridderinkhof, K.R.,Wagenmakers, E.J., 2008. Striatum and pre-SMA facilitate decision-making undertime pressure. Proceedings of the National Academy of Sciences of the United Statesof America 105, 17538–17542.

Gajewski, P.D., Falkenstein, M., 2013. Effects of task complexity on ERP components inGo/Nogo tasks. International Journal of Psychophysiology 87 (3), 273–278.

Gehring, W.J., Liu, Y., Orr, J.M., Carp, J., 2012. The error-related negativity (ERN/Ne). In:Luck, S., Kappenman, E. (Eds.), Oxford Handbook of Event-Reated PotentialComponents, pp. 231–294.

Hall, M.H., Schulze, K., Rijsdijk, F., Picchioni, M., Ettinger, U., Bramon, E., Freedman, R.,Murray, R.M., Sham, P., 2006. Heritability and reliability of P300, P50 and durationmismatch negativity. Behavior Genetics 36, 845–857.

Hansen, J.C., Hillyard, S.A., 1980. Endogenous brain potentials associated with selectiveauditory attention. Electroencephalography and Clinical Neurophysiology 49,277–290.

Isoda, M., Hikosaka, O., 2011. Cortico-basal ganglia mechanisms for overcoming innate,habitual and motivational behaviors. The European Journal of Neuroscience 33,2058–2069.

Joost, W., Bach, M., Schulte-Mønting, J., 1992. Influence of mood on visually evokedpotentials: a prospective longitudinal study. International Journal of Psychophysiology12, 147–153.

Jung, T.P., Makeig, S., Westerfield, M., Townsend, J., Courchesne, E., Sejnowski, T.J.,2000. Removal of eye activity artifacts from visual event-related potentials innormal and clinical subjects. Clinical Neurophysiology 111, 1745–1758.

Kappenman, E., Luck, S., 2012. ERP components: the ups and downs of brainwaverecordings. In: Luck, S., Kappenman, E. (Eds.), Oxford Handbook of Event-RelatedPotential Components, pp. 3–30.

Kato, Y., Endo, H., Kizuka, T., 2009. Mental fatigue and impaired response processes:event-related brain potentials in a Go/NoGo task. International Journal of Psycho-physiology 72, 204–211.

Kiesel, A., Miller, J., Jolicoeur, P., Brisson, B., 2008. Measurement of ERP latency differ-ences: a comparison of single-participant and jackknife-based scoring methods.Psychophysiology 45, 250–274.

Kok, A., Ramautar, J.R., De Ruiter, M.B., Band, G.P.H., Ridderinkhof, K.R., 2004. ERP com-ponents associated with successful and unsuccessful stopping in a stop-signal task.Psychophysiology 41, 9–20.

Kropotov, J.D., 2009. Quantitative EEG, Event-related Potentials and Neurotherapy.Elsevier Academic Press.

Kropotov, J.D., Ponomarev, V.A., 2009. Decomposing N2 NOGO wave of event-relatedpotentials into independent components. NeuroReport 20, 1592–1596.

Kropotov, J.D., Ponomarev, V.A., Hollup, S., Mueller, A., 2011. Dissociating action inhibi-tion, conflict monitoring and sensory mismatch into independent components ofevent related potentials in GO/NOGO task. NeuroImage 57, 565–575.

Kropp, P., Kiewitt, A., Göbel, H., Vetter, P., Gerber, W.D., 2000. Reliability and stability ofcontingent negative variation. Applied Psychophysiology and Biofeedback 25,33–41.

Larson, M.J., Baldwin, S.A., Good, D.A., Fair, J.E., 2010. Temporal stability of the error-related negativity (ERN) and post-error positivity (Pe): the role of number of trials.Psychophysiology 47, 1167–1171.

Lee, K.M., Lee, J., Chung, C.Y., Ahn, S., Sung, K.H., Kim, T.W., Lee, H.J., Park, M.S., 2012.Pitfalls and important issues in testing reliability using intraclass correlation coef-ficients in orthopaedic research. Clinics in Orthopedic Surgery 4, 149–155.

Lew, H.L., Gray, M., Poole, J.H., 2007. Temporal stability of auditory event-related po-tentials in healthy individuals and patients with traumatic brain injury. Journalof Clinical Neurophysiology 24, 392–397.

Liotti, M., Pliszka, S.R., Perez, R., Kothmann, D., Woldorff, M.G., 2005. Abnormal brainactivity related to performance monitoring and error detection in children withchildren with ADHD. Cortex 41, 377–388.

Liu, J., Kiehl, K.A., Pearlson, G., Perrone-Bizzozero, N.I., Eichele, T., Calhoun, V.D., 2009.Genetic determinants of target and novelty-related event-related potentials inthe auditory oddball response. NeuroImage 46, 809–816.

Makeig, S., Bell, A.-J., Jung, T.P., Sejnowski, T.-J., 1996. Independent component analysisof electroencephalographic data. Advances in Neural Information ProcessingSystems 8, 145–151.

Makeig, S., Westerfield, M., Jung, T.-P., Covington, J., Townsend, J., Sejnowski, T.J.,Courchesne, E., 1999. Functionally independent components of the late positiveevent-related potential during visual spatial attention. The Journal of Neuroscience19, 2665–2680.

114 J.F. Brunner et al. / International Journal of Psychophysiology 89 (2013) 106–114

Makeig, S., Delorme, A., Westerfield, M., Jung, T.-P., Townsend, J., Courchesne, E.,Sejnowski, T.J., 2004. Electroencephalographic brain dynamics following manuallyresponded visual targets. PLoS Biology 2, 0747–0762.

Mathalon, D.H., Ford, J.M., Pfefferbaum, A., 2000. Trait and state aspects of P300 ampli-tude reduction in schizophrenia: a retrospective longitudinal study. BiologicalPsychiatry 47, 434–449.

McGraw, K.O., Wong, S.P., 1996. Forming inferences about some intraclass correlationcoefficients. Psychological Methods 1, 30–46.

Mehta, J., Jerger, S., Jerger, J., Martin, J., 2009. Electrophysiological correlates of wordcomprehension: event-related potential (ERP) and independent component anal-ysis (ICA). International Journal of Audiology 48, 1–11.

Nachev, P., Wydell, H., O'Neill, K., Husain, M., Kennard, C., 2007. The role of the pre-supplementary motor area in the control of action. NeuroImage 36, T155–T163.

Nakata, H., Sakamoto, K., Kakigi, R., 2012. The relationship between reaction time andresponse variability and somatosensory No-go potentials. European Journal ofApplied Physiology 112, 207–214.

Neubert, F., Mars, R.B., Buch, E.R., Olivier, E., Rushworth, M.F.S., 2010. Cortical and sub-cortical interactions during action reprogramming and their related white matterpathways. Proceedings of the National Academy of Sciences of the United Statesof America 107, 13240–13245.

Nunnally, J.C., Bernstein, I.H., 1994. Psychometric Theory. McGraw-Hill, New York.Olbrich, H.M., Maes, H., Valerius, G., Langosch, J.M., Feige, B., 2005. Event-related poten-

tial correlates selectively reflect cognitive dysfunction in schizophrenics. Journal ofNeural Transmission 112, 283–295.

Olvet, D.M., Hajcak, G., 2009. The stability of error-related brain activity with increasingtrials. Psychophysiology 46, 957–961.

Onton, J., Makeig, S., 2006. Information-based modeling of event-related brain dynamics.Progress in Brain Research 159, 99–120.

Pfefferbaum, A., Ford, J., Weller, B.K., 1985. ERPs to response production and inhibition.Electrocephalography and Clinical Neurophysiology 60, 423–434.

Portney, L., Watkins, M., 1999. Foundations of Clinical Research: Application toPractice, 2nd edition. Prentice Hall.

Sandman, C.A., Patterson, J.V., 2000. The auditory event-related potential is a stable andreliable measure in elderly subjects over a 3 year period. Clinical Neurophysiology111, 1427–1437.

Sarnthein, J., Andersson, M., Zimmermann, M.B., Zumsteg, D., 2009. High test–retestreliability of checkerboard reversal visual evoked potentials (VEP) over 8 months.Clinical Neurophysiology 120, 1835–1840.

Sarter, M., Gehring, W.J., Kozak, R., 2006. More attention must be paid: the neurobiologyof attentional effort. Brain Research Reviews 51, 145–160.

Scangos, K.W., Stuphorn, V., 2010. Medial frontal cortex motivates but does not controlmovement initiation in the countermanding task. The Journal of Neuroscience 30,1968–1982.

Schmajuk, M., Liotti, M., Busse, L., Woldorff, M.G., 2006. Electrophysiological activityunderlying inhibitory control processes in normal adults. Neuropsychologia 44,384–395.

Segalowitz, S.J., Barnes, K.L., 1993. The reliability of ERP components in the auditoryoddball paradigm. Psychophysiology 30, 451–459.

Segalowitz, S.J., Santesso, D.L., Murphy, T.I., Homan, D., Chantziantoniou, D.K., Khan, S.,2010. Retest reliability of medial frontal negativities during performance monitoring.Psychophysiology 47, 260–270.

Shrout, P.E., Fleiss, J.L., 1979. Intraclass correlations: uses in assessing rater reliability.Psychological Bulletin 86, 420–428.

Smith, J.L., Johnstone, S.J., Barry, R.J., 2006. Effects of pre-stimulus processing on subse-quent events in a warned Go/NoGo paradigm: response preparation, executionand inhibition. International Journal of Psychophysiology 61, 121–133.

Smith, J.L., Johnstone, S.J., Barry, R.J., 2007. Response priming in the Go/NoGo task: theN2 reflects neither inhibition nor conflict. Clinical Neurophysiology 118, 343–355.

Stuss, D.T., Alexander, M.P., Shallice, T., Picton, T.W., Binns, M.A., Macdonald, R.,Borowiec, A., Katz, D.I., 2005. Multiple frontal systems controlling responsespeed. Neuropsychologia 43, 396–417.

van Gaal, S., Lamme, V.A.F., Fahrenfort, J.J., Ridderinkhof, K.R., 2011. Dissociable brainmechanisms underlying the conscious and unconscious control of behavior.Journal of Cognitive Neuroscience 23, 91–105.

Vigario, R.N., 1997. Extraction of ocular artefacts from EEG using independent componentanalysis. Electroencephalography and Clinical Neurophysiology 103, 395–404.

Walhovd, K.B., Fjell, A.M., 2002. One-year test–retest reliability of auditory ERPs inyoung and old adults. International Journal of Psychophysiology 46, 29–40.

Williams, L.M., Simms, E., Clark, C.R., Paul, R.H., Rowe, D., Gordon, E., 2005. The test-retestreliability of a standardized neurocognitive and neurophysiological test battery:“neuromarker”. The International Journal of Neuroscience 115, 1605–1630.

Zeman, P.M., Till, B.C., Livingston, N.J., Tanaka, J.W., Driessen, P.F., 2007. Independentcomponent analysis and clustering improve signal-to-noise ratio for statisticalanalysis of event-related potentials. Clinical Neurophysiology 118, 2591–2604.