Executive control tasks in bilingualism research: A multi-trait multi-method validity analysis of...

52
+ Nick B. Pandža ([email protected]) Program in Second Language Acquisition University of Maryland, College Park A multi-trait multi-method validity analysis of the Stroop, Simon, and ANT Executive control tasks in bilingualism research:

Transcript of Executive control tasks in bilingualism research: A multi-trait multi-method validity analysis of...

+

Nick B. Pandža ([email protected])Program in Second Language Acquisition

University of Maryland, College Park

A multi-trait multi-method validity analysis of the Stroop, Simon, and ANT

Executive control tasks in bilingualism research:

+ Measuring executive control (EC)in bilingualism research

n Non-selectivity hypothesis: both languages are active when one is in use, creating an attentional control problem in bilinguals (e.g., Colomé, 2001)

n EC is an attentional network consisting of three broad-purpose cognitive processes which monitor and resolve conflict from input (Costa, Hernández, & Sebastián-Gallés, 2008; Miyake et al., 2000)

n Subconstructs:n  Inhibition/conflict resolutionn  Information updating/monitoringn Mental set shifting/task-switching

2

+ Measuring executive control (EC)n Used for (in particular)n Measuring the bilingual advantage

(e.g., Bialystok, 2009)

n Better/more efficient EC thanks to lifelong exercise with language conflict

n Measuring individual differences in EC as predictive of second language processing (e.g., Mercier et al., 2014)

n Evidence for IDs in EC predicting spoken word recognition in L2

3

+ Executive control (EC)n Common tasks of ECn Stroop (Stroop, 1935)

n Simon (Simon & Rudell, 1967)

n Flanker (Eriksen & Eriksen, 1974)

n Each task consists ofn A congruent conditionn An incongruent conditionn A neutral condition*

*Can, but not always

4

+ Executive control (EC)n Common tasks of ECn Stroop (Shroeter et al., 2002)

n Simon (Bialystok et al., 2004)

n Attentional Network Task (ANT; Fan et al., 2002)

n Each can be similarly analyzed using up to 5 different effect calculations.n Switching not discussed here

5

+ Analyses of EC tasks

n Conflict effect (Costa et al., 2008, 2009; Bialystok et al., 2008; Poarch & van Hell, 2012)

n  a.k.a. Stroop/Simon effect

n incongruent – congruent trialsn Ha: bilinguals < monolinguals

n Interference effect (Hernández et al., 2010; Fan et al., 2003; Schroeter et al., 2002; MacLeod, 1991)

n incongruent – neutral trialsn Ha: bilinguals < monolinguals

6

Inhibition/Conflict resolution

+

n Facilitation effect (Hernández et al., 2010; Bialystok, Craik & Luk, 2008)

n neutral – congruent trialsn Ha: bilinguals > monolinguals

n Monitoring effect (Hernández et al., 2010; Costa et al., 2008; Costa et al., 2009; Poarch & van Hell, 2012)

n overall mean RT across trial typesn Ha: bilinguals < monolinguals

7

Information updating/MonitoringAnalyses of EC tasks

+ Motivation

n Bilingual advantage difficult to observe (e.g., Costa et al., 2009)n  May be a lack of an actual advantage (Paap & Greenberg, 2013) n  May be difficult to observe thanks to competition with other

monolingual and/or bilingual benefits (Valian, 2015) n  May be due to poor quality of behavioral measures

8

Construct validity

+ Motivation

n No “standard” measure or even task of ECn New EC tasks frequently created

n  e.g., number Stroop, arrow Stroop/Simon (Mercier et al., 2014) n No standard ‘effect’ choice or theoretical motivation for analysis

of EC subcomponentsn No standard for number of trials, presence of neutral trials,

number of practice trials/block, etc.n Tasks and effects are used interchangeably

n Are we measuring what we think we’re measuring?

9

Construct validity

+ Motivation

n Multi-trait multi-method (MTMM) framework (Campbell & Fiske, 1959)n A useful framework for evaluating convergent and

discriminant validity evidence for construct validityn Convergent validity evidence:

n Measures designed to measure the same (sub)construct should correlate highly

n Discriminant validity evidence:n Measures designed to measure different

(sub)constructs should not correlate highly

10

Construct validity

+Research Questions1.  Are common method measures of executive control

(Stroop, Simon, and ANT) interchangeable?i.  Will each method show similar correlations for the

same effect across different tasks?

2.  Are common trait measures of executive control interchangeable?

i.  Will each effect show similar correlations within each task? (Are all effect measures just measuring ‘general EC’?)

ii.  Will conflict and interference effects and facilitation and monitoring effects pattern together within and across tasks? (Are they measuring distinct subconstructs of EC?)

11

+Study Design

n 58 total participantsn Aged 18-38n 4 datasets lost due to technical difficulty

n Pseudo-randomized task order:n Stroop, Simon, and ANT

12

+Stroop (Schroeter et al., 2002)

Simon (Bialystok et al., 2004)

ANT(Fan et al., 2002)

Measures of executive control administered via SuperLab 4

Henceforth, only accurate responses trimmed within 2 SDs of an individual participant’s RT are reportedα = .05

BLUE (5min)

(5min)

ààßàà (20min)

+Descriptive Statistics

Table 1. Descriptive statistics for tasks by trial type. Mean reaction times (RTs) with standard deviations (SDs) in parentheses.

Effect (ms)

Task Condition Monolinguals Bilinguals

Stroop Neutral 1678 (72) 1789 (130)

Congruent 1673 (68) 1773 (124)

Incongruent 1760 (106) 1848 (156)

Simon Neutral 444 (46) 483 (90)

Congruent 423 (45) 463 (98)

Incongruent 463 (40) 501 (99)

ANT Neutral 559 (60) 593 (68)

Congruent 562 (65) 590 (70)

Incongruent 637 (71) 672 (95)

Task Reliability (α)Stroop .96Simon .97ANT .95

Table 2.Reliability estimates for each task based on the three trial types.

+Descriptive Statistics

Task Reliability (α)Conflict .26Interference .11Facilitation .14Monitoring .82

Table 4.Reliability estimates for each effect calculation.

Table 3.

Descriptive statistics for tasks by type of executive control operationalization.

Effect (ms)

Effect Task Monolinguals Bilinguals

Conflict effect Stroop 88 (63) 74 (72)

Simon 39 (36) 38 (30)

ANT 75 (24) 81 (56)

Interference effect Stroop 83 (57) 59 (61)

Simon 19 (32) 18 (36)

ANT 78 (24) 78 (58)

Facilitation effect Stroop 5 (48) 16 (51)

Simon 20 (35) 20 (41)

ANT -3 (15) 2 (12)

Monitoring effect Stroop 1704 (77) 1801 (130)

Simon 443 (39) 483 (94)

ANT 585 (65) 620 (73)

16 Table 6. Pearson’s r correlations between conflict, interference, facilitation, and monitoring effects across tasks. Stroop Simon ANT Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit.

Stroop

Conflict --- Interference --- Facilitation --- Monitoring ---

Simon

Conflict --- Interference --- Facilitation --- Monitoring ---

ANT

Conflict --- Interference --- Facilitation --- Monitoring ---

*p < .05, **p < .01, ***p < .001; Dark Green�Light Green: highlights the validity diagonals (expected high correlations) � highlights related- and hetero-trait mono- and hetero-method (expected small to moderate correlations) � highlights hetero-trait mono-method (expected small correlations) � highlights hetero-trait hetero-method (expected small to nonexistent correlations)

17 Table 7. Pearson’s r correlations between conflict, interference, facilitation, and monitoring effects across tasks. Stroop Simon ANT Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit.

Stroop

Conflict --- .71*** .53*** .41** Interference --- -.23 .35* Facilitation --- .15 Monitoring ---

Simon

Conflict --- .32* .55** .03 Interference --- -.62*** .15 Facilitation --- -.11 Monitoring ---

ANT

Conflict --- .96*** .009 .19 Interference --- -.26 .23 Facilitation --- -.19 Monitoring ---

*p < .05, **p < .01, ***p < .001; Dark Green�Light Green: highlights the validity diagonals (expected high correlations) � highlights related- and hetero-trait mono- and hetero-method (expected small to moderate correlations) � highlights hetero-trait mono-method (expected small correlations) � highlights hetero-trait hetero-method (expected small to nonexistent correlations)

18 Table 8. Pearson’s r correlations between conflict, interference, facilitation, and monitoring effects across tasks. Stroop Simon ANT Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit.

Stroop

Conflict --- .22 .08 .09 .18 Interference --- .20 -.11 -.12 .05 Facilitation --- .12 .06 -.03 -.10 Monitoring --- .24 .14 .15 .14

Simon

Conflict --- -.01 .14 Interference --- .11 .13 Facilitation --- -.001 .03 Monitoring --- -.10 .11

ANT

Conflict --- Interference --- Facilitation --- Monitoring ---

*p < .05, **p < .01, ***p < .001; Dark Green�Light Green: highlights the validity diagonals (expected high correlations) � highlights related- and hetero-trait mono- and hetero-method (expected small to moderate correlations) � highlights hetero-trait mono-method (expected small correlations) � highlights hetero-trait hetero-method (expected small to nonexistent correlations)

19 Table 9. Pearson’s r correlations between conflict, interference, facilitation, and monitoring effects across tasks. Stroop Simon ANT Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit.

Stroop

Conflict --- .71*** -.05 .13 Interference --- .14 .20 Facilitation --- .15 .24 .19 Monitoring --- .08 .02

Simon

Conflict --- .32* -.09 Interference --- -.09 Facilitation --- -.11 .002 Monitoring --- -.02

ANT

Conflict --- .96*** Interference --- Facilitation --- -.19 Monitoring ---

*p < .05, **p < .01, ***p < .001; Dark Green�Light Green: highlights the validity diagonals (expected high correlations) � highlights related- and hetero-trait mono- and hetero-method (expected small to moderate correlations) � highlights hetero-trait mono-method (expected small correlations) � highlights hetero-trait hetero-method (expected small to nonexistent correlations)

20 Table 10. Pearson’s r correlations between conflict, interference, facilitation, and monitoring effects across tasks. Stroop Simon ANT Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit.

Stroop

Conflict --- .21 .16 Interference --- -.10 .23 Facilitation --- .05 -.28 Monitoring --- .62*** .64***

Simon

Conflict --- -.10 Interference --- -.12 Facilitation --- .11 Monitoring --- .67***

ANT

Conflict --- Interference --- Facilitation --- Monitoring ---

*p < .05, **p < .01, ***p < .001; Dark Green�Light Green: highlights the validity diagonals (expected high correlations) � highlights related- and hetero-trait mono- and hetero-method (expected small to moderate correlations) � highlights hetero-trait mono-method (expected small correlations) � highlights hetero-trait hetero-method (expected small to nonexistent correlations)

21 Table 11. Pearson’s r correlations between conflict, interference, facilitation, and monitoring effects across tasks. Stroop Simon ANT Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit.

Stroop

Conflict (.68) .71*** .53*** .41** .21 -.05 .22 .08 .16 .13 .09 .18 Interference (.43) -.23 .35* .14 -.10 .20 -.11 .20 .23 -.12 .05 Facilitation (.43) .15 .12 .06 .05 .24 -.03 -.10 -.28 .19 Monitoring (.73) .24 .14 .08 .62*** .15 .14 .02 .64***

Simon

Conflict (.47) .32* .55** .03 -.10 -.09 -.01 .14 Interference (.19) -.62*** .15 -.09 -.12 .11 .13 Facilitation (.21) -.11 -.001 .03 .11 .002 Monitoring (.52) -.10 .11 -.02 .67***

ANT

Conflict (.48) .96*** .009 .19 Interference (.42) -.26 .23 Facilitation (.16) -.19 Monitoring (.61)

*p < .05, **p < .01, ***p < .001; Dark Green�Light Green: highlights the validity diagonals (expected high correlations) � highlights related- and hetero-trait mono- and hetero-method (expected small to moderate correlations) � highlights hetero-trait mono-method (expected small correlations) � highlights hetero-trait hetero-method (expected small to nonexistent correlations)

+ Sawilowsky I distribution-free test for trend in MTMM matrices (2002)

Minimum Median MaximumLevel Value I Value I Value IReliability .16 0 .45 0 .73 0Validity .05 0 .185 1 .67 2H-M .009 0 .245 3 .96 6H-H .001 0 .15 2 .24 4

22

Total inversions = 18; p = .12

Table 12. Test for increasing trend.

n  If significant, provides evidence of construct validity

n  No significant trend found

23 Table 13. Pearson’s r correlations between conflict, interference, facilitation, and monitoring effects across tasks. Stroop Simon ANT Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit. Conf. Interf. Facil. Monit.

Stroop

Conflict (.68) .71*** .53*** .41** Interference (.43) .35* Facilitation (.43) Monitoring (.73) .62*** .64***

Simon

Conflict (.47) .32* .55** Interference (.19) -.62*** Facilitation (.21) Monitoring (.52) .67***

ANT

Conflict (.48) .96*** Interference (.42) Facilitation (.16) Monitoring (.61)

*p < .05, **p < .01, ***p < .001; Dark Green�Light Green: highlights the validity diagonals (expected high correlations) � highlights related- and hetero-trait mono- and hetero-method (expected small to moderate correlations) � highlights hetero-trait mono-method (expected small correlations) � highlights hetero-trait hetero-method (expected small to nonexistent correlations)

+Discussion: RQs revisited1.  Are common method measures of executive control

(Stroop, Simon, and ANT) interchangeable?i.  Will each method show similar correlations for the same

effect across different tasks?

A.  No, with the exception of the monitoring effectA.  However, the correlations may be due to non-EC

variance in common (e.g., physical reaction speed)

24

+Discussion: RQs revisited2.  Are common trait measures of executive control

interchangeable? i.  Will each effect show similar correlations within each

task? (Are all effect measures just measuring ‘general EC’?)

A.  Maybe; There is a pattern within the Stroop and Simon tasks

A.  However, the pattern is not ubiquitous; with the lack of cross-task correlations, the correlations here could be due to non-EC, task-specific variance in common

25

+Discussion: RQs revisited2.  Are common trait measures of executive control

interchangeable? ii.  Will conflict and interference effects and facilitation and

monitoring effects pattern together within and across tasks? (Are they measuring distinct subconstructs of EC?)

A.  Within: Maybe for the Stroop taskA.  There are some missing correlations, but the pattern this

far is as expected: conflict and interference have a high correlation, and correlations between them and the monitoring and facilitation effects are smaller, when significant

26

+Discussion: RQs revisited2.  Are common trait measures of executive control

interchangeable? ii.  Will conflict and interference effects and facilitation and

monitoring effects pattern together within and across tasks? (Are they measuring distinct subconstructs of EC?)

A.  Across: No.

27

+Conclusionsu  While the Stroop, Simon, and ANT data

showed high reliability across conditions in the RT data, reliability was unacceptably low in three of four effect conditions

u  The Sawilowsky I test did not provide support for construct validity, although descriptive MTMM matrix analysis provided some weak convergent/discriminant validity evidence

u  There appears to be some convergent validity evidence that the conflict and interference effects may be interchangeable within a particular EC task, but it is not present across tasks

28

+Conclusionsu  There is strong convergent validity

evidence that the monitoring effect is reliable and consistent across tasks

u  There is discriminant validity evidence in a broad sense that there were few spurious correlations inexplicable by theory, with two exceptions:u  The magnitudes of the correlations in the

Simon task were not in the expected direction

u  Two of the three significant facilitation correlations are in the wrong direction

29

+Conclusionsu  There is by no means clear construct validity

in the sampleu  This study stresses the need for

corroborating/triangulating EC task results with at least a second measure of ECu  Is there a bilingual effect in the data?u  Is it consistent across tasks?u  If not, it may not be measuring what we think

it’s measuringu  This in turn highlights the need for the

development and use of standardized task(s) of EC with identical parameters, such as:u  Number of practice/condition trialsu  Presence or absence of neutral trialsu  Response time-out periodsu  Consistency of stimuli (pictures/colors/arrows)

30

+Limitations/Future researchu  Did not assess switching effectsu  No corrections for multiple comparisonsu  Lack of stronger validity evidence could

be speaking to:u  The high measurement error of the tasksu  The smaller nu  The lack of consistent task parameters

such as number/type of practice trials for working up a prepotent response

u  Or some combination thereof

31

+Limitations/Future researchu  Latent means modeling

u  Larger n needed (MTurk?)u  Do bilinguals have more of latent

variable(s) compared to monolinguals?u  SEM, partialling out measurement error,

could make it easier to observe the bilingual effect (or lack thereof)

u  If EC cannot be reliably measured with behavioral tasks, perhaps neurocognitive studies (fMRI/VBM) are the way to gou  Cognitive validity evidence?

32

+

And thank you for executing control of your attention in my general direction!

Nick B. Pandža ([email protected])Program in Second Language Acquisition

University of Maryland, College Park

Special thanks to:Rebecca Sachs Virginia International UniversityKaitlyn Tagarelli Dalhousie University

Cristina SanzGeorgetown UniversitySteve Ross University of Maryland, College Park

+ANOVAs

Table 5.

ANOVAs for operationalizations of executive control between bilinguals and monolinguals.

df

Sum of

squares

Mean

squares F p Conflict effect

Stroop 1 3752.44 3752.44 .76 .39 .02 Simon 1 5.45 5.45 .63 .43 .01

ANT 1 405.21 405.21 .17 .68 .004

Interference effect

Stroop 1 7479.68 7479.68 2.07 .16 .04 Simon 1 745.75 745.75 .63 .43 .01

ANT 1 11.28 11.28 .004 .95 <.001

Facilitation effect Stroop

Simon

ANT

1 636.45 636.45 .25 .62 .01 1 878.73 878.73 .58 .45 .01

1 281.30 281.30 1.57 .22 .03

Monitoring effect Stroop 1 51,191.37 51,191.37 3.96 .052+ .08 Simon 1 13,168.90 13,168.90 2.04 .16 .04

ANT 1 6176.23 6176.23 1.215 .28 .03 +p < .10, *p < .05

17 monolinguals37 bilingualsLHQ (Li et al., 2006)

Bilingual Benefit? Not found here;But doesn’t mean EC tasks shouldn’t still be accurately measuring EC

+ EC MTMM path diagram for latent means modeling

35

+ Executive control (EC)n Common tasks of EC

n Stroop (Shroeter et al., 2002)n Simon (Bialystok et al., 2004)n Attentional Network Task (ANT; Fan et al., 2002)

n Each can be similarly analyzed using up to 5 different effect calculations, including:n  Inhibition/conflict resolution

1.  The conflict effect (a.k.a. Stroop/Simon effect)2.  The interference effect

n  Information updating/monitoring3.  The monitoring effect4.  The facilitation effect

n Mental set shifting/task-switching5.  The switching effect

36

+

n Switching effect (e.g., Costa et al., 2006)

n switch – non-switch trialsn Ha: bilinguals < monolinguals

37

Analyses of EC tasksMental set shifting/Task-switching

Stroop  Instruc,ons  

In  the  following  experiment,  you  will  see  words  on  the  screen.    Your  job  is  to  decide  what  color  the  word  appears  in.  

 Press  the  key  that  corresponds  to  the  FIRST  LETTER  of  the  COLOR  of  the  word.    Press      R      if  the  color  of  the  word  is  RED.  Press      Y      if  the  color  of  the  word  is  YELLOW.  Press      G      if  the  color  of  the  word  is  GREEN.  Press      B      if  the  color  of  the  word  is  BLUE.    Try  a  few  for  prac,ce.    Do  you  have  any  ques,ons?    Press  SPACE  BAR  to  prac,ce.  

38  

+ Three common measures39

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

BLUE

GREEN

YELLOW

RED

Congruent

+ Three common measures40

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

BLUE

GREEN

YELLOW

RED

Incongruent

+ Three common measures41

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

XXXX

XXXX

XXXX

XXXX

Neutral

Simon  Instruc,ons  

Welcome  to  the  Simon  task!    In  this  task,  you  will  see  colored  squares  on  the  right  and  leT  

sides  of  the  screen.    You  will  be  asked  to  press  the  buUon  that  matches  the  of  the  

COLOR  of  the  SQUARE,  NOT  the  LOCATION.      Press  SPACE  BAR  to  con,nue.  

42  

Simon  Instruc,ons  

43  

+ Three common measures44

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

or

Congruent

+ Three common measures45

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

or

Incongruent

+ Three common measures46

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

or

Neutral

+ Three common measures47

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

à à à à à

or

ß ß ß ß ß

Congruent

+ Three common measures48

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

à à ß à à

or

ß ß à ß ß

Incongruent

+ Three common measures49

Taskn  Stroop Task

(Schroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

–– –– ß –– ––

or

–– –– à –– ––

Neutral

+Three common measures

n  Stroop Task (Shroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

n  3 practice trials

n  20 neutral trials

n  20 congruent trials

n  20 incongruent trials

n  60 total trials

n  Randomized order

n  2000ms cutoff

50

Task Breakdown

+Three common measures

n  Stroop Task (Shroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

n  8 practice trials

n  14 neutral trials (7/7)

n  14 cong. trials (7/7)

n  14 incong. trials (7/7)

n  42 total trials

n  Randomized order

n  1000ms cutoff

51

Task Breakdown

+Three common measures

n  Stroop Task (Shroeter et al., 2002)

n  Simon Task (Bialystok et al., 2004)

n  Attentional Network Task (ANT; Fan et al., 2002)

n  8 practice trials

n  3 blocks n  32 neutral trials n  32 cong. trials n  32 incong. trials

n  288 total trials

n  Randomized order

n  1700ms cutoff

n  Takes approximately 20min

52

Task Breakdown