Collocational processing in the light of a phraseological continuum model: Does semantic...

Running head: COLLOCATIONAL PROCESSING AND SEMANTIC TRANSPARENCY

Collocational processing in the light of a phraseological continuum model: Does

semantic transparency matter?

Henrik Gyllstad

Centre for Languages and Literature

Lund University

Box 201, 221 00 Lund

Sweden

Brent Wolter

Department of English and Philosophy

Idaho State University

USA

Accepted for publication in

Language Learning on 13

February, 2015.

COLLOCATIONAL PROCESSING AND SEMANTIC TRANSPARENCY 2

Abstract

The present study investigates whether two types of word combinations differ in terms of

processing: free combinations and collocations. As such it tests predictions made in

Howarth’s Continuum Model (1996, 1998) which is based on word combination typologies

from a phraseological tradition. A visual semantic judgement task was administered to

advanced Swedish learners of English (N = 27) with native speakers of English (N = 38) as

controls. Reaction times and error rates were recorded for three critical conditions: free

combinations, collocations, and baseline items. The results showed that there was a

processing cost for collocations compared to free combinations, for both groups of

participants. This is believed to stem from the semantically semi-transparent nature of

collocations as they are defined in the phraseological tradition. Furthermore, phrasal

frequency based on corpus values also predicted reaction times. These results lend initial

support for the Continuum Model from a processing perspective and suggest that degree of

semantic transparency together with phrasal frequency plays an important role in collocational

processing.

Keywords: collocation; free combination; phraseology; advanced learners


Introduction

Introduction

Within the discipline of Phraseology, broadly defined as the study of word combinations

(Granger & Meunier, 2008; see also Kuiper, 2010), a number of subtypes of combinations are

assumed that differ in terms of parameters like compositionality, figurativeness, and syntactic

fixedness. Generally speaking, commonly studied word combinations are ‘collocations’,

‘idioms’, ‘binomials’ and ‘lexical bundles’. When it comes to research into the processing and

representation of word combinations, however, there has been a clear bias towards

investigations of idioms, both in L1 (see e.g. Swinney & Cutler, 1979; Gibbs & Gonzales,

1985; Abel, 2003; Tabossi, Fanari & Wolf, 2009) and L2 research (see e.g. Abel, 2003;

Underwood et al., 2004; Cieślicka, 2006; Conklin and Schmitt, 2008; Siyanova-Chanturia,

Conklin & Schmitt, 2011).

One type of word combination for which there is a comparative lack of research in

terms of processing and representation is collocation (though see e.g. Wolter & Gyllstad,

2011, 2013; Wolter & Yamashita, 2014). Whereas idioms are typically described as non-

compositional, frozen and rather semantically opaque word combinations (Skandera, 2004),

collocations are treated as more semantically transparent. Different definitions of what a

collocation is abound in the literature, but it is possible to discern two main approaches: a

frequency-based approach and a phraseological approach (see Nesselhauf, 2004; Barfield &

Gyllstad, 2009). In the former, whether a word combination is a collocation or not is based on

how frequently the words in the combination co-occur in written and/or spoken corpora.

Often, some measure of association is used to distinguish statistically significant co-

occurrence from co-occurrence that is seen as random (Schmitt, 2010). In the phraseological

approach, on the other hand, collocations are identified more on typological grounds, with


grammatical structure and degree of semantic transparency as guiding principles. As such,

frequency of occurrence is not seen as a primary characteristic.

The choice of approach is not without importance as obtained results for learners’

processing and acquisition of collocations may in theory be strongly affected by how they are

conceptualized. It is not uncommon for authors in the Second Language Acquisition (SLA)

literature to claim that collocations normally do not pose problems for learners from a

comprehension perspective compared to idioms, but that it is in production that problems

occur (see e.g. Biskup, 1992; Nesselhauf, 2005; Henriksen & Stenius Staehr, 2009: 225;

Laufer & Waldman, 2011; Henriksen, 2013), especially for incongruent collocations (see

below). However, there is an important issue at play here, having to do with the treatment of

collocation as a generally clear and well-defined concept, which it is not. If looking at

published collocation research it becomes clear that what is referred to as a collocation varies

greatly both within and across studies. In some studies, collocation is defined purely based on

frequency (e.g. Wolter & Gyllstad, 2013), whereas in others, semantic properties are chiefly

taken into account (e.g. Nesselhauf, 2005). In fact it is not unreasonable to suggest that what

is defined as a collocation under one tradition may not be designated as such in another

tradition. For example, in Webb, Newton & Chang (2013), following a frequency-based

approach to collocation, an item such as pull strings is treated as a collocation; in the

phraseological tradition, the same item would be treated as an idiom, as the meaning (‘to

secretly use your influence with important people in order to get what you want or to help

someone else’) is not compositionally available. Thus, the choice of approach is clearly

nuanced.

When it comes to research on collocational processing in a second language (L2) to

date, two factors have been found to have tangible effects: congruency and frequency of input.

In three recent studies (Wolter & Gyllstad, 2011, 2013; Yamashita & Jiang, 2010) it was


found that congruent collocations (see also Bahns, 1993; Nesselhauf, 2005) demonstrated

faster processing than incongruent ones. A congruent L2 collocation has a corresponding

equivalent in the L1 in terms of the core meanings of the constituent words, in a word-for-

word translation. An example of congruent items from English and Swedish would be express

an opinion and uttrycka en åsikt (express[verb] a[indefinite article] opinion[noun]), where

each constituent has a straightforward equivalent across the two languages. For an item like

pay a visit, however, a word-for-word core meaning translation would render an infelicitous

Swedish phrase: *betala ett besök (pay[verb] a[indefinite article] visit[noun]), and thus be

treated as incongruent. Furthermore, frequent collocations are processed faster than less

frequent collocations (Wolter & Gyllstad, 2013), though this frequency effect seems to be

tempered by the congruency/incongruency distinction. These are important findings both for

SLA theory and for more practical teaching perspectives. To the best of our knowledge,

though, in none of these studies were semantic criteria like figurativeness and transparency

taken into account in the item selection process. Semantic transparency is very likely to affect

processing of collocations, as research investigating noun compound processing has indicated

that a constituent’s semantic value (transparent or opaque) affects reading times in

eyetracking studies and reaction times in primed lexical decision tasks (Niswander-Klement

& Pollatsek, 2008; El-Bialy, Gagné & Spalding, 2013). In an attempt to further model

collocational processing in a L2, then, by adding also semantic transparency as a factor, we

will be able to ascertain its role in a collocation comprehension-related mode.

In this study, we therefore define collocation along the lines of the phraseological

tradition. More specifically, we draw on Howarth’s Continuum Model (1996, 1998) in

investigating non-native and native speakers’ processing of two types of word combinations:

free combinations and collocations. Our focus on these two categories is warranted due to the

previous bias in the literature on idiom processing, and the common assumption that


collocations are typically not problematic from a comprehension perspective. Although we

will not measure comprehension per se, we believe that our study and its methods will shed

light on the very early stages of receptive meaning activation. We are mainly interested in

collocation processing in an L2, but because there is a lack of studies investigating how native

speakers process these two types of word combinations, we will also look at L1 processing.

Theoretical background and previous research

Researchers working in the phraseological tradition (see e.g. Cowie, 1981, 1994; Benson,

Benson & Ilson, 1997; Nesselhauf, 2003, 2005) have attempted to create phraseological

typologies and frameworks for classifying collocations alongside other word combinations.

One of these attempts is Howarth’s Continuum Model (1996, 1998). In the model Howarth

proposes 4 categories within the broader class of lexical composites (see Table 1).

*****TABLE 1 NEAR HERE*****

Free combinations, e.g. pay a bill, are word combinations in which the lexical elements are

used in their literal sense. Each component may be substituted without affecting the meaning

of the other. Restricted collocations, as in pay a visit, are word combinations in which one of

the constituent words is used in a figurative, technical or delexical sense only found in the

context of a limited number of collocates, whereas the other constituent appears in its literal

sense. To the right in the continuum, there are two further categories: figurative idioms and

pure idioms, with pay the price and pay the piper, respectively, as examples. While the former

type of word combination can appear both with a holistic metaphorical meaning and also a


current literal interpretation, the latter is wholly non-compositional and the most fixed type.

Howarth has argued that “the continuum model has great descriptive value and perhaps

psychological validity in representing the degrees of stability with which expressions are

stored in the mental lexicon.” (1996: 23). Although there is reason to accept Howarth’s claim

regarding the descriptive value of the model, the claim regarding the psychological value is

speculative and needs empirical validation. To the best of our knowledge, however, only one

study has investigated potential differences between these types of lexical composites, in

terms of how users perceive them. Columbus (2013), using human ratings of idioms and

collocations, found that the mean ratings for familiarity, semantic transparency and frequency

could differentiate between these two word combination categories (and lexical bundles as

well). No online processing task was used however.

Since the Continuum Model is a descriptive, typological model of word combinations,

in order to arrive at feasible predictions in terms of how these word combinations are likely to

be processed and represented, we need to also take into account more processing-oriented

models and frameworks for lexical phrases. In the absence of an explicit model for L2

collocational processing, we need to draw on a number of perspectives. Wray (2002), based

on work by Granger (1998) and Howarth (1998), has argued for a difference in the way

collocations are handled by native speakers compared to non-native speakers. In a discussion

on whether collocations are formulaic or not, where formulaic refers to holistic storage in the

mental lexicon, Wray claims that collocations are formulaic for native speakers but not for

non-native speakers. She argues that native speakers’ default mode is holistic processing, with

analysis into component parts used only when necessary. This means that a sequence like

major catastrophe will be remembered and stored as a sequence. The adult non-native

speaker, on the other hand, upon initial encounters with the sequence, is believed to break it

down into two components: major and catastrophe, and to store these separately, making later


activation of the pairing less efficient. Durrant and Schmitt (2010) have argued against this

position based on their own empirical findings, claiming that learners’ potential deficit in

collocational knowledge is likely to be the result of insufficient exposure to the language

rather than a language acquisition process that is fundamentally distinct from that of native

speakers. However, Durrant and Schmitt’s experimental study captures a very short treatment

time frame, with no delayed post-test, whereas Wray’s claim should probably be interpreted

as applying to a more long-term acquisition process.

The two types of processing discussed by Wray (2002) are captured in dual route

models (Wray, 2008; Van Lancker Sidtis, 2012). A dual route model postulates two

processing routes for language comprehension: ‘holistic memory retrieval’ and

‘computation/analysis’. In terms of lexical phrases, holistic memory retrieval has to do with

the process of accessing representations of these stored phrases in the mental lexicon. The

assumption is that frequently occurring phrases are stored holistically, and that these can be

retrieved when needed from memory. The computation/analysis route has to do with the

application of a bottom-up, rule-based approach. It is assumed that this type of processing

takes place when we as language users encounter novel language sequences, i.e., unfamiliar

phrases, or when there is something in the processing situation that calls for a compositional

analysis of familiar phrases in the input. The characteristics of the dual route processes are

shown in Table 2.


It is worth noting that the dual route model is basically a model for L1 processing, and that it

says nothing about compositionality and semantic transparency. What seems to matter is


whether a phrase is a familiar phrase or not. In terms of empirical support for the model, a

number of studies have investigated L1 processing of idioms vis-à-vis novel sequences (e.g.,

Swinney and Cutler, 1979; Tabossi, Fanari & Wolf, 2009). The main finding in these studies

is that idioms enjoy a processing facilitation over matched novel sequences. For example, an

idiom like bury the hatchet is processed faster than bury the axe (a phrase with supposedly

very low frequency). Conklin and Schmitt (2008) found that phrases like take the bull by the

horns were processed faster both by native speakers and non-native speakers in an idiomatic

and a literal reading (‘attack a problem’ vs. ‘wrestle an animal’) compared to matched novel

phrases. However, this processing facilitation for idiomatic phrases in an L2 has not always

been observed. For example, Siyanova-Chanturia, Conklin, and Schmitt (2011) observed no

processing advantages for non-native speakers reading passages with embedded idioms

compared to matched control phrases. Thus, the question is whether the dual route model can

be straightforwardly applied to a L2 context, and whether its predictions hold also for less

idiomatic phrases like collocations.

In research on collocation processing, in addition to the factor of cross-linguistic

influence observed in studies briefly reviewed in the introduction (Yamashita & Jiang, 2010;

Wolter & Gyllstad, 2011, 2013), frequency has also been found to have a strong effect. In an

L1 context, Ellis, Frey & Jalkanen (2009), adopting a frequency-based definition of

collocation, investigated whether the word recognition of L1 speakers of English would be

sensitive to collocation frequency. They found that there was a clear tendency whereby the

higher the collocational frequency in the language, the faster the participants recognized letter

strings as real English words. Also adopting a frequency-based definition of collocation,

Siyanova-Chanturia, Conklin & van Heuven (2011) made similar observations for both L1

and L2 in an eyetracking study using binomials (e.g. bride and groom, heart and soul) as

target items embedded in sentence contexts. Furthermore, Wolter & Gyllstad (2013) observed


clear frequency effects in L1 and L2 using a phrasal judgement task where participants were

asked to judge whether adjective + noun combinations (e.g. good news, mutual trust) were

commonly used in English or not.

In the above review we have seen that collocations are expected to be formulaic for

native speakers but not for non-native speakers, and that dual route models predict quicker

processing for phrases that are familiar and frequent, at least in a L1. Furthermore, there is a

clear processing facilitation effect for semantically opaque phrases like idioms over novel

language in a L1 context, but the picture is less clear in a L2. We have also seen that

frequency and congruency are important factors to take into account when it comes to

collocational processing in a L2. However, most studies have defined collocation along the

lines of the frequency-based tradition, and consequently factors like semantic transparency in

word combinations have not been taken into account. A line of research where this has been

investigated quite extensively is compound processing, wherefore findings from such research

warrants a brief review. In compound processing studies, it is common to manipulate

semantic transparency in terms of how constituents contribute straightforwardly to the

meaning of the compound. Typically, items in four conditions are used (see e.g. Frisson,

Niswander-Klement & Pollatsek, 2008; El-Bialy, Gagné & Spalding, 2013): fully transparent

items (TT), e.g. eyesight; partially opaque items (OT), e.g., eyetooth, where the opaque

constituent comes first, or (TO), e.g., sugarcane, where the opaque constituent comes last;

and fully opaque items (OO), e.g., catwalk. Interestingly, in an eyetracking study in which

participants were asked to read English sentences containing compounds of the four

mentioned types, Frisson, Niswander-Klement & Pollatsek (2008) found that when

compounds were presented with a space between them (e.g. honey moon (OO), dish washer

(TT), god child (OT), and stair case (TO)) there was a transparency effect: transparent items

were read faster than opaque or partially opaque items. If we extrapolate these findings to the


categories in Howarth’s Continuum Model and how they may be processed, then free

combinations would be akin to the TT type of compound, and collocations would be similar

to the partially opaque types (OT or TO). However, even though a collocation like pay a visit

has a verb in a non-core sense, it may still not be correct to say that pay in pay a visit is

semantically opaque in the same way that god is opaque in god child. Still, the finding from

Frisson, Niswander-Klement & Pollatsek (2008) is relevant enough to inform our predictions.

With these points in mind, we wanted to carry out a study that looked at the processing

of collocations where collocation was carefully operationalized according the phraseological

approach to word combinations, and compared against free combinations, rather than novel

sequences. Novel sequences are typically very low in frequency, and we have seen that

frequency is a factor that will influence processing. By comparing collocations instead with

free combinations, and matching them in terms of frequency, we are in a better position to

find out whether the semantic specialization in the components of a collocation makes a

difference in processing compared to the free combinations.

The following research questions were used for the study:

1. For advanced NNSs of English, is there a processing cost for collocations compared to free

combinations in terms of reaction time (RT) and error rate (ER) values?

2. Is the pattern the same or different for native speakers?

3. Is Howarth’s descriptive distinction between free combinations and collocations in the

continuum model reflected in processing differences?

Based on our review of theoretical positions, processing models and previous empirical

studies, and assuming conditions that are matched for frequency, we predicted that NNSs

would process collocations slower than free combinations, as the lower level of transparency

in collocations would come with a cost. Furthermore, we expected that NSs would process


collocations and free combinations roughly at a similar speed, as no totally opaque items were

used, and since collocations and free combinations that are matched for frequency are equally

likely to be stored as wholes and available for direct retrieval. Finally, we predicted that NSs

generally would process both types of word combinations faster than NNSs, simply by virtue

of being native speakers.

Methodology

Item development

To address our research questions, we used a semantic judgement task (see Jiang, 2012) to

assess response times (RTs) and error rates (ERs) to items in three critical conditions. A key

assumption underlying the task is that if there is a processing cost for collocations for NNSs,

then we should expect this to be revealed through slower RTs, and possibly higher ERs, in

comparison to free combinations. However, we should not expect to see such differences for

NSs, who could presumably store both commonly-occurring free combinations and

commonly-occurring collocations as single entries in their lexicon.

With these assumptions in mind, the conditions included in the task were: 1) free

combinations (n = 27), 2) collocations (n = 27), and 3) baseline items (n = 54). All items

consisted of verb + (object) noun combinations, with determiners added where necessary to

ensure grammaticality. Our operationalization of semantic transparency follows from

Howarth’s (1996) definitions of free combinations and collocations. He described free

combinations as word combinations consisting of “two or more words in which the elements

are used in their literal sense. Each component may be substituted without affecting the

meaning of the other.” (1996, p. 47). This entails a higher degree of semantic transparency.

Collocations, on the other hand, have “one component … used in its literal meaning, while the

other is used in a specialized sense. The specialized meaning of one element can be figurative,


delexical or in some way technical and is an important determinant of limited collocability at

the other.” (1996, p. 47). Collocations are for this reason believed to have a lower degree of

semantic transparency. In the present study, both principal word elements in the free

combination category appeared in their literal senses, e.g. combinations like write a letter,

kick a ball or sing a song, whereas in the collocations the verb elements appeared in a

specialized sense and the nouns in their literal senses, e.g. phrases like run a risk, draw a

conclusion, or serve a purpose i. To ensure validity in the item classification, the fifty-four

items in the two critical conditions were given to two additional raters, linguists specializing

in phraseology, whose task it was to categorize the items in a randomized list according to

Howarth’s definitions. Using both Krippendorff's alpha and Fleiss's kappa (Hayes &

Krippendorff, 2007), inter-rater reliability for the three ratings was observed at .804 and .802,

respectively. In those cases where disagreement was found (n = 8) ii, the classification chosen

by a majority of raters was followed. All items in the two conditions (i.e. free combinations

and collocations) were checked for other factors that might influence RTs and/or ERs on the

task. These included length of expression (i.e. number of letters and spaces), frequency, and

the number of cognates. Frequency was checked using a number of different measures with

data obtained from the Corpus of Contemporary American English (COCA, Davies, 2008-).

Here, we checked the raw frequency (i.e. total number of occurrences in the corpus) of the

verbatim phrases (log-adjusted using natural logs) as they appeared on the task (e.g. kick a

ball) as well as the log-adjusted frequency (again using natural logs) of the phrases with a

lemmatized form of the verb in the items (i.e. kick / kicks / kicked/ kicking a ball). A statistical

comparison of the items in the two critical conditions (a Mann-Whitney U test) revealed no

significant differences among any of the comparisons (see Table 3). Finally, it should be

noted that all items in both conditions were ‘congruent’ collocations. Thus, all items could be

translated word-for-word into the NNSs’ L1 (Swedish, see below) with no loss of meaning.


For example, the English expressions write a letter and run a risk have the translation

equivalents of skriva ett brev and löpa en risk in Swedish. This was done to eliminate the

possible confounding effects of congruency versus incongruency, as previous studies have

consistently shown that congruent collocations tend to be processed faster (and oftentimes

more accurately) than incongruent collocations (e.g. Wolter & Gyllstad, 2011; 2013,

Yamashita & Jiang, 2010). As our focus in the present study was semantic transparency rather

than cross-linguistic influences, we thus decided to use only congruent word combinations.


The baseline items were included not only for comparison purposes, but also to ensure that

participants did not develop a familiarity effect for the task. Baseline items were constructed

by randomly recombining verbs and nouns from the items in the free combination and

collocations conditions. All recombined items were checked against the COCA to ensure

novelty. If items occurred with any regularity in the corpus, they were recombined and

checked again. Ideally, baseline items should not be attested in the corpus at all (i.e. should

have a frequency of 0). In practice, however, this is difficult to achieve with a large corpus

such as the COCA. Therefore, we allowed for a small number of occurrences in the corpus if

we could verify that the actual occurrences were idiosyncratic enough to be deemed highly

unlikely constructions. In total, there were only 4 such items, carry a car (1 instance), play

fruit (2), make a risk (2), and set patients (1). The full list of items in all three conditions is

shown in the Appendix.

Participants and experiment administration


The participants in this study consisted of learners of English as an L2 (L1 Swedish, n = 27)

and L1 English speakers (n = 38). The NNSs were all undergraduate students at a university

in Sweden. The NS group consisted of 35 undergraduate and 3 graduate students studying at a

university in North America. Full biographical data for the participants are provided in Table

4.


RTs and ERs to the items in the three conditions were assessed using a semantic judgement

task. The task was administered using DMDX software. All items were presented in an

individually randomized order. Participants were asked to press the ‘yes’ key if they felt an

item was “meaningful and natural” in English, and the ‘no’ key if they felt an item was not

meaningful and natural in English. They were also instructed to answer as quickly and

accurately as possible.

The presentation procedure is shown in Figure 1. The test began with a practice

session prior to actual data collection so that participants could familiarize themselves with

the task. Participants were also allowed a short break in the middle of the test. Most

participants finished the whole task in 5-7 minutes.

*****FIGURE 1 NEAR HERE*****


In addition to the semantic judgement task, both groups of participants were also administered

the Y_Lex test of vocabulary size (Meara, 2005) as a proxy for general proficiency. The

Y_Lex test assesses receptive knowledge of vocabulary between the 5-10K levels using a

yes/no response format. The test includes a number of pseudo-words, and automatically

adjusts the scores of test takers who erroneously identify these words as actual English words.

The Y_Lex format was chosen on the assumption that participants in both groups would

likely know all of the words in the X_Lex (0-5k) version of the test. Finally, the NNS group

was also administered a self-rating questionnaire to assess their perceptions of their English

ability in the areas of speaking, listening, reading, and writing, and to provide data on first

exposure to English.

Results

To begin with, mean scores were calculated for the Y_Lex test for both the NS and the NNS

group. Two scores from this measure were compared, the raw scores and the adjusted scores.

Raw scores do not take into account erroneous responses to the pseudo-word items (see

above) while adjusted scores do. Independent samples Mann-Whitney tests revealed that the

NSs scored significantly higher than the NNSs in terms of raw scores (4,354 vs. 3,870), U =

267.50, p = .001, r = .41, but not for adjusted scores (3,146 vs 3,333), U = 458.00, p = .464, r

= .09. These results support our initial suggestion that this was indeed a highly proficient

NNS group.

Of much more importance to the current study were the RTs and ERs within and

across the two groups for items the three critical conditions. The mean RTs and ERs for the

three critical conditions for both groups are shown in Table 5, whereas visualizations of the

RT and ER results are shown in Figures 2 and 3, respectively. For the statistical analyses, we


used the lme4 package (Bates, Maechler, Bolker, & Walker, 2014) in the R statistical

platform (R Core Team, 2012) to construct mixed-effects models comparing RTs and ERs iii

.

The model development procedure progressed as follows. The first step was to design

a ‘core’ model that included participant and item as cross random effects along with

independent variables that were of central importance, specifically item type (free

combination, collocation, or baseline), group (NS or NNS) and the interaction between these

two variables. This core model also included random slopes to allow for individual and group

variations. The effect of item type was included as a random slope by participant, and group

was included as a random slope by item. The next step was to create a complex model that

included a large number of covariates that might affect RTs or ERs. These included the

following: item length (number of letters), trial (the order in which the item appeared for the

subject), log normalized (natural log) verb and noun frequencies (using total occurrence

counts in COCA), and first versus second occurrence of both the verb and the noun (i.e.

whether a participant was seeing a particular word for the first time or the second time owing

to the fact that the baseline items consisted of random recombinations of words from the free

combinations and collocations). All interval covariate values were centered and standardized

before analysis, while the first versus second occurrence variables were treated as categorical.

This complex model also initially included a number of random slopes. For example, trial and

length were included as random slopes that varied by participant. However, these additional

random slopes created convergence errors in lme4 so they were removed to improve model fit

iv. After this we used a backwards stepwise procedure to eliminate variables that did not

contribute to the fit of the model. The process was to eliminate the covariate with the lowest t-

value and then refit the model. This process continued until only the core model and

covariates with a t-value of at least two remained, at which point a full, simultaneous

information-theoretic model comparison was done using the model.sel command in the


MuMIn package (Barton, 2014) in R. Model.sel provides estimates of the corrected Akaike

information criterion (AICc) which we used to determine the best model (i.e. the most

parsimonious model in terms of covariates that also included the core model). Finally, effect

sizes were calculated for the model that was ultimately selected also using the MUMIn

function. MUMIn provides R2 values for the fitted mixed model in two forms: marginal and

conditional. Marginal R2 values are associated with the fixed effects while conditional R

2

values reflect both the fixed and the random effects combined.




The first group of models was constructed according to the above procedure using RT

as the response variable. Before constructing the models, however, we first had to prepare the

data. The initial step in this process was eliminating correct responses that were faster than

450 milliseconds (based on the assumption that these were erroneous key presses) and

incorrect responses (i.e. ‘no’ responses to free combinations and collocations, and ‘yes’

responses to baseline items) as well as items that had timed out at 4000 milliseconds. Then,

we log-transformed the remaining RTs using natural logs. The final RT model derived from

the procedure is shown in Table 6. Details regarding the other models developed in the

stepwise procedure are shown in the Table 7. As can be seen in Table 6, the final model

included the core variables with trial and length included as covariates. The results revealed

no significant differences between the NS and NNS groups either in terms of overall mean


RTs or in respect to group by item type interactions v. To further validate this finding

regarding interactions, we ran a final model that simply eliminated the interaction between

group and item type. We then compared this model with the model containing the interaction

using a log likelihood ratio test to see if the inclusion of the interaction was justified (i.e.

produced a significantly better fitting model). The results suggested that there was not a

significant difference, further supporting the conclusion that NSs and NNSs performed in a

very similar manner to items on the task in respect to RTs.



The next set of models, the ER models, was constructed using a mixed modeling

logistic regression procedure with ER treated as a binomial response variable. Responses were

classified as erroneous if the response was other than that expected (i.e. a ‘no’ response for a

free combination or a collocation and a ‘yes’ response for a baseline item) or if a timeout

occurred. The model identified through the backward stepwise procedure (and subsequent

model comparison procedure) is shown in Table 8. Table 9 shows the details regarding the

other models assessed in the backwards stepwise procedure. As can be seen in the Table 8,

there were two main effects that significantly affected ERs. One was the difference in item

type, which indicated that free combinations generated significantly fewer errors than

collocations. The other, somewhat surprisingly, was the first versus second presentation of the

nouns in the test items. Specifically, these results indicated that participants were more likely

to produce an unexpected (i.e. erroneous) response the second time the noun appeared in an


item. As for the NSs versus the NNSs, there was no significant difference in main effect

between the two groups, nor were there any significant interactions between group and item

type, despite the obvious differences in patterns shown in Figure 3. In order to obtain direct

comparisons between the free combination ERs and the baseline ERs, we re-leveled the data

using free combinations as the reference level and ran the selected ER model once more. Not

surprisingly, this revealed significantly more errors for baseline items over free combinations

z = 3.87, Pr(>|z|) = .0001. Nonetheless, the additional interaction revealed through this re-

leveling, NS (vs. NNS) x baseline (vs. free combination), was again not significant z = 1.20,

Pr(>|z|) = .2293. The lack of significant differences (p < .05) in interactions notwithstanding,

it is still worth noting that NSs tended to produce more erroneous responses (i.e. ‘yes’

responses to items that did not appear in the corpus) to the baseline items than the NNSs. In

brief, NSs were more permissive of items that were not attested in the corpus than were NNSs.



Finally we constructed one more set of RT-based models and one more set of ER-

based models that took into account the frequency of the entire free combinations and entire

collocations (as opposed to the frequency of the individual words alone) to see if this would

also have a significant effect on processing. As noted above, previous research (e.g. Wolter &

Gyllstad, 2013) indicated that corpus frequency for collocations was linked in a predictable

way to collocational processing, both for NSs and advanced NNSs. In order to conduct these

analyses we first needed to eliminate the baseline items, since by definition these almost

always had frequency values of zero. Thus, the frequency models were constructed using only

the data for the free combinations and collocations. Since we had two measures of frequency

that were closely related (verbatim phrasal frequency and lemmatized frequency, see above),


we decided to construct two models for each response variable (i.e. RT and ER), and then

compare the models using a log likelihood ratio test. This process indicated a significantly

better fitting model for both the RTs and ERs using the lemmatized frequency values, so these

are the models we retained. The starting point for both the RT and the ER models was the

final models identified using the full set of data described above. Since group x item type

interactions were not found to be significant in either of the initial RT or ER models, these

interactions were eliminated for these models.

The results for the final RT model are shown in Table 10 with the final ER model

results shown in Table 11. As can be seen in Table 10, lemmatized frequency values affected

RTs in a predictable way: higher frequencies were associated with faster responses.

Furthermore, item type remained a significant predictor of RTs. Somewhat unexpectedly,

however, was the fact trial and length ceased to be significant predictor variables. This

suggests that the effect these covariates had may have been more pronounced in respect to the

baseline items. Once the baseline items were removed, the effect was no longer significant.

Group was still not a significant predictor variable. In fact, a log-likelihood comparison of a

number of models revealed that the best fitting model included only two predictor variables:

lemmatized frequency and item type. As for the ER models, lemmatized frequency was not a

significant predictor of errors (Table 11). The final model was similar to the initial model in

that it included only two significant predictor variables: item type and noun occurrence.



The overall results of the study have some clear implications for our research

questions. To begin with, there did seem to be a processing cost for collocations when


compared to free combinations. However this cost appeared to be unrelated to one’s status as

a NS or NNS; both groups demonstrated this tendency both in terms of faster RTs and lower

ERs on free combinations compared to collocations. This finding was bolstered by the fact

that group by item type interaction results were also non-significant. Furthermore, the results

of the analyses using lemmatized phrasal frequency were consistent with previous research, at

least as far as RTs were concerned, in that higher frequency was significantly associated with

faster RTs.

Discussion

We wanted to investigate whether there is processing cost for collocations compared to free

combinations for advanced non-native speakers of English, and whether the pattern is the

same for native speakers. The study took Howarth’s (1996, 1998) phraseological

classifications of word combinations from the Continuum Model as its point of departure. An

overall aim was to investigate if the descriptive categories in this model would be reflected in

processing. The results showed that there was slower processing for collocations compared to

free combinations, not only for NNSs but also for NSs, and that there was also a higher error

rate. As such, the results lend support to the distinction made in the Continuum Model in

terms of expected differences in processing, at least as far as free combinations and

collocations are concerned. The results also showed that NS and NNS processing was very

similar in terms of reaction time for the two word combination conditions. The question is

how these results can be explained.

Based on our review of relevant processing models and previous research on

phraseological processing, our predictions were that NNSs would process collocations slower

than free combinations, whereas for NSs there would be no difference. However, these


predictions were not borne out. As we have seen, Wray has (2002) claimed that collocations

are formulaic for native speakers (NS) but not for non-native speakers (NNS). Additionally,

according to a dual route model (Wray, 2008; Van Lancker Sidtis, 2012), a formulaic

sequence has a stored mental representation in the mental lexicon, and therefore allows direct

retrieval. This provides a quicker route than computation, which is used for sequences that do

not have a stored representation. However, we did not observe this difference between the two

groups. The question is: What caused the processing cost for the collocations for both groups

of participants? We would like to venture the following explanation. For a delexical verb like

make, which in isolation is relatively void of specific meaning, there is initially processing

competition between a sense that projects as its object the concrete creation of a physical

object, and a sense which projects as its object a more abstract accomplishment. As the

processing continues to the noun, for example progress, it is only at this point that a

resolution can be made as to whether the more concrete or abstract sense should be supported.

This sense resolution is believed to come with a processing cost. Wittenberg & Piñango

(2011) observed a processing cost for light verb constructions (which would be the same kind

of construction as the delexical verb + noun sequence used in the present study) compared to

matched non-light constructions. They argued that the processing cost comes from the sense

interpretation being “combinatorial and built in real-time” (p. 406) rather than retrieved whole

from memory. The same kind of ambiguity cost would apply for figurative verb uses. For

example, in the processing sequence of a collocation like break a promise, an assumption is

that the word break does not appear in what could be called its prototypical semantic value

(Verspoor & Lowie, 1993), sometimes also called core meaning (Bahns, 1993; Nesselhauf,

2005). This meaning would be the literal ‘separating or cause to separate into pieces as a

result of a blow, shock, or strain’, prototypically having to do with physical objects. It stands

to reason that context plays an important role in sense resolution. However, in the absence of


any context that could bias a more figurative reading, that is, where the object of the breaking

is an abstract entity, like ‘law’ or ‘promise’, either a literal sense reaches higher activation

levels and subsequently has to be revised if the noun argument is an abstract entity, or the

simultaneous activation of both a literal and a figurative sense compete in terms of activation.

In both cases, there is an expected concomitant processing cost.

Interestingly, the observed processing cost for collocations in comparison to free

combinations can be seen to mirror the results obtained by El-Bialy, Gagné & Spalding

(2013) for compounds. These authors investigated the processing of different types of

compound nouns and found that compounds consisting of two semantically transparent

constituents (TT) or two semantically opaque constituents (OO) were processed differently

(quicker) than partially opaque compounds consisting of one transparent and one opaque

constituent (TO or OT). Thus, only partial opacity came with a processing cost. If

extrapolating their semantic classification of compounds to the categories in Howarth’s (1996,

1998) model, then an item with two transparent constituents would correspond to ‘free

combinations’, and an item with one transparent and one opaque constituent would be akin to

the phraseological definition of ‘collocation’. Thus, in an initial comparison, our collocations

seem to have characteristics that are reminiscent of partially opaque compounds, and the

processing cost for these compared to the fully transparent category of free combination can

be seen to corroborate findings in the literature on compound processing (we may also note in

passing that a fully opaque item (OO) would be most similar to a pure idiom in Howarth’s

(1996, 1998) model). However, the compounds administered in El-Bialy, Gagné & Spalding’s

study appeared as contiguous letter strings, e.g., sugarcane, and previous studies on

compound processing (Libben, Gibson, Yoon & Sandra, 2003) have found that the insertion

of a space between constituents - intra-compound spacing – such as sugar cane, makes access

to the constituents’ meanings easier, resulting in computation (along the lines of the dual route


model presented in Section 1) and thus slower processing. The same observation is made in

Frisson, Niswander-Klement & Pollatsek (2008), who argue that introducing intra-compound

spacing leads to forced computation (assembly), in which case transparency matters. There is

also another difference between the partially opaque compounds used in studies of compound

processing and the collocations used in the present study; whereas one of the constituents in a

partially opaque compound is seen not to contribute to the meaning of the whole compound,

e.g. straw in strawberry, in a figurative verb use like break a promise, the meaning of the verb

must still be seen to contribute to the meaning of the whole word combination. It is therefore

necessary to exercise caution when it comes to seeing the two cases as fully comparable in

terms of processing. Nonetheless, in the absence of models specifically designed to explain

collocational processing, our obtained results are in line with the previous results from

research on compound processing.

Even though our approach to collocations in this study was rooted in the

phraseological tradition, the question of frequency still merits a brief discussion, more

specifically in relation to familiarity, i.e. how well known a particular word combination is to

an individual. Since frequency has been shown to affect language processing at all levels of

representation (Ellis, 2002), by using corpus data we made sure to control for the frequency of

the word combinations used in our experiment; matched lists of items were used in the

experimental conditions. In the initial models in the Results section (Tables 6 and 8) we did

not include any phrasal frequency values. When including such values in our final RT model,

however (Table 10), we observed that both item type and item frequency were significant

predictors of RT. Thus, not only semantic transparency but also phrasal frequency seems to

matter. Our frequency values were taken from a corpus and therefore do not represent specific

data on how familiar the items in the experiment were to our particular participants. However,

even though it is clear that corpus frequencies are proxies, our findings nevertheless have a


bearing on the effect of item familiarity on processing. Notably, familiarity has previously

been observed to affect recognition of individual words (Connine, Mullennix, Shernoff &

Yelen, 1990), and Tabossi, Fanari and Wolf (2009), in a study investigating processing of

idioms (non-compositional) and clichés (compositional) through a semantic judgement task,

found that familiarity had a large part in determining reaction times, irrespective of items

being compositional or not. Individual item familiarity is consequently a factor that needs to

be taken into account in future studies.

A further point for discussion is the NNS performance in relation to the NS control

group. The NNSs, Swedish learners of English, processed free combinations and collocations

with the same RTs as the native speaker group. At first blush, this may seem surprising.

However, the high proficiency level of the NNS group is seen both in the self-rated

proficiency estimates and the vocabulary size scores vi

. These NNS scored a corrected mean

of 3,333 (out of a maximum 5,000 (the test targets the 6-10K bands)). On the assumption that

the first five thousand words are known, this would correspond to a total mean vocabulary

size of about 8,300 lemmas. This is a sizeable L2 vocabulary and clearly indicative of a high

proficiency vii

. In terms of collocational processing, the high level of NNS performance has

been observed also in previous studies. In both Wolter & Gyllstad (2011) and Wolter &

Gyllstad (2013), advanced Swedish university-level learners of English processed verb +

noun combinations (in a primed lexical decision task) and adjective + noun combinations (in

an acceptability judgement task), respectively, with about the same processing speed as

university-level native speakers of English. A very important point to note, though, is that this

only seems to apply to congruent collocations, i.e. collocations in which a word-by-word

translation of the constituent words in one language yielded a perfectly acceptable collocation

in the other. For incongruent collocations, on the other hand, no such similar processing

performance was observed in the two previous studies (Wolter & Gyllstad, 2011, 2013). In


terms of the present study, only using congruent items thus seems to have leveled the playing

field, as it were, enabling the non-native speakers to process L2 structures without any

potential negative influence of cross-linguistic influence, and on a par with native speakers.

In conclusion, we have seen that semantic transparency affects processing of word

combinations, both for non-native and native speakers; more specifically, when defined along

the lines of the phraseological tradition, collocations were processed slower than free

combinations. We also saw that phrasal frequency predicted RTs. These results come with a

number of important implications. Theoretically, the results lend partial support to the

descriptive classifications made in Howarth’s (1996, 1998) phraseological continuum model.

In terms of future studies on word combination processing, these should also include idioms,

in addition to free combinations and collocations, in order to investigate how fully opaque

word combinations are processed in comparison (Gyllstad, in preparation). This is needed to

further validate the continuum model, and also since the results in L2 idiom research to date

are generally inconclusive. Another implication is that future studies on collocational

processing and learning should take semantic transparency into account, as a purely

frequency-based approach will fail to capture this important variable. Together with results

from previous studies on L2 collocational processing, where factors like congruency (Wolter

& Gyllstad, 2011, 2013) and frequency of input (Wolter & Gyllstad, 2013) have been shown

to be relevant, there is now an emerging knowledge base that can inform future studies. By

and large, it would seem that lack of congruency between the L1 and the L2 is the biggest

stumbling-block in L2 collocational processing, at least for more native-like learners. In

general, what seems to be needed is more comprehensive and empirically-based models of

word combination processing and representation in a L2, in which also learner proficiency

level is taken into account. Finally, on a more practical and pedagogical level, we agree with

Webb, Newton & Chang (2013) in their call for studies that investigate learning of


collocations that vary in semantic transparency. As a first step, we have shown in our study

that semantic transparency affects processing of collocations, and future studies should shed

light on whether it also affects their acquisition, and as part of that, their comprehension.


Notes

i It should be noted that our operationalization in effect harmonizes with that of Laufer & Waldman

(2011), who place collocations on a continuum between free combinations and idioms, but who

referred to collocations as having “relative transparency of meaning” (in relation to idioms whose

meaning is much less transparent and very often opaque) and free combinations as word combinations

“in which the individual words are easily replaceable following the rules of grammar” (p. 648-9). See

Webb, Newton & Chang (2013) for a purely frequency-based operationalisation that does not take

semantic transparency into account.

ii One reviewer questioned the classification of some of the free combinations. We acknowledge that it

is sometimes difficult to distinguish between a word’s literal and more figurative or technical

meanings. Dictionary entries are not always reliable in this regard, as different dictionaries use

different criteria for ordering senses, and it is not invariably the case that the first listed meaning is a

literal one. We therefore relied on specialist rater judgements.

iii Mixed-effects models were chosen because they allow for the inclusion of both participant and item

as cross random effects. This allows the researcher to account for individual differences in participants

(e.g. slow versus fast responders) and in items. It also eliminates the need for separate analyses with

participants as a random variable and items as a random variable (so-called F1 and F2 analyses).

iv In fact any random slope introduced into the model created convergence errors. However, the

random slopes for the core variables of item type and group were retained in the model to ensure

generalizability.

v P-values were calculated using a method described in Baayen (2008, p. 248). Under this method, a

degrees of freedom estimate is obtained by taking the number of observations and subtracting the

number of fixed factors included in the model.

vi We used a vocabulary size test as a proxy for overall proficiency, as vocabulary size has been shown

to correlate at a high level with measures of writing, listening, and reading skills in a language

(Alderson, 2005).

vii

For similar performance levels, see Gyllstad (2011, 2012).


Acknowledgements

We would like to thank the anonymous reviewers and the journal editors for constructive

comments and suggestions on earlier drafts of the paper. Thanks also go to the audiences at

the Vocab@Vic conference in Wellington, New Zealand, 2013; the AAAL conference in

Portland, Oregon, US, 2014; and the Mental Lexicon conference in Niagara-on-the-Lake,

Canada, 2014, from whom we received helpful comments on earlier presentations of the study.

The responsibility for any remaining errors lies with the authors. Finally, we would like to

thank Sara Farshchi for her help with some of the data collection, and the participants of the

study for taking part.

Work on this study was supported by the Swedish Research Council by means of a project

grant (VR 2012 – 906) awarded to Henrik Gyllstad.

Correspondence concerning this article should be addressed to Henrik Gyllstad, English

Studies, Centre for Languages and Literature, Lund University, Box 201, 22100 Lund,

Sweden. E-mail: [email protected]


References

Abel, B. (2003). English idioms in the first language and second language lexicon: a dual

representation approach. Second Language Research, 19(4), 329–358.

Alderson, C. (2005). Diagnosing foreign language proficiency: The interface between

learning and assessment. London: Continuum.

Baayen, H. (2008). Analyzing linguistic data. Cambridge: Cambridge University Press.

Bahns, J. (1993). Lexical collocations: A contrastive view. ELT Journal, 47, 56–63.

Barfield , A. & Gyllstad, H. (2009). Introduction: Researching L2 collocation knowledge

and development . In A. Barfield & H. Gyllstad (Eds.), Researching collocations in

another language: Multiple interpretations (pp. 1 – 18 ). Basingstoke, UK: Palgrave

Macmillan.

Barton, K. (2014). MuMIn: Multi-model inference (Version 1.10.5) [computer software].

Bates, D., Maechler, M., Bolker, B. & Walker, S. (2014). lme4: Linear mixed-effects models

using Eigen and S4 (Version 1.1-7) [computer software].

Benson, M., Benson, E. and Ilson, R. (1997). The BBI dictionary of English word

combinations. Amsterdam: John Benjamins.

Biskup, D. (1992). L1 influence on learners’ renderings of English collocations: A

Polish/German empirical study. In P. Arnaud & H. Béjoint (Eds.), Vocabulary and

applied linguistics (pp. 85–93). Amsterdam: John Benjamins.

Cieślicka, A. (2006). Literal salience in on-line processing of idiomatic expressions by second

language learners. Second Language Research, 22(2), 115–144.

Columbus, G. (2013). In support of multiword unit classifications: Corpus and human rating

data validate phraseological classifications of three different multiword unit types.

Yearbook of Phraseology, 4, 23-43.


Connine, C. M., Mullennix, J., Shernoff, E., & Yelen, J. (1990). Word familiarity and

frequency in visual and auditory word recognition. Journal of Experimental

Psychology: Learning, Memory, and Cognition, 16(6), 1084-1096.

Conklin, K. and Schmitt, N. (2008). Formulaic sequences: Are they processed more quickly

than non-formulaic language by native and non-native speakers. Applied Linguistics, 29,

72–89.

Cowie, A.P. (1981). The treatment of collocations and idioms in learners‘ dictionaries.

Applied Linguistics 2(3). 223-235.

Cowie, A.P. (1994). Phraseology. In R. E. Asher (ed.), The Encyclopedia of Language and

Linguistics, 3168-3171. Oxford: Pergamon.

Davies , M. (2008–).The corpus of contemporary American English: 450 million words,

1990– present . Retrieved from http://corpus.byu.edu/coca .

Durrant , P. & Schmitt , N. (2010). Adult learners’ retention of collocations from exposure.

Second Language Research, 26, 163–188.

El-Bialy, R., Gagné, C. L., & Spalding, T. L. (2013). Processing of English compounds is

sensitive to the constituents’ semantic transparency. The Mental Lexicon, 8(1), 75-95.

Ellis, N. C. (2002). Frequency effects in language processing. Studies in Second Language

Acquisition, 24(2), 143-188.

Ellis, N. C., Frey, E. & Jalkanen, I. (2009). The psycholinguistic reality of collocation and

semantic prosody (1): Lexical access. In U. Römer & R. Schulze (Eds.), Exploring the

lexis-grammar interface (pp. 89–114). Amsterdam: John Benjamins.

Farghal, M. & Obiedat, H. (1995). Collocations: A neglected variable in EFL. IRAL, 33(4),

315-331.


Frisson, S., Niswander-Klement, E. & Pollatsek, A. (2008). The role of semantic transparency

in the processing of English compound words. British Journal of Psychology, 99(1),

87–107.

Gibbs, R. and Gonzales, G. (1985). Syntactic frozenness in processing and remembering

idioms. Cognition, 20, 243–259.

Granger, S. (1998). Prefabricated patterns in advanced EFL writing: Collocations and

formulae. In A. P. Cowie (ed.), Phraseology: Theory, analysis, and applications (pp.

145-160). Oxford: Oxford University Press.

Granger, S. & Meunier, F. (Eds.) (2008). Phraseology: An interdisciplinary perspective.

Amsterdam: John Benjamins.

Gyllstad, H. (2011). A short longitudinal study of foreign language vocabulary size – the case

of advanced Swedish learners of English. Paper presented at the 21st EUROSLA

conference in Stockholm, Sweden, 7-10 September, 2011.

Gyllstad, H. (2012). Validating the Vocabulary Size Test – A Classical Test Theory Approach.

Poster presented at the 9th

EALTA conference in Innsbruck, Austria, 31 May – 3 June,

2012.

Hayes, A. F. and Krippendorff, K. (2007). Answering the call for a standard reliability

measure for coding data. Communication Methods and Measures, 1(1), 77–89.

Henriksen, B. (2013). Research on L2 learners’ collocational competence and development –

a progress report. In C. Bardel, C. Lindqvist & B. Laufer (Eds.), L2 vocabulary

acquisition, knowledge and use - new perspectives on assessment and corpus analysis

(pp. 29-56). Eurosla Monograph Series 2. Eurosla.

Henriksen, B. & Stenius Staehr, L. (2009). Processes in the development of L2 collocational

knowledge – A challenge for language learners, researchers and teachers. In A. Barfield


& H. Gyllstad (Eds.), Researching collocations in another language: Multiple

interpretations (pp. 224-231). Basingstoke and New York: Palgrave Macmillan.

Howarth, P. (1996). Phraseology in English academic writing. Tübingen: Max Niemayer

Verlag.

Howarth, P. (1998). Phraseology and second language proficiency. Applied Linguistics, 19(1),

24–44.

Jiang, N. (2012). Conducting reaction time research in second language studies. London:

Routledge.

Kuiper, K. (2010). Editorial. Yearbook of Phraseology, 1(1), IX-X.

Laufer, B. & Waldman, T. (2011). Verb-noun collocations in second language writing: A

corpus analysis of learners’ English. Language Learning, 61(2), 647-672.

Libben, G., Gibson, M., Yoon, Y., & Sandra, D. (2003). Compound fracture: The role of

semantic transparency and morphological headedness. Brain and Language, 84(1), 50–

64.

Meara, P. (2005). Y_lex. Swansea: Lognostics.

Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some

implications for teaching. Applied linguistics, 24(2), 223-242.

Nesselhauf , N . (2004). What are collocations? In D. J. Allerton , N. Nesselhauf , &

P. Skandera (Eds.), Phraseological units: Basic concepts and their application (pp. 1 –

21 ). Basel, Switzerland : Schwabe.

Nesselhauf , N . (2005). Collocations in a learner corpus. Amsterdam: Benjamins.

R Core Team. (2012). R: A language and environment for statistical computing. Vienna,

Austria: R Foundation for Statistical Computing. (Version 3.1.2). [Computer software].

Schmitt, N. (2010). Researching vocabulary: A vocabulary research manual. Basingstoke:

Palgrave Macmillan.


Siyanova-Chanturia, A., Conklin, K. and Schmitt, N. (2011). Adding more fuel to the fire: An

eye-tracking study of idiom-processing by native and non-native speakers. Second

Language Research, 27(2), 251–272.

Siyanova-Chanturia, A., Conklin, K. and van Heuven, W. J. B. (2011). Seeing a phrase “time

and again” matters: The role of phrasal frequency in the processing of multiword

sequences. Journal of Experimental Psychology, 37(3), 776–784.

Siyanova, A. and Schmitt, N. (2008). L2 learner production and processing of collocation: A

multi-study perspective. Canadian Modern Language Review, 64(3), 429–458.

Skandera, P. (2004). What are idioms? In D. J. Allerton , N. Nesselhauf & P. Skandera (Eds.),

Phraseological units: Basic concepts and their application (pp. 23 – 36). Basel,

Switzerland : Schwabe.

Swinney, D. and Cutler, A. (1979). The access and processing of idiomatic expressions.

Journal of Verbal Learning and Verbal Behaviour, 18, 523–534.

Tabossi, P., Fanari, R. and Wolf, K. (2009). Why are idioms recognized fast? Memory and

Cognition, 37, 529–540.

Underwood, G., Schmitt, N. and Galpin, A. (2004). The eyes have it: An eye-movement study

into the processing of formulaic sequences. In N. Schmitt (Ed.), Formulaic Sequences

(pp. 153–172). Amsterdam: John Benjamins.

Van Lancker Sidtis, D. (2012). Two-track mind: Formulaic and novel language support a

dual-process model. In M. Faust (ed.), The handbook of the neuropsychology of

language (pp. 342–367). Chichester, UK: Wiley-Blackwell.

Verspoor , M. & Lowie, W. (1993). Making sense of polysemous words. Language Learning,

53, 547 – 586.

Wittenberg, E. & Piñango, M. M. (2011). Processing light verb constructions. The Mental

Lexicon, 6(3), 393-413.


Wolter, B. & Gyllstad, H. (2011). Collocational links in the L2 mental lexicon and the

influence of L1 intralexical knowledge. Applied Linguistics, 32(4), 430–449.

Wolter, B. & Gyllstad, H. (2013). Frequency of input and L2 collocational processing: A

comparison of congruent and incongruent collocations. Studies in Second Language

Acquisition, 35(3), 451–482.

Wolter, B. & Yamashita, J. (2014). Processing collocations in a second language: A case of

first language activation? Applied Psycholinguistics, 1-29. First view. DOI:

http://dx.doi.org/10.1017/S0142716414000113

Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge University

Press.

Wray, A. (2008). Formulaic language: Pushing the boundaries. Oxford: Oxford University

Press.

Wray, A. (2012). What do we (think we) know about formulaic language? An evaluation of

the current state of play. Annual Review of Applied Linguistics, 32, 231–254.

Yamashita, J. & Jiang, N. (2010). L1 influence on the acquisition of L2 collocations: Japanese

ESL users and EFL learners acquiring English collocations. TESOL Quarterly, 44(4),

647–668.


APPENDIX

List of experimental items

Free combinations Collocations Baseline items

rent a car

write a letter

pick flowers

cause confusion

express an opinion

win a race

kick a ball

pay tax

exchange presents

sing a song

obey the law

light a fire

build houses

cross the river

solve a problem

catch a fish

lay eggs

feel pain

speak a language

eat meat

stop crying

hear voices

play the piano

lock the door

waste time

read a book

treat patients

change money

run a risk

break a promise

plant a bomb

draw conclusions

coin a phrase

suffer damage

breathe a word

throw light

give an order

keep a secret

carry the burden

open an account

seek asylum

hold a meeting

serve a purpose

seize the opportunity

bear fruit

set sail

cast doubt

fall victim

cut costs

take shape

make progress

shake hands

do business

have sex

cross a victim

shake money

catch crying

suffer an opinion

sing houses

solve a conclusion

plant a ball

feel sail

speak meat

eat a promise

break a letter

coin a race

draw law

lock a song

read a bomb

rent fire

carry a car

seize a river

light an account

play fruit

obey flowers

throw time

cut a meeting

cast a door

write a light

bear books

serve damage

pick confusion

express a piano

breathe a purpose

lay sex

hear a business

kick voices

win presents

take burden

open a pain

do a phrase

fall word

pay a problem

build tax

waste order

treat doubt

change a secret

make a risk

stop eggs

cause progress

set patients

run a shape

exchange costs

hold a language

keep opportunity

give hands

seek fish

have asylum


Tables and Figures

Table 1. Categories of lexical word combinations in Howarth’s Continuum Model (1998: 28);

here, with examples of V+NP structures.

Free combinations Restricted

collocations

Figurative idioms Pure idioms

pay a bill pay a visit pay the price pay the piper


Table 2. Key characteristics of a dual route processing model.

Processing Route Used for How Characteristics

Holistic memory

retrieval

Stored familiar phrases Access of whole forms

or simultaneous

activation of component

parts of a phrase

Quicker

processing

Computation/analysis Novel phrases

(or stored familiar

phrases, if required)

A word-and-rules

approach

Slower

processing


Table 3. Summary of test item means.

Item type free

combinations

collocations

baseline items

Statistical

comparison***

phrase frequency (lbf)†*

phrase frequency (vf)††*

5.6 (1.0)

5.0 (.8)

5.5 (1.5)

4.7 (1.5)

-

0.13 (0.44)**

1.19 .233

1.24 .216

verb frequency* 10.4 (1.0) 10.9 (1.6) 10.7 (1.3) -1.32 .186

noun frequency* 10.7 (1.0) 10.4 (1.0) 10.5 (1.0) .926 .355

phrase length (in letters) 12.2 (2.8) 12.5 (2.9) 12.0 (2.4) -.349 .727

Note. Standard deviations are provided within parentheses.

† frequency for whole phrase based on searches with the lemma of the verb (lbf)

†† verbatim form of whole phrase (vf)

* Log-adjusted values (higher values represent greater frequency).

** Not log-adjusted value

*** Mann Whitney U Test comparing the means from the free combination and collocation

conditions. The values reported are the standardized test statistic and significance.


Table 4. Biographical data for participants.

Group N Age Sex Self-rated proficiency

scores*

AoA*

*

Y_Lex***

M/F S L R W Yrs Raw Adj

NNS 27 23.8

(4.8)

9/18 7.5

(1.4)

8.4

(1.2)

8.4

(1.5)

7.4

(1.4)

5.5

(2.6)

3,870

(626)

3,333

(772)

NS 38 24.0

(8.9)

15/23 n.a. n.a. n.a. n.a. n.a. 4,354

(426)

3,146

(857)

Note: standard deviations are provided within parentheses

* 1 = none, 10 = nativelike; S = speaking, L = Listening, R = reading, W = writing

** Mean age of onset of acquisition

*** Test of vocabulary size targeting the 6-10 K frequency bands (maximum score = 5,000)


Table 5. Mean response times in ms, standard deviations (in parentheses), and error rates [in

brackets]

Group Free combinations Collocations Baseline items

NNS (N = 27) 1123 (461) [2.1% ] 1296 (663) [14.1% ] 1536 (731) [11.8% ]

NS (N = 38) 1164 (488) [2.5% ] 1283 (668) [15.7% ] 1551 (829) [19.9% ]


Table 6. Selected mixed model (after backward stepwise procedure) comparing NS and NNS

RTs to collocations, free combinations, and baseline items. R2 marginal = .15. R

2 conditional

= .44.

Fixed Effect Estimate Std. Error t-value p-value

(Intercept) 7.118115 0.036283 -- --

NS (versus NNS) -0.007050 0.042132 0.17 .87

Baseline (versus collocation) 0.171771 0.036521 4.70 <.0001

Free combination (versus collocation) -0.137120 0.029505 4.65 <.0001

Trial -0.023730 0.003564 6.66 <.0001

Length 0.041358 0.009382 4.41 <.0001

NS (vs. NNS) x baseline (vs.

collocation)† 0.032894 0.039606 0.83 .41

NS (vs. NNS) x free comb.(vs.

collocation)†† 0.032230 0.023175 1.39 .16

†Indicates differences in NS and NNS groups’ RTs when baseline means are compared to

collocation means.

†† Indicates differences in NS and NNS groups’ RTs when free combination means are

compared to collocation means.


Table 7. Variables included in RT models derived from backwards stepwise procedure (with

t-values)

Model Number

Variable 1 2 3 4 5

NS (versus NNS) 0.16 0.16 0.16 0.18 0.17

Baseline (versus collocation) 4.74 4.74 4.71 4.70 4.70

Free combination (versus collocation) 4.47 4.57 4.67 4.68 4.65

Trial 2.86 2.86 2.87 4.35 6.66

Length 4.20 4.3 4.4 4.41 4.41

Verb frequency (log adjusted) 0.25 -- -- -- --

Noun frequency (log adjusted) 0.87 0.93 -- -- --

Verb occurrence (first vs. second) 1.77 1.77 1.77 -- --

Noun occurrence (first vs. second) 1.85 1.84 1.84 1.83 --

NS (vs. NNS) x baseline (vs. collocation)† 0.80 0.80 0.8 0.83 0.83

NS (vs. NNS) x free comb. (vs. collocation)†† 1.43 1.43 1.44 1.44 1.39


collocation means.




Table 8. Selected mixed model (after backward stepwise procedure) comparing NS and NNS

ERs on collocations, free combinations, and baseline items. R2 marginal = .17. R

2

conditional = .51.

Fixed Effect Estimate Std.

Error z-value Pr(>|z|)

(Intercept) -2.69690 0.34618 -- --

NS (versus NNS) 0.22218 0.31494 0.705 .48

Baseline (versus collocation) -0.20212 0.42706 0.473 .63

Free combination (versus collocation) -2.21792 0.52814 4.199 >.0001

Noun occurrence (second vs. first) 0.18456 0.08658 2.132 .03

NS (vs. NNS) x baseline (vs. collocation)† 0.68334 0.39888 1.713 .09

NS (vs. NNS) x free comb. (vs. collocation)†† 0.04110 0.47355 0.087 .93

†Indicates differences in NS and NNS groups’ ERs when baseline means are compared to

collocation means.

†† Indicates differences in NS and NNS groups’ ERs when free combination means are



Table 9. Variables included in ER models derived from backwards stepwise procedure (with

z-values)

Model Number

Variable 1 2 3 4 5 6

NS (versus NNS) 0.71 0.71 0.72 0.72 0.74 0.71

Baseline (versus collocation) 0.48 0.47 0.49 0.48 0.46 0.47

Free comb. (versus collocation) 4.17 4.19 4.20 4.20 4.19 4.20

Trial 0.44 0.44 0.46 0.44 -- --

Length 0.21 0.21 0.19 -- -- --

Verb frequency (log adjusted) 0.03 -- -- -- -- --

Noun frequency (log adjusted) 0.21 0.21 -- -- -- --

Verb occurrence (first vs. second) 1.35 1.35 1.37 1.35 1.32 --

Noun occurrence (first vs. second) 1.8 1.86 1.86 1.87 2.46 2.13

NS (vs. NNS) x baseline (vs. collocation)† 1.7 1.7 1.7 1.7 1.69 1.71

NS (vs. NNS) x free comb. (vs.

collocation)†† 0.09 0.09 0.09 0.1 0.08 0.09


collocation means.




Table 10. Selected mixed model for RTs including lemmatized frequency ratings for free

combinations and collocations. R2 marginal = .03. R

2 conditional = .25.

Fixed Effect Estimate Std. Error t-value p-value

(Intercept) 7.259741 0.063572 -- --

NS (versus NNS) -0.109573 0.134963 -0.81 .42

Free combination (versus collocation) -0.070538 0.024769 2.85 .0045

Lemmatized phrasal frequency -0.033348 0.010059 3.32 .0009

Trial -0.003635 0.009140 0.40 .69

Length 0.004807 0.011795 0.41 .68


Table 11. Selected mixed model for ERs including lemmatized frequency ratings for free

combinations and collocations. R2 marginal = .15. R

2 conditional = .91.

Fixed Effect Estimate Std.

Error z-value Pr(>|z|)

(Intercept) -11.172 6.5481 -- --

NS (versus NNS) 9.5260 6.4801 1.470 .14

Free combination (versus collocation) -2.1628 0.7592 2.849 .0044

Lemmatized phrasal frequency -0.3037 0.1966 -1.545 .12

Noun occurrence (second vs. first) 0.6403 0.3304 1.938 .0526


Figure 1. Sequence of presentation for items in the semantic judgement task.


Figure 2. Reaction time (RT) results on the semantic judgement task. FC = free

combinations; COLL = collocations; BASELINE = baseline items

1000

1100

1200

1300

1400

1500

1600

FC COLL BASELINE

RT

(ms)

Item Type

NS

NNS


Figure 3. Error rate (ER) results on the semantic judgement task. FC = free combinations;

COLL = collocations; BASELINE = baseline items

0%

5%

10%

15%

20%

25%

FC COLL BASELINE

Erro

r R

ate

Item Type

NS

NNS

Collocational processing in the light of a phraseological continuum model: Does semantic...

Documents

Transcript of Collocational processing in the light of a phraseological continuum model: Does semantic...