Syntactic flexibility and competition in sentence production: the case of English and Russian

Running Head: COMPETITION IN SENTENCE PRODUCTION

Syntactic flexibility and competition in sentence production: The case of English and Russian

Andriy Myachykov, Christoph Scheepers, Simon Garrod, Dominic Thompson, and Olga Fedorova

Author note

Andriy Myachykov, Department of Psychology, Northumbria University, Newcastle upon

Tyne.

Christoph Scheepers, Simon Garrod, and Dominic Thompson, Institute of Neuroscience and

Psychology, University of Glasgow.

Olga Fedorova, School of Psychology, Moscow State University.

This research was supported by the ESRC grants PTA-026-27-1579 awarded to Andriy

Myachykov and RES-062-23-2009 awarded to Christoph Scheepers.

Authors gratefully acknowledge Victor Shklovsky and Maria Ivanova at Russian National

Center of Speech Pathology and Neurorehabilitation for their help in data collection for Experiment

2 and Oliver Garrod at University of Glasgow for his help with creating the script for automatic eye-

voice span extraction.

Correspondence concerning this article should be addressed to Andriy Myachykov,

Department of Psychology, Northumbria University, Northumberland Building, Newcastle upon

Tyne, NE1 8ST, United Kingdom, Tel.: +44-191-227-31-58, Fax: +44-191-227-45-15, e-mail:

[email protected]

Word Count: 8264

COMPETITION IN SENTENCE PRODUCTION 2

Abstract

We analyzed how syntactic flexibility influences sentence production in two different languages –

English and Russian. In Study 1, speakers were instructed to produce as many structurally different

descriptions of transitive-event pictures as possible. Consistent with the syntactically more flexible

Russian grammar, Russian participants produced more descriptions and used a greater variety of

structures than their English counterparts. In Study 2, a different sample of participants provided

single-sentence descriptions of the same picture materials while their eye-movements were

recorded. In this task, English and Russian participants almost exclusively produced canonical SVO-

active-voice structures. However, Russian participants took longer to plan their sentences, as

reflected in longer sentence onset latencies and eye-voice spans for the sentence-initial Subject

noun. This cross-linguistic difference in processing load diminished toward the end of the sentence.

Stepwise GLM analyses showed that the greater sentence-initial processing load registered in Study

2 corresponded to the greater amount of syntactic competition from available alternatives (Study 1),

suggesting that syntactic flexibility is costly regardless of the language in use.

167 words

Keywords: syntactic flexibility, competition, sentence production, English, Russian


Syntactic Flexibility and Competition in Sentence Production

This paper addresses two important issues in sentence production: (1) whether speakers necessarily

activate the inventory of structural alternatives available for the description of a given event in the

grammar of their language and (2) whether activating these structural alternatives leads to

competition between them in the speaker’s mind. These questions are motivated by the fact that

syntactic planning may involve selection among syntactic alternatives that are equally felicitous

with regard to a given event’s semantics, but highlight its properties differently. Theoretically,

availability of structural alternatives enables speakers to convey subtle event parameters, for

example, promote some event’s referents and demote others. Each language provides its speakers

with a different inventory of available structural choices. Consider how speakers of two

morphologically distinct languages – Russian and English – could describe the transitive event

portrayed in Figure 1.

Research suggests that speakers of both languages strongly prefer the canonical active-voice

SVO (subject-verb-object) frame (for Russian: e.g., Baylin, 1995; Bivon, 1971; Timberlake, 2004,

for English: e.g., Svartvik, 1966). Hence, in a “neutral” context (e.g., when the event is completely

novel and no prior context is provided), both English and Russian speakers are likely to describe it

using sentences such as “A cowboy is punching a boxer” (English) and “Kovboj b’jot boksera”

(Russian). If, however, the English speaker wants to promote the boxer and demote the cowboy, she

might use a passive voice construction (e.g., A boxer is (being) punched by a cowboy). In the

presence of a strongly biasing semantic context, an English speaker could also describe the event by

using, for example, a cleft construction, such as “It is the boxer that the cowboy is punching” or “It

is the boxer who is (being) punched by the cowboy”. Cleft sentences, however, are extremely rare in

naturally occurring speech (Collins, 1991; Roland, Dick, & Elman, 2007). They typically require a


strongly biasing contrastive context in which the true agent or patient of the described event is

selected among several alternative ones (Collins, 1991; Nelson, 1997).

In contrast to English, a speaker of Russian has a much wider range of structural alternatives

available to her if she wants to describe the same transitive event. First, like in English, she can

choose between active- and passive-voice frames, even though the Russian passive tends to be used

more rarely than in English (e.g., Krylova & Khavronina, 1988; Zemskaja, 1973). Rather, she may

decide to scramble the linear order of constituents, thereby changing their positions in a sentence.

Scrambling makes any permutation (SVO, SOV, OVS, OSV, VSO, or VOS) grammatical.1 Studies

of Russian language corpora indicate that canonical SVO structures are most likely to be produced;

at the same time, alternative word orders are also commonly found (Bivon, 1971; Timberlake,

2004). Russian nouns in nominative case represent the morphological base form but the assignment

of other cases typically requires overt inflexion (e.g., bokser [nominative]; bokser-a [accusative];

bokser-u [dative]). Due to explicit case marking, constituents in a Russian sentence can be

positioned relatively freely. The syntactic contrast between Subject and Object, for example, is

determined by morphological case inflexion on the Object. English, on the other hand, is a language

in which syntactic functions of constituents are mostly defined in terms of their relative positioning

in a sentence (overt case marking is only observed in pronouns).

Hence, while English and Russian share comparable canonical frames (SVO active), they

provide different degrees of syntactic flexibility to their speakers: At least grammatically, Russian

speakers have more structural options available to them than English speakers. But do Russian

speakers actively make use of their wider structural inventory, or are those alternatives only

activated in very circumscribed scenarios? Indeed, available corpus data suggest that non-canonical

structures are used quite regularly in Russian. First, English speakers seem to use canonical SVO-

1 Cleft constructions are also possible in Russian. However, as with English, such constructions are extremely unlikelyto be considered by Russian speakers in the absence of strong contextual constraints.

https://www.researchgate.net/publication/249699939_Cleft_Constructions_in_Spoken_and_Written_English?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==


active constructions more frequently (94%, e.g., Svartvik, 1966) than Russian speakers (79%, e.g.,

Bivon, 1971). Second, as far as the distribution of non-canonical alternatives is concerned, passive-

voice constructions typically account for ~5-6% of agent-patient structures in English (Roland,

Dick, & Elman, 2007; Svartvik, 1966), while some studies on Russian report that sentences with

non-canonical word orders may account for up to 50% of agent-patient sentences. The data vary

substantially depending on the corpus used. Bivon (1971), for example, reports the following

frequencies from a Russian transitive-sentence corpus: SVO 79%, OVS 11%, OSV 4%, VOS 2%,

SOV 1%, Passive Voice <1%. A more recent study (Timberlake, 2004) reports the following

distribution: SVO 46%, SOV 30%, OVS 14%, OSV 1.7%, VSO 3.6%, VOS 4.7%, Passive Voice

0%. In any case, alternatives to the canonical SVO-active construction seem to account for a

minimum of ~20% in Russian, and for only ~5% in English, suggesting greater syntactic flexibility

in Russian.

Further evidence for a more flexible use of non-canonical alternatives in Russian comes

from a recent study by Vasilyeva and Waterfall (2012). They employed a structural priming

paradigm to investigate the differential properties of passive-voice priming in English versus

Russian children and adults. Rarely considered cleft constructions aside, passive-voice is virtually

the only way to promote the patient and demote the agent in English, while Russian speakers can

employ passive-voice, active-voice constructions with fronted patients (OVS, OSV), imperfective

passives (e.g., dom stroilsja – The house was being built), or impersonal patient-promoting actives

(e.g., dom postroili – (They) house built) to the same effect. The findings showed that Russian and

English speakers responded differently to exposure to passive-voice primes: While English speakers

displayed a classic passive-voice priming effect (more passive-voice uses in the target after a

passive-voice prime), Russian speakers showed a much wider distribution of different patient-

promoting structures in the target, with scrambled patient-initial constructions (16%) actually

accounting for a higher percentage of responses than passive-voice constructions (6%) (Vasilyeva &

https://www.researchgate.net/publication/245037963_A_Reference_Grammar_of_Russian?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==


Waterfall, 2012, Experiment 3). This confirms that Russian speakers actively employ a wider range

of patient-promoting syntactic alternatives than English speakers.

Given that the two languages display different degrees of syntactic flexibility, the question

arises how this would affect the process of sentence planning and the actual time course of sentence

production in English versus Russian. Theoretically, there are two possibilities. On the one hand,

greater flexibility might lead to more syntactic competition, thus slowing slow down sentence

production. Alternatively, it could make incremental sentence production easier and therefore faster.

A competition account (e.g., Dell & O’Seaghdha, 1994; McClelland & Rummelhart, 1981;

Stallings, MacDonald, & O’Seaghdha, 1998) assumes that alternative syntactic plans become

simultaneously pre-activated and compete with one another. According to this view, the more

alternatives the speaker has, the more time it should take to choose between them because the

speaker does not only need to select the preferred structure among competitors, she also needs to

inhibit the latter. For example, if at the point of Subject selection, Speaker A has more alternative

continuations to consider than Speaker B, then Speaker A should be slower to make a final

commitment to one of such continuations.

In support of this account, experimental studies have found that speech errors and hesitations

are more likely to occur at the beginning of an utterance than at the end (Barr, 2001; Beattie 1979;

MacKay, 1970; Maclay & Osgood 1959). Such sentence-initial disfluencies may indicate higher

cognitive load due to the necessity to make syntactic choices during early stages of sentence

planning. Also, the likelihood of hesitations and pauses in the production flow appears to be affected

by the complexity of the word choices made by the speaker. In one study (Schachter et al., 1991),

the authors analyzed the number and time-courses of pauses made by lecturers during their classes.

The results showed that an increase in the overall number of choices available to speakers was

associated with an increase in the frequency of pauses and hesitations. Pauses and hesitations are

also more likely to occur at the beginning of long sentences than at the beginning of shorter ones

https://www.researchgate.net/publication/232510529_Speech_Disfluency_and_the_Structure_of_Knowledge?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==

https://www.researchgate.net/publication/200045247_An_Interactive_Activation_Model_of_Context_Effects_in_Letter_Perception_Part_I_An_Account_of_Basic_Findings?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==

https://www.researchgate.net/publication/281304934_Trouble_in_mind_Paralinguistic_indices_of_effort_and_uncertainty_in_communication?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==


(Clark & Fox-Tree, 2002; Oviatt, 1995; Shriberg, 1996). Other studies point to the fact that complex

syntactic structures take longer to plan than simpler ones (e.g., Allum & Wheeldon, 2007;

Nottbusch, 2010; Konopka, 2012; Smith & Wheeldon, 1999), that initial verb selection may occur

before the onset of the noun-verb complex (Kempen & Huijbers, 1983; Lindsley, 1975), and that the

scope of advance structural planning is not fixed but flexible, as it can be expanded under increased

cognitive load (Wagner, Jescheniak, & Schriefers, 2010). Put together, this evidence suggests that

the overall complexity of the planned sentence affects the cognitive load experienced at the initial

stages of sentence planning and that a significant part of sentence planning (accompanied with a

higher processing load) happens before speakers start articulating the sentence.

An opposite view is advocated by what we will refer to as “opportunistic” account (e.g., V.

Ferreira, 1996). An opportunistic account assumes that sentences are constructed in a piecemeal

fashion with a limited amount of global pre-planning. According to the opportunistic view, having

more options available at any given point should facilitate production, making speakers’ choices

easier and faster. In support of this claim, V. Ferreira (1996) demonstrated that a wider range of

syntactic choices facilitates generation of English ditransitive sentences. In this study, participants

completed sentence fragments containing an alternating or a non-alternating verb:

(a) I gave…

(b) I donated…

The use of the verb gave in (a) leaves two possible continuations: a Prepositional Object (PO)

continuation (e.g., I gave the toys to the children) or a Double Object (DO) continuation (e.g., I gave

the children the toys). A verb like donate only allows for a PO continuation (e.g., I donated the toys

to the children). Hence, the two verbs differ in syntactic flexibility, with gave being more flexible

than donated. Ferreira demonstrated that English speakers were faster (and less error prone) to

complete sentences containing gave than sentences containing donated. This result supports an

opportunistic view of sentence generation, according to which sentences are constructed in a

https://www.researchgate.net/publication/6261290_Planning_Scope_in_Spoken_Sentence_Production_The_Role_of_Grammatical_Units?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==

https://www.researchgate.net/publication/2808965_Predicting_Spoken_Disfluencies_During_Human-Computer_Interaction?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==


piecemeal fashion without mandatory consideration of syntactic alternatives. However, it is

important to note that in V. Ferreira’s study, participants were instructed to produce sentences (1) as

quickly as possible and (2) without producing mistakes and disfluencies. A more recent study by F.

Ferreira and Swets (2002) actually showed that speakers tend to plan sentences in full (and,

therefore, weigh their global syntactic choices) when they are producing sentences in the absence of

any time pressure constraints, whereas under time pressure, sentence formulation proceeded in a

more opportunistic, “race-based” fashion (“the first horse over the line wins”). Hence, the scope of

syntactic planning and the associated competition may depend on task demands such as time

pressure.

It is also unclear whether syntactic competition during sentence generation is universal

across structures and languages. Ultimately, opportunistic accounts might assume that piecemeal

sentence formulation is a fundamental and universal feature of the speaker’s strategy. Therefore, it

should occur with production of syntactic structures other than the ditransitive sentences used in

Ferreira (1996) and in languages other than English. In other words, a speaker of a language more

structurally flexible than English should also be faster, and more accurate, in making syntactic

choice decisions than her English counterpart because her language makes more syntactic

alternatives available to her. At first approximation, this does not seem to be the case.

First, data from Odawa, a free word-order language with a wider syntactic inventory than

English, provided evidence against radically opportunistic views of language production

(Christianson & Ferreira, 2005). Odawa is a language with fully flexible word order for which

‘radically’ versus ‘mildly’ incremental production models generate different predictions. The former

do not assume much pre-planning of global syntactic structure. An at least implicit prediction from

radically incremental models is therefore that the most easily accessible referent would be the first

to be lexicalized without necessarily prescribing its constituent role in the sentence. In a situation

where the most accessible referent of a transitive event is the patient, speakers of Odawa have

https://www.researchgate.net/publication/222579819_How_Incremental_Is_Language_Production_Evidence_from_the_Production_of_Utterances_Requiring_the_Computation_of_Arithmetic_Sums?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==


multiple options available to them, including direct and inversed word order and also a passive voice

form comparable to English. Although the former two frames are generally more frequent in Odawa,

participants in the Christianson & Ferreira (2005) study actually preferred to promote patient as the

Subject of a passive-voice sentence, suggesting a degree of global planning beyond simply using the

most accessible referent as the sentence-initial NP.

Second, a study by Myachykov & Tomlin (2008) found that speakers of Russian were

slower than their English counterparts in initiating both canonical SVOs and scrambled sentences

when describing transitive events under the same simple perceptual-priming production task. Hence,

at least in these two languages (with greater freedom of syntactic choice than English), greater

flexibility did not facilitate production, but rather hampered it.

Of course, the results by V. Ferreira (1996) and by Myachykov & Tomlin (2008) are not

directly comparable because they used different syntactic structures (ditransitive vs. transitive) and

different experimental paradigms (sentence completion vs. perceptually cued event description). In a

sentence completion task, initial constituents are already available whereas in a perceptual cueing

task they are not. This procedural difference is important because syntactic flexibility is likely to

change over time during incremental sentence generation. The lack of directly comparable data from

English and Russian speakers performing on the same production task motivates the two studies

reported in this paper. Using a task that explicitly ‘encouraged’ structural flexibility, Study 1

investigated whether Russian speakers would produce more structural alternatives than English

speakers when describing the same set of depicted events. Study 2 employed the same stimuli in a

free picture description task combined with eye-tracking. Here, we were interested in the cognitive

effort associated with sentence production in each language, as measured in sentence-onset latencies

and referent-related eye-voice spans (EVS). Moreover, we combined the structural flexibility data

from Study 1 and the latency data from Study 2 to establish (on a by-item basis) whether and to

what extent the former can predict the latter.


To generate testable predictions, it is useful to start with an illustration of the structural

choices that are (at least theoretically) available to Russian versus English speakers when they want

to describe a picture such as Figure 1. Figure 2 displays the set of available structural choices in

Russian, and Figure 3 the set of available choices in English. The point S in each figure represents

the sentential starting point. As explained earlier, Russian is a free word-order language, and so the

Russian grammar provides six options to start with at this point, whereas English (realistically) has

only two options available. Hence, at least when encouraged to produce a wide range of structurally

different descriptions, Russian speakers should be able to produce more such alternatives than

English speakers. This will be our working hypothesis for Study 1.

According to opportunistic accounts, greater structural flexibility should benefit Russian

speakers because, regardless of how they start the sentence, they would always have a considerable

range of choices available during incremental production. By contrast, a competition account

predicts that a Russian speaker would have to entertain several competing syntactic frames in

parallel, which would slow down the final selection process. Let us assume that both English and

Russian speakers would conceptualize the event in Figure 1 as “agent-driven”. In English, this

commitment leaves only one available option – SVO active (see Figure 3). The Russian grammar,

on the other hand, is far more flexible in that speakers can choose between four different options if

they intend to start with the agent: SVO active, SOV active, OVS passive, and OSV passive (cf.

Figure 2). If Russian speakers further commit themselves to assigning nominative case to the initial

agent-NP, then this still leaves a choice between SVO active and SOV active; the more word-order

and case marking commitments that are being made, the fewer the options that are left available.

Hence, according to opportunistic accounts, greater structural flexibility during initial stages of

sentence formulation should benefit Russian speakers, leading to a faster sentence-onset latencies

and shorter eye-voice spans for the initial constituent. Also, as this flexibility diminishes down the

production stream, Russian eye-voice spans should increase. According to competition accounts, the


opposite pattern should be observed: Russian speakers should experience more load at the beginning

of their sentences (measurable in slower sentence-onset latencies and longer eye-voice spans at the

initial constituent) than English speakers. The eye-voice spans should incrementally decrease as

competition-related load diminishes towards the end of the sentence. This set of predictions

motivates alternative hypotheses for Study 2.

Study 1

In this study, a sample of 12 Russian and 12 English speakers were asked to describe a set of

transitive event pictures (see Figure 1), using as many structurally different (but semantically

appropriate) ways to describe each picture as possible within a given time frame (15 seconds per

picture). Examples of such structural alternatives were provided in the instructions (6 for the

Russian speakers, and 6 for the English speakers). The question was whether Russian speakers

would display greater structural flexibility per item than English speakers when encouraged to be

syntactically creative.

Participants

Twenty-four participants (18 female; 6 male) were tested in individual sessions, each lasting

approximately 45 minutes. Twelve participants were native speakers of Russian, and twelve were

native speakers of English. All received subject payment or course credits for their participation.

The Russian-speaking participants were undergraduates at Moscow State University and the

English-speaking participants were undergraduates at the University of Glasgow. The mean age of

participants was 21.5 years.


Materials

The stimuli were cartoon-like black and white line drawings showing various human characters in

different activities or events (see Figure 1). The target pictures had 17 different human characters

acting as protagonists in five different transitive events: pulling, punching, pushing, touching, and

shooting. There were 40 critical target items (eight different protagonist-pairings per transitive

event) and 82 filler pictures. The latter were pictures of intransitive events that always involved only

one character. Pictorial materials were controlled for size and position of referent. Half of the critical

target items showed the agent on the left and the other half showed the agent on the right of the

patient; orientation per item was counterbalanced across subjects.

The items were presented in a fixed quasi-random order. There were four filler pictures at

the beginning of each session and each target picture was preceded by a minimum of two fillers. All

items were displayed centrally on the screen. Russian and English participants were given the same

instructions (in their native language) and were presented with the same set of picture materials.

Apparatus and Procedure

The experiment was implemented in Microsoft PowerPoint. Experimental materials were presented

on a 17" LCD monitor. Participants used the spacebar to initiate each trial. A second computer was

concurrently used by the experimenter to code the participant responses.

Participants sat in front of the display computer throughout the experiment. Before the main

experimental session, each participant was run through a practice session consisting of two parts.

First, in order to familiarize them with the protagonists they would encounter in the main session (as

well as their labels), participants were presented with pictures of the individual referent characters

together with their names written at the bottom of each picture. Participants were instructed to read

out the referent names and to remember them for the following task.


In the second part of the practice session, participants were presented with a screen

displaying a transitive event (different from those in the main session) and six syntactically different

ways of describing that event printed underneath. Russian participants were given the canonical

SVO-active structure (e.g., Kovboj b’jot boksera), plus five scrambled alternatives of the same

sentence – OVS, VSO, SOV, OSV, and VOS, all in present tense active voice. Clearly, it is more

difficult to come up with as many naturally occurring structural alternatives in English. However, in

order to make the procedure maximally comparable, we gave English participants six different

examples as well, including the canonical SVO-active structure (e.g., A cowboy is punching a

boxer), passive-voice (e.g., A boxer is being punched by a cowboy), a clefted-patient active voice

structure (e.g., It is a boxer who a cowboy is punching), a clefted-agent active voice structure (e.g.,

It is a cowboy who is punching a boxer), a clefted-verb active voice structure (e.g., Punching a

boxer is what the cowboy is doing), and a clefted-verb passive voice structure (e.g., Getting punched

by the cowboy is the boxer). After reading these examples aloud, participants described ten practice

event pictures (different from those in the main session), each in as many different ways as they

could think of. Four of these practice pictures were transitive events (comparable to the critical

stimuli) and six were intransitive events (comparable to the filler materials). Importantly,

participants were not limited to using only the structures suggested in the six examples at the

beginning; they were explicitly encouraged to be as creative as possible.

The instruction for the experimental session was to produce as many structurally different

descriptions of each picture as possible within the allocated time limit (15 seconds per picture). Each

description should be a non-truncated single sentence in present tense. Each description for a given

event should make consistent use of a single verb (i.e., Cowboy hits the boxer and Cowboy punches

the boxer would not count as different variants). Each trial began with the presentation of a central

fixation dot. Participants initiated the picture display for each trial by pressing the spacebar. Each

item was presented for a maximum of 15 seconds (both targets and fillers) during which participants


orally produced event descriptions. Participants had the option of moving on to the next trial by

pressing the spacebar if they felt they had exhausted the range of possible descriptions for the

current item.

Results and Discussion

Participants’ descriptions of the critical trials were coded in terms of syntactic structure. Target

descriptions that did not conform to the experimental instructions (e.g., There are a boxer and a

swimmer in the picture) accounted for less than 2% and were excluded from further analyses. Also,

we included only distinct (i.e., unrepeated) structural alternatives produced in each individual trial,

and variants that only differed in the use of adjectives or adverbs (e.g., a cowboy punches a boxer

versus a grumpy cowboy punches a boxer) were counted as the same structure.

Table 1 shows the distribution of different syntactic forms produced by the English and

Russian participants in Study 1. While canonical SVO-active was the most frequent response in both

English and Russian, this type of description accounted for a higher percentage of responses in

English than in Russian (in line with the corpus data discussed in the introduction). Also, Russian

speakers produced a minimum of three different non-canonical description types with a frequency of

more than 10%, whereas for the English speakers, passive voice clearly dominated the range of non-

canonical alternatives produced. This shows that Russian participants made active use of a wider

inventory of structural alternatives than their English counterparts.

Table 2 displays by-item means of (i) numbers of syntactically different sentence types

produced (Types), (ii) numbers of canonical tokens produced (NC), (iii) numbers of non-canonical

tokens produced (NN), and (iv) the log-ratio of non-canonical over canonical tokens (ln(NN/NC)).

Also shown are the results of within-items t-tests examining the effect of language in each measure,

together with 95% CIs for the cross-linguistic difference. Note that dividing the NC and NN values

by 12 yields average counts per participant. As can be seen, the Russian participants produced a


greater variety of different syntactic types in their picture descriptions than the English participants.

There was no reliable difference in the number of canonical tokens (each participant produced about

one canonical description per item), but a very clear difference in the number of non-canonical

tokens per item, which was about twice as high in Russian than in English (indeed, English

participants were more likely to cut the trial short as they were running out of ideas). The latter is

also reflected in the average non-canonical over canonical ratio per item, the logarithm of which

(ln(NN/NC)) was used as a measure of competition in the correlational analyses reported as part of

Study 2. Although they were given comparable instructions (including the same number of non-

canonical examples), the same practice session, and the same time limit per item, Russian speakers

were much more productive than their English counterparts, apparently due to the greater syntactic

flexibility of the Russian language.

Study 2

In this study, we employed a free picture description task combined with eye-tracking to address the

main question of this paper: Does greater structural flexibility in Russian incur higher processing

costs (competition between structural alternatives for selection) or is greater flexibility actually

beneficial to the speaker (opportunistic production)?

Participants

Fifteen native speakers of English (7 female) and 15 native speakers of Russian (10 female)

participated in the study. All participants had normal or corrected-to normal vision. English

participants (mean age 21.4 years) were undergraduate students at the University of Glasgow.

Russian participants (mean age 27.1 years) were members of staff at the National Center of Speech

Pathology and Neurorehabilitation, Moscow.


Design and Materials

In Study 2, we implemented a within-item/between-participant design (with Language as a quasi-

experimental factor) similar to Study 1. The dependent variables were (1) the probability of

producing an SVO-active structure, (2) the temporal lag between picture onset and the onset of the

verbal description (henceforth called sentence onset latency), and (3) the temporal lag between

having finished visual inspection of a referent and producing that referent’s name (henceforth

referred to as eye-voice span). Picture materials and randomization procedures were same as in

Study 1.

Apparatus and Procedure

The Russian data were collected at the National Center of Speech Pathology and

Neurorehabilitation, Moscow, using an SMI iView remote eye tracker. The English data were

collected at the University of Glasgow using an SMI EyeLink I head-mounted eye tracker. Materials

were always presented on a 17" CRT running at 75 Hz refresh rate. The speech data were recorded

on a SONY DAT digital recorder. To extract eye-voice span data (see below), we pre-coded two

interest areas in each of the target pictures: one for the Agent and one for the Patient. These included

the corresponding referent and a surrounding area of approximately 20 of visual angle.

Participants were told that the main purpose of the study was to analyze how people talk

about events. They were seated in front of the monitor at an approximate distance of 60 cm between

the eyes and the monitor. The experiment always began with a practice session lasting for about 15

minutes. During the practice session, participants first saw pictures of single referents (some of

which would also occur during the main experiment) and read out their names printed underneath.

Then, they practiced describing event pictures – one for each event so that that each of the five

actions (pulling, punching, pushing, touching, and shooting) was described once. After that, the

participants had to name pictures of the individual referents (e.g., boxer, cowboy, etc.) that would


later appear in the target events; this time, the naming onset latencies were recorded and analyzed.

This analysis revealed that although it took Russian participants slightly longer than the English

participants to name the individual referents (958 ms vs. 911 ms), this difference was not reliable

(t(28) = 1.59, p > .1, two-tailed). The main purpose of this practice session was to familiarize

participants with the referent and event pictures as well as their names. Thereby, we not only

minimized potential differences in how familiar Russian versus English speakers were with the

kinds of pictures that would occur during the main session, but also potential differences in how

familiar Russian versus English speakers were with the lexical labels required to describe the

depicted referents and events.

After the practice session, the event description phase (main session) followed. Participants

viewed and described pictures of transitive events one at a time with no specific instructions as to

how to describe the event pictures, except that they were encouraged to make reference to all the

characters they saw in the pictures (this was to avoid production of truncated passives). Figure 4

illustrates the presentation sequence per experimental trial.

Upon the presentation of the central fixation mark, a displaced fixation mark, equally distant

from the interest areas, appeared on the screen. This ensured that participants always had to perform

a saccade to inspect the subsequently presented target picture. The onset of the target display was

contingent with fixating the displaced fixation mark for a minimum of 200 msec. Then the

participant described the target picture and pressed the space bar to initiate the next trial. Target

picture presentation was timed-out after 7700 ms, which provided sufficient time for responding

(participants were therefore not under time pressure). A unique audio signal accompanied the

presentation of each target picture; this enabled us to identify the relevant picture onsets in the sound

recordings, and consequently, to synchronize participants’ eye-movements with their verbal

responses.


Participants were individually interviewed after completing the experimental session about

difficulties they had in perceptual identification of the experimental materials, uncertainty in

selecting verbs for description, or providing their descriptions. No such difficulties were reported.

Results and Discussion

Participants’ descriptions of the critical pictures were coded in terms of syntactic structure,

considering the range of possibilities illustrated in Figure 2 (Russian) and Figure 3 (English). Target

descriptions that did not conform to the experimental instructions (e.g., There are a boxer and a

swimmer in the picture) were counted as missing values (2.5% in the Russian data and 1.8% in the

English data). It turned out that both English and Russian speakers were heavily biased towards

producing canonical SVO-active structures in this free description task (contrasting with Study 1

where participants were encouraged to be creative), accounting for 98% of the descriptions in

English and 99% of the descriptions in Russian. Since proportions of alternative non-canonical

structures were negligible, all further analyses were based on trials in which participants produced

canonical SVO-active picture descriptions.

Before analyzing sentence onset latencies and eye-voice spans, we had to ensure that there

were no systematic cross-linguistic differences in the numbers of syllables per constituent. For this

purpose, we went through the actual sound recordings (one per experimental trial) and noted down

the numbers of syllables of the nouns and verbs produced by our participants.

The relevant means (broken down by Constituent Position and Language) are shown in

Table 3. Between-subjects/within-items t-tests in each constituent position confirmed that English

and Russian responses were, on average, comparable in terms of numbers of syllables (all ps > .1).

Any cross-linguistic differences in sentence onset latency or eye-voice span therefore cannot

plausibly be attributed to differences in phonological length.


For the sentence-onset latency analysis, we subtracted the time when the picture appeared on

the screen (as indicated by a unique audio signal in the sound recordings) from the time when

participants started to articulate the Subject noun. This was done separately for each trial. Because

Russian (unlike English) does not have determiners before nouns, all sentence onset latencies,

including the English ones, were coded relative to the onset of the Subject noun in the considered

SVO-active picture descriptions. This eliminated the possibility that sentence onset latencies in

English would be faster just because English speakers would, say, always start with an easily

accessible determiner and produce the actual Subject noun after some delay (e.g., “The.. [uhm]..

cowboy is punching the boxer”), an option that would not be available to Russian speakers. By

always coding sentence onset latencies relative to the onset of the Subject noun (as in the present

analyses), the two languages became maximally comparable.

In cases where the so-defined sentence onset latencies exceeded 5000 ms, or undercut 300

ms, the relevant trials were excluded from analysis (this resulted in 2.9% data loss overall). The

resulting average sentence onset latencies were 1470 ms for the English speakers and 1771 ms for

the Russian speakers. Ninety-five percent confidence intervals indicated that the difference was

significant by participants (301 ± 187 ms) as well as by items (301 ± 78 ms). Hence, Russian

speakers took reliably longer to plan their responses than English speakers even though they were

describing identical sets of picture materials and, importantly, cross-linguistic differences in the use

of determiners were accounted for.

Analysis of eye-voice spans was performed using the procedure described in Griffin & Bock

(2000). The eye-voice span (henceforth EVS) was defined as the temporal lag between the onset of

the last fixation to a referent immediately preceding the production of its name and the onset of the

spoken name itself.2 EVS was originally used in research on oral reading as a chronometric measure

2 In both Russian and English, the onset of the spoken name was determined as the onset of the relevant noun in thespoken response.


of how far the eyes are ahead of the voice (Levin, 1979). It was later used in picture description

experiments, using the definition provided above (Griffin & Bock, 2000). As such, it is claimed to

be sensitive to formulation processes following the stage of rapid apprehension, during which the

“gist” of a depicted event is perceived. The initial suggestion was that in picture description, EVS

values represent “fixed” signatures of constituent-related lexical access (Griffin & Bock, 2000).

However, more recent research has shown that, e.g., in cases of referential ambiguity, speakers tend

to re-fixate already mentioned referents for additional conceptual (re)analysis with corresponding

eye-voice spans values gradually deflating (Coco & Keller, 2010). Similarly, Myachykov (2007)

demonstrated that EVS values become progressively shorter as more conceptual, lexical, and

structural information about an event becomes available. Hence, EVS values may not only reflect

lexical access (processing difficulty associated with relating visual referents to their names), but also

processes related to grammatical role assignment and conceptual reanalysis.

A Perl-based script was used to extract EVS automatically. The script used the text files

containing a participant’s data for name and gaze onset latencies for each trial. Each gaze onset

corresponded to one of the pre-coded interest area: Agent, Event, or Patient. The name onsets were

marked as corresponding to Subject, Verb, or Object of the event; each produced sentence also

received a word order code, e.g., SVO. The script used this set of markers in order to perform a loop

search and eventually match a particular name onset to the relevant onset of the last fixation to the

corresponding interest areas and calculate the corresponding EVS value. When there was no fixation

to the referent or when no name was produced, the EVS value was coded as missing and replaced

with the corresponding mean value. This affected less than 3% of the data in each condition. Table 4

summarizes mean EVS values for the Subject and Object constituents.3 Table 5 presents the results

of two-factorial ANOVAs on those data, including Language (Russian vs. English) as between-

3 Since it was difficult to identify a unique ‘event region’ in the pictures, we refrained from calculating EVS values forthe verb constituent.


subjects/within-items factor and Constituent Position (Subject vs. Object) as within-subjects/within-

items factor (F1 for analyses by participants, F2 for analyses by items).

As can be seen from Table 5, there was a main effect of Language: the Russian EVS values

(668 ms) were on average 61 ms longer than the English EVS values (607 ms). The main effect of

Constituent Position was not reliable. However, there was a significant interaction between

Language and Constituent Position which can be decomposed as follows. For the English sample,

ninety-five percent confidence intervals indicated reliably longer EVS values in Object than in

Subject position (76 ± 75 ms for the difference by participants; 76 ± 50 ms for the difference by

items). Conversely, for the Russian sample, there were reliably longer EVS values in Subject than in

Object position (61 ± 34 ms for the difference by participants, 61 ± 27 ms for the difference by

items). Opposing trends were also present in the Language contrasts per Constituent Position: in

Subject position, EVS values were 129 ms (± 53 ms by participants, ± 33 ms by items) longer for

the Russian rather than the English sample; in Object position, EVS values were 7 ms longer for the

English rather than the Russian sample (which was, however, not a significant difference).

As for the EVS values in Subject position, we needed to address one potential confound:

Shorter EVS values for the Subject noun in English might reflect the presence of an easily

accessible auxiliary verb in the upcoming verb phrase. Given that production of the Subject

constituent often coincides with partial pre-planning of the subsequent verb or verb phrase (cf.

Lindsley, 1975), English speakers might have an advantage over Russian speakers in formulating

the sentence-initial Subject noun, not because they need to consider fewer syntactic alternatives at

this point, but because the subsequent verb phrase often starts with an easily accessible auxiliary

verb in English. Russian speakers, on the other hand, would always have to pre-plan a notional verb

at this point because there are no auxiliaries in Russian. To account for this potential problem, we

conducted an additional analysis on the EVS values in Subject position, this time only considering

instances where English speakers did not produce an auxiliary after the Subject noun (e.g., a cowboy


punching a boxer), which was the case in 67% of the English responses. The resulting mean EVS

value in Subject position amounted to 564 ms for the English speakers. For the Russian speakers,

the corresponding mean EVS value remained unchanged (698 ms, see Table 4). The mean

difference was still significant at 134 ms (± 52 ms by participants, ± 32 ms by items). It is therefore

safe to conclude that faster Subject eye-voice spans in English were not due to pre-planning of

upcoming auxiliaries.

To summarize, analyses of eye-voice spans indicated that Russian participants experienced

more processing difficulty in formulating the Subject rather than the Object constituent, whereas the

opposite was true for the English participants. Moreover, in comparison to English speakers,

Russian speakers displayed significantly prolonged eye-voice spans while formulating the sentence-

initial (Subject) constituent, which is in line with the corresponding cross-linguistic effect in

sentence-onset latency. This suggests that the wider range of syntactic choices in Russian

(particularly at the beginning of sentence formulation) is detrimental to the fluency of production, in

line with the competition hypothesis. To establish whether syntactic competition truly provides a

viable explanation of these cross-linguistic differences in production latency, we performed a series

of correlation and stepwise GLM analyses, combining the data from Study 1 with those from Study

2.

Syntactic Flexibility as a predictor of production latencies.

The ln(NN/NC) values obtained from Study 1 were used as a continuous predictor of the sentence-

onset latencies, the eye-voice spans for the Subject noun, and the eye-voice spans for the Object

noun in Study 2. The question was whether by-item variability (within and across languages) in

those latency variables would predictably correspond to varying degrees of syntactic flexibility, as

measured by the ln(NN/NC) metric. Recall that higher ln(NN/NC) values indicate higher ratios of

non-canonical over canonical descriptions per item, reflecting greater ease of access to non-


canonical options and thus higher syntactic flexibility, potentially resulting in greater competition

with the canonical SVO-active structure.

Correlations were computed across all 40 (pictures) × 2 (languages) = 80 item-language

combinations (Table 6). As can be seen, there were significant positive correlations with sentence-

onset latency (explaining about 23% of the variance) and with eye-voice spans for the Subject noun

(explaining about 43% of the variance), but not with eye-voice spans for the Object noun (the latter

did not reliably differ across languages). Figure 5 shows the relevant scatter plots.

The next, more important question we asked was whether syntactic flexibility (ln(NN/NC))

can contribute more to the explanation of the sentence-initial latency data than the categorical

partitioning by language (English vs. Russian) alone. To answer this question, a series of more

sophisticated stepwise GLM analyses was performed across the previously considered 80 item-

language combinations. As predictors, we included Item (N=40) as categorical random factor,

Language (English vs. Russian) as categorical fixed factor, and Flexibility (ln(NN/NC)) as

continuous predictor (covariate). The dependent variables were the sentence-onset latencies and

Subject eye-voice spans from Study 2 (given that Object eye-voice spans were largely unaffected by

Language and/or Flexibility, they were not considered further). Using hierarchical (Type-I) variance

decomposition4, the three predictors were entered incrementally in two different orders, referred to

as I-L-F and I-F-L model, respectively. In both models, the random factor Item was always entered

first, thereby accounting for potential random variation in how difficult different pictures are to

describe regardless of language and/or syntactic flexibility (e.g., due to variation in visual

recognisability of the depicted protagonists and actions). Next, either Language (I-L-F model) or

Flexibility (I-F-L model) was added, and lastly, the remaining of the three factors (Flexibility or

Language, respectively) was included in the model. The logic behind the different sequences of

4 This results in an ordered incremental modelling approach, contrasting with standard Type-III decomposition whichresults in simultaneous testing of model effects. Note that simultaneous testing is unsuitable for present purposesbecause the two most critical predictors (Language and Flexibility) are highly correlated with one another (simultaneoustesting would not be able to reliably estimate each predictor’s unique contribution to the model fit).


testing was to find out whether Flexibility has any effects “above and beyond” Language

(suggesting that Flexibility is the more informative predictor) or vice versa (suggesting that

Language is the more informative predictor).

As can be seen in Table 7, the results were fairly clear. With sentence onset latency (SOL) as

dependent variable, Flexibility still contributed significantly to the fit of the I-L-F model (i.e. after

accounting for the categorical effect of Language), while the categorical predictor Language did not

reliably contribute to the fit of the I-F-L model (i.e. after accounting for the continuous effect of

Flexibility). With Subject eye-voice spans (S-EVS) as dependent variable, results were less

compelling but still pointing in the same direction; clearly, there was no indication that Language

would yield a better explanation of the data than Flexibility.

Taken together, it appears that syntactic flexibility (quantified by the ln(NN/NC) metric) not

only provides a viable, but indeed a better explanation of the sentence-initial latency effects in Study

2 than language “per se”. This is likely because the ln(NN/NC) metric not only captures between-

language variability, but also within-language variability in syntactic flexibility (potentially related

to the different actions and/or the verbs used to describe them), thus resulting in a better fit of the

latency data. This lends further support to the syntactic competition hypothesis.

General Discussion

In this paper, we investigated whether Russian speakers have more structural alternatives available

to them than English speakers when describing the same transitive events (Study 1), and whether

this greater structural flexibility in Russian leads to an increase or decrease in associated processing

load (Study 2).

In line with grammatical considerations (Figures 2 and 3), corpus data (e.g., Svartvik, 1966;

Bivon, 1971), and prior psycholinguistic research (e.g., Vasilyeva & Waterfall, 2012), Study 1

showed that Russian speakers were able to actively use more—and more diverse—syntactic


alternatives to the canonical SVO-active structure than English speakers when describing the same

set of pictures following a “flexibility-encouraging” instruction. Study 2 combined a free single-

sentence picture description task with eye-tracking, using the same set of stimuli as before (but

different participants). Here, we found that both English and Russian speakers predominantly chose

canonical SVO-active structures to describe the pictures; most importantly, however, Russian

speakers displayed reliably increased processing load, particularly during initial stages of sentence

planning (sentence-onset latency and eye-voice spans for the sentence-initial Subject constituent), an

effect that diminished towards the end of sentence production (no significant cross-linguistic

difference in eye-voice spans for the Object constituent).

By-item correlation analyses indicated that greater syntactic competition with the canonical

SVO-active structure (as established via the log-ratio of non-canonical over canonical descriptions

in Study 1) reliably corresponded to higher sentence-initial processing load for the canonical

structures produced in Study 2, thus explaining most of the observed cross-linguistic differences in

production latency.

As illustrated in Figures 2 and 3, prior to producing the descriptions, Russian speakers are

confronted with a much wider range of possible syntactic choices compared to English speakers,

who realistically consider only two such options.5 Competition accounts would therefore predict that

Russian speakers experience greater cognitive load at this point due to a partial activation of the

available alternatives than English speakers. The data from Study 2 (particularly when correlated

with the data from Study 1) strongly suggest that, even when they are not explicitly produced, non-

canonical alternatives to the SVO-active structure become (at least) partially activated and compete

for structural selection before the dominant canonical structure reaches the selection threshold: first,

cross-linguistic differences in sentence onset latencies confirmed that Russian speakers took longer

5 Indeed, Study 1 revealed that English speakers infrequently produced options other than SVO-active or passive voice,even though such alternatives were clearly indicated to them in the instructions.


to initiate their canonical picture descriptions than English speakers; second, cross-linguistic

differences in eye-voice spans suggested that Russian speakers experienced greater cognitive load

particularly during formulation of the sentence-initial constituents. Thus, although speakers of both

languages eventually selected the same canonical frame, it appears that the Russian speakers

experienced greater syntactic competition prior to making that choice. This load diminished as a

function of diminishing structural options: Following the production of an agentive Subject

constituent, Russian speakers are still left with two continuation options as compared to only one for

English speakers (cf. Figures 2 and 3). It is only at the point of choosing the final constituent (i.e.,

the Object) that speakers of the two languages have only a single option left. In accordance with

this, eye-voice spans for English and Russian speakers no longer differed at this point in production.

In conclusion, our results are consistent with competition accounts of sentence generation

(e.g., Dell & O’Seaghdha, 1994; McClelland & Rummelhart, 1981; Stallings, MacDonald, &

O’Seaghdha, 1998) but not with fully opportunistic accounts (e.g., V. Ferreira, 1996), although they

may be compatible with limited (or extended) opportunistic accounts such as the one in Christianson

& F. Ferreira (2005).

One aspect of Russian sentence production that we have not directly addressed up to this

point is the role of morphological case marking. Overt case marking is an important property of

languages such as Russian, as it enables scrambling (and thus structural flexibility) in the first place.

It could be that Russian speakers in Study 2 took longer during initial stages of sentence generation

due to the need to perform an extra case-assigning operation via morphological inflexion. However,

this would not explain the correlations we found between the structural flexibility data in Study 1

and the latency data in Study 2. Moreover, the necessity to assign morphological case in Russian

emerges only after the sentential Subject is determined, and therefore (potentially) after the point of

initial structure selection. This is because, for nouns in nominative case (i.e., the initial Subject

nouns produced by the Russian participants in Study 2), Russian does not require explicit

https://www.researchgate.net/publication/200045247_An_Interactive_Activation_Model_of_Context_Effects_in_Letter_Perception_Part_I_An_Account_of_Basic_Findings?el=1_x_8&enrichId=rgreq-4d9bf179-7091-486c-b4f9-02eaa7666c76&enrichSource=Y292ZXJQYWdlOzIzNDA0OTQzMDtBUzoxMDQ2NjMwMDUyNzAwMjBAMTQwMTk2NTAxMzM2Ng==


morphological inflexion (the nominative is the morphological base-form in Russian). Hence, at the

sentential starting point, Russian speakers are not much different from their English counterparts as

far as (implicit or explicit) case assignment at the Subject noun is concerned. In the same context,

note that there was no reliable cross-linguistic difference in eye-voice span for the Object

constituent, although the latter does require morphological inflexion for accusative case in Russian.

In the light of these findings (most notably, the correlations between Study 1 and Study 2), we

believe that differences in case marking are not responsible for the observed latency differences

between English and Russian.

Other alternative explanations of the cross-linguistic difference in processing load (Study 2)

seem equally infelicitous. For example, one might argue that Russian speakers were less familiar

with the pictures and/or their labels than the English speakers (after all, none of the pictures showed

a dancing bear or a balalaika), and that the picture-name familiarization phase at the beginning of

each session was largely ineffective. Again, such a claim would provide no explanation for the fact

that the latency data from Study 2 were reliably correlated with the flexibility data from Study 1,

and nor would it explain why Russian participants were actually more productive than their English

counterparts when describing the pictures in Study 1. In conclusion, we believe that structural

competition is the only real contender to plausibly explain the reported findings.

This leaves us with the important question of why the findings by V. Ferreira (1996) led to

conclusions that are diametrically opposite to the ones suggested here. Recall that in Ferreira’s

study, speakers were found to be consistently slower to respond when prompted to complete

sentences containing non-alternating verbs (e.g., I donated…) as compared to sentences containing

alternating verbs (e.g., I gave…), which apparently speaks against competition in sentence

formulation. As discussed in the introduction, one possibility might be that, in comparison to our

own Study 2, speakers were under more time pressure in V. Ferreira (1996)’s experiments, which

might have induced a more opportunistic sentence formulation strategy (cf. F. Ferreira & Swets,


2002). Another possibility could be that the verbs used in V. Ferreira (1996) differed in respects

other than just syntactic flexibility. Indeed, when we looked up V. Ferreira (1996)’s alternating and

non-alternating verbs in the Corpus of Contemporary American English (COCA; Davis, 2009), we

found that the alternating verbs (e.g., gave) had a mean log10 lexical frequency per million of 1.27,

compared to 0.75 for the non-alternating verbs (e.g., donated); the difference was significant by

items (N = 24; 95% CI = 0.52 ± 0.46). Likewise, the average number of syllables was lower for the

alternating verbs (1.2) than for the non-alternating verbs (2.3), again resulting in a significant

difference (1.1 ± 0.3 syllables). This suggests that at least part of V. Ferreira (1996)’s results may be

due to the fact that the non-alternating verbs in that study were both less frequent and

phonologically longer than the alternating verbs.6

In conclusion, while we acknowledge that further research is necessary to ultimately resolve

the debate, we believe that competition accounts of sentence formulation cannot be easily dismissed,

particularly when the present cross-linguistic comparisons between English and Russian are

considered.

References

Allum, P.H. & Wheeldon L. (2007). Planning scope in spoken sentence production: the role of

grammatical units. Journal of Experimental Psychology: Learning, Memory, And Cognition

2007, 33,791–810.

Barr, D.J. (2001). Trouble in mind: Paralinguistic indices of effort and uncertainty in

communication. Oralité et gestualité, communication multimodale, intéraction, ed. by S. Santi,

I. Guaïtella, C. Cave, and G. Konopczynski, 597-600. Paris : L’Harmattan.

6 Similar claims can be made for the pronoun manipulations in Experiment 2 of Ferreira (1996). For example, we found that gavefollowed by a “non-constraining” pronoun (him or her) is about 3.6 times more likely to occur in the corpus than gave followed by a“constraining” pronoun (it). Again, syntactic flexibility seems to be confounded with a frequency advantage.








Baylin, J. (1995). Underlying Phrase Structure and “Short” Verb Movement in Russian. Journal of

Slavic Linguistics, 3(1), 13-58.

Beattie, G.W. (1979). Planning units in spontaneous speech: Some evidence from hesitation in

speech and speaker gaze direction in conversation. Linguistics, 17, 61-78.

Bivon, R. (1971). Element Order, Studies in the Modern Russian Language 7, CUP: Cambridge,

UK.

Bock, J.K. & Levelt, W.J.M. (1994). Language production: grammatical encoding. In M.

Gernsbacher (Ed.), Handbook of Psycholinguistics (pp. 945-984), New York: Academic Press.

Clark, H.H., & Fox Tree, J.E. (2002). Using uh and um in spontaneous speaking. Cognition, 84,73-

111.

Coco, M.I. & Keller, F. (2010). Sentence Production in Naturalistic Scenes with Referential

Ambiguity. In Proceedings of the 32th Annual Conference of the Cognitive Science Society

pp. 1070-1075, Portland, USA

Collins, P.C. (1991). Cleft And Pseudo-Cleft Constructions In English. London: Routlege.

Davis, M. (2009). The 385+ million word corpus of contemporary American English (1990-2008+):

design, architecture, and linguistic insights. International Journal of Corpus Linguistics. 14,

159-90.

Dell, G.S., & O’Seaghdha, P.G. (1994). Inhibition in interactive activation models of linguistic

selection and sequencing. In D. Dagenbach & T. H. Carr (Eds.), Inhibitory Processes in

Attention, Memory and Language (pp. 409–453). San Diego: Academic Press.

Ferreira, F., & Swets, B. (2002). How incremental is language production? Evidence from the

production of utterances requiring the computation of arithmetic sums. Journal of Memory

and Language, 46, 57–84.

Ferreira, V.S. (1996). Is it better to give than to donate? Syntactic flexibility in language production.

Journal of Memory and Language, 35, 724-755.


Griffin, Z.M. & Bock, K. (2000). What the eyes say about speaking. Psychological Research, 11(4),

274-279.

Konopka, A.E. (2012). Planning ahead: How recent experience with structures and words changes

the scope of linguistic planning. Journal of Memory and Language, 66, 143-162.

Krylova, O.A. & Khavronina, S.A. (1988). Word Order In Russian. Moscow & Chicago: Russkiy

Yazyk Publishers.

Levin, H. (1979). The eye-voice span. MIT Press, Cambridge, MA.

Lindsley, J.R. (1975). Producing simple utterances: How far ahead do we plan? Cognitive

Psychology, 7, 1–19.

MacKay, D.G. (1970). Spoonerisms: The structure of errors in the serial order of speech.

Neuropsychologia, 8, 323-350.

Maclay, H., & Osgood, C.E. (1959). Hesitation phenomena in spontaneous speech. Word, 15, 19-44

McClelland, J.L., & Rumelhart, D.E. (1981). An interactive activation model of context effects in

letter perception: Part I. An account of basic findings. Psychological Review, 88, 375–407.

Myachykov, A. (2007). Integrating perceptual, semantic, and syntactic information in sentence

production. PhD manuscript. University of Glasgow.

Myachykov, A. & Tomlin, R.S. (2008). Perceptual priming and structural choice in Russian

sentence production. Journal of Cognitive Science, 6(1), 31-48.

Nelson, G. (1997). Cleft constructions in spoken and written English. Journal of English Linguistics,

25, 340-348.

Nottbusch, G. (2010). Grammatical planning, execution, and control in written sentence production.

Reading And Writing, 23(7), 777-801.

Oviatt, S. 1995. Predicting spoken disfluencies during human-computer interaction. Computer

Speech and Language, 9, 19-35.


Roland, D., Dick, F., & Elman, J.L. (2007). Frequency of basic English grammatical structures: A

corpus analysis. Journal of Memory and Language, 57, 348-379.

Schachter, S., Christenfeld, N., Ravina, B., & Bilous, F. (1991). Speech disfluency and the structure

of knowledge. Journal of Personality and Social Psychology, 60,362-267.

Shriberg, E. (1996). Disfluencies in Switchboard. Proceedings, International Conference on Spoken

Language Processing, Addendum, 11-14. Philedephia.

Smith, M., & Wheeldon, L.R. (1999). High level processing scope in spoken sentence production.

Cognition, 73, 205-246.

Stallings, L.M., MacDonald, M.C., & O’Seaghdha, P.G. (1998). Phrasal ordering constraints in

sentence production: Phrase length and verb disposition in heavy-NP shift. Journal of Memory

and Language, 39, 392-417.

Svartvik, J. (1966). On voice in the English verb. The Hague: Mouton and Co.

Timberlake, A. (2004). A Reference Grammar of Russian. Cambridge: Cambridge University Press.

Tomlin, R.S. (1995). Focal Attention, Voice, and Word Order. In P. Dowing & M. Noonan (Eds.),

Word Order in Discourse (pp. 517-552). Amsterdam: John Benjamins.

Vasilyeva, M. & Waterfall, H. (2012). Beyond syntactic priming: Evidence for activation of

alternative syntactic structures. Journal of Child Language, 39, 258-283.

Wagner, V., Jescheniak, J. D., & Schriefers, H. (2010). On the flexibility of grammatical advance

planning during sentence production: Effects of cognitive load on multiple lexical access.

Journal of Experimental Psychology: Learning, Memory, and Cognition, 36, 323-340.

Yokoyama, O. (1986). Discourse and Word Order. Amsterdam/Philadelphia: John Benjamins.

Zemskaja, E.A. (1973). Russkaja Razgovornaja Reˇc’. Moskva: Nauka.


Table 1

Percentages of different syntactic structures (in ranked order) used to describe the target

pictures in Study 1. S = Subject; V = Verb; O = Object (hence, SVO = Subject-Verb-

Object word order); AV = Active Voice; PV = Passive Voice; CA = Clefted Agent; CP =

Clefted Patient; CV = Clefted Verb.

Language

English Russian

SVO (AV) 40% SVO (AV) 25%

PV 35% OVS (AV) 22%

CP (PV) 8% VOS (AV) 18%

CA (AV) 7% VSO (AV) 17%

CV (AV) 7% OSV (AV) 10%

other 3% SOV (AV) 8%


Table 2

Average per-item counts in Study 1. Types = numbers of syntactically different sentence types

produced; NC = numbers of canonical (SVO-active) tokens produced; NN = numbers of non-

canonical tokens produced.

Measure English Russian t(39) p 95% CI (diff)

Types 4.7 5.0 -2.22 .04 0.3 ± 0.3

NC 11.9 11.9 -0.18 .86 0.0 ± 0.3

NN 18.0 36.2 -29.26 .001 18.2 ± 1.3

ln(NN/NC) 0.40 1.11 -22.20 .001 0.71 ± 0.06


Table 3

Average by-trial numbers of syllables in Study 2.

Average length in syllables

Subject Verb Object

English 1.8 2.7 1.8

Russian 2.0 2.6 2.2


Table 4

Eye-voice spans (ms) in Study 2.

Eye-voice span

Subject Object

English 569 644

Russian 698 637


Table 5

ANOVA results for the eye-voice spans in Study 2.

Effect F1(1,28) p1 F2(1,39) p2

Language 6.48 .02 19.05 .001

Constituent < 1 ns < 1 ns

Language × Constituent 10.45 .003 26.57 .001


Table 6

Pearson r and Spearman rho correlation coefficients (obtained across all 80 item × language

combinations) using competition (ln(NN/NC)) as a predictor of sentence-onset latency (SOL),

Subject eye-voice span (Subject-EVS), and Object eye-voice span (Object-EVS) in the main

experiment; r2 refers to the proportion of variance explained.

DV Pearson r p Spearman rho p r2

SOL .474 .001 .425 .001 .225

Subject-EVS .657 .001 .611 .001 .432

Object-EVS .009 .94 .032 .78 .000


Table 7

Results from stepwise GLM analyses with Item (N = 40; random factor), Language (English vs.

Russian; fixed factor) and Flexibility (ln(NN/NC); covariate) as predictors of sentence-onset

latency (SOL) and Subject eye-voice span (S-EVS) in Study 2. The random factor Item was

always entered first (Step I), followed by either Language (I-L-F model) or Flexibility (I-F-L

model) at Step II before adding the remaining factor (Flexibility or Language, respectively) at

Step III. The table shows F-values, degrees of freedom, p-values, and partial eta-squares (Pη2, a

unit-independent measure of effect size) for each effect in each type of analysis.

I-L-F Model I-F-L Model

DV Step Factor F df p Pη2 Factor F df p Pη2

I Item 2.28 39,38 .01 .70 Item 2.28 39,38 .01 .70

SOL II Lang 51.64 1,38 .001 .58 Flex(1) 56.76 1,38 .001 .60

III Flex(1) 5.18 1,38 .03 .12 Lang 0.06 1,38 .81 .00

I Item 1.09 39,38 .41 .53 Item 1.09 39,38 .41 .53

S-EVS II Lang 58.02 1,38 .001 .60 Flex(1) 58.97 1,38 .001 .61

III Flex(1) 1.64 1,38 .21 .04 Lang 0.69 1,38 .41 .02

(1) Consistent with the correlation analyses, slope-parameters for the continuous predictor werealways positive.


Figure 1. Example picture showing a “transitive event”.


Figure 2. Grammatically permissible choices for (non-truncated) transitive event

descriptions in Russian. NP = noun phrase; [nom] = nominative case; [acc] = accusative

case; [inst] = instrumental case (case marking is morphologically overt in Russian and

morphologically covert in English).


Figure 3. Grammatically permissible choices for (non-truncated) transitive event descriptions in

English.


Figure 4. Presentation sequence per trial in Study 2.

target

time-out 7700 msec

central

fixation

displaced

fixation

central

fixation


Figure 5. Scatterplots of (a) sentence-onset latencies, (b) eye-voice spans for the Subject noun, and

(c) eye-voice spans for the Object noun as a function of competition (ln(NN/NC)). English data are

represented by open circles and Russian data by asterisks. Linear regression lines are also shown.

(a)

(b)

(c)

Syntactic flexibility and competition in sentence production: the case of English and Russian

Documents

Transcript of Syntactic flexibility and competition in sentence production: the case of English and Russian