Arguments for adjuncts
Jean-Pierre Koeniga,c,*, Gail Maunerb,c, Breton Bienvenueb,c
aLinguistics Department, University at Buffalo, The State University of New York, Buffalo, NY, USAbPsychology Department, University at Buffalo, The State University of New York, Buffalo, NY, USA
cCenter for Cognitive Science, University at Buffalo, The State University of New York, Buffalo, NY, USA
Received 22 May 2001; revised 18 September 2002; accepted 12 March 2003
Abstract
It is commonly assumed across the language sciences that some semantic participant information
is lexically encoded in the representation of verbs and some is not. In this paper, we propose that
semantic obligatoriness and verb class specificity are criteria which influence whether semantic
information is lexically encoded. We present a comprehensive survey of the English verbal lexicon,
a sentence continuation study, and an on-line sentence processing study which confirm that both
factors play a role in the lexical encoding of participant information.
q 2003 Elsevier Science B.V. All rights reserved.
Keywords: Verb; Arguments; Adjuncts; Argument structure; Sentence processing; Lexical semantics; Lexicon;
Instruments; Locations
1. Introduction
Most utterances describe situations. One immediate task of addressees is to determine
who participated in those situations and the nature of this participation. To successfully
complete this task, addressees can rely on their knowledge of situations and of their
language’s lexicon and grammar. A goal of linguistics, psycholinguistics, and
computational linguistics is to determine the contribution of each of these sources to
the process of understanding an utterance. It is common among linguists to assume that a
significant portion of the situational information which addressees retrieve is associated
with the sentence’s verb(s). In a sentence such as Mary washed her hands, for instance, the
type of situation being described, the number of entities that it must include, as well as
0022-2860/03/$ - see front matter q 2003 Elsevier Science B.V. All rights reserved.
doi:10.1016/S0010-0277(03)00082-9
Cognition 89 (2003) 67–103
www.elsevier.com/locate/COGNIT
* Corresponding author. Department of Linguistics, 609 Baldy Hall, University at Buffalo, Buffalo, NY 14260-
1030, USA. Tel.: þ1-716-645-2177, ext. 717; fax: þ1-716-645-3825.
E-mail address: [email protected] (J.P. Koenig).
their modes of participation in the situation are assumed to be encoded in the lexical entry
for the verb washed, such that upon encountering washed, readers will include in their
representations this schematic information. In contrast, other kinds of schematic
situational participants are claimed to have a different status; they are not included in
the lexical entry of particular verbs. Participants whose presence and mode of participation
in the situation are associated with particular verbs are typically called arguments; those
which are not are called adjuncts. Unfortunately, while most linguists agree that the
distinction between arguments and adjuncts is real, no consensus currently exists as to its
basis, the boundary between the two classes, or its role in grammar. In particular, there is
no generally agreed upon answer to the following question: what are the criteria that
determine which semantic dependents are included in the representation of particular
lexical entries? In this paper, we propose a preliminary answer to this question and provide
quantitative and experimental evidence that supports it.
2. The argument/adjunct distinction
It is common since at least Tesniere (1959) to distinguish between two classes of
dependents of verbs like cut in sentence (1):
(1) Mary cuts out paper dolls
with her embroidery scissors for her children on the porch every week-end
The noun phrases (NPs) Mary and paper dolls are assumed to correspond to schematic
participant information (roughly speaking, agent and patient roles, respectively)1 encoded
in the entry for cut, and are typically called arguments of the verb. All the prepositional
phrases (PPs) in sentence (1), on the other hand, are typically excluded from the lexical
entry of cut and are typically (although not universally) called adjuncts. The intuition
behind this classification of schematic participant information contributed by verbs is that
the required presence of two schematic participants – and two NPs which express them –
is a property of cut. In contrast, the presence of other participants in the situation (and PPs
which express them, italicized in sentence (1)) is neither required nor depends on the
particular verb the speaker chose. These participants could co-occur with most other verbs.
The same distinction is sometimes characterized with the pair of terms complement and
adjunct/modifier. To avoid terminological confusion, we will reserve the dichotomy
between argument and adjunct for the semantic side of the purported distinction illustrated
in sentence (1) and complement and modifier for its syntactic correlate. Finally, the set that
comprises both arguments and adjuncts we call semantic dependents whereas we talk of
syntactic dependents for the set which includes both complements and modifiers.
The distinction between arguments and adjuncts is crucial to most current linguistic
frameworks (see, among others, Bresnan, 1982; Chomsky, 1981, 1986; Foley & Van
1 We use here the traditional names of agent and patient for mere convenience, despite the fact that, as we point
out later, there is no single coherent notion of agent (or patient, for that matter), but only a collection of various
agent (and patient) roles.
J.-P. Koenig et al. / Cognition 89 (2003) 67–10368
Valin, 1984; Pollard & Sag, 1987). In all these approaches, argument information included
in the representations of lexical entries drives the construction of clauses. Thus, the
(required) occurrence of subject and direct object NPs in sentence (1) and the participant
roles they bear follows from the inclusion of some syntactic and/or semantic information
in the entry for cut which encodes the fact that it describes situations in which an agent and
patient participate. As a consequence, the syntactic structure of many sentences is mostly
or entirely determined by the information about situation participants included in lexical
entries of verbs.
The distinction between arguments and adjuncts is not only important to current
linguistic theorizing, but also to research on human sentence processing. Speer and Clifton
(1998) and Schutze and Gibson (1999), for example, have adduced empirical evidence to
support the claim that the human sentence processing mechanism gives precedence to
arguments in building the representation of a sentence (see also Boland &
Boehm-Jernigan, 1998; Liversedge, Pickering, Branigan, & Van Gompel, 1998). The
former provide data which suggest that readers read PPs faster when they express
arguments of a verb than when they express adjuncts, even when the two PPs are equated
for their relative plausibility. The latter suggest that upon encountering a phrase which can
ambiguously attach in the current structural representation of the sentence, the parser
prefers attachments which make the ambiguous phrase an argument. Finally, the
distinction between arguments and adjuncts bears on the processing of implicit participant
information discussed in Mauner, Tanenhaus, and Carlson (1995), Koenig and Mauner
(1999), and Mauner and Koenig (2000). This research suggests that as soon as the passive
verb is encountered, e.g. sold in sentence (2a), an agent participant role is included in the
reader’s representation of the sentence. By contrast, no agent role is included in a reader’s
representation following the access of the middle verb sold in sentence (2b).
(2) a. The vase was sold to collect money for the charity.
b. *The vase sold to collect money for the charity.
Because of the presence of an implicit agent in sentence (2a), the unexpressed subject
of the rationale clause finds an antecedent and readers do not experience processing
difficulties at collect; by contrast, the unavailability of a lexically encoded agent creates
processing difficulties for readers of sentence (2b). The fact that implicit agent
information, which can be optionally expressed through a by-phrase, is lexically encoded
and affects the processing of subsequent rationale clauses raises the possibility that some
of the italicized optionally expressed constituents in sentence (1) are also lexically
encoded and may affect processing of dependent expressions or of displaced fillers. But,
which ones?
Despite the linguistic and psycholinguistic importance of the distinction between
argument and adjunct participant information, we do not believe that reliable criteria to
distinguish between these two sorts of information currently exist. Many behavioral
distinctions that are used to distinguish between arguments and adjuncts are not reliable, as
has been pointed out in several papers (see Miller, 1997; Schutze, 1995; Vater, 1978).
Furthermore, the criteria which have been proposed are, at best, epiphenomenal and of
relatively low frequency (e.g. verb phrase (VP) ellipsis, constraints on extraction). They do
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 69
not constitute properties that are reliably present in the environment and that could cause
some participant information to become encoded in speakers’ mental lexicon. The next
section proposes a set of semantic properties which (i) reliably determine the degree to
which a participant role is lexically encoded in a verb’s lexical representation and (ii) are
easily observable and might therefore cause some participant information to be lexically
encoded.
3. The semantic basis of the argument/adjunct distinction
Our saying that information is “included in” lexical entries should not be taken too
literally. Although throughout this paper we use traditional linguistic terminology,
according to which, information is metaphorically said to be “included in” lexical entries,
our proposal does not depend on the accuracy of this metaphor. It can, for example, be
recast as “strong associations” between a phonological representation, a predicate
meaning, and participant “slots” which are in turn operationalized through weighted
connections among distributed nodes. A detailed picture of what lexical encoding amounts
to is more than we can do in this paper. We can, however, outline our representational
assumptions regarding the nature and organization of lexical knowledge in order to
motivate our criteria for the lexical encoding of participant information.
Simply put, we assume that the organization of the mental lexicon can be described as a
multidimensional hierarchy of categories, very much along the lines of a traditional
semantic network (see Collins, Quillian, & Ross, 1970; Quillian, 1968). Each category can
include a combination of syntactic, semantic, and morphological information. Words
which are members of these categories are linked to the most specific categories of which
they are members and each category can itself be linked to more general categories. The
situation is illustrated informally in Fig. 1.
Each node in the figure represents a cluster of semantic, syntactic, or graphemic/pho-
nological information. Thus, the semantic representation of the transitive use of break is
represented by the node labeled breakcaus. It inherits schematic participant information
from two general semantic classes, the class of situations which include a cause,
represented by the node labeled cause-relation and the class of situations which include an
affected entity, represented by the node labeled affected-relation. The word break also
Fig. 1. A schematic “network”-based representation of lexical information. Dashed lines indicate the presence of
intermediary nodes between two categories; pointed arrows indicate that the association between two categories
is more complex than strict inheritance.
J.-P. Koenig et al. / Cognition 89 (2003) 67–10370
inherits from the node labeled transitive-verb which summarizes information which
characterizes the syntactic category of transitive verbs. Because this syntactic category is
itself associated with the semantic nodes representing situations which include causes and
affected entities, as argued by various scholars (see Goldberg, 1995 for a review), the
nodes cause-relation and affected-relation are also activated by the category transitive-
verb.2 In fact, the salience of the activation of the schematic information summarized in
the cause-relation and affected-relation nodes might partially be the result of the relevance
of this information to syntactic processes.
This view of lexical knowledge underlies most directly syntactic research within Head-
driven Phrase Structure Grammar (see Pollard & Sag, 1987, 1994 for an introduction, and
Koenig, 1999 for a recent detailed analysis of lexical knowledge along these lines). But
several frameworks, particularly within computational linguistics, assume such an
organization of lexical knowledge (see Briscoe, Copestake, & de Paiva, 1993 for a
survey). It is not crucial to our analysis of the factors which determine lexical encoding of
participant information that lexical knowledge be truly encoded in the mental lexicon in
the form of an inheritance network. As Rumelhart and Todd (1993) show, an organization
of knowledge very similar, if not isomorphic to, semantic networks may emerge out of a
distributed representational system. What is crucial for our hypothesis regarding lexical
encoding of participant information is that upon encountering a word, readers and hearers
access a vast amount of semantic and syntactic information which is not encapsulated, but
rather shared across words. In such a model, lexical encoding means that the information
accessed/activated upon the recognition of a word includes information about categories
to which the word belongs. Lexical encoding of participant information thus reduces to
semantic categories accessed/activated upon word recognition. This model leads us to
propose that two criteria affect lexical encoding (and thus determine the semantic
argument status of a participant role), semantic obligatoriness and semantic specificity.
The first criterion selects participant categories that are present in all situations described
by a verb lemma and that recognition of a verb is likely to activate. The second criterion
selects participant categories that are most relevant to the meaning and syntactic properties
of a verb lemma and that recognition of a verb is also likely to activate.
3.1. Obligatoriness
If a type of participant is required by the class of situations described by a verb,
information relative to this participant is activated upon recognition of the verb. Put
otherwise, the semantic representation of a verb is linked to semantic classes which
characterize the properties of obligatory participants (through a set of features or
microfeatures). In more formal terms, the Semantic Obligatoriness Criterion (SOC) can be
stated as follows:
2 Note that the association between the class of transitive verbs and cause-relation and affected-relation is more
complex than that between breakcaus and cause-relation. Whereas the meaning of the transitive version of break
(breakcaus) is a subtype of causal relations (cause-relation), causal relations (cause-relation) are not a subtype of
transitive verbs (transitive-verb).
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 71
Semantic Obligatoriness Criterion (SOC): If r is an argument participant role of
predicate P, then any situation that P felicitously describes includes the referent of the
filler of r.
The SOC has already been proposed as a criterion for argumenthood (Dowty, 1982),
but has wrongly, to us, been dismissed. It says that an argument constitutes an obligatory
participant of the eventualities described by a verb. The fact that the subject of John slept
corresponds to an argument of sleep partly stems from the fact that no sleeping event can
occur without a participant who is sleeping. But, the SOC only provides a necessary
condition on argumenthood. Location semantic dependents – more specifically those
locations we call event locations – and time semantic dependents are entailed of almost all
situations, despite the fact that they are widely believed to be adjuncts, as Bresnan (1982)
points out. For instance, if you knit, you must knit somewhere; in other words, any
situation described by the predicate corresponding to the English word knit includes a
location in which the event occurred. Thus, if the SOC is the only criterion on
argumenthood, it means that the semantic expression denoted by in his office in sentence
(3) would qualify as a semantic argument of the predicate denoted by knit.
(3) Marc knits in his office during lunch.
The same reasoning holds for semantic dependents that encode the time at or during
which an event occurred. Any situation the word knit felicitously describes must occur at a
particular interval of time. Again, if the SOC were the sole determinant of argumenthood,
the denotations of time expressions such as during lunch would qualify as semantic
arguments, a conclusion contradicting most linguists’ intuitions. This is why our notion of
lexical encoding also requires that the participant information be specific to a restricted
class of verbs. We refer to this additional requirement as the specificity criterion.
3.2. Specificity
The notion of specificity applies at two different levels of granularity, individual verbs
and semantically determined verb classes. At a fine-grained level, if a type of participant is
unique to a verb, it will be strongly activated upon recognition of that verb. At a coarse-
grained level, if a type of participant information is associated with a syntactic process
which targets a class of verbs, that information will be activated upon recognition of one of
these verbs. Readers should note that our formulation of specificity is a mixture of two
kinds of measure. While specificity is an intensional semantic information measure (a
measure of how many more general categories subsume the encoded information), class-
restrictedness is an extensional measure of how many verbs or situation types this
information fits. Of course, the two measures are correlated: more specific information is
typically true of fewer entities. Because quantifying class membership is easier than
quantifying degrees of generality of information, throughout this paper we will
consistently estimate participant information specificity by the cardinality of the class
of verbs it fits and use specificity and class-restrictedness interchangeably.
Our notion of semantic specificity is based on two observations. First, most events
J.-P. Koenig et al. / Cognition 89 (2003) 67–10372
occur at a certain location, at a certain time, and for the benefit of somebody. Constituents
which encode event locations, times, or beneficiaries can co-occur with most event-
denoting verbs (or nouns). In contrast, PPs expressing recipients, for example, cannot as
easily combine with verbs (or nouns). In fact, we know of no verb which can combine with
phrases which express recipients but not with phrases which express beneficiaries, event
locations, and times. The set of verbs which can combine with constituents expressing
beneficiaries, event locations, or time adverbials is thus significantly larger than the set of
verbs which can combine with recipients.
Secondly, participant roles corresponding to arguments (or, that are specific) take on
additional properties for particular event-types. Consider the words sing and write, for
example. The agent of a singing event carries additional properties beyond those which
all agents (more precisely, all causal effectors) carry. One property, discussed in Dowty
(1989), is that the agent must adduct its vocal folds in any event that sing felicitously
describes. This property is clearly not true of the agent of write. Conversely, the agent
of write is required to produce written examplars of words or sentences, which is not
true of the agent of sing. The presence of obligatory, additional properties for the
agents of sing and write contrasts with the absence of verb specific properties of the
(event) location at which events of singing or writing occur. Whether one sings or
writes in one’s office, no property of the location is entailed aside from the
characteristic property of the event location role, namely the fact that the location is
where the event occurs. This difference between event locations and agents of the verbs
write or sing generalizes, we believe, to most verbs. Participants in the events
felicitously described by each verb (or restricted class of verbs) are lexically required
to bear additional properties above those characteristic of the general semantic role
they instantiate if they correspond to arguments, but not if they correspond to adjuncts.
Because providing reliable estimates of the average number and kind of additional
properties argument participant roles bear is much more difficult than estimating the
sheer number of verbs requiring those roles, we do not provide direct empirical
validation of this aspect of the notion of specificity in this paper.
Two differences between (semantically obligatory) arguments and (semantically
obligatory) adjuncts emerge from these observations: (i) adjunct participant roles are
common to most verbs; (ii) argument participant roles are lexically required to bear
additional properties aside from those which are characteristic of the role. We summarize
these two differences between (obligatory) argument roles and (obligatory) adjunct roles
by saying that argument roles are specific to both an individual event-type and a restrictive
class of event-types to which the meaning of the verb belongs. We define the Semantic
Specificity Criterion for argumenthood (hereafter, SSC) as follows:
Semantic Specificity Criterion (SSC): If r is an argument participant role of predicate
P denoted by verb V, then r is specific to V and a restricted class of verbs/events.
The hypothesis that semantic specificity correlates with lexical encoding of participant
information accords well with recent research on the nature of semantic roles and research
on the semantic basis of syntactic subcategorization. Dowty (1989), for instance, argues
that semantic roles, such as agent and patient, are the intersection of properties of
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 73
participants in the described situations shared among a set of verbs. To use his
terminology, semantic role types (our participant roles) are categories of individual
roles, where the latter are defined as all the necessary properties of a given
participant in a situation-type. For example, the individual role of writer consists of
the properties all writers bear; similarly, the individual role of singer consists of the
properties all singers bear. The participant role of agent (assuming for now there is
such a role), on the other hand, is defined as the intersection of the properties of
writers, singers, and so forth. The fact that participants in events bear roles of
various grain size (from the most specific individual roles to the most general
semantic roles types) plays an important part in delineating a verb’s meaning and
the argument status of its participant roles.
To illustrate the role individual participant roles (more generally, finer-grained roles
than semantic roles) play in distinguishing verb meaning, let’s again compare sing and
write. Information relative to the agent and patient participants of sing must include
information which pertains to the singer and what is sung, not just the properties they
share with all agents and patients. That its agent must adduct its vocal cords is part of
what makes an eventuality an event of singing rather than of writing. By contrast, the
properties of the location in which singing events occur does not help distinguish the
meaning of sing from that of write. Let’s call A, P, and L the sets of properties which
the agent, patient, and location participants of any singing event bear. The sets A and
P on the one hand, and L on the other, differ in one crucial respect. Whereas the sets
A and P include properties which help distinguish the meaning of sing from that of
other words (the singer adducts his/her vocal cords, the product of the activity is
audible, …), the set L does not include such properties. In other words, A and P
contain members which are not part of the corresponding sets for write, knit, and so
forth. In contrast, all members of L are shared with most, if not all verbs (in particular,
with the corresponding set for write, knit, and so forth). They are properties of all
spatial locations in which eventualities occur, whatever their type. The SSC posits that
this difference between the sets A and P, on the one hand, and L, on the other, is
crucial to lexical encoding of schematic participant information.3
Our specificity criterion also accords well with current research on the interface
between lexical semantics and syntax which holds that the predictability of the syntactic
realization of semantic dependents must make reference to event- or verb-classes (see
Davis & Koenig, 2000; Goldberg, 1995; Jackendoff, 1990; Levin, 1993; Levin &
Rappaport-Hovav, 1995; Pinker, 1989; Van Valin & Lapolla, 1997; Wechsler, 1995
among others). This research has shown that reference to classes of verbs which are
defined semantically (by the similarity of the event-type their members denote) is critical
to the statement of the constraints which relate (1) semantic and syntactic dependents
(so-called linking constraints) and (2) the variants of verbs with multiple subcategorization
3 The relevance of participant role specificity to the individuation of verb meaning has an intriguing parallel in
the notion of term specificity in document retrieval research. The more documents a query term appears in, the
less characteristic of any document it is and, therefore, the more likely it is that a query that contains it will return
irrelevant documents (Sparck-Jones, 1972). Similarly, the more verb meanings a participant role must be part of,
the less characteristic of any of these verbs’ meanings it is (see also Resnick, 1999).
J.-P. Koenig et al. / Cognition 89 (2003) 67–10374
frame possibilities (so-called valence alternations). Our claim that the distinction between
argument and adjunct roles is sensitive to the size of the class of verbs whose meanings
include this role is a natural extension of this research. Building on the observation that
linking constraints and valence alternation conditions which target participant roles do not
indiscriminately apply to most verbs, but only to the restricted class of verbs whose
meanings include that role, we hypothesize that a participant role is treated as an argument
of a verb only if it targets a restrictive class of verbs, that is, if the participant role is not
true of the denotation of most verbs.
3.3. The Lexical Encoding Hypothesis
We summarize our hypothesis regarding lexical encoding of participant roles as
follows:
Lexical Encoding Hypothesis (LEH): A participant role is a (semantic) argument of a
verb if and only if it satisfies both the SOC and SSC, that is, if its presence is required of
all situations described by that verb and if it is required of the denotation of only a
restricted set of verbs.
Three points are worth noting regarding the LEH. First, our hypothesis is that semantic
specificity and obligatoriness determine the set of lexically encoded semantic arguments.
Whether all semantic arguments are syntactically active arguments – what most scholars
call arguments simpliciter – depends on the form of the verb (see the contrast between
middles and passives discussed in Mauner & Koenig, 2000). Second, arguments are
required to meet two conditions. Hence participant roles can be non-arguments or adjuncts
for either of two reasons. They are either not obligatory or they are not specific to a
restricted class of verbs. As we discuss in the next section, one can indeed find examples of
all four combinations of the two factors we have isolated (^obligatory and ^specific).
Third, the LEH is, strictly speaking, an hypothesis about factors that determine the lexical
encoding of participant information, not of what underlies all notions of argument that one
may find in the literature. This last point has important consequences for the interpretation
of our experimental evidence. What this evidence suggests is that participant roles which
satisfy the conditions of the LEH behave as if they were lexically encoded and are
therefore arguments by the definition of argument we gave in the introduction. But, we do
not know whether all lexically encoded participants behave the same. First, semantically
and syntactically obligatory participant roles might behave differently from semantically
obligatory, but syntactically optional roles. Second, some roles (causes or affected entities,
or those that Langacker (1987) calls participants) might be more cognitively salient than
others (participant locations or those that Langacker (1987) calls settings). Any or all of
these additional factors might lead to behavioral differences among the set of lexically
encoded participant roles. We leave the study of these possible further differences for
future research.
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 75
4. A quantitative survey of the English verbal lexicon
4.1. Confirming the SSC
According to our two semantic criteria, participants that are arguments should be
both semantically obligatory and associated with restricted classes of verbs. In
contrast, participants corresponding to adjuncts are associated with most verbs or are
not semantically obligatory. In this section, we present the results of a
comprehensive survey of the English verbal lexicon that suggests that these two
criteria allow us to quantitatively and empirically define notions similar or identical
to those on which linguists have intuitively relied to characterize the differential
behavior of participant roles.
In order to gather a reliable estimate of how many verbs lexically require various
participant roles, we had three pairs of raters determine which verbs require or do not
require various participant properties for each verb they knew out of a list of 6100
verbs obtained from the MRC psycholinguistics database (see Coltheart, 1981). Each
pair judged a different set of participant roles. The initial list contained all irregular
verb forms as well as a number of exclusively British entries. Once the non-base forms
of irregular verbs and exclusively British entries were excluded and the remaining
items were cross-referenced with the American Heritage Dictionary (3rd edition), 5542
verbs remained. (A few verbs which were in the American Heritage Dictionary, but not
in the MRC database were added. It is likely that the American Heritage Dictionary
includes a few additional verbs of very low frequency.) Raters were Linguistics
graduate students and a research technician working in the Psycholinguistics
laboratory. All raters received extensive instructions on how to assess lexical
entailments that define various participant roles. Our surveys made an important
simplifying assumption. We counted verbs on the basis of their morphological identity:
all senses of a lemma counted as one entry in our database. This decision had the
following consequence. When a verb had several entries, only one of which entailed
the presence of a given participant role, the verb was marked as requiring this
participant role. Thus, the verb hide was marked as requiring an obligatory participant
location (the location where the pictures end up in sentence (4a)), although it only
requires one in the use illustrated in sentence (4a), not in the one illustrated in sentence
(4b).
(4) a. Bill hid many compromising pictures in his desk, (while) at school.
b. Sue does not want her children to hide their feelings from her.
Each rater independently determined whether each verb had at least one sense which
required the presence of a participant bearing a certain participant property, and if not, if it
had at least one sense which allowed the presence of a participant bearing a certain
participant property. Rater pairs also met to resolve any differences in the criteria they
applied in assessing whether a verb required or allowed a given participant role.
Disagreements fell into three categories: (1) raters disagreed on whether two uses of a verb
constituted one or two distinct verb senses; (2) one rater of a rater pair failed to imagine a
J.-P. Koenig et al. / Cognition 89 (2003) 67–10376
context of use in which a verb did not require the presence of a participant bearing a certain
property; (3) raters simply disagreed on whether a verb could be used in a given context.
After the first two kinds of disagreement were resolved, only a small proportion of true
disagreements were left.
4.2. Rating study 1: agent roles
In our first survey, one pair of raters judged agent roles because they are the most
undisputed of argument types and are likely to be the most frequently occurring of
argument categories. For each verb raters had to answer the following questions, shown in
example (5), which correspond to the three most important proto-Agent properties
mentioned in Dowty (1991):4
(5) a. CAUSAL FORCE: Does the verb describe situations which must, can, or cannot
include a participant or force that causes a change of state (e.g. Marc in Marc finally
cooked the fish)?
b. INTENTIONALITY: Does the verb describe situations which must, can, or cannot
include a participant who is volitionally involved in the situation (e.g. Martha in
Martha jogged last night)?
c. NOTION: Does the verb describe situations which must, can, or cannot include a
participant who has a mental representation of another participant in the situation
(e.g. Marc in Marc thought of the beach)?
Our motivation for assessing whether described situation-types must include a
participant bearing properties such as causality, volition, and having a notion of another
participant, rather than whether the described situation-types simply required the presence
of an agent, is based on semantic considerations. Although the term agent is sometimes
used in the literature to cover participants that bear any of the three properties we
examined (or even a larger number), there is no semantic correlate of that use of the term
agent. It is merely a cover term for a disjunction of semantic categories (agents are
volitional entities or causers or cognizers …). To the extent this disjunction does not
represent a semantically natural class it does not serve to specify a coherent class of verb
meanings and is not a participant role, semantically.5 Furthermore, even if the label agent
had some semantic unity, the fact that we distinguish subtypes of agents (but not, of say,
event locations or times) that are restricted to relatively small classes of verbs counts as
evidence that, all else being equal, agents (and not event locations or time) are arguments.
This is because the existence of recognizable subcategories of a participant role is what
makes that role useful for distinguishing one verb meaning from another. Finally, the fact
that some of these properties are specifically targeted by grammatical constructions (e.g.
intentionality is required of the controller of the unexpressed subject of rationale clauses,
4 Because we follow Dowty (1989) and assume semantic roles are defined through necessary properties shared
by participants in situations described by verbs, we will talk interchangeably of agent roles or agent properties.5 Dowty (1991) discusses the grouping of the properties listed in (5) for interfacing the lexical semantics of
verbs and their co-occurrence properties. But, crucially, he does not discuss the grouping on semantic grounds
per se.
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 77
as shown by the ungrammaticality of *John collapsed to wake his mother up) suggests the
grammatical relevance of individual entailments and further justifies the selection of this
semantic grain-size. For these reasons, we believe that the semantically consistent
participant properties listed in (5) should be tabulated independently when assessing the
degree to which our LEH matches traditional, intuitive classifications of arguments; thus
Table 1 tabulates each agent property separately.
In computing the percentage of verbs whose semantics require that a participant bear a
certain property, we considered verbs which were known by one pair of raters, i.e. 3909
verbs in all.
As Table 1 shows, agent properties target between 15% and 30% of verbs known to
college-educated English speakers. Because the various agent properties are the most
frequent arguments, we can use them as a rough estimate of an upper limit for the
specificity hypothesis. Our results on the selective nature of agent properties (they are
required to be true of less than a third of lemmas) accord well with the restrictiveness
of a few valence alternations in both English and French. The English ditransitive
alternation targets about 5% of the verbs known to college-educated speakers (A.
Goldberg, p.c.), while the French inchoative and dative predication alternations target
about 16% and 2%, respectively, of the verbs known to college-educated speakers
(Koenig, 1994). Like required agent properties, valence alternations semantically target
a restricted set of verbs.
4.3. Rating study 2: time, locations, instrument, and beneficiary
With this upper limit of class-size in hand, we exhaustively examined the English
verbal lexicon for five additional participant properties. The surveys of these five
properties involved two different pairs of raters, one pair for beneficiary, instrument, and
time, another for participant and event locations. As was the case for Study 1, each rater
independently determined whether each verb had at least one sense which required the
presence of a participant bearing a certain participant property, and if not, if it had at least
one sense which allowed the presence of a participant bearing a certain participant
property. We chose to contrast those participant properties which are widely held to be
adjunctive in nature (event location, time, and beneficiaries) and those syntactically
optional participant properties which pass some traditional tests of argumenthood and are
therefore sometimes classified as arguments or quasi-arguments (e.g. semantically
Table 1
Percentages of verbs judged to require or allow various agent roles
Agent property Verbs requiring role Verbs allowing role
Causal force 29.8 18.1
Volition 23.6 70.8
Notion 14.2 71.5
All inter-rater agreements were above 99%. Since a single participant can bear more than one property (e.g.
about 3% of verbs describe events which include participants that are both causal forces and volitionally
involved), the percentages do not sum to 100%.
J.-P. Koenig et al. / Cognition 89 (2003) 67–10378
obligatory instruments and participant locations).6 The argument status of instruments is
controversial. Some linguists treat instruments as adjuncts on the basis of the fact that
instruments do not participate in valence alternations in English (see Carlson &
Tanenhaus, 1988) or that they are syntactically optional (see Dowty, 1989). Others argue
on the basis of several syntactic criteria that instruments behave like arguments or quasi-
arguments and differ from event locations or time participants (see Schutze, 1995; Van
Valin & Lapolla, 1997). It is possible that the inconclusiveness of previous studies stems
from treating instruments as a unified category and from not recognizing the existence of
two classes of verbs, those which allow, but do not require the presence of instruments in
denoted situations and those which do require their presence. Consider the contrast
between the verbs chop and eat. Chop describes situations which must include an
instrument; in contrast, eat describes situations which can, but do not necessarily, include
an instrument. The SSC predicts that syntactically optional participant roles which pattern
syntactically with arguments (i.e. obligatory instruments and participant locations) will be
required of the denotations of less than 30% of English verbs whereas syntactically
optional participant roles which do not pattern syntactically like arguments (event
locations and time) will be required of the denotations of significantly more than 30% of
English verbs. Finally, we also compared event locations and times to the seldom
discussed category of participant locations (i.e. of locations which indicate where a
participant in the event is or ends up) which are exemplified by the PPs in her notebook and
in his desk in sentences (6a) and (6b), respectively. Participant locations meet many of the
syntactic criteria which Schutze (1995) uses to argue that instruments behave like
arguments.
(6) a. Johanna wrote the address in her notebook, (while) in her office.
b. Bill hid many compromising pictures in his desk, (while) at school.
One pair of raters assessed which verbs require, allow, or exclude a time participant, an
instrument participant, or a beneficiary participant by answering the following questions:
(7) a. INSTRUMENT: Does the verb describe situations in which one participant must,
can, or cannot use another participant to perform an action (e.g. Marc poked the frog
requires Marc to have used something)?
b. TIME: Does the verb describe situations which must, can, or cannot occur at a
certain time interval (e.g. the time of the writing in Marc wrote down the address)?
c. BENEFICIARY: Does the verb describe situations which must, can, or cannot be
performed for the benefit of someone or something? (e.g. Mary in Marc cooked a
wonderful dinner for Mary)?
A second pair of raters assessed which verbs require, allow or exclude an event location
participant or a participant location participant, by answering the following questions:
6 Although traditional (syntactic) criteria are not reliable for determining the argument status of a participant
role, we use the fact that, as Schutze (1995) suggests, not all syntactically optional participant roles behave
identically to assess the role the SSC plays in discriminating between participant categories whose participant
status has traditionally been problematic.
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 79
(8) a. EVENT LOCATION: Does the verb describe situations which include a location in
which all the participants must, can, or cannot be located and in which the event
as a whole takes place (e.g. the location in which the writing occurs in Marc wrote the
address)?
b. PARTICIPANT LOCATION: Does the verb describe situations which must, can, or
cannot include a location in which one, but not necessarily all participants are
(e.g. the notebook in Martha wrote down the address in her notebook)?
As was the case for the rating of agent properties, raters were given several examples of
each category. But because the categories of instruments and participant locations were
not as well-known, our raters were also given instructions on how to resolve difficult cases.
For instruments, body-parts could count as instruments only if objects other than body-
parts could be used as instruments as well (thus, mouth is not an instrument of eating,
according to this definition) and causally responsible entities in intermediate causal events
that do not differ from the ultimate causally responsible entity did not count as instruments
(thus, Bob is not the instrument of the event denoted by the by-phrase in Bob persuaded
Billy to wash the dishes by suggesting he would not otherwise go to the ballgame).
Similarly, explicit instructions were given to cover difficult cases involving participant
and event locations. In particular, to count as an event location all participants had
to be in that location during the entire duration of the event. Thus, because John, in John
abandoned Bill on the island, is present on the island at the initial but not final interval of
the event, on the island was classified as a participant location. The results of the survey
are summarized in Table 2.
The percentages in Tables 1 and 2 should be considered as estimates, since, as
mentioned before, all senses of a lemma counted as one entry. Counting verb lemmas with
at least one entry that requires a particular participant role rather than directly counting
entries most probably leads to an overestimation of the percentage of verb entries whose
denotations require an obligatory participant location or an obligatory instrument. Our
motivation for simplifying our counting procedure was prompted by practical
considerations. We had no practical means of distinguishing verb senses. However, the
consequences of our misestimation do no harm to our hypothesis regarding class size,
since, if anything, it overestimates the percentage of verbs which take obligatory
instrument or participant location roles.
The results shown in Tables 1 and 2 provide empirical support for our hypothesis:
Table 2
Percentages of verbs judged to require or allow five syntactically optional participant roles
Participant role Verbs requiring role Verbs allowing role
Instruments 12 35
Event locations 98.2 0
Participant locations 7 1.5
Time 99.8 0
Beneficiary 0 94.1
All inter-rater agreements were above 92%.
J.-P. Koenig et al. / Cognition 89 (2003) 67–10380
semantically obligatory participants, which are traditionally judged to be arguments,
co-occur with a small set of verbs (about 30% or less), whereas semantically obligatory
participants, which are traditionally judged to be adjuncts, occur with most verbs (above
90%). They are also consonant with the judgments of linguists such as Van Valin (1993)
and Schutze (1995) who treat syntactically optional, but semantically obligatory,
instruments as semi-arguments rather than adjuncts. Finally, our results suggest that the
less studied participant location role, like instruments, are semantic arguments of verbs
whose denotations require them.
Aside from confirming our class-size hypothesis, our survey results bring to the fore
several theoretical issues. For reasons of space, we only mention one here, namely the
independence of the notions of semantic obligatoriness and participant role type. Consider
the following sentences:
(9) a. The policeman poked the body (with a stick).
b. The policeman sipped his iced tea (with a straw).
Whereas the presence of an entity serving as instrument in the described event is
necessary irrespective of the presence of the with-PP in sentence (9a), the presence of an
instrument is dependent on there being a PP in sentence (9b). Thus, whereas the use of
poke illustrated in sentence (9a) requires the presence of something which the policeman
used to poke the body, the use of sip illustrated in sentence (9b) does not entail that
something was used by the policeman to sip his tea. Nonetheless, the referent of the object
of with plays the same participant role in both sentences. In other words, the verbs poke
and sip both describe events which include the same participant roles in sentences (9a) and
(9b), when they include a with instrument PP, even though only poke lexically requires
the presence of an instrument. The contrast between obligatory instruments and non-
obligatory instruments suggests that we need to (partly) dissociate the semantic type of a
participant role from its obligatoriness and that we cannot define arguments as members of
a predefined list of participant roles.
The contrast between a semantically obligatory and a semantically optional variant of a
single role type is endemic to the instrument role. As shown in Table 2, there are hundreds
of verbs which semantically require instrument participants and many more verbs that
merely allow an instrument participant in the situations they denote. Instruments thus
provide good evidence of the need to separate the issue of role type from that of semantic
obligatoriness. But, other participant roles illustrate this same contrast. Some verbs (more
precisely, verb lemmas) require that the situations they felicitously describe include
participant locations, while others merely allow their presence. Consider the following two
sentences:
(10) a. He heated the wax quickly in a small pan.
b. He boiled the water quickly in a small pan.
Whereas both locations are participant locations and not event locations (as the ability
to add an event location such as in the kitchen attests), the participation of the pan (or some
other container) is only required of boiling events; it is not required of heating events. (The
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 81
presence of the adverb quickly excludes an NP attachment interpretation of the locational
PP.) One can heat wax by letting sun’s rays hit it for an hour. The situation is similar to
what we just discussed for instruments, except that optionality is quite rare in the case of
participant locations. Very few verbs are like heat in describing situations which allow, but
do not require, participant location participant roles. The independence of semantic
obligatoriness from participant role type means that our two criteria of lexical encoding of
participant information are orthogonal: all combinations of obligatoriness and specificity
are indeed attested, as shown in Table 3.
5. Behavioral evidence for the SSC and SOC
Our survey results provide some initial confirmation for the hypothesis that semantic
obligatoriness and semantic specificity correlate with behavior traditionally deemed
symptomatic of lexical encoding. Instrument as well as participant location participants
which pass most of the (imperfect) traditional tests of argumenthood (Schutze, 1995) are
indeed specific to a restricted set of verbs and thus contrast with event location and time
participants. But these data do not provide direct evidence for the lexical encoding of this
information. The two experiments discussed in this section directly address this issue.
5.1. Experiment 1: testing the SSC
We adopted the following experimental logic to test the SSC. We assumed that
participant information that is lexically encoded is retrieved upon recognition of a word.
Because this information is activated, it is more likely to be used to continue a sentence.
This higher activation of lexically encoded information need not be reflected in higher
frequency in written texts of constituents expressing that information over participant
information which is not lexically encoded. Many other factors come into play in deciding
whether to express syntactically optional semantic information, including the relevance of
that information or its typicality. But, when explicitly asked to finish a sentence by adding
a constituent, we expect subjects to add a constituent expressing lexically encoded
participant information more often than a constituent expressing a participant that is not
lexically encoded, since, by hypothesis, only the former is activated by the recognition of
the sentence’s main verb. If the SSC is correct, we predict constituents expressing
semantically obligatory participant locations to be added: (i) more frequently after verbs
which lexically encode them than after verbs which do not lexically encode them; and (ii)
more frequently than constituents which express event locations, times, or any other
Table 3
Examples of ^specific and ^obligatory participant roles
Obligatory role Non-obligatory role
Specific role The participant location of boil; the instrument of poke The participant location of heat
Non-specific role The event location of boil The beneficiary of poke
J.-P. Koenig et al. / Cognition 89 (2003) 67–10382
participant information which is not lexically encoded, since only semantically obligatory
and specific participant information is activated upon recognition of verbs.
We tested these predictions in an experiment in which participants read sentences such
as those shown in (11).
(11) a. The collectible doll was advertised
b. The collectible doll was sold
The verb advertise in sentence (11a) semantically requires the denoted situations to
include both event and participant locations; that is, a location where the doll was
advertised; in contrast, sell in sentence (11b) requires the presence of an event location, but
does not permit a participant location.
5.1.1. Method
5.1.1.1. Participants. Seventy-two native English speaking undergraduates from the
University at Buffalo received partial course credit for participating in this experiment.
5.1.1.2. Materials. Thirty-six sentence pairs such as those in example (11) were
constructed (see Appendix A for a complete list of stimuli). Each pair included an
experimental sentence whose main verb semantically required both a participant location
and an event location (e.g. advertise in sentence (11a)) and an identical control sentence
except for the fact that its main verb required the presence of an event location but
did not allow a participant location (e.g. sell in sentence (11b)). Each of these
sentences allowed several continuations, as the sentences in (11) illustrate, participant
locations (e.g. in the magazine), manner adverbials (e.g. successfully), temporal adverbials
(e.g. last month), and so forth. The experimental sentences were distributed across two
presentation lists such that one member of each pair appeared on each list, and across the
two lists each pair member appeared only once. Thus, each participant saw 18 experi-
mental sentences and 18 control sentences. Sixty distractor sentences were intermixed
with experimental sentences. Like our experimental and control sentences, the 60
distractor sentences allowed multiple completions. For example, the distractor sentence
Jordan drank a soda can be followed by an instrument PP with a straw, a location PP in the
dining room, a temporal PP during dinner, a manner adverbial slowly, and so forth.
5.1.1.3. Procedure and coding. Participants were instructed to add something to the end of
each sentence that made sense and was grammatical, but without spending too much time
on any one item. Prior to conducting this study, a pilot study was conducted to determine
what categories to assign to completions (see Bienvenue, 1997). Twenty-two categories of
completions emerged. Some categories elicited very few continuations in this study and
were collapsed into a single Other category. In the end, 11 categories were used for inter-
rater reliability and statistical analysis. The assignment of completion type for experi-
mental and control items to one of these categories was performed by two raters blind to
the purpose of the study. After initially rating all the sentences, both raters met and
resolved all disagreements.
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 83
5.1.2. Results
Fig. 2 presents the average percentage of completions for the experimental sentence
pairs. As the figure indicates, sentences whose main verb semantically required a
participant location elicited many more participant location completions than any other
completion type (47%). Crucially, event location phrases, the second most frequent
completion, elicited less than a fourth as many responses (11%).7 Four separate x 2
analyses were conducted on the summed number of completions across all items and all
participants. The first showed that verbs semantically requiring a participant location
elicited more participant location completions than expected by chance (since there were
11 continuation categories – the Ungrammatical category being excluded – chance was
defined as 100% 4 11 ¼ 9:1%) for experimental items (x2ð11Þ ¼ 58:12, P , 0:001), but
not control items (x2ð11Þ ¼ 0:11, P . 0:05). The second indicated that verbs semantically
requiring a participant location elicited more participant location than event location
completions (and a fortiori other completions, since event location phrases were
the second most frequent type of completion) for experimental items (x2ð11Þ ¼ 41:01,
P , 0:001), but not for controls (x2ð11Þ ¼ 0:95, P . 0:1). The third indicated that verbs
semantically requiring a participant location elicited more participant location
completions than verbs that do not (x2ð1Þ ¼ 4:21, P , 0:05). The fourth indicated that
completions did not differ across experimental and control items for any other categories,
since they did not differ for the category with the second largest difference score, manner
(x2ð1Þ ¼ 1:23, P . 0:1).
5.1.3. Discussion
The results of this study suggest that participant location verbs, which semantically
require participant locations, are highly associated with participant location roles or
include in their representation a participant location role. But the same verbs are not highly
associated with event location participant roles or do not include in their representation an
event location role, despite the fact that they semantically require them. The SSC explains
this difference: event locations are not specific to a restricted class of verbs and are thus not
7 As a reviewer pointed out, it is surprising that there was 11% (139 out of 1296) of participant location
completions for control items which were meant to include verbs that did not allow a participant location. Several
factors contributed to this unexpectedly high percentage. First, some continuations which were classified as
participant continuations were actually instances of PP modifiers extraposed from the subject position. Second,
some continuations classified as participant locations required the passive verb to receive either an adjectival or
result passive interpretation. Since such interpretations involve only the “patient” participant, the location
continuations are actually event locations. Both of these misclassifications involved difficult linguistic judgments
which we had not anticipated and which our raters were not trained to make. Third, some continuations classified
as participant location continuations followed a particle that changed the verb lemma involved (for example,
many participant location continuations followed the addition of the preposition over after turn). All in all, after
these continuations were removed, only 5.1% of true participant location continuations remained. Two-thirds of
these followed verbs that were rated as allowing participant locations by at least one of our English verbal lexicon
survey raters. Given that at least one of our survey raters had judged the verbs as allowing participant locations,
the presence of participant location completions is not that surprising. The remaining 1.7% of true participant
location completions involved verbs that neither of our survey raters judged to allow participant locations.
All such continuations involved infrequent uses of the control verbs which raters in our survey simply did
not anticipate (13 of these remaining 22 completions, for example, followed the verb tested, as in The new
memory was tested on the latest computer on the market.).
J.-P. Koenig et al. / Cognition 89 (2003) 67–10384
highly associated with – or specific to – any verb. By contrast, participant locations are
semantically obligatory for only a fraction of the English verbal lexicon and thus are
highly associated with these verbs. Although these results support the SSC, a competing
explanation is that the obtained differences are not due to a semantic contrast between
experimental and control verbs or between purported arguments and adjuncts, but rather to
the sheer surface frequency of PPs expressing the various participant properties.
According to this competing hypothesis, participant locations constitute more frequent
continuations than event locations, simply because participant location verbs syntactically
co-occur more frequently with PPs which express such participant roles than with PPs
which express event locations. Of course, this competing hypothesis does not explain the
source of differences in PPs surface frequencies. We surmise that differences in lexical
encoding of schematic participant information play a role in the frequency of occurrence
of PPs which express that information.
To assess the plausibility of this competing hypothesis, we searched the Brown corpus
for occurrences of participant location verbs and control verbs, and counted the number of
times each verb occurred with various PPs. Numbers of occurrences of these PPs across
experimental and control verbs are summarized in Table 4. Present in the table refers to
the number of times the verb did co-occur with a given PP category; possible refers to the
number of times each verb occurred in a use that, by the judgments of our raters licensed
the occurrence of a particular kind of PP, although none co-occurred with the verb in the
Fig. 2. Percentages of completions of various semantic types for participant location and control verbs
(chance ¼ 9:1%).
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 85
Brown corpus. For example, in the Brown corpus sentence And he would sleep, sleep does
not co-occur with a PP expressing a participant location, but our raters thought it could
(e.g. in his chair can be added). We chose as the denominator for each participant role the
number of times a verb occurs in the Brown corpus in a use that licenses the occurrence of
a PP expressing that role (possible in Table 4) rather than simply the number of times a
verb occurs in the Brown corpus for the following reason. Many verbs have some uses
which do not license the expression as a PP of a particular kind of participant role.
Including all uses of a verb in the denominator would have therefore unduly reduced the
chances of finding a correlation, as the percentages of occurrence of PPs expressing any
participant role would have been rather small when compared to all uses of the verb in the
corpus. By choosing a smaller denominator, we increased the likelihood of finding a
correlation that would be counter to the experimental hypothesis. Because control verbs
were not predicted to allow participant locations to occur, Table 4 groups together
participant locations and stationary locations into a location supercategory. (Note that
since more than one optional PP can co-occur with any given verb, the percentages do not
sum up to 100%.)
We conducted two correlation analyses on the labeled corpus data. First, we examined
whether there was a correlation between the percentage of times a verb occurred with a PP
expressing any location (directional or non-directional) in the Brown corpus and whether
the verb denoted situations that included a participant location. A significant correlation
(rð48Þ ¼ 0:561, P , 0:01) was indeed found across all items between location
occurrences in the Brown corpus and location completions in our study. (Items with
fewer than four occurrences in the corpus were excluded, leaving 25 pairs from the
original 36.) Second, we examined whether the percentage of participant location
continuations for a participant location verb was correlated with the percentage of times
each verb co-occurred with a PP expressing a participant location in the Brown corpus.
We found that the correlation was not significant, and moreover, was negative
(rð23Þ ¼ 20:192, P ¼ 0:36). The scatterplot in Fig. 3 illustrates this second analysis.
The presence of a correlation in the first analysis suggests that syntactic co-occurrence
frequencies might indeed be part of what underlies the increase in location completions for
participant location verbs. But, the absence of a correlation in the second analysis militates
against assuming that all of the effect observed in our continuation study can be attributed
to co-occurrence frequencies. How frequently a participant location verb co-occurs with a
PP expressing a participant location in the Brown corpus is not a good predictor of the
Table 4
Percentages of occurrence of syntactically optional PPs in the Brown corpus grouped by semantic role for
participant location verbs and control verbs in Experiment 1
Inst Loc Src Goal Path Rec Ben Temp Rat Man
% present/possible (participant location verbs) 0.5 24.2 2.9 37.8 25 7.7 0.8 4.1 2.8 2.6
% present/possible (control verbs) 1.1 5.9 6.1 17.9 7.1 22.7 0.8 5.6 2.3 2.8
Inst, instrument; Loc, locations; Src, source; Rec, recipient; Ben, beneficiary; Temp, temporal; Rat, rationale;
Man, manner.
J.-P. Koenig et al. / Cognition 89 (2003) 67–10386
percentage of completions this verb received in our continuation study. Note that it is not
particularly surprising that we found a difference between the frequency of expression of
syntactically optional semantic arguments in our continuation studies and in (non-
laboratory generated) ordinary texts (in particular, that participant locations occurred more
frequently as completions than in the corpus). As we noted earlier, the fact that a verb
lexically encodes a syntactically optional semantic participant but another does not is no
guarantee that speakers or writers will express that semantic participant more often with
the former. Multiple factors influence the expression of syntactically optional semantic
arguments, including (i) the verb specific restrictiveness of the selectional constraints on
fillers of that participant role (see Resnik, 1997) or (ii) the discourse role that this filler will
play (see Givon, 1984). These lexical and discourse factors can override any difference in
the rate of PP expression. In contrast, our experimental participants were explicitly asked
to provide continuations and thereby express syntactically optional phrases. We
hypothesize that in such a situation the aforementioned factors for not expressing
participant information are severely attenuated, such that the importance of the lexical
encoding of participant information becomes more important.
Overall, the results of our corpus study suggest that (i) surface co-occurrence with
locations partly underlies the higher rate of location completions for verbs which require
them than for verbs which do not require them, but (ii) the results of the continuation study
cannot be entirely attributed to surface co-occurrence information, since the corpus
frequency of co-occurrence of a participant location verb with a participant location
constituent in the Brown corpus does not correlate with the frequency with which
participants completed an experimental item containing that verb with a constituent
expressing a participant location.
Fig. 3. Scatterplot of the correlation between percentage of a participant location post-verbal constituent in the
Brown corpus and percentage of participant location completion for experimental items.
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 87
5.2. Experiment 2: testing the SOC
Our second study was conducted to determine whether differences in the semantic
obligatoriness of participant information that is associated with syntactically optional
constituents has an immediate influence on parsing. A large number of studies have
provided evidence, in the form of processing disruptions, for the early use of lexically-
encoded syntactic and semantic information associated with the required constituents of
verbs in the processing of filler-gap sentences (Boland, 1997; Boland & Tanenhaus, 1990;
Boland, Tanenhaus, Garnsey, & Carlson, 1989; Clifton & Frazier, 1989; Crain & Fodor,
1985; Stowe, 1986; Stowe, Tanenhaus, & Carlson, 1991; Tanenhaus, Boland, Mauner, &
Carlson, 1993; Tanenhaus & Carlson, 1989; Tanenhaus, Garnsey, & Boland, 1990;
Traxler & Pickering, 1996). We reasoned that the sensitivity of verbs to co-occurring
constituents that are syntactically obligatory and express lexically encoded participant
information might also extend to lexically encoded participant information that is only
optionally expressed. To test this hypothesis, we compared the processing of sentences
whose main verbs, according to the results of our participant property survey, either
semantically required (e.g. behead in sentences (12a) and (12c)), or merely permitted (e.g.
kill in sentences (12b) and (12d)) an instrument participant in the events they described.
(12) a. Which sword | did the rebels | behead | the traitor king with [ ] | during the
rebellion?
b. Which sword | did the rebels | kill | the traitor king with [ ] | during the
rebellion?
c. With which sword | did the rebels | behead the traitor king [ ] | during the
rebellion?
d. With which sword | did the rebels | kill the traitor king [ ] | during the
rebellion?
The logic of this filler-gap paradigm depends on there being a WH-filler that has been
extracted from an indirect object position that is, in addition, syntactically or semantically
inappropriate as a sentence’s direct object. When a reader encounters the verb in a sentence like
(12a) or (12b), they should experience processing difficulty if they attempt to integrate the filler
into the developing sentence representation as a direct object, which they are likely to do unless
the verb provides some semantic or syntactic evidence that the filler could be plausibly
associated with an indirect object gap. We compared the processing offiller-gap sentences with
NP WH-fillers (e.g. (12a) and (12b)) to control sentences whose WH-fillers included a
preposition (e.g. (12c) and (12d)) which provided readers with clear syntactic evidence that the
WH-filler could not be a direct object of the verb. Assuming that processing involves constraint-
satisfaction, we expect the ease of integration of the filler to be affected both by the presence of
syntactic evidence of the role of the filler (which the preposition in the PP filler conditions
provides) and the presence of lexical information as to the semantic role of the filler (which the
verb in the behead conditions provides). In other words, we expect PP filler sentences to be
easier to process than NP filler sentences, since the preposition provides readers with useful
information about the semantic role of the filler (with is often associated with an instrument
role). We also expect verbs that semantically require an instrument to be easier to process than
J.-P. Koenig et al. / Cognition 89 (2003) 67–10388
verbs that do not require an instrument, at least if our LEH is correct. Other things being equal,
we would expect an interaction between filler and verb types. But other factors can obscure that
interaction. In particular, whether an interaction is found depends on two factors: (i) the
respective size of the processing penalty incurred for the absence of syntactic vs. lexical
evidence of the filler’s semantic role; and (ii) the size of the increased benefit that the syntactic
cue provides because of its earlier occurrence (five word positions prior to the lexical evidence).
Since we do not currently have any way of assessing either factor, our predictions in this paper
focus instead on region specific main effects. Because filler-gap studies using verbs which
subcategorize for more than a direct object (i.e. dative and object control verbs) typically find
anomaly effects not at the main verb but rather at the next word position (c.f. discussion in
Tanenhaus et al., 1993), we predicted that our filler-gap sentences, which optionally sub-
categorize for a PP, would also elicit anomaly effects at the region immediately following the
main verb. In particular, we expect to see effects of both syntactic and lexical cues at this region.
5.2.1. Method
5.2.1.1. Participants. A total of 105 native English speaking undergraduates from the
University at Buffalo participated in this experiment for partial course credit. Data from
five participants were omitted from analyses because they did not meet an 85% accuracy
criterion on comprehension questions.
5.2.1.2. Materials. Twenty-four sentence quadruples such as those in example (12) were
constructed (see Appendix A for a complete list of stimuli). Each quadruple included a pair
of sentences whose main verb was hypothesized to semantically require an instrument
(e.g. behead in sentences (12a) and (12c)) while the main verb shared by the other pair
permitted but did not require an instrument participant (e.g. kill in sentences (12b) and
(12d)). Within each of these pairs of sentences sharing a verb, one sentence began with an
NP WH-filler. The filler was syntactically possible as either a direct object or indirect
object, but was semantically appropriate only as the indirect object of the stranded
preposition with. The other sentence within each pair began with a PP WH-filler that was
syntactically impossible as a direct object and thus must be associated with a later indirect
object gap. To avoid obscuring expected effects with end-of-sentence wrap-up effects (Just
& Carpenter, 1987), all sentences included a sentence-final prepositional or adverbial
phrase. Aside from the verb and filler type (NP or PP), the content of the sentences within a
quadruple was identical. For presentation and analysis, sentences were segmented into five
regions which are indicated by the vertical bars (|) in example (12).
The 24 experimental sentences were counterbalanced across four presentation lists in a
latin square. Because of an error in counterbalancing, two conditions on each presentation
list had five stimulus sentences while the other two conditions had seven stimulus
sentences. Despite this imbalance, there was no overall effect of list (F1ð3; 96Þ ¼ 0:58,
P ¼ 0:63; F2ð3; 16Þ ¼ 0:68, P ¼ 0:57). Experimental sentences were intermixed with 72
distractor sentences. To prevent participants from developing systematic processing
strategies based on surface characteristics that might be associated with experimental
items, these distractor sentences were all sentences that took the form of other types
of questions including Yes-No questions with various types of auxiliary verbs, and
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 89
WH-questions in which the filler could be a bare WH-word (e.g. Who or What), a WH-PP filler
(e.g. At what time), a WH-NP filler (e.g. What color shirt), or a quantifier phrase WH filler (e.g.
How many apples). WH-fillers in distractor sentences were extracted from both pre- and post-
verbal argument positions and adjunct positions. Thus, they expressed a wide variety of
participant roles. Distractor sentences were segmented into presentation regions in a fashion
similar to that of experimental items. For 32 distractor sentences, comprehension questions
were constructed to make sure that participants were comprehending the stimulus sentences.
For half of these questions, the correct answer was Yes. Comprehension questions queried
readers about what a distractor sentence was asking or about the plausibility of a particular
answer. For example, for the distractor sentence When is the repairman coming to fix the
refrigerator? the comprehension question was Is last Monday a possible answer?
Materials selected as stimulus sentences were extensively normed. We conducted two
types of plausibility norming studies, one norming study for grammaticality, and a corpora
study. The first norming study evaluated a set of WH-fillers that had been generated by the
experimenters and a separate group of participants for their plausibility as instruments of
behead and kill verb sentence pairs following the norming procedure outlined in McRae,
Ferretti, and Amyote (1996). Participants were asked to rate, on a 7-point Likert scale, how
likely each of a set of participant- and experimenter-generated NP fillers were as instruments
of active declarative behead and kill verb sentences (7, highly likely; 1, highly unlikely).
Participants rated the likelihood of these fillers for active declarative versions of experimental
sentences in which the object of the preposition was left blank (e.g. The rebel knight beheaded/
killed the traitorous king with __ during the rebellion.). For each pair of kill and behead verb
sentences, the NP filler selected had a mean rating that was above the mean for all of the fillers
that had been generated for a pair. This mean rating was always above the midpoint on the
Likert scale. Moreover, whenever a selected NP’s rating for kill and behead members of a
sentence frame differed, this difference never exceeded 0.55 points on the Likert scale and the
selected NP was rated as more plausible in the kill verb member of a sentence pair for 63% of
the items. Mean plausibility ratings for NP instrument fillers for behead and kill verb sentences
were 5.89 (SE ¼ 0:13) and 5.94 (SE ¼ 0:14), respectively.
Another group of participants rated the selected NPs for their plausibility as direct objects
of kill and behead verbs in active declarative versions of filler-gap sentences like The rebel
knight beheaded/killed the __ during the rebellion on a 7-point Likert scale with the same
anchors as in the previous study. All of the selected NP fillers received low plausibility ratings
as direct objects (behead: M ¼ 1:91 (SE ¼ 0:17), kill: M ¼ 1:97 (SE ¼ 0:15)).
A third norming study examined the grammaticality of behead and kill verb sentences with
NP and PP fillers. This study was conducted to prospectively rule out the possibility that
differences between kill and behead verbs could be attributed to putative violations of island
constraints rather than to differences in the argument structures of behead and kill verbs. Note
that this possibility is only a concern if one subscribes to the view that any constituent that
corresponds to an argument of a verb must be subcategorized for.8 Under this assumption, a
concern arises regarding our materials. Our sentences involve WH-movement of putative
arguments and adjuncts whose semantic status could be reflected syntactically as differences
8 We do not subscribe to this view and in fact argue that a participant’s argument and subcategory status are
only weakly correlated, at least for syntactically optional constituents.
J.-P. Koenig et al. / Cognition 89 (2003) 67–10390
in subcategory frames for behead and kill verbs. Under this view, the extraction of an NP from
a PP adjunct WH-filler (e.g. (12b)) should result in an island constraint violation while
extraction of the same NP from a PP argument WH-filler (e.g. (12a)) should not (Chomsky,
1986; Huang, 1982). Thus, if sentences such as (12b) elicit anomaly effects relative to their PP
controls and sentences such as (12a) do not, this could be attributed to differences in their
respective syntactic grammaticality rather than argument status of NP fillers.
The guiding assumption underlying this norming study was that any grammatical violation
should reliably elicit a judgment of ungrammaticality. Many types of extractions that have
been identified as examples of island violation phenomena elicit judgments of ungrammati-
cality and do so quite palpably (e.g. extraction from WH-islands as in Who does John
wonder why Lisa likes __? or Complex-NP islands as in What do you believe the claim
that Lisa bought__?). Note that in contrast to these examples, it is difficult to discern much
difference in the acceptability of our experimental sentences with putative argument (12a) and
adjunct (12b) extractions. Of course, the greater unacceptability of extractions from WH- and
Complex-NP islands could be the result of more than one grammatical constraint being
violated. But note that at least intuitively, extraction from a putative adjunct PP like (12b) is far
more acceptable than a subject–verb agreement (SVA) violation (e.g. The key for the
cabinets are on the table), which arguably violates only one constraint.
We confirmed these intuitive judgments by asking participants to rate experimental
sentences like those shown in (12a)–(12d) and sentences with SVA violations and extractions
from WH- and Complex-NP islands like those illustrated above on a 7-point Likert scale in
which the anchors 1 and 7 were described respectively as sentences of English that you would
be either highly unlikely or highly likely to hear or read. The results of this study are
summarized in Table 5. WH-island violations were rated significantly lower than either
Complex-NP or SVA violations (Ps , 0:002). Crucially, grammaticality ratings to sentences
with NP fillers did not differ from each other or from their PP controls. In contrast, NP filler
arguments, which received the lowest rating of our experimental sentences, were rated as
significantly better than SVA violations, the category of clear grammatical violation that
received the highest acceptability ratings (P , 0:0001). Given these results, it is unlikely that
any awkwardness that is sometimes associated with sentences with extractions from adjunct
PPs like (12b) is due to island violations.
Finally, we conducted a corpora study to examine the possibility that any differences we
might observe in the processing of behead and kill verb sentences with NP fillers relative to
their controls could be due to the frequency with which behead and kill verbs co-occur with
with þ instrument phrases. For each verb, we computed the proportion of with þ instrument
phrases which occurred with behead and kill verbs in the combined Brown and Wall Street
Journal corpora. There was no difference in the proportion of instrument PPs co-occurring
with behead (M ¼ 0:06, SE ¼ 0:1) and kill (M ¼ 0:08, SE ¼ 0:12) verbs. The results of this
study suggest that any differences between the processing offiller-gap sentences with kill and
behead verbs are unlikely to be due to stored co-occurrence information.
All in all, the extensive norming we performed resulted in sentences which were
equally grammatical and whose fillers were equally plausible across behead and kill verbs.
Furthermore, overall verb frequencies favored kill verbs (M ¼ 186:1) over behead verbs
(M ¼ 26:8) (Kucera & Francis, 1967). Because all sentences were grammatical and
equally plausible, we expect differences in reading times to be subtle and relatively small
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 91
compared to those of previous filler-gap experiments whose materials were not as
extensively normed and which compared incommensurate participant categories (say,
patient and adjuncts of argument NPs).
5.2.1.3. Procedure. This study used a region-by-region reading time paradigm. Sentences
were presented across two lines of a video monitor of an IBM PC clone using a moving
window procedure. Each trial began with two rows of dashes that were left-aligned on the
monitor’s screen. The first and second rows of dashes corresponded to all of the non-white-
space characters of the stimulus sentence. Participants pressed the spacebar of a keyboard
to reveal each successive region of text. For all experimental sentences, the first four
critical presentation regions and at least one word of the last presentation region were
presented on the first line of text. Participants were instructed to read each sentence at their
normal rate. Reading times were collected for each region. Before beginning the experi-
ment, participants read some instructions that described the task. Following instructions,
participants completed ten practice trials to familiarize themselves with the task and the
response keys, before beginning the experiment.
5.2.2. Results
Prior to analysis, participants’ answers to comprehension questions were examined.
Accuracy for each participant was at least 85%. Outliers were identified in a two-stage
process following the procedure outlined in Zurif, Swinney, Prather, Solomon, and
Bushells (1993).9 First, after inspecting the distribution of reading times it was determined
Table 5
Mean ratings and standard errors (in parentheses) for instrument and non-instrument verb filler-gap sentences
with PP and NP WH-fillers, and sentences with Complex-NP, WH-island, and subject–verb agreement violations
Instrument arguments Instrument adjuncts Grammatical violations
PP filler NP filler PP filler NP filler SVA C-NP island WH-island
5.7 (0.2) 5.6 (0.2) 5.9 (0.2) 5.8 (0.2) 2.0 (0.2) 1.6 (0.1) 1.1 (0.1)
9 A number of procedures for dealing with outliers are common in the literature. Some researchers identify a
cut-off value, applied to all participant scores, to exclude extreme scores (e.g. Clifton, 1993; McKoon, Albritton,
& Ratcliff, 1996). A potential problem with this approach is that it is likely to differentially affect data from the
slowest readers for long cut-off values and the fastest readers for short cut-off values. An alternative approach,
adopted by many researchers (e.g. Boland, 1997; Trueswell & Kim, 1998) is to replace extreme values with
boundary values that are based on the mean plus or minus some multiple of the standard deviation of the
distribution of scores for each participant. This procedure also has shortcomings under some conditions. When the
number of observations per participants is small, as it was in this study, this procedure may fail to identify
potentially extreme values in a participant’s distribution of scores because the presence of another extreme score
in the opposite direction or an even more extreme score in the same direction considerably inflates the variance.
Such extreme scores are often not likely to be accurate reflections of reading time. For example, it is highly
unlikely that participants could recognize and integrate into a developing sentence representation a one- or two-
word phrase of average length in a word-by-word or phrase-by-phrase reading task in 250 ms or less. Typically,
such short reading times are due to the accidental early triggering of a response key. Similarly, extremely long
scores are also unlikely to reflect just word recognition and integration processes. In this experiment, to identify
outliers, we adopted a hybrid approach advocated by some researchers (e.g. Zurif et al., 1993).
J.-P. Koenig et al. / Cognition 89 (2003) 67–10392
that a few scores, namely those that were shorter than 250 ms or exceeded 3000 ms, most
likely reflected off-task behavior (e.g. accidental triggering of a response key or
inattention). The selected values did not sample more from the distributions of slow or fast
readers than from the distributions of average readers. The affected reading times, which
accounted for 1.3% of the data, were removed prior to computing outlier replacement
values. In the remaining data, outliers for the distribution of each participant’s reading
time within a given region were replaced by a boundary value equivalent to the mean plus
or minus 2.5 standard deviations. Replaced values accounted for an additional 1.5% of the
reading times.
Two other details are relevant to the analyses we performed. After data collection was
completed, we realized that four of our instrument verbs were eponymous with potential
fillers (e.g. drill, till, whip, and whisk). Because the reading times for these items could
have biased results in favor of our hypothesis, we excluded them prior to analysis. The
aforementioned error in counterbalancing resulted in unbalanced lists for the analysis by
items. ANOVAs on the means from the resultant unbalanced lists were conducted using
GANOVA 4 (Brecht, Woodward, & Bonett, 1982–1987) which employs computational
procedures based on a cell means ANOVA model. ANOVAs conducted using this model
yield results that are computationally equivalent to a Type III reparameterization
adjustment strategy for unbalanced lists in statistical packages, such as SAS, whose
computational procedures are based on a less than full rank ANOVA model (Woodward,
Bonett, & Brecht, 1990: p. 219 and references therein).
Because of baseline differences in the frequencies of occurrence of kill verbs and
behead verbs (Kucera & Francis, 1967), reading times to kill and behead verb sentences
with NP fillers cannot be compared to each other at the verb region. Instead, we compared
NP filler sentences with kill and behead verbs to their respective PP filler controls.
Moreover, string lengths systematically differed between NP and PP filler sentences in the
crucial direct object (DO) þ gap region since experimental sentences included the
preposition with while control sentence did not. We therefore computed residual reading
times to partial out the influence of string length on reading times following procedures
outlined in Ferreira and Clifton (1986) and Trueswell, Tanenhaus, and Garnsey (1994).
Mean residual reading times for NP and PP behead and kill filler-gap sentences across
subject NP (e.g. did the rebels), verb (e.g. behead/kill), and direct object (^with) regions
are shown in Fig. 4. As one can see, residual reading times to NP filler sentences were
longer than PP filler sentences at the verb and direct object position. Similarly, kill
sentences elicited longer reading times at the direct object than behead sentences. Finally,
whereas NP filler sentences with behead verbs did not differ from their PP filler control at
the direct object region, NP filler sentences with kill verbs elicited longer residual reading
times than control sentences with PP fillers at both the verb and direct object regions. To
provide a point of comparison with other reading time studies, raw reading times for these
regions are shown in Table 6 and Fig. 5.
Residual reading times were submitted to two separate 4 (List) £ 2 (Verb type) £ 2
(Filler type) analyses of variance with participants and items as random variables for two
regions of interest: the verb and the direct object (þwith) region.
5.2.2.1. Verb region. Sentences with NP fillers elicited longer reading times at the verb
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 93
than sentences with PP fillers in analyses by participant and items (F1ð1; 96Þ ¼ 7:66,
MSe ¼ 10399, P , 0:01; F2ð1; 16Þ ¼ 7:31, MSe ¼ 1974, P , 0:02). This difference was
consistent across verb types, emerging in both sentences with semantically obligatory
instrument verbs like behead (F1ð1; 96Þ ¼ 3:87, MSe ¼ 9710, P , 0:05;
F2ð1; 16Þ ¼ 2:02, MSe ¼ 3396, P ¼ 0:17), and sentences with semantically optional
instrument verbs like kill (F1ð1; 96Þ ¼ 9:89, MSe ¼ 9228, P , 0:01; F2ð1; 16Þ ¼ 6:98,
MSe ¼ 1104, P , 0:05). No other effects were significant.
5.2.2.2. Direct object region. As predicted, sentences with semantically obligatory
instrument verbs like behead elicited faster reading times than sentences with semantically
optional instrument verbs like kill (F1ð1; 96Þ ¼ 13:09, MSe ¼ 29186, P , 0:01;
F2ð1; 16Þ ¼ 10:91, MSe ¼ 102273, P , 0:01). This effect of Verb type was consistent
across the two types of filler sentences. Reading times were longer for kill verb sentences
with NP fillers (F1ð1; 96Þ ¼ 11:86, MSe ¼ 24900, P , 0:01; F2ð1; 16Þ ¼ 11:17,
MSe ¼ 4828, P , 0:01), and for sentences with PP fillers (F1ð1; 96Þ ¼ 6:59,
MSe ¼ 16580, P , 0:01; F2ð1; 16Þ ¼ 4:04, MSe ¼ 4945, P # 0:06). We also observed
a main effect of Filler type. Overall, sentences with NP fillers elicited longer reading times
than sentences with PP fillers. These differences were significant in the analysis by
participants (F1ð1; 96Þ ¼ 8:28, MSe ¼ 9247, P # 0:005), and trended in the analysis by
items (F2ð1; 16Þ ¼ 3:29, MSe ¼ 4253, P # 0:09). However, the effect of filler type was
not consistent across verb type. Sentences with NP fillers elicited longer reading times
than sentences with PP fillers when their main verb was an optional instrument verb like
Fig. 4. Mean residual reading times for behead and kill verbs with NP and PP fillers across three regions.
J.-P. Koenig et al. / Cognition 89 (2003) 67–10394
kill (F1ð1; 96Þ ¼ 11:86, MSe ¼ 24900, P , 0:01), but not when their main verb was an
obligatory instrument main verb like behead (F1&2 # 0:60).
5.2.3. Discussion
The results of this study provide psychological evidence supporting the obligatoriness
criterion for argument status. Filler-gap sentences with NP fillers were easier to process
when the semantic representation of a main verb included an obligatory instrument
participant than when it did not. More specifically, shorter reading times were found at the
direct object regions of behead verb sentences (which provided lexical evidence for the
integration of the filler as an instrument of the event) than kill sentences. Furthermore,
longer reading times were found at the direct object regions of kill verb sentences with NP
fillers than with PP fillers, which provided syntactic evidence that the filler could not be a
direct object. In contrast, there were no differences in reading times at the direct object
Fig. 5. Mean raw reading times for behead and kill verbs with NP and PP fillers across three regions.
Table 6
Raw reading times to behead and kill verb sentences with NP and PP fillers across the subject, verb, and direct
object (DO) regions
Verb type Filler type Subject Verb DO (þ with)
Behead NP 735 585 842
PP 763 554 679
Kill NP 725 607 912
PP 749 579 729
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 95
position of behead verb sentences as a function of filler type. This pattern of data is
consistent with the hypothesis that readers used instrument participant information that is
associated with behead but not kill verbs to predict upcoming indirect object gaps for WH
fillers. In addition, these results also confirm that argument status cannot be adjudicated on
the basis of participant category. Constituents expressing instrument participants are
neither uniformly adjuncts nor arguments. Rather, their argument status depends on their
obligatoriness for particular verbs. Although we cannot entirely rule out the possibility that
these results could be attributed to differences in the plausibility of WH-fillers as indirect
objects of behead and kill verbs or differences in the implausibility of fillers as direct
objects, the fact that fillers were selected on the basis of plausibility norms to be equivalent
or even to favor the kill verb sentences suggests that this possibility is remote. Similarly,
these results could have arisen from differences in the frequency with which instrument
PPs co-occur with kill and behead verbs. But we found no such differences in a corpora
study. Finally, one might argue that these differences could be due to differences in the
grammaticality of filler-gap sentences with NP extracted from arguments (behead) and
adjuncts (kill). However, the results of our grammaticality judgment study demonstrate
that readers similar to the participants in this on-line study found NP extractions from both
kill and behead verb sentences as acceptable as extractions of PP fillers in the same
sentence frames. Thus, the most plausible explanation for the observed pattern of results is
that behead verbs provide obligatory instrument participant information for the
interpretation of NP fillers while kill verbs do not.
More generally, the pattern of results we observed receives a natural explanation within
constraint-based models of processing. Recall that when fillers are ultimately associated
with a position other than the direct object, anomaly effects typically do not appear until
after the verb is processed. We predicted and found that anomaly effects would not emerge
at the verb but instead would be found at the next region of analysis. However, we also
observed that sentences with PP fillers were read faster than sentences with NP fillers in
the verb region. This is most likely due to combined effects of readers knowing that the PP
fillers were syntactically impossible as direct objects and the implausibility of PP fillers as
direct objects. A different situation arises with the NP filler sentences. NP fillers were
syntactically possible and only semantically implausible as direct objects. Thus, these two
sources of information conflict with each other, leading to longer reading times that only
get resolved in the next region. Finally, in the direct object region, an interesting pattern
emerges. Note first that there is no difference in the reading times to NP and PP filler
sentences with behead verbs. In contrast, NP filler sentences elicited significantly longer
reading times than sentences with PP filler controls in kill verb sentences. Moreover, PP
filler sentences with kill verbs required more time to read than either NP or PP filler
sentences with behead verbs. This pattern can best be explained in terms of the quantity
and consistency of the information readers have available to integrate WH-fillers into
sentence representations at this point. For both NP and PP fillers in behead verb sentences,
readers have available not only the syntactic information from the presence of the direct
object that indicates that the filler must be an indirect object, but two sources of semantic
information from the verb: the implausibility of the filler as a direct object, and instrument
participant information that provides positive semantic information regarding the
interpretation of the filler. In contrast, the PP filler sentences of kill verb sentences have
J.-P. Koenig et al. / Cognition 89 (2003) 67–10396
syntactic and implausibility information at the earlier verb region indicating that the filler
cannot be a direct object but neither the verb nor the direct object region provide any
semantic participant information that could further assist readers in interpreting the PP
filler. Finally, when readers encounter kill verb sentences with NP fillers, they receive only
implausibility information from the preceding verb region and must resolve both the
syntactic category and semantic interpretation of the filler when they encounter the direct
object region. The presence of the preposition with disambiguates the syntactic category at
this point but does not provide unambiguous semantic information that would be of
assistance in interpreting the filler. In conclusion, the pattern of reading times across these
two regions can be explained in terms of the kinds of syntactic and semantic evidence
readers have available to them at different points in processing for parsing and interpreting
a sentence with a WH-filler.
6. Conclusion
It is commonly assumed across the language sciences that only some semantic
participant information is lexically encoded. Despite the large number of extant proposals,
no set of necessary and sufficient criteria has yet been proposed as the basis for the
distinction between arguments and adjuncts. In this paper, we have argued that lexical
encoding of participant information reduces to two semantic criteria: (1) whether
participant information is semantically obligatory; and (2) whether participant information
is specific to a verb or to a restricted verb class to which a verb belongs. Those criteria
fall out naturally from a hierarchical, distributed model of lexical knowledge, whether
it is implemented in a multiple inheritance semantic network or in a more parallel,
distributed representational schema. We have shown, through a comprehensive survey of
approximately 4000 English verbs and eight participant roles, that participant roles that
correspond to the traditional notion of argument as well as more controversial roles such as
semantically obligatory instruments and participant locations do display class specificity,
as predicted. The results of a sentence continuation study and an on-line filler-gap study
support the psychological relevance of both semantic obligatoriness and semantic
specificity as conditions for lexical encoding of participant information. Verbs which,
according to the semantic specificity condition, lexically encode a participant location are
more likely to lead to continuations which express that argument than they are to
continuations which express adjuncts. Similarly, verbs which, according to the SOC,
lexically encode an instrument role facilitate the integration of a WH-filler in a filler-gap
dependency, when compared to verbs whose denotations allow, but do not require an
instrument role. Whether semantic specificity likewise affects the on-line processing of
filler-gap dependencies remains to be explored in future research.
Acknowledgements
This research was supported (in part) by research grant number 1 R01 MH60133-01
from the National Institute of Mental Health, National Institutes of Health and by grants
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 97
from the Center for Cognitive Science at the University at Buffalo. We gratefully
acknowledge Beaumont Brush, Tom Graves, Cori Grimm, Anne Joergensen, Leana
Longley, Alissa Melinger, and Chris Phipps for their work on the English verbal lexicon
project, Cerenity Dickerson, Lauren DiMaria, Brian Dugan, Nicole Enzinna, Kathy
Conklin, Min Ju, Leana Longley, Cameron Stelmach, and Jeffrey Wescott for assistance in
data collection and analysis, and David Braun for his perceptive comments. All remaining
errors are ours.
Appendix A
A.1. Stimuli for Experiment 1
Participant location sentences (a) and control sentences (b).
1a The collectible doll was advertised
1b The collectible doll was sold
2a The veneer was applied
2b The veneer was prepared
3a The juicy steak was barbecued
3b The juicy steak was turned
4a The new puppy was bathed
4b The new puppy was washed
5a The soup was boiled
5b The soup was stirred
6a The flavored coffee was brewed
6b The flavored coffee was warmed
7a The pitbull was caged
7b The pitbull was tranquilized
8a The microfilm was concealed
8b The microfilm was detected
9a The steer were confined
9b The steer were castrated
10a The days’ profits were deposited
10b The days’ profits were checked
11a The bags of cocaine were discovered
11b The bags of cocaine were reported
12a The new painting was displayed
12b The new painting was straightened
13a The hole was drilled
13b The hole was measured
14a The deposit was entered
14b The deposit was verified
15a The defect was found
15b The defect was noticed
J.-P. Koenig et al. / Cognition 89 (2003) 67–10398
16a The ornaments were hung
16b The ornaments were broken
17a The evidence was hidden
17b The evidence was destroyed
18a The acorns were hoarded
18b The acorns were eaten
19a The cooked pata was immersed
19b The cooked pata was seasoned
20a The new memory was inserted
20b The new memory was tested
21a The important passage was located
21b The important passage was discussed
22a The car was parked
22b The car was damaged
23a The heroin was planted
23b The heroin was transported
24a The data points were plotted
24b The data points were examined
25a The reading list was posted
25b The reading list was revised
26a The molten steel was poured
26b The molten steel was cooled
27a The discount price was printed
27b The discount price was calculated
28a The advertisement was published
28b The advertisement was purchased
29a The physics lectures were recorded
29b The physics lectures were collected
30a The turkey was roasted
30b The turkey was sliced
31a The salad dressing was spilled
31b The salad dressing was made
32a The caviar was spread
32b The caviar was tasted
33a The sailor was stationed
33b The sailor was commended
34a The automatic weapons were stored
34b The automatic weapons were arranged
35a The sailing gear was stowed
35b The sailing gear was inspected
36a The defective bolt was unscrewed
36b The defective bolt was seen
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 99
A.2. Stimuli for Experiment 2
NP and PP filler sentences with instrument and non-instrument verbs. An asterisks (*)
marks eponymous items removed from analyses.
1 (With) What kind of cloth did the mother bathe/wipe the tiny baby (with) last night?
2 (With) Which sword did the rebels behead/kill the traitor king (with) during the
rebellion?
3 (With) What type of knife did the butcher chop/separate the chicken legs (with)
yesterday afternoon?
4 (With) Which bulldozer did the man dig/fill the hole (with) on Thursday?
5 (With) What type of liquid did the fireman extinguish/contain the street fire (with)
last week?
6 (With) What type of spear did the pygmies jab/attack the angry lion (with) in the
documentary?
7 (With) What type of needle did the doctor lance/drain the patient’s boil (with)
last month?
8 (With) Which tractor did the farmer plow/prepare the wheat field (with) on Tuesday?
9 (With) Which stick did the children poke/tease the poisonous snake (with) this
morning?
10 (With) What kind of toy did the child prod/bother the angry cat (with) cruelly in the
yard?
11 (With) Which needle did the child puncture/burst the big balloon (with) at the fair?
12 (With) Which spoon did the cook scoop/sample the ice cream (with) for the party?
13 (With) Which key did the teenagers scratch/vandalize the girl’s new car (with) last
night?
14 (With) Which spatula did the chef spread/decorate the chocolate icing (with) at the
restaurant?
15 (With) What type of weapon did the knight stab/intimidate the fiery dragon (with) in
the famous story?
16 (With) Which spoon did the cook stir/serve the hot soup (with) this afternoon?
17 (With) What kind of baton did the policeman strike/threaten the violent protester
(with) during the riot?
18 (With) What type of rope did the kidnapper tie/restrain the little kid (with) during the
robbery?
19 (With) Which hoe did the gardener till/work the dry soil (with) last weekend?
20 (With) Which card did the security guard unlock/deactivate the automatic door
(with) this morning?
21 *(With) Which whisk did the maid whip/make the tasty cream (with) last night?
22 *(With) Which leash did the woman whip/train the frisky dog (with) this morning?
23 *(With) Which fork did my sister whisk/eat the scrambled eggs (with) on the
counter?
24 *(With) Which tool did the builder drill/widen the tiny hole (with) during the
renovations?
J.-P. Koenig et al. / Cognition 89 (2003) 67–103100
References
Bienvenue, B (1997). Comprehension of unexpressed locations. Unpublished Honor’s thesis, University at
Buffalo, Buffalo, NY.
Boland, J. (1997). The relationship between syntactic and semantic processes in sentence comprehension.
Language and Cognitive Processes, 12, 423–484.
Boland, J., & Boehm-Jernigan, H. (1998). Lexical constraints and prepositional phrase attachment. Journal of
Memory and Language, 39, 684–719.
Boland, J., & Tanenhaus, M. K. (1990). Evidence for the immediate use of verb control information in sentence
processing. Journal of Memory and Language, 29, 413–432.
Boland, J. E., Tanenhaus, M. K., Garnsey, S., & Carlson, G. (1989). Verb argument structure in parsing and
interpretation: evidence from WH-questions. Journal of Memory and Language, 34, 774–806.
Brecht, M. L., Woodward, J. A., & Bonett, D. G. (1982–1987). GANOVA 4, textbook version. Redmond, WA:
Microsoft Corporation.
Bresnan, J. (1982). Control and complementation. In J. Bresnan (Ed.), The mental representation of grammatical
relations (pp. 292–390). Cambridge, MA: MIT Press.
Briscoe, T., Copestake, A., & de Paiva, A. (1993). Inheritance, defaults, and the lexicon. Cambridge: Cambridge
University Press.
Carlson, G., & Tanenhaus, M. (1988). Thematic roles and language comprehension. Thematic relations (21)
(pp. 263–291). Syntax and semantics, New York: Academic Press.
Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris.
Chomsky, N. (1986). Knowledge of language. New York: Praeger.
Clifton, C. (1993). Thematic roles in sentence processing. Canadian Journal of Experimental Psychology, 47,
222–246.
Clifton, C., Jr., & Frazier, L. (1989). Comprehending sentences with long-distance dependencies. In G. N.
Carlson, & M. K. Tanenhaus (Eds.), Linguistic structure in language processing (pp. 273–317). Dordrecht:
Kluwer.
Collins, A., Quillian, M., & Ross, M. (1970). Facilitating retrieval from semantic memory: the effect of repeating
part of an inference. Acta Psychologica, 33, 304–314.
Coltheart, M. (1981). The MRC psycholinguistic database. Quarterly Journal of Experimental Psychology, 33,
497–505.
Crain, S., & Fodor, J. (1985). How can grammars help parsers? In L. Karttunen, D. Dowty, & A. Zwicky (Eds.),
Natural language processing: psychological, computational, and theoretical perspectives. New York:
Cambridge University Press.
Davis, A., & Koenig, J.-P. (2000). Linking as constraints on word classes in a hierarchical lexicon. Language, 76,
56–91.
Dowty, D. (1982). Grammatical relations and Montague grammar. In P. Jacobson, & G. Pullum (Eds.), The
nature of syntactic representations (pp. 79–130). Dordrecht: Reidel.
Dowty, D. (1989). On the semantic content of the notion of ‘thematic role’. In G. Chierchia, B. Partee, & R. Turner
(Eds.), (2) (pp. 69–129). Properties, types, and meaning, Dordrecht: Kluwer.
Dowty, D. (1991). Thematic proto-roles and argument selection. Language, 67, 547–619.
Ferreira, F., & Clifton, C. (1986). The independence of syntactic processing. Journal of Memory and Language,
25, 348–368.
Foley, W., & Van Valin, R. (1984). Functional syntax and universal grammar. Cambridge: Cambridge
University Press.
Givon, T. (1984). Syntax: a functional-typological introduction. Amsterdam: John Benjamins.
Goldberg, A. (1995). Constructions: a construction grammar approach to argument structure. Chicago, IL:
Chicago University Press.
Huang, J (1982). Logical relations in Chinese and the theory of grammar. PhD thesis, MIT, Cambridge, MA.
Jackendoff, R. (1990). Semantic structures. Cambridge, MA: MIT Press.
Just, M., & Carpenter, P. (1987). The psychology of reading and language comprehension. Needham Heights,
MA: Allyn & Bacon.
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 101
Koenig, J.-P (1994). Lexical underspecification and the syntax/semantics interface. PhD thesis, University of
California at Berkeley, Berkeley, CA.
Koenig, J.-P. (1999). Lexical relations. Stanford, CA: CSLI.
Koenig, J.-P., & Mauner, G. (1999). A-definites and the semantics of implicit arguments. Journal of Semantics,
16(3), 207–236.
Kucera, H., & Francis, W. N. (1967). Computational analysis of present-day American English. Providence, RI:
Brown University Press.
Langacker, R. (1987) (1). Foundations of cognitive grammar, Stanford, CA: Stanford University Press.
Levin, B. (1993). English verb classes and alternations. Chicago, IL: Chicago University Press.
Levin, B., & Rappaport-Hovav, M. (1995). Unaccusativity at the syntax-lexical semantics interface. Cambridge,
MA: MIT Press.
Liversedge, S., Pickering, M., Branigan, H., & Van Gompel, R. (1998). Processing arguments and adjuncts in
isolation and in context: the case of by-phrase ambiguities in passives. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 24, 461–475.
Mauner, G., & Koenig, J.-P. (2000). Linguistic vs. conceptual sources of implicit agents in sentence
comprehension. Journal of Memory and Language, 43, 110–134.
Mauner, G., Tanenhaus, M., & Carlson, G. (1995). Implicit arguments in sentence processing. Journal of Memory
and Language, 34, 357–382.
McKoon, G., Albritton, D., & Ratcliff, R. (1996). Sentential context effects on lexical decisions with a cross-
modal instead of an all-visual procedure. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 22, 1494–1497.
McRae, K., Ferretti, T. Y., & Amyote, L. (1996). Thematic roles as verb-specific concepts. Language and
Cognitive Processes (Special Issue: Lexical Representations in Sentence Processing), 12, 137–176.
Miller, P. (1997). Complements et circonstants: une distinction syntaxique ou semantique? In J.-C. Souesme
(Ed.), (pp. 91–103). Actes du 37 eme Congres de la SAES, Nice: Presses Universitaires de Nice.
Pinker, S. (1989). Learnability and cognition: the acquisition of argument structure. Cambridge, MA: MIT Press.
Pollard, C., & Sag, I. (1987) (1). Information-based syntax and semantics, Stanford, CA: CSLI.
Pollard, C., & Sag, I. (1994). Head-driven phrase-structure grammar. Chicago, IL: Chicago University Press.
Quillian, M. R. (1968). Semantic memory. In M. Minsky (Ed.), Semantic information processing (pp. 216–270).
Cambridge, MA: MIT Press.
Resnick, P. (1997). Selectional constraints: an information-theoretic model and its computational realization. In
M. Brent (Ed.), Computational approaches to language acquisition (pp. 127–159). Cambridge, MA: MIT
Press.
Resnik, P. (1999). Semantic similarity in a taxonomy: an information-based measure and its application to
problems of ambiguity in natural language. Journal of Artificial Intelligence Research, 11, 95–130.
Rumelhart, D., & Todd, P. (1993). Learning and connectionist representations. In D. Meyer, & S. Kornblum
(Eds.), Synergies in experimental psychology, artificial intelligence, and cognitive neuroscience (XIV)
(pp. 3–30). Attention and performance, Cambridge, MA: MIT Press.
Schutze, C. (1995). PP attachment and argumenthood (26) (pp. 95–151). Papers on language processing and
acquisition, Cambridge, MA: MIT Press.
Schutze, C., & Gibson, E. (1999). Argumenthood and English prepositional phrase attachment. Journal of
Memory and Language, 40, 409–431.
Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of
Documentation, 28, 11–21.
Speer, S. R., & Clifton, C., Jr. (1998). Plausibility and argument structure in sentence comprehension. Memory
and Cognition, 26, 965–978.
Stowe, L. A. (1986). Parsing WH-constructions: evidence for on-line gap location. Language and Cognitive
Processes, 1, 227–245.
Stowe, L., Tanenhaus, M. K., & Carlson, G. (1991). Filling gaps on-line: use of lexical and semantic information
in sentence processing. Language and Speech, 34, 319–340.
Tanenhaus, M., Boland, J., Mauner, G., & Carlson, G. (1993). More on combinatory lexical information. In
G. Altmann, & R. Shillcock (Eds.), Cognitive models of speech processing: the second Sperlonga meeting
(pp. 297–319). Hillsdale, NJ: Erlbaum.
J.-P. Koenig et al. / Cognition 89 (2003) 67–103102
Tanenhaus, M., & Carlson, G. (1989). Lexical structure and language comprehension. In W. Marslen-Wilson
(Ed.), Lexical representation and process (pp. 529–561). Cambridge, MA: MIT Press.
Tanenhaus, M. K., Garnsey, S. M., & Boland, J. (1990). Combinatory lexical information and language
comprehension. In G. T. Altmann (Ed.), Cognitive models of speech processing (pp. 383–408). Cambridge,
MA: MIT Press.
Tesniere, L. (1959). Elements de syntaxe structurale. Paris: Klincksieck.
Traxler, M., & Pickering, M. (1996). Plausibility and the processing of unbounded dependencies: an eye-tracking
study. Journal of Memory and Language, 35, 454–475.
Trueswell, J., & Kim, A. (1998). How to prune a garden path by nipping it in the bud: fast priming of verb
argument structure. Journal of Memory and Language, 39, 102–123.
Trueswell, J., Tanenhaus, M., & Garnsey, S. (1994). Semantic influences on parsing: use of thematic role
information in syntactic ambiguity resolution. Journal of Memory and Language, 32, 285–318.
Van Valin, R. (1993). A synopsis of role and reference grammar. In R. Van Valin (Ed.), Advances in role and
reference grammar (pp. 1–164). Amsterdam: John Benjamins.
Van Valin, R., & Lapolla, R. (1997). Syntax: form, meaning, and function. Cambridge: Cambridge University
Press.
Vater, H. (1978). On the possibility of distinguishing between complements and adjuncts. Valence, semantic
case, and grammatical relations (pp. 21–45). Amsterdam: John Benjamins.
Wechsler, S. (1995). The semantic basis of argument structure. Stanford, CA: CSLI.
Woodward, J. A., Bonett, D. G., & Brecht, M. L. (1990). Introduction to linear models and experimental designs.
San Diego, CA: Harcourt Brace & Jovanovic/Academic Press.
Zurif, E., Swinney, D., Prather, P., Solomon, J., & Bushells, C. (1993). An on-line analysis of syntactic processing
in Broca’s and Wernicke’s aphasia. Brain and Language, 45, 448–464.
J.-P. Koenig et al. / Cognition 89 (2003) 67–103 103
Top Related