Parsing with Dynamic Continuized CCG - Association for ...

13
Proceedings of the 13th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+13), pages 71–83, Umeå, Sweden, September 4–6, 2017. c 2017 Association for Computational Linguistics Parsing with Dynamic Continuized CCG Michael White and Jordan Needle Department of Linguistics The Ohio State University Columbus, OH 43210 [email protected] [email protected] Simon Charlow Department of Linguistics Rutgers University New Brunswick, NJ 08901 [email protected] Dylan Bumford Department of Linguistics New York University New York, NY 10003 [email protected] Abstract We present an implemented method of parsing with Combinatory Categorial Grammar (CCG) that for the first time derives the exceptional scope behavior of indefinites in a principled and plau- sibly practical way. The account im- plements Charlow’s (2014) monadic ap- proach to dynamic semantics, in which in- definites’ exceptional scope taking follows from the way the side effect of introduc- ing a discourse referent survives the pro- cess of delimiting the scope of true quan- tifiers in a continuized grammar. To effi- ciently parse with this system, we extend Barker and Shan’s (2014) method of pars- ing with continuized grammars to only in- voke monadic lifting and lowering where necessary, and define novel normal form constraints on lifting and lowering to avoid spurious ambiguities. We also integrate Steedman’s (2000) CCG for deriving basic predicate-argument structure and enrich it with a method of lexicalizing scope island constraints. We argue that the resulting system improves upon Steedman’s CCG in terms of theoretical perspicuity and empir- ical coverage while retaining many of its attractive computational properties. 1 Introduction A long-standing puzzle in natural language se- mantics has been how to explain the exceptional scope behavior of indefinites. For example, (1a) has a reading where there’s a specific relative (a steel magnate, say) such that if she dies, the speaker will be rich. By contrast, (1b) has no analogous reading where the universal takes wide scope: this sentence cannot mean that every rela- tive is such that if that particular relative dies, I’ll be rich. 1 If one takes the antecedent of a condi- tionals to be a scope island (as suggested by the < ... > bracketing), then it’s not surprising that the universal in (1b) is blocked from taking wide scope; what instead requires explanation is how the indefinite in (1a) can exceptionally take scope out of this island. (1) a. If <a relative of mine dies>, I’ll inherit a fortune. (> if ) b. If <every relative of mine dies>, I’ll in- herit a fortune. (* > if ) Charlow (2014) has recently shown that the exceptional scope behavior of indefinites can be derived from their role of introducing discourse referents in a dynamic semantics. To do so, he showed that (1) a monadic approach to dy- namic semantics can be seamlessly integrated with Barker and Shan’s (2014) approach to scope tak- ing in continuized grammars, and (2) once one does so, the exceptional scope of indefinites fol- lows from the way the side effect of introducing a discourse referent survives the process of delim- iting the scope of true quantifiers, such as those expressed with each and every. To date, computationally implemented ap- proaches to scope taking 2 have not distinguished indefinites from true quantifiers in a way that accounts for their exceptional scope taking. In Bos’s (2003) implementation of Discourse Repre- sentation Theory (Kamp and Reyle, 1993, DRT), for example, scope taking is independent of how 1 That is, (1b) cannot mean the same thing as If any rel- ative of mine dies, I’ll inherit a fortune; see Barker and Shan (2014) for a compatible treatment of negative polarity items such as any. 2 See e.g. (Copestake et al., 2005; Koller et al., 2003; Gar- dent and Kallmeyer, 2003; Nesson and Shieber, 2006; Pogo- dalla and Pompigne, 2012), inter alia. 71

Transcript of Parsing with Dynamic Continuized CCG - Association for ...

Proceedings of the 13th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+13), pages 71–83,Umeå, Sweden, September 4–6, 2017. c© 2017 Association for Computational Linguistics

Parsing with Dynamic Continuized CCG

Michael White and Jordan NeedleDepartment of LinguisticsThe Ohio State University

Columbus, OH [email protected]@osu.edu

Simon CharlowDepartment of Linguistics

Rutgers UniversityNew Brunswick, NJ 08901

[email protected]

Dylan BumfordDepartment of Linguistics

New York UniversityNew York, NY [email protected]

Abstract

We present an implemented methodof parsing with Combinatory CategorialGrammar (CCG) that for the first timederives the exceptional scope behaviorof indefinites in a principled and plau-sibly practical way. The account im-plements Charlow’s (2014) monadic ap-proach to dynamic semantics, in which in-definites’ exceptional scope taking followsfrom the way the side effect of introduc-ing a discourse referent survives the pro-cess of delimiting the scope of true quan-tifiers in a continuized grammar. To effi-ciently parse with this system, we extendBarker and Shan’s (2014) method of pars-ing with continuized grammars to only in-voke monadic lifting and lowering wherenecessary, and define novel normal formconstraints on lifting and lowering to avoidspurious ambiguities. We also integrateSteedman’s (2000) CCG for deriving basicpredicate-argument structure and enrich itwith a method of lexicalizing scope islandconstraints. We argue that the resultingsystem improves upon Steedman’s CCG interms of theoretical perspicuity and empir-ical coverage while retaining many of itsattractive computational properties.

1 Introduction

A long-standing puzzle in natural language se-mantics has been how to explain the exceptionalscope behavior of indefinites. For example, (1a)has a reading where there’s a specific relative(a steel magnate, say) such that if she dies, thespeaker will be rich. By contrast, (1b) has noanalogous reading where the universal takes widescope: this sentence cannot mean that every rela-

tive is such that if that particular relative dies, I’llbe rich.1 If one takes the antecedent of a condi-tionals to be a scope island (as suggested by the< . . . > bracketing), then it’s not surprising thatthe universal in (1b) is blocked from taking widescope; what instead requires explanation is howthe indefinite in (1a) can exceptionally take scopeout of this island.

(1) a. If <a relative of mine dies>, I’ll inherit afortune. (∃ > if)

b. If <every relative of mine dies>, I’ll in-herit a fortune. (* ∀ > if)

Charlow (2014) has recently shown that theexceptional scope behavior of indefinites can bederived from their role of introducing discoursereferents in a dynamic semantics. To do so,he showed that (1) a monadic approach to dy-namic semantics can be seamlessly integrated withBarker and Shan’s (2014) approach to scope tak-ing in continuized grammars, and (2) once onedoes so, the exceptional scope of indefinites fol-lows from the way the side effect of introducinga discourse referent survives the process of delim-iting the scope of true quantifiers, such as thoseexpressed with each and every.

To date, computationally implemented ap-proaches to scope taking2 have not distinguishedindefinites from true quantifiers in a way thataccounts for their exceptional scope taking. InBos’s (2003) implementation of Discourse Repre-sentation Theory (Kamp and Reyle, 1993, DRT),for example, scope taking is independent of how

1That is, (1b) cannot mean the same thing as If any rel-ative of mine dies, I’ll inherit a fortune; see Barker andShan (2014) for a compatible treatment of negative polarityitems such as any.

2See e.g. (Copestake et al., 2005; Koller et al., 2003; Gar-dent and Kallmeyer, 2003; Nesson and Shieber, 2006; Pogo-dalla and Pompigne, 2012), inter alia.

71

indefinites are treated. Although Steedman (2012)has developed an account of indefinites’ excep-tional scope taking in a non-standard static seman-tics for Combinatory Categorial Grammar (Steed-man, 2000, CCG), this treatment has not been fullyimplemented (to our knowledge); moreover, asBarker and Shan point out, Steedman’s theory ap-pears to undergenerate by not allowing true quan-tifiers to take scope from medial positions.

Barker and Shan offer a brief sketch of how aparser for their continuized grammars can be im-plemented, including how lifting can be invokedlazily to ensure parsing terminates. In this paper,we show how their approach can be seamlesslycombined with Steedman’s CCG and extended toinclude Charlow’s monadic dynamic semantics,thereby providing the first computational imple-mentation of a system that accounts for the ex-ceptional scope behavior of indefinites in a prin-cipled and plausibly practical way. To efficientlyparse with this system, we devise rules to only in-voke monadic lifting and lowering where neces-sary, and define novel normal form constraintson lifting and lowering to avoid spurious ambi-guities. We also integrate a method of lexicalizingscope island constraints (Barker and Shan, 2006),as Charlow’s account does not provide a practi-cal and empirically satisfactory means of enforc-ing such constraints. We argue that the resultingsystem improves upon Steedman’s CCG in termsof theoretical perspicuity—insofar as it buildsupon an account of dynamic semantics that is in-dependently necessary—and empirical coverage,in that it allows quantifiers to take scope from me-dial positions and from some subordinate clauses.At the same time, it also retains many of CCG’s at-tractive computational properties; in particular, itrespects Steedman’s Principle of Adjacency, onlycombining overtly realized adjacent constituents,thereby making it easy to use with well-studiedparsing algorithms. An open source prototype im-plementation, suitable for testing out grammaticalanalyses, is available online.3

2 Are Scope Islands Real?

Steedman (2012) observes that although the em-pirical status of scope islands remains unsettled inthe linguistics literature (Farkas and Giannakidou,1996; Reinhart, 1997; Ruys and Winter, 2011;Syrett and Lidz, 2011; Syrett, 2015), the possi-

3https://github.com/mwhite14850/dyc3g

ble scopings of true quantifiers appear to be muchmore limited than commonly assumed in compu-tational approaches to scope taking, arguing there-fore in favor of a surface-compositional approachthat aims to capture all and only the attested read-ings; in particular, Steedman takes as his work-ing hypothesis that scope inversion should be sub-ject to syntactic island constraints. While we aresympathetic to Steedman’s point of view, we areskeptical of his working hypothesis, as it appearsto incorrectly predict that quantifiers should neverbe able to take scope from subjects of finite com-plement clauses. Acknowledging that universalssometime appear to do so, Steedman appeals toFox & Sauerland’s (1996) illusory scope analyis,where the quantificational force is argued to stemfrom a main clause generic. However, Farkas andGiannakidou (1996) provide numerous examplesin English and Greek of episodic sentences suchas (2) where the universal takes extra-wide scope.

(2) Yesterday, a guide made sure that <everytour to the Louvre was fun>. (∀ > ∃)

By contrast, a corpus analysis given in the ap-pendix suggests that conditionals and relativeclauses plausibly represent cases where scope is-lands should be treated as hard constraints. Assuch, in this paper we adopt the working hypoth-esis that scope island constraints can be given anaccurate lexicalized treatment. Alternatively, onecould pursue an approach based solely on soft con-straints, where a probabilistic model simply makesscope taking beyond finite clause boundaries veryunlikely. Even in this scenario, we contend that theapproach to exceptionally scoping indefinites im-plemented here will greatly simplify the learningtask, since the ability of indefinites to take excep-tional scope would not need to be learned.

3 Continuized CCG

A continuized grammar is one where the meaningof expressions can be defined as a function on aportion of its surrounding context, or continuation(Barker, 2002; Shan and Barker, 2006; Barker andShan, 2014). To make it easier to reason aboutcontinuized grammars, Barker & Shan devised the“tower” notation illustrated in Figure 1.4 For ex-ample, everyone has a tower category with NP on

4Semantic types are suppressed in this and subsequent fig-ures, except where essential for understanding.

72

someone loves everyone

S S

NP (S\NP)/NP

S S

NP∃x.[ ]x λyx.love(x, y)

∀y.[ ]y

↑S S

(S\NP)/NP[ ]

λyx.love(x, y)C,>

S S

S\NP∀y.[ ]

λx.love(x, y)C,<

S S

S∃x.∀y.[ ]love(x, y)

↓S

∃x.∀y.love(x, y)(a) Surface Scope with Explicit Lifting

someone loves everyone

S S

NP (S\NP)/NP

S S

NP∃x.[ ]x λyx.love(x, y)

∀y.[ ]y↑L,>

S S

S\NP∀y.[ ]

λx.love(x, y)↑L,↑R,<

S S

S S

S∀y.[ ]∃x.[ ]

love(x, y)↓

S∀y.∃x.love(x, y)

(b) Inverse Scope with Integrated Lifting

Figure 1: Continuized CCG Derivations

the bottom and two Ss on top; reading counter-clockwise from the bottom, this category repre-sents a constituent that acts locally as an NP , takesscope over an S, and returns an S. The seman-tics is λk.∀y.ky, a function from a continuationk of type e → t to a universally quantified ex-pression of type t. A continuized meaning of thisform can be abbreviated by representing the loca-

tion where the continuation argument applies with[ ] and putting the argument to the continuation onthe bottom of the tower, as shown.

In a continuized grammar, all expressions canpotentially be given continuized meanings via theLift (↑) operation. This is illustrated in Figure 1awhere the category for loves is lifted, taking onthe semantics λk.k(λyx.love(x, y)). Lifting thecategory for loves allows it to combine with thatof everyone using scopal combination. The wayin which scopal combination works in the towernotation is shown in Figure 2b (left): on the towertop, the continuized functions g[ ] and h[ ] composein surface order, yielding g[h[ ]]; meanwhile, re-cursing on the tower bottom, the expressions a andb combine as they normally would in CCG (usingthe combinators in Figure 2a), yielding c.5 In theexample, [ ] and ∀y.[ ] compose to again yield ∀y.[ ]on the tower top, with λyx.love(x, y) applying toy and yielding λx.love(x, y) on the bottom.

As Barker & Shan observe, the explicit liftingstep seen in Figure 1a can be integrated with thescopal combination step, as shown in the otherrecursively defined rules in Figure 2b, therebyavoiding an infinite regress when applying the lift-ing rule. Figure 1b shows how Lift Left (↑L) andLift Right (↑R) can be applied in sequence—aspart of a single parsing step combining adjacentsigns—to create a three-level tower where every-one ends up taking inverse scope over the sub-ject:6 first, in applying Lift Left, the entire towerfor someone is matched as A, while the bottom ofthe tower for loves everyone, S\NP , is matchedas B, and then the rules are reapplied with A andB as inputs; next, Lift Right is applied, with thebottom of the tower for someone, NP , matched asA, and S\NP again matched as B, and the rulesare reapplied once more; this time, the categoriescan combine directly using Backward Application(<), ending the recursion; as the rules unwind,the three-level tower for someone loves everyoneis constructed, with inverse scope semantics, asshown. The final representations are derived bycollapsing the towers using the recursively definedLower (↓) operation in Figure 2c, which repeat-

5The combinator for combining two scopal terms m andn is λmnk.m(λx.n(λy.k(xy))), assuming forward applica-tion on the tower bottom. Formulating the rules recursivelyallows the base combinator to be factored out while also gen-eralizing to multi-level towers.

6Though everyone is right peripheral in the example,nothing would prevent it from taking inverse scope from me-dial position, in contrast to Steedman’s (2012) approach.

73

Forward Backward Forward ForwardApplication Application Composition Type Raising

X/Y Yf : α→ β a : α

>X

fa : β

Y X\Ya : α f : α→ β

<X

fa : β

X/Y Y/Zf : β → γ g : α→ β

>BX

λx.f(gx) : α→ γ

NPa : e

>TS/(S\NP)

λp.pa : (e→ t)→ t

(a) Base CCG Combinators (not exhaustive)

Combine Lift Left Lift Right

D E

A

E F

Bg[ ]

a

h[ ]

bC

D F

Cg[h[ ]]

c

A

E F

B

a

h[ ]

b↑L

E F

Ch[ ]

c

D E

A B

g[ ]

a b↑R

D E

Cg[ ]

c

if A : a B : b

C : c

(b) Combination with Lifting

Lower

S S

Sg[ ]

a↓

Sg[a]

S S

Ag[ ]

a↓

Sg[c]

if A : a↓

S : c

(c) Lowering (base and recursive)

Figure 2: Continuized CCG

edly applies the continuized semantics to the iden-tity continuation λk.k.

4 Monadic Dynamic Semantics

Charlow’s (2014) dynamic semantics makes useof the State.Set monad (Hutton and Meijer, 1996),which combines the State monad for handling sideeffects with the Set monad for non-determinism.The State monad pairs ordinary semantic valueswith a state, which is threaded through computa-tions. The Set monad models non-deterministicchoices as sets, facilitating a non-deterministictreatment of indefinites. For example, the dynamicmeaning of a linguist swims appears in (3): here,the proposition that x swims, where x is some lin-guist, is paired with a state that augments the inputstate s with the discourse referent x.

(3) λs.{〈swim(x), sx〉 | linguist(x)}

More formally, the State.Set monad is definedas in (4). For each type α, the correspondingmonadic typeMα is a function from states of types to sets pairing items of type α with such states.The η function injects values into the monad, sim-ply yielding a singleton set consisting of the input

item paired with the input state. The bind opera-tion ( sequences two monadic computations bysequencing the two computations pointwise, feed-ing each result of m applied to the input state sinto π and unioning the results.7 Less formally,the ( operation can be thought of as “run m todetermine v in π.”

(4) State.Set Monad

Mα = s→ α× s→ taη = λs.{〈a, s〉}

mv ( π = λs.⋃〈a,s′〉∈ms π[a/v]s

Since the only operation on states that we willbe concerned with in this paper is adding discoursereferents, it suffices to leave the states implicit inthe implementation, only explicitly representingthe new discourse referents—much as in compu-tational implementations of Discourse Represen-tation Theory (Bos, 2003), where assignments arenot explicitly represented in Discourse Represen-tation Structures. Consequently, we will represent

7Note that the notation mv ( π is just syntactic sugarfor m( λv.π, which may be more familiar.

74

(3) as (5), which can be translated to first-orderlogic in much the same way as with DRT.8

(5) {〈swim(x), x〉 | linguist(x)}; ∃x.linguist(x) ∧ swim(x)

The definition of State.Set sequencing allows us todefine a sequence reduction operation where thevalue of m is substituted into π for v and the dis-course referents and conditions are combined. Forexample, the representation of a linguist can be se-quenced with that of swim and simplified as in (6).

(6) {〈x, x〉 | linguist(x)}y ( {〈swim(y), ε〉}; {〈swim(x), x〉 | linguist(x)}

As in DRT, negation in Charlow’s dynamic se-mantics is defined in a way that captures dis-course referents, making them inaccessible forsubsequent reference. Conditionals and universalsare defined in terms of negation, thereby explain-ing their effects on discourse referent accessiblity;for representational simplicity, we will instead as-sume directly defined meanings for conditionalsand universals, as in DRT.

5 Dynamic Combinatory Rules

The rules for combining signs in Dynamic Con-tinuized CCG appear in Figures 3 and 4, augment-ing those in Figure 2. We first give an overview ofthese rules and then illustrate with examples.9

As Charlow (2014) explains in detail, con-tinuized grammars can be reconceptualized as op-erating over an underlying monad, where monadiclifting is identified with applying the underlyingmonad’s sequencing operator (() and monadiclowering with applying the injection function (η).Accordingly, we include the rules for monadiclifting and lowering in Figure 3a-b. The liftingrule takes a category A with monadic value a oftype Mα and sequences it with a new continu-ation, yielding a function λk.av ( kv of type(α → Mβ) → Mβ for a tower category withA on the bottom. Monadic lowering is defined re-cursively, with the two base cases on the left and

8Explicitly representing the states could simplify thetreatment of discourse referent accessibility; we leave inves-tigating this alternative for future work.

9The side conditions for these rules (preceded by ‘if’)sometimes serve to define the rules recursively, as in the ear-lier Figure 2, and sometimes serve to specify sub-cases of in-terest. Rules for anaphora resolution are left for future work.

the two recursive cases on the right. The basecases apply η to the value a on the tower bottombefore filling it in for the continuation; the sec-ond base case enables lowering to apply to theCCG categories S/NP and S\NP used in rela-tive clauses. The recursive cases on the right againenable multi-level towers to be lowered in one fellswoop, with the second rule enabling towers withtower-result categories on the bottom to be fullylowered.

To implement scope islands, the rules in Fig-ure 3c together with the unary type constructor 〈·〉enable categories to specify that their argumentsmust be scope delimited by undergoing a reset (i.e.lower then re-lift) before combination is permit-ted.10 Figure 3d enables a double-continuationanalysis of determiners by triggering a loweringwhen two categories combine to yield a categorywith a lowerable tower result. Finally, the rules inFigure 4 apply when the functor category expectsa monadic value on the tower bottom; the rulesin Figure 4a use η to coerce the input to the righttype, while the ones in Figure 4b invoke loweringto do so.

An example illustrating exceptional scope foran indefinite appears in Figure 5a. Even thoughthe category for if requires the category for itsantecedent someone complains to be reset priorto combination, the side effect of discourse ref-erent introduction survives the reset operation—enabling a wide-scope reading of the indefinite—irrespective of whether sequence reduction is car-ried out immediately, as in Figure 5c. (Figure 8in the appendix shows how side effects are un-affected by reset in the general case, using themonadic identity and associativity laws.) Fig-ure 5b shows how the narrow scope reading for theindefinite can be derived instead using monadictype–driven lowering. By contrast, Figure 6 showswhy the narrow scope reading is the only oneavailable for the universal in if everyone com-plains, since the reset operation closes off thescope of the universal, as illustrated in detail inFigure 6b. The appendix gives two further ex-amples: Figure 10a illustrates result tower lower-ing in an inverse linking example—including thepossibility of medial scoping, which is not possi-ble with Steedman’s CCG—while Figure 9 shows

10The Delimit rules must apply first to ensure that entiretowers are reset. This is accomplished using a cut in theProlog implementation; alternatively, these rules could be de-fined at the level of signs rather than categories.

75

Lift

Aa :Mα

↑S S

Aav ( [ ] :Mβ

v : α

(a) Monadic Lifting

Lower

A S

Sg[ ]

a↓

Ag[aη]

S S

Ag[ ]

p↓

Aλx.g[(px)η]

S S

Ag[ ]

a↓

Cg[c]

S S

Ag[ ]

a↓

Cλk.g[ck]

, if A is

D E

FS

S

if A is S/Y or S\Y if A : a↓

C : c

(b) Monadic Lowering (base and recursive)

Delimit Right Delimit Left

...X/〈Y 〉

...Y

a bDR

...Xc

...Y

...X\〈Y 〉

b aDL

...Xc

if...Yb↑,↓

...Yb′

by Lowerthen Lift,

and...

X/Y

...Y

a b′

...Xc

...Y

...X\Y

b′ a

...Xc

(c) Delimiting Scope with Reset

Lower Result

A : a B : b↓RS

C : c

ifA : a B : b

D E

FG

H

: d

andD E

FG

H

: d

↓C : c

(d) Result Lowering

Figure 3: Dynamic Continuized CCG: Monadic Lifting and Lowering

by contrast how universals are trapped in relativeclause scope islands.

6 Prototype Implementation

Barker & Shan suggest that the rules in Figure 2can form the basis of a practical parser. While theworst-case complexity of parsing with such ruleshas yet to be investigated, the way in which tow-ers can grow to arbitrary heights is apt to at leastlimit the utility of dynamic programming in prac-tice, potentially posing efficiency problems evenwhen using aggressive statistical pruning (Clarkand Curran, 2007). However, recent work on pars-ing with global neural network models has movedaway from dynamic programming solutions, asthe global models are incompatible with dynamicprogramming locality requirements. In particular,Lee et al. (2016) have shown that global neuralmodels can be used with A* search to obtain a

new state-of-the-art in CCG in parsing accuracywhile maintaining impressive speed, even thoughthe search space is exponential. As such, giventhat our approach respects Steedman’s Principleof Adjacency, we suggest that it may be possibleto extend CCG statistical parsing methods to thecurrent setting, thereby resolving scope ambigu-ities the same way as other derivational ambigui-ties, rather than in a post-process as in earlier com-putational work on scope taking. While we areaware of no large-scale scope-annotated corporaat present, small-scale corpora do exist that wouldenable this conjecture to be tested in future work,such as the corpora used in work on CCG semanticparsing (Artzi and Zettlemoyer, 2013, inter alia).

Towards that end, we have implemented a pro-totype shift-reduce parser in Prolog that uses theunary and binary combination rules defined in Fig-ure 2 together with additional rules defined in

76

Forward Application Backward Applicationwith η with η

X/Y Yf :Mα→ β a : α

Xfaη : β

Y X\Ya : α f :Mα→ β

Xfaη : β

(a) Base CCG Combinators with Monadic Type Coercion(not exhaustive)

Lower Right Lower Left

...A

...B

a :Mα→ β b : γ↓R

...Cc : β

...B

...A

b : γ a :Mα→ β↓L

...Cc : β

if...Bb : γ

↓B′

b′ :Mα

and...A B′

a :Mα→ β b′ :Mα

...Cc : β

B′...A

b′ :Mα a :Mα→ β

...Cc : β

(b) Monadic Type–Driven Lowering

Figure 4: Dynamic Continuized CCG: MonadicArguments

Section 5 for implementing Charlow’s monadicdynamic semantics.11 When implementing theparser, we found it paid off to only invoke low-ering where necessary, as discussed in Section 5,rather than simply invoking lowering wheneverpossible. Moreover, in Section 7, we will see howenforcing scope islands keeps tower heights un-der control, while also providing an opportunityto define normal form constraints that limit spu-rious ambiguity, another important practical con-sideration. To test the implementation, we devel-oped an initial test suite of 40 examples of aver-age length 6.7 words, roughly comparable in size

11https://github.com/mwhite14850/dyc3g

and complexity to Baldridge’s (2002) OpenCCG12

test suite. With the normal form constraints, pars-ing time was 60ms per item on a laptop, similarto OpenCCG on the same hardware. By contrast,with the normal form constraints turned off, theparsing time increased to 4.6s per item, nearly twoorders of magnitude slower.

7 Normal Form Constraints

A normal form parse is the simplest parse in anequivalence class of parses yielding the same in-terpretation. Normal form constraints can play animportant role in practical CCG parsing by elimi-nating derivations leading to spurious ambiguitieswithout requiring expensive pairwise equivalencechecks on λ-terms (Eisner, 1996; Clark and Cur-ran, 2007; Hockenmaier and Bisk, 2010; Lewisand Steedman, 2014). Continuized CCG can em-ploy existing CCG normal form constraints at thebase level. The main additional source of spuriousambiguity is illustrated in Figure 7.13 In the fig-ure, the two towers at the upper left are combinedvia ↑R and ↑L to yield a three-level tower, whichpotentially allows an operator to subsequently takescope between any scopal elements present in theleft and right input signs. However, if this three-level tower is subsequently lowered without anyoperator taking intermediate scope, the derivationwill yield an interpretation that is equivalent tothe one yielded by the simpler derivation that justcombines the two signs in their surface scope order(i.e., without yielding a three-level tower).14 Assuch, the lowering operations triggered by scopeislands or sentence boundaries provide an oppor-tunity to recursively detect and eliminate suchnon–normal form derivations, as follows:

Trigger If a sign is created using a lower op-eration, check the input sign for a spurious↑R, ↑L combination.

Base A sign constructed via . . . , ↑R, ↑L, . . . hasa spurious ↑R, ↑L combination.

Non-Scopal A sign that is derived from a non-scopal input sign—i.e., one whose category

12http://openccg.sourceforge.net/13Spurious ambiguity can also arise from the inversion of

two indefinites; we leave this issue for future work.14As noted in Section 5, it remains for future work to add

the lowering rules for multi-level towers that enable Char-low’s treatment of selective exceptional scope; the normalform constraints will need to be augmented accordingly.

77

if someone complains Vincent quits

S/〈S〉/〈S〉S S

NP S\NP S

λxy.(x→ y)η{〈x, x〉}u ( [ ]

u λx.complain(x) quit(v)↑R,<

S S

S{〈x, x〉}u ( [ ]

complain(u)DR,↑L,>η

S S

S/〈S〉{〈complain(x), x〉}p ( [ ]

λy.(pη → y)η

DR,↑R,>ηS S

S{〈complain(x), x〉}p ( [ ]

(pη → quit(v)η)η

↓S

{〈complain(x)η → quit(v)η, x〉}; ∃x.(complain(x)→ quit(v))

(a) Wide Scope Indefinite

if someone complains Vincent quits

S/〈S〉/〈S〉S S

S S

λxy.(x→ y)η{〈x, x〉}u ( [ ]

complain(u) quit(v)Mt→Mt→Mt (t→Mt)→Mt t

DR,↓R,>S/〈S〉

λy.({〈complain(x), x〉} → y)η

Mt→MtDR,>η

S({〈complain(x), x〉} → quit(v)η)η

Mt

; ∀x.(complain(x)→ quit(v))

(b) Narrow Scope via Type-Driven Lowering

{〈x, x〉}u ( [ ]

complain(u)↓

{〈x, x〉}u ( {〈complain(u), ε〉}≡

{〈complain(x), x〉}↑

{〈complain(x), x〉}p ( [ ]

p

(c) Resetting someone complains

Figure 5: Conditional with Indefinite Example

has no tower top—has a spurious ↑R, ↑Lcombination if the other input sign has a spu-rious ↑R, ↑L combination. This case is illus-trated in Figure 7, where H is such a non-scopal input sign.

Inversion A sign that is derived by a ↑L, ↑ R in-version has a spurious ↑R, ↑L combination ifeither input sign has a spurious ↑R, ↑ L com-bination.

Recurse Right A sign that is derived by a C, ↑Lhas a spurious ↑R, ↑L combination if the

right input sign has a spurious ↑R, ↑L com-bination. Note that ↑L,C can derive interme-diate scope for the left input sign.

Recurse Left A sign that is derived by a ↑R,Chas a spurious ↑R, ↑L combination if the leftinput sign has a spurious ↑R, ↑L combina-tion. Note that C, ↑R can derive intermediatescope for the right input sign.

These rules have been tested for safety in thereference implementation by ensuring that all six(3!) desired interpretations result from a ditransi-

78

if everyone complains Vincent quits

S/〈S〉/〈S〉S S

S S

λxy.(x→ y)η(∀x[ ])η

complain(x) quit(v)DR,↑L,>η

S S

S/〈S〉[ ]

λy.((∀x complain(x)η)η → y)η

DR,↑R,>ηS S

S[ ]

((∀x complain(x)η)η → quit(v)η)η

↓S

((∀x complain(x)η)η → quit(v)η)η

; (∀x.complain(x))→ quit(v)

(a) Narrow Scope for Universal

(∀x[ ])ηcomplain(x)

(∀x complain(x)η)η

[ ]

∀x complain(x)η

(b) Resetting everyone complains

Figure 6: Conditional with Universal Example

A B

C

D E

F H*** ↑R,↑L,...A B

D E

G↑R,...

A B

D E

I↓,...

J

Figure 7: Non–Normal Form Derivation

tive verb combined with three scopal arguments,and all 4! desired interpretations result from a 4-argument verb in combination with four scopal ar-guments. With the ditransitive verb, all spuriousambiguity is eliminated, reducing 78 derivationsin an otherwise unambiguous sentence down tojust the six normal form derivations. The rulesare not quite complete though, as six spuriouslyequivalent derivations remain with the 4-argument

verb, where 525 derivations are whittled down to30; safely filtering the remaining six spuriouslyequivalent derivations would require more com-plex rules that track the level at which the ↑R, ↑Loperations apply in the base case, which may notbe worth the added complexity in practice.15

8 Conclusion

We have presented a method of parsing with Dy-namic Continuized CCG that for the first time de-rives the exceptional scope behavior of indefinitesin a principled and plausibly practical way. Ourapproach (i) extends Barker and Shan’s (2014)method of parsing with continuized grammars toonly invoke Charlow’s (2014) monadic lifting andlowering where necessary, (ii) integrates Steed-man’s (2000) CCG for deriving basic predicate-argument structure and enriches it with a practicalmethod of lexicalizing scope island constraints,and (iii) takes advantage of the resulting scope is-lands in defining novel normal form constraints forefficient parsing. We have argued that the account(i) improves upon Steedman’s (2012) approach toquantifier scope in terms of theoretical perspicuityby taking advantage of a dynamic semantics for in-definites independently needed for anaphora, and(ii) offers better empirical coverage by allowingquantifiers to take scope from medial positions andsome subordinate clauses. At the same time, byrespecting the Principle of Adjacency, only com-bining overtly realized adjacent constituents, ourapproach is easy to use with well-studied parsingalgorithms, as with Steedman’s CCG. Althoughthe normal form constraints are quite effectivein small-scale experiments, it remains for futurework to verify quantitatively whether these con-straints suffice for practical parsing in conjunc-tion with statistical filtering techniques. It also re-mains for future work to computationally explorethe novel analyses made possible by this frame-work, including order-sensitivity in negative po-larity items (Barker and Shan, 2014) and selectiveexceptional scope for indefinites and focus alter-natives (Charlow, 2014). Towards that end, wehave made available online an open source proto-type implementation suitable for testing out gram-matical analyses.

15Normal form constraints need not be complete to bepractically useful, as any remaining ambiguity can be han-dled by pairwise checks.

79

Acknowledgments

We thank Carl Pollard, Scott Martin, Mark Steed-man, the OSU Clippers and Synners Groups, theMidwest Speech and Language Days 2016 au-dience and the anonymous reviewers for helpfulcomments and discussion. This work was sup-ported in part by a Targeted Investment in Excel-lence Grant from OSU Arts & Sciences and byNSF grant IIS-1319318.

ReferencesYoav Artzi and Luke Zettlemoyer. 2013. Weakly su-

pervised learning of semantic parsers for mappinginstructions to actions. Transactions of the Associa-tion for Computational Linguistics 1(1):49–62.

Jason Baldridge. 2002. Lexically Specified Deriva-tional Control in Combinatory Categorial Gram-mar. Ph.D. thesis, University of Edinburgh.

Chris Barker. 2002. Continuations and the natureof quantification. Natural Language Semantics10(3):211–242.

Chris Barker and Chung-chieh Shan. 2006. Typesas graphs: Continuations in type logical gram-mar. Journal of Logic, Language and Information15:331–370.

Chris Barker and Chung-chieh Shan. 2014. Continua-tions and Natural Language. Oxford Studies in The-oretical Linguistics.

Johan Bos. 2003. Implementing the bind-ing and accommodation theory for anaphoraresolution and presupposition projection.Computational Linguistics 29(2):179–210.https://aclweb.org/anthology/J/J03/J03-2002.pdf.

Simon Charlow. 2014. On the semantics of exceptionalscope. Ph.D. thesis, New York University.

Stephen Clark and James R. Curran. 2007.Wide-Coverage Efficient Statistical Pars-ing with CCG and Log-Linear Models.Computational Linguistics 33(4):493–552.https://aclweb.org/anthology/J/J07/.

Ann Copestake, Dan Flickinger, Carl Pollard, and IvanSag. 2005. Minimal recursion semantics: An intro-duction. Research on Language and Computation3:281–332.

Jason Eisner. 1996. Efficient normal-form pars-ing for Combinatory Categorial Grammar. InProceedings of the 34th Annual Meeting ofthe Association for Computational Linguis-tics. Association for Computational Linguistics,Santa Cruz, California, USA, pages 79–86.https://doi.org/10.3115/981863.981874.

Donka F. Farkas and Anastasia Giannakidou. 1996.How clause-bounded is the scope of universals? InProceedings of Semantics and Linguistic Theory.Cornell University, volume 6, pages 35–52.

Danny Fox and Uli Sauerland. 1996. Illusive scope ofuniversal quantifiers. In Proceedings of the NorthEastern Linguistic Society (NELS). volume 26,pages 71–86.

Claire Gardent and Laura Kallmeyer. 2003. Semanticconstruction in feature-based TAG. In Proceedingsof EACL-03.

Julia Hockenmaier and Yonatan Bisk. 2010. Normal-form parsing for Combinatory Categorial Grammarswith generalized composition and type-raising. InProceedings of the 23rd International Conferenceon Computational Linguistics (Coling 2010). Coling2010 Organizing Committee, Beijing, China, pages465–473. http://www.aclweb.org/anthology/C10-1053.

Graham Hutton and Erik Meijer. 1996. Monadic ParserCombinators. Technical Report NOTTCS-TR-96-4, Department of Computer Science, University ofNottingham.

Hans Kamp and Uwe Reyle. 1993. From Discourseto Logic: An Introduction to Modeltheoretic Seman-tics of Natural Language, Formal Logic and DRT .Kluwer, Dordrecht, The Netherlands.

Alexander Koller, Joachim Niehren, and Stefan Thater.2003. Bridging the gap between underspecifica-tion formalisms: Hole semantics as dominance con-straints. In Proceedings of EACL-03.

Richard Larson. 1985. Quantifying into NP. MITManuscript.

Kenton Lee, Mike Lewis, and Luke Zettlemoyer.2016. Global neural CCG parsing with opti-mality guarantees. In Proceedings of the 2016Conference on Empirical Methods in NaturalLanguage Processing. Association for Computa-tional Linguistics, Austin, Texas, pages 2366–2376.https://aclweb.org/anthology/D16-1262.

Roger Levy and Galen Andrew. 2006. Tregexand Tsurgeon: tools for querying and manip-ulating tree data structures. In Proceedingsof the Fifth International Conference on Lan-guage Resources and Evaluation (LREC’06). Eu-ropean Language Resources Association (ELRA).http://aclweb.org/anthology/L06-1311.

Mike Lewis and Mark Steedman. 2014. Im-proved CCG parsing with semi-supervisedsupertagging. Transactions of the Associa-tion of Computational Linguistics 2:327–338.http://aclweb.org/anthology/Q14-1026.

Rebecca Nesson and Stuart M. Shieber. 2006. SimplerTAG semantics through synchronization. In Pro-ceedings of the 11th Conference on Formal Gram-mar.

80

Sylvain Pogodalla and Florent Pompigne. 2012. Con-trolling extraction in abstract categorial grammars.In Formal Grammar. Springer, pages 162–177.

Tanya Reinhart. 1997. Quantifier scope: How labor isdivided between QR and choice functions. Linguis-tics and philosophy 20(4):335–397.

E.G. Ruys and Yoad Winter. 2011. Quantifier scopein formal linguistics. In Handbook of philosophicallogic, Springer, pages 159–225.

Chung-chieh Shan and Chris Barker. 2006. Explainingcrossover and superiority as left-to-right evaluation.Linguistics and Philosophy 29(1):91–134.

Mark Steedman. 2000. The syntactic process. MITPress, Cambridge, MA, USA.

Mark Steedman. 2012. Taking Scope: The Natural Se-mantics of Quantifiers. MIT Press, Cambridge, MA,USA.

Kristen Syrett. 2015. Experimental support forinverse scope readings of finite-clause-embeddedantecedent-contained-deletion sentences. LinguisticInquiry .

Kristen Syrett and Jeffrey Lidz. 2011. Competence,performance, and the locality of quantifier raising:Evidence from 4-year-old children. Linguistic In-quiry 42(2):305–337.

81

Fact 4.1 (Reset and monadic programs).

*,m⌫( [ ]

f ⌫+-#"=

m⌫( [ ]f ⌫

Proof.

*,m⌫( [ ]

f ⌫+-#"= (m⌫( ( f ⌫)⌘ )" #

=(m⌫( ( f ⌫)⌘ )u( [ ]

u"

=m⌫( ( f ⌫)⌘u( [ ]

uAssoc

=m⌫( [ ]

f ⌫LeftID

More generally, because " is identified with(, any side e�ects that survive evaluation are free totake scope post-evaluation: sequencing an evaluated, impure program ⇡ with a new, post-evaluationcontinuation/scope k means that any side e�ects in ⇡ inevitably influence the evaluation of k. Thus,for example, Reset has no e�ect on e.g. a linguist left, see (4.2). Evaluation brings us to a fullyevaluated expression, and Lifting brings us back into a scopal expression. Nothing, however, haschanged. The indefiniteness retains scope over its continuation.

Fact 4.2 (Resetting a linguist left).

*,a.lingx( [ ]

left x+-#"=

a.lingx( [ ]left x

Similarly for she left. Reset has no e�ect:

Fact 4.3 (Resetting she left).

*,prox( [ ]

left x+-#"=

prox( [ ]left x

However, crucially, the situation is quite di�erent for every linguist left, see Fact 4.4. Evaluatingthe derived meaning for this sentence yields a trivial Mt program which just bottles up the truthcondition that every linguist left and lacks either nondeterministic or state-changing e�ects. Evaluationthus utterly discharges the scope-taking ability of the universal. Even after Lifting back into atower, the new continuation scopes over the universal: the quantificational force that inhered in thepre-evaluated meaning is relegated to a truth condition on the bottom of the derived tower, and that iswhere its role ends.

91

Figure 8: Side Effects Not Affected By Reset(Charlow, 2014, Fact 4.1)

A Supplemental Material

A.1 Exceptional Scope in the Penn Treebank

As noted in Section 2, the empirical status of scopeislands remains unsettled, with further corpus-based and experimental work necessary to ade-quately characterize the distribution of true quan-tifiers. Nevertheless, a search of the Penn Tree-bank reveals that if scope islands do not representhard constraints, then violations are at least veryrare. We used Tregex (Levy and Andrew, 2006) tosearch the Wall Street Journal portion of the PennTreebank with the pattern

SBAR << /MD|VBD|VBP|VBZ/ << /ˆevery|ˆeach/

and found that only 385 finite subordinate clausescontain (a form of) every or each, including 80relative clauses and just 9 conditionals, with noneshowing clear evidence of the universal scopingout of the finite clause. There were, however,a couple of potential counter-examples, such as7, that appear amenable to an analysis involvingfunctional readings, rather than exceptional scope;these deserve further study.

(7) Tandy said its experience during the short-age didn’t merit the $5 million to $50 millioninvestmenti <U.S. Memories is seeking fromeachi investor>.

By contrast, exceptionally scoping indefinites arequite easy to find.

A.2 Side Effects and Reset

Figure 8 reproduces Charlow’s (2014) proof thatin the general case, side effects in an underlyingmonad are not affected by reset if the lift and lower

senator who everyone likes

N N\N/〈S/NP〉S S

S/NP

senator λqpx.px ∧ qx(∀y [ ])η

λx.like(y, x)DR,↑L,>

S S

N\N[ ]

λpx.px ∧ ∀y like(y, x)η↑L,<

S S

N[ ]

λx.senator(x) ∧ ∀y like(y, x)η

Figure 9: Relative Clause Example

operations in the continuized grammar are identi-fed with the monad’s sequencing (() and injec-tion (η) operators.

A.3 Relative Clauses and Inverse LinkingFigure 9 gives an example of a relative clausescope island. The category for the relative pro-noun requires its clausal argument to be delimited,triggering a reset via the Delimit Right (DR) rule,which closes off the semantic scope for everyone.Not shown is the derivation of the base categoryS/NP for everyone likes, which can be derivedusing standard CCG rules on the bottom withoutinvoking empty string elements. The lowering rulefor incomplete clauses is required in order for thisbase category to be lowerable.

By contrast, Figure 10 shows how inverse scopegoes through for the nominal PP in every state,since the preposition category does not require itsargument to be delimited. One-fell-swoop resultlowering implements Larson’s (1985) constraintbarring interleaved inverse scope out of NPs whilepreserving the ability of the universal to bind sub-sequent pronouns. Although Steedman’s (2012)account handles examples such as the one in Fig-ure 10, where the inversely linked PP is right pe-ripheral, his treatment—unlike the present one—cannot handle examples such as few votersi [inevery state] whoi supported Trump participated inthe protests where the inversely linked PP is in me-dial position. (Note that the relative clause heremust be interpreted restrictively, and thus is nottenable as an appositive, contra Steedman’s sug-gested analysis of related examples.)

82

a voter in every state protests

S S

NPS

S/NN N\N/NP

S S

NP S\NPλk.{〈x, x〉 | [ ]}u ( ku

λp.px voter λypx.px ∧ in(x, y)

(∀y state(y)η [ ])ηy protest

↑L,>S S

N\N(∀y state(y)η [ ])ηλpx.p(x) ∧ in(x, y)

↑L,<S S

N(∀y state(y)η [ ])η

λx.voter(x) ∧ in(x, y)↓,↑L,↑R,>

S S

NP(∀y state(y)η {〈x, x〉 | voter(x) ∧ in(x, y)}u ( [ ])η

u↑R,<

S S

S(∀y state(y)η {〈x, x〉 | voter(x) ∧ in(x, y)}u ( [ ])η

protest(u)↓

S(∀y state(y)η {〈protest(x), x〉 | voter(x) ∧ in(x, y)})η

; ∀y.state(y)→ ∃x.voter(x) ∧ in(x, y) ∧ protest(x)

(a) Wide Scope for Universal

S S

S S

NPS

S(∀y state(y)η [ ])η

λk.{〈x, x〉 | [ ]}u ( ku

voter(x) ∧ in(x, y)↓

S S

NPλk.(∀y state(y)η {〈x, x〉 | (voter(x) ∧ in(x, y))η}u ( ku)η

(b) Lowering Result Tower

Figure 10: Inverse Linking Example

83