Determiners, Probability, Intermediate Quantification

20
Determiners, Probability, Intermediate Quantification Corina Str¨ oßner ([email protected]) University of Konstanz Abstract. The paper suggests a modal predicate logic which employs determiners from generalized quantifier theory. The determiners build quantifiers when they are applied to predicates and modal operators if they are applied to sentences. We will pay special attention to intermediate quantification, i.e. operations which are less than universal but stronger than particular, and its connection to probabilistic inferences. These inferences, read as “therefore, probably” are defined by Carnap’s symmetrical probability measures. The given probabilistic inference relates interme- diate quantification to singular statements: Though “Mostly A” does not logically entail “A” it is rational to conclude, that A is probably the case. Keywords: Generalized quantifier theory, Determiners, Logical probability, Ad- verbs of quantification, Probabilistic logic Introduction When Ray Reiter established default logic in (Reiter, 1980) he stated: “A good deal of what we know about a world is ‘almost always’ true, with a few exceptions. Such facts usually assume the form “Most P’s are Q’s” or “Most P’s have property Q” (Reiter, 1980) . The default rules he introduced like BIRD(x): MFLY(x) FLY(x) (read: If x is a bird and it is consistent to assume that x flies, then conclude that x flies), have little to do with the semantics of “most” or quantities at all. 1 However, there is clearly an intuitive connection between natural language statements about majorities and single case expectations which finds expression in Reiter’s remarks. Forecasting on the basis of information on the portion of the subjects which have a certain property is a core business of statistics which is itself closely related to probability theory. Statistical models aim usually at very precise predictions and require exact data. The infor- mation which is given by the natural language sentence “Most S are P” however is not very precise but still seems to be strong enough to justify an expectation, namely that some x which is S is P. The semantics of “most” is studied in the theory of generalized quantifiers which provided much insight in the formal semantics of natural lan- guage determiners. The relation of some determiners to probabilities c 2012 Kluwer Academic Publishers. Printed in the Netherlands. DETblind.tex; 16/03/2012; 17:33; p.1

Transcript of Determiners, Probability, Intermediate Quantification

Determiners, Probability, Intermediate Quantification

Corina Stroßner ([email protected])University of Konstanz

Abstract. The paper suggests a modal predicate logic which employs determinersfrom generalized quantifier theory. The determiners build quantifiers when they areapplied to predicates and modal operators if they are applied to sentences. Wewill pay special attention to intermediate quantification, i.e. operations which areless than universal but stronger than particular, and its connection to probabilisticinferences. These inferences, read as “therefore, probably” are defined by Carnap’ssymmetrical probability measures. The given probabilistic inference relates interme-diate quantification to singular statements: Though “Mostly A” does not logicallyentail “A” it is rational to conclude, that A is probably the case.

Keywords: Generalized quantifier theory, Determiners, Logical probability, Ad-verbs of quantification, Probabilistic logic

Introduction

When Ray Reiter established default logic in (Reiter, 1980) he stated:“A good deal of what we know about a world is ‘almost always’ true,with a few exceptions. Such facts usually assume the form “Most P’sare Q’s” or “Most P’s have property Q” (Reiter, 1980) . The defaultrules he introduced like

BIRD(x) : MFLY(x)

FLY(x)

(read: If x is a bird and it is consistent to assume that x flies, thenconclude that x flies), have little to do with the semantics of “most”or quantities at all.1 However, there is clearly an intuitive connectionbetween natural language statements about majorities and single caseexpectations which finds expression in Reiter’s remarks.

Forecasting on the basis of information on the portion of the subjectswhich have a certain property is a core business of statistics whichis itself closely related to probability theory. Statistical models aimusually at very precise predictions and require exact data. The infor-mation which is given by the natural language sentence “Most S areP” however is not very precise but still seems to be strong enoughto justify an expectation, namely that some x which is S is P. Thesemantics of “most” is studied in the theory of generalized quantifierswhich provided much insight in the formal semantics of natural lan-guage determiners. The relation of some determiners to probabilities

c© 2012 Kluwer Academic Publishers. Printed in the Netherlands.

DETblind.tex; 16/03/2012; 17:33; p.1

2

and presumptions, however, remained unexplored. The present papersuggests a logical system which connects results from the study of gen-eralized quantifiers and probabilistic considerations in order to modelpresumptions which resemble default inferences in certain respects.

The first section recaptures some basic results from generalized quan-tifier theory, motivates the relation of determiners to probabilities andsketches parts of Carnap’s inductive logic as far as they are neededfor the present purpose. In the second section the definitions of amodal predicate logic with the determiner “MOST” as a probabilisticinference are given and some basic results are stated. The third sectiongives some comparisons between the here defined systems and researchon both, generalized quantifiers and probabilistic reasoning, as well aa short conclusion.

1. Determiners and Probability

(Barwise and Cooper, 1981) pointed out that it is syntactically moreappropriate to analyse “Every men is mortal” as “For every men itholds that he is mortal” than as “For everything it holds that if it isa man than it is mortal”. Determiners in natural language are usuallybinary. In this paper we will focus on “all”, “no” and proportionaldeterminers, i.e. “most” or “at least n%”. It is well-known that thesedeterminers fulfil the following constraints:

− ISOM (isomorphism): The truth value of “DET S are P” is equal inisomorphic structures, namely structures which are similar with re-spect to the cardinalities of all sets, including unions, intersectionsetc. 2

− EXT (extension): The truth value of “DET S are P” is equal indifferent domains. 3

− CONSERV (conservativity): The truth values of “DET S are P”and “DET S are S and P” are equal. 4

Every determiner that meets these constraints can be semanticallycharacterized by two cardinalities: the number of subjects which are Pand the number of subjects which are not P. Therefore, the semanticsof such determiners is describable by m = S ∩ P and k = S− P.

− “ALL S are P” is true iff k = 0.

− “SOME S are P” is true iff m > 0.

DETblind.tex; 16/03/2012; 17:33; p.2

3

− “MOST S are P” is true iff m > k.5

− “AT LEAST n% S are P” is true iff (100− n) ·m ≥ n · k.

With exception of ALL the determiners suppose existential import.Consequently, the apparently strongest universal statement does notimply any other quantitative statement. Therefore, an alternative de-terminer ALLE which resembles ALL but has existential import maybe useful:

− “ALLE S are P” is true iff k = 0 and m 6= 0

As observed by van Benthem in (van Benthem, 1984), all determin-ers which can be described by m and k have a geometric representationfor finite models in a so-called tree of numbers or number triangle.6

Every part of the tree of numbers relates to one pair of numbers. Thefirst number representsm and the second k. Figure 1 shows this pattern.

|S|= 0 0,0

1 0,1 1,0

2 0,2 1,1 2,0

3 0,3 1,2 2,1 3,0

4 0,4 1,3 2,2 3,1 4,0

... ... ... ... ... ... ... ... ... ...

Figure 1. Tree of numbers

When a determiner is represented in the tree of numbers, a plus isused to point out that the corresponding pair will give the value true.The representations of ALLE and MOST in a tree of numbers are givenin figure 2 and figure 3.

DETblind.tex; 16/03/2012; 17:33; p.3

4

-

- +

- - +

- - - +

- - - - +

... ... ... ... ... ... ... ... ...

ALLE

Figure 2. Representation of ALLE in the tree of numbers

-

- +

- - +

- - + +

- - - + +

... ... ... ... ... ... ... ... ...

MOST

Figure 3. Representation of MOST in the tree of numbers

Determiners are related to singular sentences. The relation is veryclear in case of the classical logical determiners:

− ALL S are P and a is S. Therefore, a is P.

− NO S is P and a is S. Therefore, a is not P.

− SOME S are P and a is S. Therefore, a might be P.

These inferences deal with truth and falsehood only. The last patternis not monotonic but one could say that “a might be P” is just a wayto state that “SOME S are P and a is S. Therefore, a is not P” is notvalid.

Not only logical determiners are connected to singular statements.Intermediate quantification is intuitively connected to inferences to sin-gular statements. “Most mice have a tail. Pinky is a mouse. Therefore,Pinky has probably a long tail” can be accepted as a reasonable argu-ment. Other than the classical “All men are mortal. Socrates is a man.Therefore Socrates is mortal” the inference is not logically valid in theclassical sense but rather based on probabilistic considerations. This is

DETblind.tex; 16/03/2012; 17:33; p.4

5

indicated by “probably” in the conclusion where “probably” is used, asCarnap would put it, as a prescientific, classificatory expression.7

Intermediate quantifiers are connected to singular statements aswell. This connection is clearly not classical but probabilistic:

− MOST S are P and a is S. Therefore, a is probably P.

− AT LEAST n% of S are P and a is S. Therefore, the probabilitythat a is P is at least n

100 .

The relation of determiners to probability can be illustrated bymeans of the tree of numbers and Carnap’s results on direct inductiveinferences. Every pair of m and k represents a statistical distributionin Carnap’s sense. A singular sentence, on the other hand, representsan individual distribution for just one individual of the domain.

The formal explication of direct inductive inference supposses reg-ular and symmetrical measure functions. The idea of symmetry is theignorance of differences on the level of individuals or, as Carnap putsit, “the requirement of nondiscrimination among individuals” ((Car-nap, 1962), p. 485). If all we know about a and b is that both are Sthen every quantitative statement should have the same probabilisticconsequences for a and b. This criteria is the probabilistic analogue toISOM for Determiners. Actually, Carnap himself speaks of isomorphicstate descriptions Z in the characterization of symmetrical measurefunctions:

Our intention is to characterize the symmetrical m-functions asthose which treat all individuals on a par. Now, two isomorphicZ differ only in their references to different individuals [. . . ]; exactlythe same properties and relations which the one attributes to a, b,c, etc., the other attributes to, say, d, b, a, etc. Thus, to treat allindividuals on a par amounts to treating isomorphic Z on a par.((Carnap, 1962), p. 485)

Carnap’s symmetrical m-function for sentences is defined as the sumof the values of the symmetrical measure functions for all state descrip-tions in which the sentence is included (i.e. true). The symmetricalconfirmation c is based on symmetrical m-functions for sentences. That

means c(h/e), the confirmation of h by e, is m(h&e)m(e) . The confirmation

remains undefined if m(e) = 0.For the confirmation of a direct inductive inference it is suppossed,

that MI , MII ,..., Mp are logically possible, mutually exclusive anddisjoint attributes.8 The evidence e is an statistical distribution whichgives for every attribute Mi the number ni of altogether n individualswhich have these attributes. A hypothesis h states the attribute of everyof the s individuals which have been sampled where si is the number of

DETblind.tex; 16/03/2012; 17:33; p.5

6

individuals which are Mi. With this formal foundations Carnap proofsthe following equation for symmetrical confirmations c:

− c(h, e) = (n−s)!(n1−s1)!(n2−s2)!...(np−sp)! ×

n1!n2!...np!n!

9

We are now looking at the special case where h is singular sentencestating that one individual from the population has a certain property:s = s1 = 1. This results in a simplified equation:

− c(h, e) = (n−1)!(m−1)!(n−m)! ×

m!(n−m)!n! = (n−1)!

(m−1)! ×m(m−1)!n(m−1)! = m

n

The result is highly plausible. It states that the logical probability for asingle prediction equals the given frequency among the population. Forevery ISOM, EXT and KONSERV DET the symmetrical confirmationwill yield k

n+k for “a is P” as hypothesis and “DET S are P & a is S”as evidence. Therefore the tree of numbers is a tree of confirmation aswell. The pattern is given in figure 4.

|S|= 0 —

1 0 1

2 0 12 1

3 0 13

23 1

4 0 14

12

34 1

... ... ... ... ... ... ... ... ... ...

Figure 4. Tree of numbers with probabilities

If the representation of a determiner in the tree of numbers is giventhe determiner can be connected to a potential probability which mea-sures the reliability of predictions and, by that means, the logical powerof the determiner. Examples are given in figure 5.

ALL p = 1

NO p = 0

SOME p > 0

MOST p > 12

AT LEAST n % p ≥ n

Figure 5. Determiners and probability

Not all determiners which are representable in the tree of numbershave a relevant connection to probability. Determiners like AT LEAST

DETblind.tex; 16/03/2012; 17:33; p.6

7

n or EXACTLY n with n > 0 can be defined as m ≥ n or m = n,respectively. But since we know nothing about k this information isprobabilistically not more relevant than m 6= 0.

2. Determiner Logic

In this section, PDL, a modal determiner logic with logical and prob-abilistic entailment is proposed. We start with a plain modal logicand show how determiners are applied to define quantifiers and modaloperators. Finally, an additional probabilistic inference is defined.

2.1. Modal Predicate Logic

2.1.1. SyntaxPDL uses the syntax of a modal predicate logic. Preliminary, quantifiersand modal operators are omitted.

Definition 1. Preliminary syntactical rules of PDL

1. a, b and c are individual constants. If t is an individual constant thent′ is an individual constant. x, y and z are individual variables. If tis an individual variable then t′ is an individual variable. Nothingelse is an individual variable or constant.

2. S1, P1 andQ1 are unary predicate letters. If αn is a n−ary predicateletter then αn+1 is a n+ 1− ary predicate letter. If αn is a n− arypredicate letter then α′n is a n− ary predicate letter. Nothing elseis a predicate letter.

3. If αn is a n-ary predicate letter and t1, t2, . . . , tn are individualconstants or variables then αnt1t2...tn is an atomic formula. p, qand r are propositional parameters. If φ is propositional parameterthen φ′ is a propositional parameter. Every atomic formula andevery propositional parameter is a well formed formula. If φ and ψare well formed formulas then ¬φ, φ∧ψ, φ∨ψ and φ→ ψ are wellformed formulas. Nothing else is a well formed formula.

2.1.2. ModelThe model we suggest hereafter is rather simple. The domain is con-stant. All worlds are accessible to each other. In terms of (Priest, 2008)the model complies to CS5, a S5 modal predicate logic with a constantdomain. The possible worlds should be understood as different cases or

DETblind.tex; 16/03/2012; 17:33; p.7

8

situations in a general sense. One could easily suppose a more sophisti-cated modal basis and a more restricted interpretation, e.g. as a tenselogic. However, for the main purpose of introducing determiners in amodal framework a simple model and generality is convenient.

Definition 2. Model of PDLA PDL Model M is a tuple 〈W,D, I〉 where W is a non-empty set

of possible worlds, D is a non-empty domain of discourse and I is aninterpretation such that:

1. I(t) ∈ D if t is an individual constant. For every d ∈ D there is aconstant cd such that I(cd) = d.

2. If φ is a propositional parameter Iw(φ) ∈ {1, 0} for every w ∈W .

3. If αn is an n− ary predicate Iw(αn) ⊆ Dn for every w ∈W whereDn is the set of all n− ary tuples with members from D.

2.1.3. SemanticsThe preliminary semantic constraints without quantification and modaloperators are classical.

Definition 3. Preliminary semantic constraints of PDL

1. If φ is a propositional parameter: M, w |= φ iff Iw(φ) = 1.

2. If αnt1t2...tn is an atomic formula: M, w |= αt1t2...tn iff 〈I(t1), I(t2), ...I(tn)〉 ∈Iw(α).

3. M, w |= φ ∧ ψ iff M, w |= φ and M, w |= ψ.

4. M, w |= ¬φ iff M, w 6|= φ.

5. M, w |= φ ∨ ψ iff M, w |= φ or M, w |= ψ.

6. M, w |= φ→ ψ iff M, w 6|= φ or M, w |= ψ.

2.2. Determiners, quantifiers, modal operators

2.2.1. Reference and the domain of determinersIn PDL, Determiners are treated as functions. They need an appropri-ate domain. This domain will be the reference of expressions. If [[MEN]]is the reference (Frege’s Bedeutung) of the term “men” then “MOSTmen” denotes the set of all sets such that they contain most elementsof [[MEN]]. But what about “mostly if it rains”? Here the determiner iscombined to a sentence. Frege’s Bedeutung of a propositional sentence

DETblind.tex; 16/03/2012; 17:33; p.8

9

is its truth value but this is not appropriable for the application ofdeterminers. Instead, “mostly if p” is roughly the same as “in mostcases in which p is the case”. This expression denotes worlds or casesin which it rains. Our concept of reference is, therefore, extensional forpredicates and intensional for sentences.

Definition 4. ReferenceThe reference of an unary predicate α in model M and world w is

its interpretation: [[α]]M,w = Iw(α).The reference of a closed sentence, i.e. a well formed formula with

no free variable, φ in model M is the set of worlds in which it is true:[[φ]]M = {w : M, w |= φ}.

2.2.2. DeterminersDeterminers are defined as functions from sets onto subsets of thepowersets of these sets.

Definition 5. DeterminersA determiner DET is a function which yields for a set A the set

B ⊆ POW (A) such that:

− ALL(A):M ∈ B iff M ∩A = A.

− ALLE(A):M ∈ B iff M ∩A = A and M 6= ∅.

− SOME(A): M ∈ B iff M 6= ∅.

− MOST(A): M ∈ B iff |M ∩A| > |S −M |.

The logic PDL could easily be extended by additional intermediatedeterminers which are generally defined in the following way:

− AT LEAST n%: M ∈ B iff |M ∩A| · (100− n) ≥ |S −M | · n.

2.2.3. Quantifiers and Modal OperatorsDepending on the on the syntactical use, determiners can range overtwo different kind of sets. The determiner takes a set of individuals ifit applies to a predicate or a set of worlds if it applies to a sentence.In the first case the output is a quantifier. In the second case it givesa modal operator. Though the construction of these operations differsfrom the common quantifiers and modal operators, which are usuallyuncomplex, they resemble their syntactical and semantic function.

DETblind.tex; 16/03/2012; 17:33; p.9

10

Definition 6. Quantifiers and modal operatorsIf DET is a determiner and α is unary predicate than DETα is a

quantifier.M, w |= DETαχφ iff there is at least one M ∈ DET ([[α]]M,w) suchthat for every d ∈ M : M, w |= [cd/χ]φ, where [c/χ]φ is the result ofsubstituting every occurrence of χ in φ by c.

If DET is a determiner and φ is a closed sentence than DETφ is amodal operator.M, w |= DETφψ iff there is at least one M ∈ DET [[φ]]M such that forevery v ∈M : M, v |= φ.

There is just one clause for each, quantifiers and modal operators. Thesemantic constraints for different kinds of quantification are given bythe definition of the determiners.

2.2.4. Lambda AbstractionIn natural language one can express propositions by building complexsubject terms, e.g. in “Most children who grow up without siblings tendto be self centered”. So far, this sentence can not be formalized in PDL.In order to allow a formal representation of complex predicates lambdaabstraction is added.

Definition 7. Lambda abstractionIf φ is a formula and χ is a variable then λχ(φ) is used like a unary

predicate. Its meaning is the set of individuals which fulfill φ: d ∈[[λχφ]]M,w iff M, w |= [cd/χ]φ.

2.2.5. Classical OperatorsThe classical quantifiers and modal operators can be given as specialcases in which the subject is the domain and the set of possible worlds,respectively. This formal abbreviation resembles the relation of thenatural language determiners “every” and “some” to “every-thing” and“some-thing”.

THEOREM 1.

∀ can be defined as ALLλx(p∨¬p) and ∃ can be defined as SOMEλx(p∨¬p) 10

Proof 1. M, w |=CS5 ∀χφ iff for every d ∈ D: M, w |= [cd/χ]φ,where [c/χ]φ is the result of substituting every occurrence of χ inφ by c. In PDL it holds that M, w |= ALLαχφ iff there is at leastone M ∈ ALL([[α]]M,w) such that for every d ∈ M : M, w |= [cd/χ]φ.[[λχ(p ∨ ¬p)]]M,w = D and ALL(D) = {D} since D is the only set

DETblind.tex; 16/03/2012; 17:33; p.10

11

M such that M ∩ D = D. Therefore, it holds in PDL that M, w |=ALL(λχ(p∨¬p))χφ iff for every d ∈ D: M, w |= [cd/χ]φ which resemblesthe semantics of ∀ in CS5 (and PL).

A proof for ∃ is omitted since it can be defined by ∀ anyway.

THEOREM 2.

� can be defined as ALLp ∨ ¬p and ♦ can be defined as SOMEp ∨ ¬p

Proof 2. M, w |=CS5 �φ iff M, v |=CS5 φ for every v ∈W . In PDL itholds that M, w |= ALLp∨¬pψ iff there is at least one M ∈ ALL([[p∨¬p]]M) such that for every v ∈M : M, v |= ψ. Now, [[p∨¬p]]M = W andALL(W ) = {W} because W is the only set M such that M ∩W = W .Therefore, in PDL it holds that M, w |= ALLp ∨ ¬pψ iff for everyv ∈W : M, v |= ψ which resembles the semantics of � in CS5 (and S5).

A proof for ♦ is omitted since it can be defined by � anyway.

2.2.6. Relation to CS5Everything which is expressible CS5 with propositional parameters andits sublogics, e.g. PL and S5, can be expressed in DL with the com-mon quantifiers and modal operators as abbreviations. These logics aresublogics of PDL and every expanded PDL system.

THEOREM 3. If φ1, φ2..., φn |=CS5 ψ then φ1, φ2..., φn |=PDL ψ.

Obviously, the same does not hold vice versa. PDL and every enricheddeterminer logic has a greater expressive power than CS5.

2.3. Entailment and Probability

2.3.1. Measure Functions and ConfirmationThe connection between determiners and singular statements by sym-metrical confirmation was pointed out in the first section of this paper.Usually probability measures in logic are applied to possible worlds. 11

In modal determiner logic possible worlds were introduced but theseworlds are used as real alternative cases to formalize expressions like“in most cases”. Additionally, another kind of worlds is needed to rep-resent logical possibilities or state descriptions. A logical possibility isa combination of an actual world and real but not necessarily actualcases.

Definition 8. Logical possibilitiesEM is a set of logical possibilities with respect to model M: EM

contains every pair 〈e, E〉 with E ∈ POW (W ) and e ∈ E.

DETblind.tex; 16/03/2012; 17:33; p.11

12

At first, the probabilistic measure is applied to the logical possibilities.Definition 10 ensures that every measure function is symmetrical. Par-ticularly, it must be guaranteed that epistemic possibilities which donot differ in cardinalities are measured equally high.

Definition 9. Measure function for possibilitiesA measure m yields for every logical possibility 〈e, E〉 ∈ EM a value

such that:

− m〈e, E〉 > 0

− If EM = {〈e1, E1〉, ..., 〈en, En〉} thenm(〈e1, E1〉)+...+m(〈en, En〉) =1

− m〈ei, EA〉 = m〈ej , EB〉 if it holds for every formula φ: [[λxφ]]M,ei =[[λxφ]]M,ej and for every world ea ∈ EA there is exactly one worldeb ∈ EB such that [[λxφ]]M,ea = [[λxφ]]M,eb

The sentence measure function is defined in the common way as thesum of the measure for the possibilities in which the sentence is true.

Definition 10. Sentence measureFor every measure there is a sentence measure m. The value of mM(φ)

is the sum all logical possibilities 〈e, E〉 ∈ EM such that ME/W , e |= φwhere ME/W = 〈E,D, I〉 iff M = 〈W,D, I〉 and E ⊆W .

Finally, the confirmation can be given in the same way as Carnap does.

Definition 11. ConfirmationFor every sentence measure m there is a confirmation c: c(φ/ψ) =

m(φ∧ψ)m(ψ) if (ψ) 6= 0. Otherwise c(φ/ψ) is undefined.

2.3.2. InferenceThe confirmation can be used to define an inference which is founded onprobabilistic considerations. In natural language it reads as “....there-fore, probably...”. Intuitively “A; therefore, probably B” is only validif A makes B at least more probable than non-B. That is the case ifthe confirmation of B by A is higher than 1

2 . 12 However, this conditionis problematicif A is contradictory. In that case the logical inferencefrom A to B is valid (ex contradictione sequitur quodlibet) but theapparently weaker probabilistic inference would be invalid since thereis no confirmation defined. To avoid this consequence the probabilisticinference should be valid if the confirmation is undefined, as well.

DETblind.tex; 16/03/2012; 17:33; p.12

13

Definition 12. Probabilistic inferenceφ1, φ2, ..., φn |=prob ψ, iff c(ψ/φ1∧φ2∧...∧φn) is undefined or c(ψ/φ1∧

φ2 ∧ ... ∧ φn) > 12 for every possible measure m, on which c is defined,

in every model M and its logical possibilities EM.

The given formal approach to “therefore probably” is not very demand-ing. The minimal confirmation is just more than a half. This fits theboundary line of the determiner MOST.

In systems with stronger determiners like AT LEAST 95 % one couldalso introduce stronger probabilistic inferences in the following generalway:

− φ1, φ2, ..., φn |=prob≥n ψ, iff c(ψ/φ1 ∧ φ2 ∧ ... ∧ φn) is undefined orc(ψ/φ1∧φ2∧...∧φn) ≥ n with n > 1

2 for every possible measure m,on which c is defined, in every model M and its logical possibilitiesEM

2.3.3. ResultsIt is easy to see that every inference which is logically valid is alsoprobabilistically valid.

THEOREM 4. If φ1, φ2, ..., φn |= ψ then φ1, φ2, ..., φn |=prob ψ.

For logical inferences the confirmation is always 1 or not defined.Such inferences remain irrefutable when further premises are added.Purely probabilistic inferences, i.e. inferences which are not logicallyvalid, on the other hand, are defeasible. The following list gives someimportant examples of purely probabilistic inferences in PDL :

− Prediction

• MOSTpq, p |=prob q

• MOSTPx(Qx), Pa |=prob Qa

− Relevance of the smallest reference class

• MOSTp¬r,MOSTqr, p, q 6|=prob rMOSTp¬r,MOSTqr, ALLqp |=prob r,MOSTp¬r,MOSTp ∧ qr |=prob r

• MOSTPx(¬Rx),MOSTQx(Rx), Pa,Qa 6|=prob RaMOSTPx(¬Rx),MOSTQx(Rx),ALLQx(Px) |=prob Ra,MOSTPx(¬Rx),MOSTλx(Px ∧Qx)x(Rx) |=prob Ra

DETblind.tex; 16/03/2012; 17:33; p.13

14

3. Comparisons and Conclusion

Finally, we will compare determiner logic with other approaches whichare in some respects alike or deal with the same questions. Firstly, wecompare ideas of PDL with the analyzation of quantificational adverbsby David Lewis and in the theory of quantification. Secondly, we willlook at approaches to probabilistic reasoning.

3.1. Adverbs of quantification

The close connection drawn between quantity and modal expressions isan eye-catching feature of PDL. The idea that terms like “mostly” or“always” should be analyzed by means of quantification was proposedby David Lewis in Adverbs of Quantification. He gives the followingclassification of quantificational adverbs:

(1) Always, invariably, universally, without exception(2) Sometimes, occasionally, [once](3) Never(4) Usually, mostly, generally, almost always, with few exceptions,[ordinarily], [normally](5) Often, frequently, commonly(6) Seldom, infrequently, rarely, almost never((Lewis, 1998), p. 5).

David Lewis investigates the nature and the domain of adverbial quan-tifiers. After considering that they quantify over times or events withoutsatisfactory results, Lewis finally concludes that this kind quantificationrefers to different cases. According to Lewis, quantifying adverbs can beanalyzed as unselective quantifiers. While common quantifiers specifyvariables in a formula the unselective quantifiers refer to each unboundelement in their scope. In a preliminary summary Lewis states:

Our adverbs are quantifiers over cases; a case may be regarded asthe ’tuple of its participants; and the participants are values of thevariables that occur free in the open sentence modified by the adverb((Lewis, 1998), p. 10).

Consequently, Lewis gives the following truth conditions for particularand universal quantificational adverbs:

∀φ is true iff φ is true under every admissible assignment of valuesto all variables free in φ;∃φ is true iff φ is true under some admissible assignment of valuesto all variables free in φ ((Lewis, 1998), p. 9f.).

Moreover, Lewis states that sometimes a time coordinate or an addi-tional event-variable must be introduced (Cf. (Lewis, 1998), p. 11).

DETblind.tex; 16/03/2012; 17:33; p.14

15

In a further section Adverbs of Quantification deals with restrictionby if-clauses as well. In principle there are two ways to analyze sen-tences with if-clauses and adverbs of quantification. The three-part con-struction is [ADVERB, IF] + [ANTECEDENT] + [CONSEQUENT]and a two-part construction is [ADVERB]+[CONDITIONAL]. Usuallyif we say “mostly if A then B” we mean the three-part construction.Is there a possible reduction to a two-part construction? The answer isgenerally negative:

Sentence (39) [the two-part construction] is true iff the conditionalIf ψ, φ is true in all, some, none, most, many, or few of the admissiblecases - that is, of the cases that satisfy any permanent restrictions,disregarding the temporary restrictions, disregarding the temporaryrestrictions imposed by the if-clause. But is there any way to inter-pret the conditional If ψ, φ that makes (39) equivalent to (38) [thethree-part construction] for all six groups of our adverbs? No; if theadverb is always we get the proper equivalence by interpreting itas the truth-functional conditional ψ ⊃ φ , whereas if the adverbis sometimes or never, that does not work, and we must insteadinterpret it as the conjunction φ&ψ . In the remaining cases, thereis no natural interpretation that works ((Lewis, 1998), p. 14).

This result can be described and explained in PDL where DETφψ is athree-part construction and DET (p∨¬p)(φ→ ψ) or DET (p∨¬p)(φ∧ψ) are two-part constructions. Only in case of � (all) and ♦ (some)the three-part construction has an equivalent two-part construction. Incase of MOST and any determiner of the form AT LEAST n% this isnot the case.

Though some main ideas are common to our modal determiner logicand Adverbs of Quantification the formal proceeding in PDL is verydifferent from Lewis’ suggestions. For Lewis the quantifying adverbis a deviant classical quantifier which is not selective. We understandquantifiers for individuals and for cases as two different result of ap-plying a determiner. We agree with Lewis that terms like “mostly”quantify over cases in the broadest sense. However, the way cases arespecified differs a lot. For Lewis a case is a ’tuple of its participantswhere sometimes an additional time or event parameter for events isintroduced. It remains vague under which conditions an event-variablehas to be added. Determiner logic is rather clear at the formal side:In PDL cases are represented by possible worlds but it is not furtherspecified what a case is. We take “case” as a primitive term whichincludes events, points in time or combinations of involved participants.

The basic approach of PDL differs insofar as restriction is the normalcase. David Lewis, on the other hand, treats restriction, though notdefinable by unrestricted structures, as an extension of his basic adverbs

DETblind.tex; 16/03/2012; 17:33; p.15

16

of quantifications. In PDL unrestricted structures are a special instanceof restricted structures, namely restriction by a logical truth.

The idea of David Lewis is revived in the theory of generalized quan-tifiers. Peters and Westerstahl state that “in certain cases of adverbialquantification, ordinary natural language quantifiers are sometimes ap-plied to tuples of individuals rather than to individuals” ((Peters andWesterstahl, 2006), p. 352). They give the following examples:

a. Men are usually taller than women.b. Men seldom make passes at girls who wear glasses. (DorothyParker)c. People are usually grateful to firemen who rescue them ((Petersand Westerstahl, 2006), p. 352).

In order to give a formal analysis of such sentences the usual quantifiersof type 〈1, 1〉 are not sufficient. Not only one individual is bounded byeach of the two arguments but tuples of individuals. A more complexquantifier is needed. This is given by so called resumption. The re-sumption Resk(Q) of a type 〈1, 1〉 quantifier Q is a quantifier of of type〈k, k〉 so that k variables can be bound simultaneously (See (Peters andWesterstahl, 2006), p. 352).

For the given examples the resumption interpretation seems to bemore appropriate than a modal reading. It is hardly obvious how thesesentences are related to events, points in time or situations. Howeverthere is no harm in formalizing the sentences in a modal approach witheach world as a possible combination of relevant individuals. Petersand Westerstahl state that “if the sentences [. . . ] are taken to quantifyover events or situations, these are bound to contain the very elements(men, women, girls with glasses, etc.) of their tuples that the resumptivequantifier would quantify over. So, applying e.g. most to events is infact rather similar to applying it to ordered tuples” ((Peters and West-erstahl, 2006), p. 353). Although a modal interpretation of quantifyingadverbs is not needed in these examples it is not harmful. On theother hand there are sentences with adverbs of quantification wherea reference to events or time is necessary. PDL provides the modalcontext and does not depend on further optional variables. It covers awide range of utterances of quantificational adverbs.

3.2. Probabilistic Entailment

There are numerous theories which relate logical inference to proba-bility. In PDL probability is a tool do define a non-monotonic relationof entailment. The language itself remains two-valued. No probabil-ities are ascribed to the premises of probabilistic inferences. This is

DETblind.tex; 16/03/2012; 17:33; p.16

17

the main difference in comparison to other approaches of probabilisticentailment.

Ernest Adams probabilistic approach exposes an important featureof entailment with respect to probabilities, stating that “it should beimpossible for the premises of an inference to be probable while itsconclusion is improbable” ((Adams, 1975), p. 1). The main idea is thatsufficiently probable premises imply only conclusions which are highlyprobable as well. Adams definition for probabilistic entailment is thefollowing:

Let L be a factual language, let A be a formula of its conditionalextension [formulas of L including conditionals without logicallyfalse antecedent], and let X be a set of such formulas. Then Xprobabilistically entails A (abbr. ‘X p-entails A’) if for all ε > 0there exists δ > 0 such that for all probability-assignments p for Lwhich are proper for X and A, if p(B) ≥ 1− δ for all B in X, thenp(A) ≥ 1− ε. ((Adams, 1975), p. 57)

This definition differs from the probabilistic inference in PDL in manyrespects. Firstly, PDL does not deal with uncertain premises. Secondly,the main idea of probabilistic inference in PDL is not that certaintyis not reduced. The point of our probabilistic inferences is in fact thatcertainty is lost from the premises to the conclusion, though the evi-dence which is given by the premises makes it still reasonable not to beindifferent about the conclusion. Another difference is that conditionalsin PDL are truth functional and not defined by conditional probability.

“Probabilistic inference” is also a common term in Bayesian net-works. 13 Such networks use graphoids for “making relevance relation-ships explicit” ((Pearl, 1989), p. 12). Nodes in a network are connectedif and only if there are dependencies. Information about variables whosenodes are not connected to the node of the variable which we considercan be ignored. This makes it possible to work with conditional prob-abilities without considering all accessible evidence. The computationof a posterior probability in a network is often called probabilistic infer-ence (Cf. (Cooper, 1990), p. 396). The differences to the probabilisticinference of PDL is obvious. Mainly, we don’t calculate probabilities,but the validity of an inference. Probabilities are solely used in anintermediate step.

Bayesian networks address a problem which is very crucial to prob-abilistic logics, the problem of relevance. How does PDL deal with thisproblem? In general, the only relevant informations are those whichare part of the set of premises. Virtually, it is correct to concludeprobabilistically that Tux can fly supposed that Tux is a penguin, allpenguins are birds and most birds fly. Only if we add the information

DETblind.tex; 16/03/2012; 17:33; p.17

18

that most penguins don’t fly as a premise the inference is no longervalid. This takes the general intuition into account that the validityof an inference does not depend on the background knowledge of anagent but solely on the information provided by the premises of theconsidered inference. However, there is a kind of irrelevance since manyinformations which can be added as premise of the PDL probabilisticinference will not influence validity.

3.3. Concluding remarks

PDL connects ideas from generalized quantifiers theory and probabilitytheory in a unifying framework. It differs from these theories in someways. Only a small part of determiners which are investigated in gener-alized quantifier theory, namely proportional determiners, are relevantto us. The treatment of quantifying adverbs is crucially different fromthe resumption technique which is common in generalized quantifiertheory. Probability theory gives the background for the definition ofthe probabilistic inference but PDL is not a probabilistic logic in thecommon sense. PDL, though a logic of statistical reasoning, resemblesin some respects non-monotonic logics. The specifics of this relationremain open to further investigations.

Notes

1 This pointed out by Reiter himself, e.g. in (Reiter, 1987).2 For details on ISOM and isomorphic structures cf. (Peters and Westerstahl,

2006), p. 98f.3 For details cf. (Peters and Westerstahl, 2006), p. 105. Sometimes extension is

called domain independence, e.g. by (Keenan, 2002).4 For a definition cf. (Peters and Westerstahl, 2006), p. 138.5 This is formal analysis of the natural language term “most” which is com-

monly given in the textbooks. One might object that this borderline seems artificial.Though the ratio of male/female newborns is 105 to 100 one would hardly claimthat most newborns are male. But this line, i.e. more than a half, is logical minimalrequirement which captures some significant properties of most, for example con-trariety. “MOST S are P” and “MOST S are not P” are obviously not compatible.Both statements are wrong if exactly half of the S are P. In a (countable) infiniteuniverse both sentences are wrong if infinite S are P and infinite S are not P.

6 The term “tree of numbers” is used by (van Benthem, 1984). “Number triangle”is used to refer to the same construct, e.g. by (Peters and Westerstahl, 2006).

7 Carnap distinguishes between classificatory (a is warm), comparative (a is warmerthan b) and quantitative (a has a temperture of n degrees) usage of terms. Accordingto Carnap, terms tend to be used rather quantitative as science progresses. For Car-naps remarks on classificatory, comparative and quantitative, see (Carnap, 1962),p. 8ff. Carnap is mainly interested in the quantitative concept of probability. In

DETblind.tex; 16/03/2012; 17:33; p.18

19

studying reasoning which is based on less precise terms like the natural languagedeterminer “most”, we need to pay special attention to the classificatory notion ofprobability.

8 In Carnap’s terms the attributes form a division. See (Carnap, 1962), p. 495).9 (Carnap, 1962), p. 495.

10 We could also use ALLE in the definition. Since empty subjects are not relevantin the expression the result would be the same.11 Cf. e.g. (van Benthem, 2003).12 Carnap’s investigation is focused on a quantitative term of confirmation, i.e.

a determined value between 0 and 1. For the purpose to define the probabilisticinference we stick to a qualitative notion defined by comparison.13 Details on Bayesian Networks are exposed in (Pearl, 1989).

References

Adams, E. W.: 1975, The Logic of Conditionals. An Application of Probability toDeductive Logic. D. Reidel Publishing Company Dodrecht, Boston, London.

Barwise, J. and R. Cooper: 1981, ‘Generalized quantifiers and natural language’.Linguistics and Philosophy 4, 159–219.

Carnap, R.: 1962, Logical Foundations of Probability. The University of ChicagoPress, 2. aufl edition.

Cooper, G. F.: 1990, ‘The Computational Complexity of Probabilistic InferencesUsing Baysian Belief Networks’. Artificial Intelligence 42, 393–405.

Frege, G.: 2008, Funktion, Begriff, Bedeutung. Vandenhoeck & Ruprecht Goettingen,2. durchgesehene Auflage.

Keenan, E.: 2002, ‘Some Properties of Natural Language Quantifiers: Gen-eralized Quantifier Theory’. Linguistics and Philosophy 25, 627–654.10.1023/A:1020803514176.

Lewis, D.: 1998, ‘Adverbs of quantification’. In: Papers in Philosophical Logic.Cambridge University Press, pp. 5–20.

Pearl, J.: 1989, Probabilistic Reasoning in Intelligent Systems: Networks of PlausibleInference. Morgan Kaufmann Santa Mateo.

Peters, S. and D. Westerstahl: 2006, Quantifiers in Language and Logic. ClarendonPress Oxford.

Priest, G.: 2008, An Introduction to Non-Classical Logic. Cambridge UniversityPress.

van Benthem, J.: 1984, ‘Questions on Quantifiers’. The Journal of Symbolic Logic49, 2, 443–466.

van Benthem, J.: 2003, ‘Conditional Probability Meets Update Logic’. Journal ofLogic, Language and Information 12, 409–421.

Westerstahl, D.: 1989, ‘Quantifiers in formal and natural languages’. In: D. M.Gabbay and F. Guenther (eds.): Handbook of Philosophical Logic, Vol. IV. D.Reidel Publishing Company Dodrecht, Boston, London, Chapt. IV.1, pp. 1–131.

Pearl, J.: 1988, Probabilistic Reasoning in Intelligent Systems. Morgan KaufmannSanta Mateo.

Reiter, R.: 1980, ‘A logic for default reasoning’. Artificial Intelligence 13, S. 81–132.Reiter, R.: 1987, ‘Nonmonotonic Reasoning’. Annual Review of Computer Science

2, S. 147–186.

DETblind.tex; 16/03/2012; 17:33; p.19

20

Veltman, F.: 1996, ‘Defaults in update semantics’. Journal of Philosophical Logic25, 221–261.

DETblind.tex; 16/03/2012; 17:33; p.20