Smart Inductive Generalizations are Abductions

15
2 SMART INDUCTIVE GENERALIZATIONS ARE ABDUCTIONS John A. Josephson 2.1 A DISTINCTIVE PATTERN OF INFERENCE 2.1.1 Inference to the best explanation To postpone entanglements with the abundant confusions surrounding various uses of the term "abduction," for which Peirce himself seems to be largely responsible, and to proceed as directly as possible to engage the basic logical and computational issues, let us begin by examining a pattern of inference I will call "inference to the best explanation" and abbreviate as "IBE" ("IBEs" for the plural). 1 IBEs follow a pattern like this: 2 Dis a collection of data (facts, observations, givens), H explains D (would, if true, explain D), No other hypothesis explains D as well as H does. Therefore, H is probably correct. The strength of the conclusion depends on several considerations, including: 3 how good H is by itself, independently of considering the alternatives, how decisively H surpasses the alternatives, and 1 The phrase "inference to the best explanation" seems to originate with Gilbert Hannan (Hannan, 1965). 2 This formulation is largely due to William Lycan. 3 Piease see (Josephson, 1994, p.l4), for a more complete description of the considerations governing con- fidence and acceptance, including pragmatic considerations. 31 P.A. Flach and A.C. Kakas (eds. ), Abduction and Induction, 31-44. © 2000 Kluwer Academic Publishers.

Transcript of Smart Inductive Generalizations are Abductions

2 SMART INDUCTIVE GENERALIZATIONS ARE ABDUCTIONS

John A. Josephson

2.1 A DISTINCTIVE PATTERN OF INFERENCE

2.1.1 Inference to the best explanation

To postpone entanglements with the abundant confusions surrounding various uses of the term "abduction," for which Peirce himself seems to be largely responsible, and to proceed as directly as possible to engage the basic logical and computational issues, let us begin by examining a pattern of inference I will call "inference to the best explanation" and abbreviate as "IBE" ("IBEs" for the plural). 1

IBEs follow a pattern like this:2

Dis a collection of data (facts, observations, givens), H explains D (would, if true, explain D), No other hypothesis explains D as well as H does.

Therefore, H is probably correct.

The strength of the conclusion depends on several considerations, including:3

• how good H is by itself, independently of considering the alternatives,

• how decisively H surpasses the alternatives, and

1 The phrase "inference to the best explanation" seems to originate with Gilbert Hannan (Hannan, 1965). 2This formulation is largely due to William Lycan. 3Piease see (Josephson, 1994, p. l4), for a more complete description of the considerations governing con-fidence and acceptance, including pragmatic considerations.

31

P.A. Flach and A.C. Kakas (eds.), Abduction and Induction, 31-44. © 2000 Kluwer Academic Publishers.

32 J.R. JOSEPHSON

• how thorough was the search for alternative explanations.

I trust that my readers recognize IBE as familiar, and as having a kind of intuitively recognizable evidential force. We can observe that people quite commonly justify their conclusions by direct, or barely disguised, appeal to IBE, showing that speaker and hearer share a common understanding of the pattern. Thus, IBE appears to be part of "commonsense logic." Why this might be so makes for interesting speculation - perhaps it is somehow built into the human mind, and perhaps this is good design. These speculations aside, it seems undeniable that people commonly view IBE as a form of good thinking. Moreover, it appears that people intuitively recognize many of the considerations, such as those just mentioned, that govern the strength of the conclusions of IBEs. Beyond that, people sometimes actually come up to the standards set by IBE in their actual reasoning. When they do so, they can be reasonably said to be "behaving intelligently" (in that respect). Thus, IBE is part of being intelligent, part of being "smart."

When I say that a form of inference, is "smart" I mean that reasoning in accor-dance with it has some value in contributing to intelligence, either because inferences according to the pattern carry evidential force, as in the case of IBE, or because they have some other power or effectiveness to contribute to intelligence. I will leave "in-telligence" undefined, although I suppose that intelligence is approximately the same as western philosophers have called "reason," and that intelligence is a biological phe-nomenon, an information-processing capability of humans and other organisms, and that it comes in degrees and dimensions, with some species and individuals being more intelligent than others in some respects.

Besides everyday intelligence, we can readily see IBE in scientific reasoning, as well as in the reasoning of historians, juries, diagnosticians, and detectives. Thus, IBE seems to characterize some of the most careful and productive processes of human reasoning.4 Considering its apparent ubiquity, it is remarkable how overlooked and underanalyzed this inference pattern is by some 2,400 years of logic and philosophy.

The effectiveness of predictions enters the evaluative considerations in a natural way. A hypothesis that leads to false or highly inaccurate predictions is poor by itself, and should not be accepted, even if it appears to be the best explanation when consid-ering all the available data. Failures in predictive power count as evidence against a hypothesis and so tend to improve the chances of other hypotheses coming out as best. Failures in predictive power may also improve the margin of decisiveness by which the best explanation surpasses the failing alternatives. Thus. we see that IBEs are capable of turning negative evidence against some hypotheses into positive evidence for alternative hypotheses.

This kind of reasoning by exclusion, which is able to tum negative to positive ev-idence, can be viewed deductively as relying on the assumption that the contrast set (the set of hypotheses within which one hypothesis gets evidential support by being best and over which reasoning by exclusion proceeds) exhausts the possibilities. It

4For more extensive discussions of the epistemic virtues of this pattern of inference, see (Harman, 1965; Lipton, 1991; Josephson, 1994).

SMART INDUCfiVE GENERALIZATIONS ARE ABDUCfiONS 33

must either exhaust the possibilities, or at least be broad enough to include all plau-sible hypotheses, i.e., all hypotheses with a significant chance of being true. If the contrast set is broad enough, the true explanation can be presumed to be included somewhere in the set of hypotheses under consideration, and the best explanation can then be brought out by reasoning by exclusion. A thorough search for alternatives, the third consideration mentioned previously, is important for high confidence in the conclusion, since a thorough search reduces the danger that the contrast set is too nar-row and that the true explanation has been overlooked.5 Note that nothing requires the alternatives in a contrast set to be mutually exclusive. In principle a patient might have both indigestion and heart trouble. Reasoning by exclusion works fine for non-exclusive hypotheses- reasoning by exclusion depends on exhaustiveness, not mutual exclusion.

I do not suggest that the description of the IBE inference pattern that I have given here is perfect, or precise, or complete, or the best possible description of it. I suggest only that it is good enough so that we can recognize IBE as distinctive, logically forceful, ubiquitous, and smart.

2.1.2 "Abduction"

Sometimes a distinction has been made between an initial process of coming up with explanatorily useful hypothesis alternatives and a subsequent process of critical eval-uation wherein a decision is made as to which explanation is best. Sometimes the term "abduction" has been restricted to the hypothesis-generation phase. Peirce him-self commonly wrote this way, although at other times Peirce clearly used the term "abduction" for something close to what I have here called "inference to the best ex-planation."6

Sometimes "abduction" has been identified with the creative generation of explana-tory hypotheses, even sometimes with the creative generation of ideas in general. Kruijff suggests that, besides the creativity of hypotheses, the surprisingness of what is to be explained is at the core of abduction's ubiquity and of its relation to reality (Kruijff, 1997). Peirce, too, sometimes emphasizes surprise. It is clear that there is much expected utility in trying to explain things that are surprising. Surprise points out just where knowledge is lacking, and when a failed expectation has distinctly pleas-ant or unpleasant effects, there may well be something of practical important to be learned. But one may also wonder about, and seek explanations for, things that are not ordinarily surprising, and which only become "surprising" when you wonder about them, when you recognize that in some way things could be different. "Why do things fall?" "Why do people get angry?" "Why do arctic foxes have white coats in winter?" None of these are unexpected, all present openings for new knowledge. Clearly, nei-ther novelty of hypothesis nor surprise at the data are essential for an IBE to establish

5For a more extensive discussion of the importance of evidence that the contrast set includes the true expla-nation, please see (Josephson, 1994, p. IS). 6For a discussion of Peirce's views on abduction, please see the opening essay by Flach and Kakas and the essay by Psillos in this volume. For a detailed scholarly examination of Peirce's writings on abduction please see (Fann, 1970).

34 J.R. JOSEPHSON

its conclusion with evidential force . "Who forgot to put the cheese away last night?" "Probably Billy. He has left the cheese out most nights this week."

While the creative generation of ideas is certainly virtuous in the right contexts, and useful for being smart, it is necessary for creative hypotheses to have some plausibility, some chance of being true, or some pursuit value, before creativity can make a gen-uine contribution to working intelligence. Generating low value creative explanatory hypotheses is in itself a dis-virtue in that time, attention, and other cognitive or com-putational resources must then be expended in rejecting these low value hypotheses so that better hypotheses may be pursued. Too much of the wrong kind of creativity is a drain on intelligence, and so is not smart. Generation of hypotheses, without critical control, is not smart. Generation of hypotheses, as a pattern of inference, in and of itself, is not smart.

Generation of plausible explanatory hypotheses, relevant to the current explanatory problem, is smart. Yet pre-screening hypotheses to remove those that are implausible or irrelevant mixes critical evaluation into the hypothesis-generation process, and so breaks the separation between the process of hypotheses generation and the process of critical evaluation. Furthermore, evaluating one or more explanatory hypotheses may require (according to IBE) that alternative explanations are generated and considered and that a judgment is made concerning the thoroughness of the search for alternative explanations. Again we see a breakdown in the separation of the processes of hypoth-esis generation from the processes of critical evaluation. Either type of process will sometimes need the other as a subprocess. The use of one process by the other might be precompiled, so that it is not invoked explicitly at run time, but instead is only implicit. A hypothesis generation mechanism might implicitly use criticism (it must use some criticism if it is to be smart), and criticism might implicitly use hypothesis generation, for example by implicitly considering and eliminating a large number of alternatives as being implausible. Thus I conclude that hypothesis generation and hy-pothesis evaluation cannot be neatly separated, and in any case, hypothesis generation by itself is not smart.

Consider another pattern of inference, which I will call "backward modus ponens," which has a pattern as follows:

p-+q q

Therefore, p.

The arrow,"--+", here may be variously interpreted, so let us just suppose it to have more or less the same meaning as the arrow used in schematizing:

p-+q p

Therefore, q.

This second one is modus ponens, and this one is smart. Modus ponens has some kind of intuitively visible logical force. In contrast, backward modes ponens is ob-viously fallacious. Copi calls it "the fallacy of affirming the consequent" (Copi and Cohen, 1998). By itself, backward modus ponens is not smart, although reasoning in

SMART INDUCfiVE GENERALIZATIONS ARE ABDUCfiONS 35

accordance with its pattern may be smart for other reasons, and there may be special contexts in which following the pattern is smart.

It has become common in AI to identify "abduction" with backward modus po-nens, or with backward modus ponens together with syntactic or semantic constraints, such as that the- conclusion must come from some fixed set of abducibles. There is a burden on those who study restricted forms of backward modus ponens to show us the virtues of their particular forms, that is, they need to show us how they are smart. I suggest that we will find that backward modus ponens is smart to the degree that it approximates, or when it is controlled and constrained to approximate, or when it implements, inference to the best explanation.

From the foregoing discussion it appears that IBE is distinctive, evidentially force-ful, ubiquitous, and smart, and that no other proposed definition or description of the term "abduction" has all of these virtues. Thus it seems that IBE is our best candidate as a description of what is at the epistemological and information-processing core of the family of patterns collected around the idea of abduction. I therefore claim the term "abduction" for IBE, and in the remainder of this essay, by "abduction" I mean "inference to the best explanation."

Some authors characterize abduction as reasoning from effects to causes, a view to which we will return later in this essay. For now, I would just like to point out that, at least, abduction is a good way to be effective in reasoning from effects to causes. From an effect, we may generate a set of alternative causal explanations and try to determine which is the best. If a hypothesized cause is the best explanation, then we have good evidence that it is the true cause.

2. 1.3 Abductive reasoning

Until now the discussion has mainly focused on abduction as an argument pattern - as a pattern of evidence and justification - although we have briefly touched on a process-oriented view of abduction in our discussion of the separability of hypoth-esis generation and evaluation, and in other hints about what it takes to be smart. In their opening essay to this volume, Flach and Kakas have distinguished Peirce's early views on abduction from his later more mature views and have characterized these as "syllogistic" and "inferential" views of abduction, the latter being more pro-cess oriented. Oztiirk has distinguished inference "as an evidential process," which is concerned with the value of conclusions either in security or in productivity, from inference "as a methodological process," which emphasizes the role of inferences in the economy of processes of inquiry, or the uses of inferences in support of other tasks (Oztilrk, 1997).

It will be helpful for conceptual clarity to distinguish abduction as a pattern of argument or justification, from abduction as a reasoning task, from abduction as a reasoning process. An information-processing task sets up a goal to accomplish, which may be described independently of its means of accomplishment, that is, a task may be described separately from the available algorithms, mechanisms, strate-

36 J .R. JOSEPHSON

gies, implementations, and processes that will be needed to accomplish it.7 These three perspectives - justification, task, and process - are conceptually tightly inter-connected, as follows. An abductive reasoning task, prototypically, is one that has the goal of producing a satisfactory explanation, which is an explanation that can be confidently accepted. An explanation that can be confidently accepted is one that has strong abductive justification. Thus, a prototypical abductive task aims at setting up a strong abductive justification. Information processing that is undertaken for the pur-pose of accomplishing a prototypical abductive task, that is, of producing a confident explanation, may reasonably be called an "abductive reasoning process." From an information- processing perspective, it makes sense to think of abductive reasoning as comprising the whole process of generation, criticism, and possible acceptance of explanatory hypotheses.

Note that the abductive justifications set up by abductive reasoning might be ex-plicit, as when a diagnostic conclusion can be justified, or they might arise implicitly as a result of the functioning of an "abductively effective mechanism," such as, per-haps, the human visual system, or the human language understanding mechanism, or an effective neural-net diagnostic system. Note also that the conclusions of abduc-tive arguments (and correspondingly, the accomplishments of abductive tasks, and the results of abductive reasoning processes) may be either general or particular propo-sitions. Sometimes a particular patient's symptoms are explained; sometimes an em-pirical generalization is explained by an underlying causal mechanism (e.g., univer-sal gravitation explains the orbits of the planets). Sometimes an individual event is explained - "What caused the fire?" - and sometimes a recurrent phenomenon is explained- "What causes malaria?"

The account of abduction that has been sketched so far in this essay still has two large holes: (I) what is an explanation? and (2) what makes one explanation better than another? I will not attempt to fill the second hole in this chapter - the literature on the subject is vast (see (Darden, 1991, p.277 ff.) for a starting point). I will simply mention some desirable features of explanatory hypotheses: consistency, plausibility, simplicity, explanatory power, predictive power, precision, specificity, and theoretical promise. To begin to fill the first hole, let us ask: what conception of explanation is needed for understanding abduction?

2.2 WHAT IS AN EXPLANATION?

2.2.1 Explanations are not proofs

There have been two main traditional attempts to analyze explanations as deductive proofs. By most accounts, neither attempt has been particularly successful. First, Aristotle maintained that an explanation is a syllogism of a certain form that also satisfies various informal conditions, one of which is that the middle term of the syllo-gism is the cause of the thing being explained. More recently (considerably) Hempel (Hempel, 1965) modernized the logic and proposed the "covering law" or "deductive

7 See (Lucas, 1998) for a method-independent account of diagnosis as a task.

SMART INDUCfiVE GENERALIZATIONS ARE ABDUCfiONS 37

nomological" (D-N) model of explanation.8 The main difficulty with these accounts (besides Hempel's confounding the question of what makes an ideally good explana-tion with the question of what it is to explain at all) is that being a deductive proof is neither necessary nor sufficient for being an explanation. Consider the following:

QUESTION: Why does he have bums on his hand? EXPLANATION: He sneezed while cooking pasta and upset the pot.

The point of this example is that an explanation is given, but no deductive proof, and although it could be turned into a deductive proof by including additional proposi-tions, this would amount to gratuitously completing what is on the face of it an in-complete explanation. Real explanations are almost always incomplete. Under the circumstances (incompletely specified) sneezing and upsetting the pot were presum-ably causally sufficient for the effect, but this is quite different from being deductively sufficient. For another example, consider that the flu hypothesis explains the body aches, but often people have flu without body aches, so having flu does not imply having body aches. The lesson is that an explanatory hypothesis need not deductively entail what it explains.

The case that explanations are not necessarily deductive proofs is made even stronger when we consider psychological explanations, where there is presumptively an ele-ment of free will, and explanations that are fundamentally statistical, where, for ex-ample, quantum phenomena are involved. In these cases it is clear that causal deter-minism cannot be assumed, so the antecedent conditions, even all antecedent condi-tions together, known and unknown, cannot be assumed to be causally sufficient for the effects.

Conversely, many deductive proofs fail to be explanations of anything. For exam-ple, classical mechanics is deterministic and time reversible, so an earlier state of a system can be deduced from a later state, but the earlier state cannot be said to be explained thereby. Also, q can be deduced from 'p and q' but is not thereby explained. Many mathematicians will at least privately acknowledge that some proofs establish their conclusion without giving much insight into why the conclusions are true, while other proofs give richer understanding. So it seems that, even in pure mathematics, some proofs are explanatory and some are not.

We are forced to conclude that explanations are not deductive proofs in any par-ticularly interesting sense. Although they can always be presented in the form of deductive proofs by adding premises, doing so does not succeed in capturing anything essential or especially useful, and typically requires completing an incomplete expla-nation. Thus the search for a proof of D is not the same as the search for an explanation of D. Instead it is only a traditional, but seriously flawed, approximation of it.

2.2.2 Explanations give causes

An attractive alternative view is that an explanation is an assignment of causal respon-sibility; it tells a causal story. Finding possible explanations is finding possible causes

8For a brief summary of deductive and other models of explanation please see (Bhaskar, 1981 ). For a history of more recent philosophical accounts of explanation, please see (Salmon, 1990).

38 J.R. JOSEPHSON

of the thing to be explained. It follows that abduction, as a process of reasoning to an explanation, is a process of reasoning from effect to cause. (Ideas of causality and explanation have been intimately linked for a very long time. For a well-developed historical account of the connections, see (Wallace, 1972; Wallace, 197 4).)

It appears that "cause" for abduction must be understood somewhat more broadly than its usual senses of mechanical, or efficient, or event-event causation. To get some idea of a more expanded view of causation, consider the four kinds of causes according to Aristotle: efficient cause, material cause, final cause, and formal cause (Aristotle, Physics, bk.2, chap.3). Consider the example of my coffee mug. The efficient cause is the process by which the mug was manufactured and helps explain such things as why there are ripples on the surface of the bottom. The material cause is the ceramic and glaze, which compose the mug and cause it to have certain gross properties such as hardness. The final cause is the end, or function, or purpose, in this case to serve as a container for liquids and as a means of conveyance for drinking. A final-cause explanation is needed to explain the presence and shape of the handle. Formal cause is somewhat more mysterious - Aristotle is hard to interpret here - but it is perhaps something like the mathematical properties of the shape, which impose constraints resulting in certain specific other properties. That the cross-section of the mug, viewed from above, is approximately a circle, explains why the length and width of the cross-section are approximately equal.

What the types of causation and explanation are, remains unsettled, despite Aris-totle's best efforts and those of many other thinkers over the centuries. Apparently, the causal story told by an abductive explanation might rely on any type of causation. At different times we seek best explanations of different types. Note that different types of explanations do not usually compete with each other: they answer different kinds of explanation-seeking puzzlements; they explain different aspects; and they do not belong together in the same contrast sets. Yet it seems that the various types of best-explanation reasoning, corresponding to different types of explanations, are fun-damentally similar in their reliance on the logic of reasoning by exclusion over a set of possible explanations.

When we conclude that data D is explained by hypothesis H , we say more than just that H is a cause of D in the case at hand. We conclude that among all the vast causal ancestry of D we will assign responsibility to H . Commonly, our reasons for focusing on H are pragmatic and connected rather directly with goals of producing, preventing, or repairing D. We blame the heart attack on the blood clot in the coronary artery or on the high-fat diet, depending on our interests. We can blame the disease on the invading organism, on the weakened immune system that permitted the invasion, or on the wound that provided the route of entry into the body. I suggest that it comes down to this: the things that will satisfy us as accounting for D will depend on what we are trying to account for about D, and why we are interested in accounting for it; but the only things that count as candidates are plausible parts of the causal ancestry of D according to a desired type of causation.

I have argued that explanations give causes. Explaining something, whether that something is particular or general, gives something else upon which the first thing de-pends for its existence, or for being the way that it is. The bomb explosion explains

SMART INDUCfiVE GENERALIZATIONS ARE ABDUCTIONS 39

the plane crash. The mechanisms that connect the ingestion of cigarette smoke with effects on the arteries of the heart, explain the statistical association between smok-ing and heart disease. It is common in science for an empirical generalization, an observed regularity, to be explained by reference to underlying structure and mech-anisms. Explainer and explained, explanans and explanandum, may be general or particular. Accordingly, abductions may apply to, or arrive at, propositions that are either general or particular. Computational models of abduction that do not allow for this are not fully general, although they may be effective as special-purpose models.

As I have argued, explanations are not deductive proofs in any particularly inter-esting sense. Although they can always be presented in the form of deductive proofs, doing so seems not to capturing anything essential, or especially useful, and usually requires completing an incomplete explanation. Thinking of explanations as proofs tends to confuse causation with logical implication. To put it simply: causation is in the world, implication is in the mind. Of course, mental causation exists (e.g., where decisions cause other decisions), which complicates the simple distinction by includ-ing mental processes in the causal world, but that complication should not be allowed to obscure the basic point, which is not to confuse an entailment relationship with what may be the objective, causal grounds for that relationship. Deductive models of causation are at their best when modeling deterministic closed-world causation, but this is too narrow for most real-world purposes. Even for modeling situations where determinism and closed world are appropriate assumptions, treating causality as de-duction is dangerous, since one must be careful to exclude non-causal and anti-causal (effect-to-cause) conditionals from any knowledge base if one is to distinguish cause-effect from other kinds of inferences. [Pearl has pointed out the significant dangers of unconstrained mixing of cause-to-effect with effect-to-cause reasoning(Pearl, 1988b ).]

Per se, there is no reason to seek an implier of some given fact. The set of possible impliers includes all sorts of riffraff, and there is no obvious contrast set at that level to set up reasoning by exclusion. But there is a reason to seek a possible cause: broadly speaking, because knowledge of causes gives us powers of influence and prediction. And the set of possible causes (of the kind we are interested in) does constitute a contrast set for reasoning by exclusion. Common sense takes it on faith that everything has a cause. (Compare this with Leibniz's Principle of Sufficient Reason.) There is no (non-trivial) principle of logic or common sense that says that everything has an implier.

In the search for interesting causes, we may set up alternative explanations and reason by exclusion. Thus, IBE is a way to reason from effect to cause. Effect-to-cause reasoning is not itself the same as abduction, rather, effect-to-cause reasoning is what abduction is for.

2.3 SMART INDUCTIVE GENERALIZATIONS ARE ABDUCTIONS

This brings us to the main point of the essay, which is the argument that inductive generalizations are a special case of abductions, or they are so if they are any good. More precisely, I will argue that epistemically warranted inductive generalizations are warranted because they are warranted abductions.

40 J.R. JOSEPHSON

To begin with, let us note that the word "induction" has had no consistent use, either recently or historically. Sometimes writers have used the term to mean all inferences that are not deductive, sometimes they have specifically meant inductive generaliza-tions, and sometimes they have meant next-case inductions as in the philosophical "problem of induction" as put by David Hume. We focus on inductive generalizations, which we may describe by saying that an inductive generalization is an inference that goes from the characteristics of some observed sample of individuals to a conclusion about the distribution of those characteristics in some larger population. Examples include generalizations that arrive at categorical propositions (All A's are B's) and generalizations that arrive at statistical propositions (71% of A's are B's, Most A's are B's, Typical A's are B's.). A common form of inductive generalization in AI is called "concept learning from examples," which may be supervised or unsupervised. Here the learned concept generalizes the frequencies of occurrence and co-occurrence of certain characteristics in a sample, with the intention to apply them to a larger general population, which includes unobserved as well as observed instances.

I will argue that it is possible to treat every "smart" (i.e., reasonable, valid, strong) inductive generalization as an instance of abduction, and that analyzing inductive gen-eralizations as abductions shows us how to evaluate the strengths of these inferences.

First we note that many possible inductive generalizations are not smart.9

This thumb is mine & this thumb is mine.

Therefore, all thumbs are mine.

All observed apples are observed.

Therefore, all apples are observed.

Russell's example: a man falls from a tall building, passes the 75th floor, passes the 74th floor, passes the 73rd floor, is heard to say, "so far, so good."

Harman pointed out that it is useful to describe inductive generalizations as abduc-tions because it helps to make clear when the inferences are warranted(Harman, 1965). Consider the following inference:

All observed A's are B's

Therefore, All A's are B's

This inference is warranted, Harman writes, " ... whenever the hypothesis that all A's are B's is (in the light of all the evidence) a better, simpler, more plausible (and so forth) hypothesis than is the hypothesis, say, that someone is biasing the observed sample in order to make us think that all A's are B's. On the other hand, as soon as the total evidence makes some other competing hypothesis plausible, one may not infer from the past correlation in the observed sample to a complete correlation in the total population."

9I did not invent these examples, but I forget where I got them.

SMART INDUCfiVE GENERALIZATIONS ARE ABDUCfiONS 41

If this is indeed an abductive inference, then "All A's are B's" should explain "All observed A's are B's." The problem is that, "All A's are B's" does not seem to explain why "This A is a B," or why A and Bare regularly associated (pointed out by (Ennis, 1968)). Furthermore, it is hard to see how a general fact could explain its instances, because it does not seem in any way to cause them.

The story becomes clearer if we are careful about what precisely is explained and what is doing the explaining. What the general statement in the conclusion explains are certain characteristics of the set of observations, not the facts observed. For example, suppose I choose a ball at random (arbitrarily) from a large hat containing colored balls. The ball I choose is red. Does the fact that all of the balls in the hat are red explain why this particular ball is red? No, but it does explain why, when I chose a ball at random, it turned out to be a red one (because they all are). "All A's are B's" cannot explain why "This A is a B" because it does not say anything at all about how its being an A is connected with its being a B. The information that "they all are" does not tell us anything about why this one is, except that it suggests that if we want to know why this one is, we would do well to figure out why they all are. Instead, all A's are B's helps to explain why, when a sample was taken, it turned out that all of the A's in the sample were B's. A generalization helps to explain some characteristics of the set of observations of the instances, but it does not explain the instances themselves. That the cloudless, daytime sky is blue helps explain why, when I look up, I see the sky to be blue, but it doesn't explain why the sky is blue. Seen this way, an inductive generalization does indeed have the form of an inference whose conclusion explains its premises.

In particular, "A's are mostly B's" together with "This sample of A's was obtained without regard to whether or not they were B's" explains why the A's that were sam-pled were mostly B's.

Why were 61% of the chosen balls yellow?

Because the balls were chosen more or less randomly from a population that was two thirds yellow, the difference from two thirds in the sample being due to chance.

Alternative explanation for the same observation:

Because the balls were chosen by a selector with a bias for large balls from a population that was only one third yellow but where yellow balls tend to be larger than non yellow

Core claim: The frequencies in the larger population, together with the frequency-relevant characteristics of the method for drawing a sample, explain the frequencies in the observed sample.

What is explained? In this example, just the frequency of characteristics in the sample is explained, not why these particular balls are yellow or why the experiment was conducted on Tuesday. In general, the explanation explains why the sample fre-quency was the way it was, rather than having some markedly different value. If there is a deviation in the sample from what you would expect, given the population and the sampling method, then you have to throw some Chance into the explanation (which is more or less plausible depending on how much Chance you have to suppose).

42 J.R. JOSEPHSON

How are frequencies explained? An observed frequency is explained by giving a causal story that explains how the frequency came to be the way it was. This causal story typically includes both the method of drawing the sample, and the population frequency in some reference class. Unbiased sampling processes tend to produce representative outcomes; biased sampling processes tend to produce unrepresentative outcomes. This "tending to produce" is a fully causal relationship and it supports ex-planation and prediction. For example, we may say that an outcome has been caused, in part, by a certain kind of sampling bias; this sampling bias will then be part of the explanation for existing data, and a basis for predictions. Similarly, outcomes causally depend partly on the population frequency, which is also explanatory and predictive.

A peculiarity is that characterizing a sample as "representative" is characterizing the effect (sample frequency) by reference to part of its cause (population frequency) . Straight inductive generalization (carrying the sample frequencies unchanged to the generalization) is equivalent to concluding that a sample is representative, which is a conclusion about its cause. Straight inductive generalization depends partly on evi-dence or presumption that the sampling process is (close enough to) unbiased. The unbiased sampling process is part of the explanation of the sample frequency, and any independent evidence for or against unbiased sampling bears on its plausibility as part of the explanation.

If we do not think of inductive generalizations as abductions, we are at a loss to explain why such an inference is made stronger or more warranted, if in collecting data we make a systematic search for counter-instances and cannot find any, than it would be if we just take the observations passively. Why is the generalization made stronger by making an effort to examine a wide variety of types of A's? The answer is that it is made stronger because the failure of the active search for counter-instances tends to rule out various hypotheses about ways in which the sample might be biased, that is, it strengthens the abductive conclusion by ruling out alternative explanations for the observed frequency. If we think that a sampling method is fair and unbiased, then straight generalization gives the best explanation of the sample frequencies. But if the sample size is small, alternative explanations, where the frequencies differ, may still be plausible. These alternative explanations become less and less plausible as the sample size grows, because the sample being unrepresentative due to chance becomes more and more improbable. Thus viewing inductive generalizations as abductions shows us why sample size is important. Again, we see that analyzing inductive generalizations as abductions shows us how to evaluate the strengths of these inferences.

2.3.1 What is the best generalization?

We have seen that alternative generalizations are parts of alternative explanations, and that for a generalization to be warranted it must be part of the best explanation in some contrast set. (It must also be best by a distinct margin and the contrast set must plausibly exhaust the possibilities.)

We have considered contrast sets where hypotheses differ with respect to the nature and degree of hypothesized bias in the sampling method, and differ correspondingly in the hypothesized parent frequency and amount of supposed chance. But generaliza-tions sometimes differ in other ways, such as differing in the choice of the reference

SMART INDUCfiVE GENERALIZATIONS ARE ABDUCfiONS 43

class for the parent frequency (e.g., is it just humans or all multi-celled animals that use insulin for glucose control?). This is the A class in "All A's are B's." Generaliza-tions may also differ in the B class, which amounts to differing on the choice of which attribute to generalize (e.g., 'Crows are black' versus 'Crows are squawky'). This example shows clearly that sometimes alternative generalizations from the same data (e.g., crow observations) are not genuinely contrastive; they do not belong together in contrast sets from which a best explanation may be drawn by abductive reasoning. It does not make sense to argue in favor of the generalization, 'Crows are black' by arguing against 'Crows are squawky.' This is not simply because they are compati-ble hypotheses; we have seen earlier that alternative explanations for abduction need not be mutually exclusive. 'Heart attack' and 'indigestion' are compatible explana-tions for the pain since it is possible for the patient to have both conditions. But in this case one may argue for one by arguing against the other; they are genuinely con-trastive. Perhaps the 'black' and 'squawky' generalizations of the crow observations are not contrastive because different aspects of the observations are explained by the generalizations, so they are not alternative ways of explaining the same things.

I am puzzled about what principles govern when explanations in general are gen-uinely contrastive, and when they are not. Explanations of different causal types, e.g., final cause, efficient cause, are not usually contrastive. Yet within the same causal type, explanations may not be contrastive, e.g., whether we blame the death on the murderer or the heart stoppage.

Peter Flach has suggested that finding the right level of generality is most of the work in forming good generalizations (mailing-list communication). It is interesting to note that alternative levels of generality do seem to be genuinely contrastive. Let us suppose that the attribute to be generalized is fixed, and we want to determine the best reference class for the parent frequency. For concreteness, let us suppose that we have noticed (and have carefully recorded data showing that) on average, week-ends are rainier than weekdays. Our observations are taken in New York. Alternative generalizations include that weekend days are rainier in New York, on the east coast of North America, on Earth. These generalizations lead to different predictions and perhaps have differing levels of plausibility based on background knowledge about plausible mechanisms. (The best generalization in this case seems to be the east coast of North America, according to a recent report in Nature.)

We have seen that contrastive alternative generalizations may differ in their degrees of plausibility, for example by supposing more or less chance or by hypothesizing kinds of sampling bias that are made more or less plausible by background knowl-edge. Alternative explanations with alternative generalizations may also differ in other virtues, such as explanatory power, predictive power, precision (e.g., in hypothesized frequency), specificity (of reference class), consistency (internal consistency of the ex-planation), simplicity (e.g., complicated versus simple account of bias), and theoretical promise (e.g., whether the generalization is both testable and theoretically suggestive).

2.4 CONCLUSION

I have argued that it is possible to treat every "smart" inductive generalization as an in-stance of abduction, and that analyzing inductive generalizations as abductions shows

44 J.R. JOSEPHSON

us how to evaluate the strengths of these inferences. Good inductive generalizations are good because they are parts of best explanations. I conclude that inductive general-izations derive their epistemic warrants from their natures as abductions. The warrant of an inductive generalization is not evident from its form as an inductive generaliza-tion, but only from its form as an abduction.

If this conclusion is correct, it follows that computational mechanisms for inductive generalization must be abductively well-constructed, or abductively well-controlled, if they are to be smart and effective.

We can easily imagine that inference to the best explanation (IBE) and adopting the best plan (ABP) might rely on the capabilities of a single underlying mechanism able to generate causal stories. Both admit of separation into two stages of: 1 - gen-erating and evaluating alternatives, and 2 - decision to adopt. (This is not the same as separation into propose and evaluate, which I argued against earlier.) However, these pieces of intelligence are functionally different: IBE leads to belief and ABP leads to decision to act. If we assume the existence of supervisory control by a problem solver that is able to keep track of goals, reason hypothetically, and make predictions based on causal stories, then an important challenge for artificial intelligence is to develop an integrated framework that unifies both general and specialized methods for generating plausible causal stories.

Acknowledgments

Parts of this essay are adapted from (Josephson and Josephson, 1994, Chapter 1 ), "Conceptual analysis of abduction" (used with permission). I especially want to thank Richard Fox for many helpful comments on an earlier draft of this chapter.

References for Smart Inductive Generalizations are Abductions John R. Josephson, (2000) in Abduction and Induction, P. A. Flach and A. C. Kakas (eds.), (pp. 31-44) Kluwer Academic Publishers, Netherlands.

Bhaskar, R. (1981). Explanation. In Bynum, W. F., Browne, E. J., and Porter, R., editors, Dictionary of the History of Science, pages 140-142. Princeton University Press.

Copi, I. M. and Cohen, C. (1998). Introduction to Logic. Prentice Hall, tenth edition.

Darden, L. (1991). Theory Change in Science: Strategies from Mendelian Genetics. Oxford University Press, New York.

Ennis, R. (1968). Enumerative induction and best explanation. Journal of Philosophy, LXV(18):523-529.

Fann, K. T. (1970). Peirce's Theory of Abduction. Martinus Nijhoff, The Hague.

Flach, P. A. and Kakas, A. C., editors (1997). Proceedings of the IJCAI’97 Workshop on Abduction and Induction in Artificial Intelligence. Available on-line at http://www.cs.bris.ac.uk/~flach/IJCAI97/.

Harman, G. (1965). The inference to the best explanation. Philosophical Review, 74:88{95.

Hempel, C. G. (1965). Aspects of Scientific Explanation and Other Essays in the Philosophy of Science. Free Press, New York.

Josephson, J. R. (1994). Conceptual analysis of abduction. In Abductive Inference: Computation, Philosophy, Technology, pages 5-30. Cambridge University Press, New York.

Josephson, J. and Josephson, S. (1996) (ed.), Abductive Inference: Computation, Philosophy, Technology, Cambridge University Press, New York.

Kruijff, G.-J. (1997). Concerning logics of abduction - on integrating abduction and induction. In (Flach and Kakas, 1997), pages 31-36. Available on-line at http://www.cs.bris.ac.uk/~flach/IJCAI97/.

Lipton, P. (1991). Inference to the Best Explanation. Routledge, London.

Lucas, P. (1998). Analysis of notions of diagnosis. Artificial Intelligence, 105(1-2):289-337.

Öztürk, P. (1997). An AI criterion for an account of inference: how to realize a task. In (Flach and Kakas, 1997), pages 43-48. Available on-line at http://www.cs.bris.ac.uk/~flach/IJCAI97/.

Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA.

Salmon, W. C. (1990). Four Decades of Scientific Explanation. University of Minnesota Press, Minneapolis.

Wallace, W. A. (1972). Causality and Scientific Explanation, volume 1, Medieval and Early Classical Science. University of Michigan Press, Ann Arbor.

Wallace,W. A. (1974). Causality and Scientific Explanation, volume 2, Classical and Contemporary Science. University of Michigan Press, Ann Arbor.