Bayesian inference: Combining base rates with opinions of sources who vary in credibility

13
Journal of Personality and Social Psychology 1983, Vol. 45, No. 4, 792-804 Copyright 1983 by the American Psychological Association, Inc. Bayesian Inference: Combining Base Rates With Opinions of Sources Who Vary in Credibility Michael H. Birnbaum University of Illinois at Urbana-Champaign Barbara A. Mellers University of California, Berkeley Subjects made judgments of the probability of an event given base-rate information and the opinion of a source. Base rate and the source's hit and false-alarm rates were manipulated in a within-subjects design. Hit rate and false-alarm rate were manipulated to produce sources of varied expertise and bias. The base rate, the source's opinion, and the source's expertise and bias all had large systematic effects. Although there was no evidence of a "base-rate fallacy," neither Bayes' theorem nor a subjective Bayesian model that allows for "conservatism" due to misperception or response bias could account for the data. Responses were consistent with a scale- adjustment averaging model developed by Birnbaum & Stegner (1979). In this model, the source's report corresponds to a scale value that is adjusted according to the source's bias. This adjusted value is weighted as a function of the source's expertise and averaged with the subjective value of the base rate. These results are consistent with a coherent body of experiments in which the same model could account for a variety of tasks involving the combination of information from different sources. The question, How should humans revise their beliefs? has been studied by philosophers and mathematicians, and the question, How do humans form opinions and revise them? has been investigated by psychologists. Early research that compared the two questions concluded that Bayes' theorem was a useful starting point for the description of human inference but that humans are "conservative," or revise their probability judgments in a manner less extreme than implied by Bayes' theorem (Edwards, 1968; Peterson & Beach, 1967; Slovic & Lichtenstein, 1971). Edwards (1968) discussed three interpre- tations of conservatism: misperception, mis- aggregation, and response bias. Misperception includes the possibility that objective proba- bilities are transformed to subjective proba- bilities by a psychophysical function. Misag- gregation refers to use of a non-Bayesian rule to combine evidence. Response bias refers to nonlinearity in the judgment function relating judged probabilities to subjective likelihoods. Early experimental work attempted to separate We thank Kevin tanning, James Shanteau, and Amos Tversky for comments on an earlier draft. Requests for reprints should be sent to Michael H. Birn- baum, Department of Psychology, University of Illinois, 603 East Daniel Street, Champaign, Illinois 61820. these interpretations, but the experiments could not yield definitive conclusions (Ed- wards, 1968). Wallsten (1972) proposed a subjective ver- sion of Bayes' theorem to account for con- servatism. In his formulation, subjective prob- abilities replaced objective probabilities, the response scale was assumed to be only ordinal (allowing for nonlinear response bias), but the aggregation rule was assumed to be Bayesian. However, Shanteau (1975) found that judges in the bookbags and poker chips paradigm revise their opinions even when they are given evidence that is nondiagnostic. Shanteau con- cluded that his judges were averaging prior probability with a subjective probability rep- resenting the implication of each new sample of evidence (see also Troutman & Shanteau, 1977). Kahneman and Tversky (1973) argued that judges neglect diagnostic information such as base-rate information and the validity of sources of information. Recently there has been a debate over the claim of a "base-rate fallacy," the contention that subjects neglect base-rate information crucial to a Bayesian analysis (Ajzen, 1977; Bar-Hillel, 1980; Birn- baum, 1983; Carroll &Siegler, 1977;Fischhoff, Slovic, & Lichtenstein, 1979; Lyon & Slovic, 1976; Manis, Dovalina, Avis, & Cardoze, 1980; 792

Transcript of Bayesian inference: Combining base rates with opinions of sources who vary in credibility

Journal of Personality and Social Psychology1983, Vol. 45, No. 4, 792-804

Copyright 1983 by theAmerican Psychological Association, Inc.

Bayesian Inference: Combining Base Rates With Opinionsof Sources Who Vary in Credibility

Michael H. BirnbaumUniversity of Illinois at Urbana-Champaign

Barbara A. MellersUniversity of California, Berkeley

Subjects made judgments of the probability of an event given base-rate informationand the opinion of a source. Base rate and the source's hit and false-alarm rateswere manipulated in a within-subjects design. Hit rate and false-alarm rate weremanipulated to produce sources of varied expertise and bias. The base rate, thesource's opinion, and the source's expertise and bias all had large systematic effects.Although there was no evidence of a "base-rate fallacy," neither Bayes' theoremnor a subjective Bayesian model that allows for "conservatism" due to misperceptionor response bias could account for the data. Responses were consistent with a scale-adjustment averaging model developed by Birnbaum & Stegner (1979). In thismodel, the source's report corresponds to a scale value that is adjusted accordingto the source's bias. This adjusted value is weighted as a function of the source'sexpertise and averaged with the subjective value of the base rate. These results areconsistent with a coherent body of experiments in which the same model couldaccount for a variety of tasks involving the combination of information fromdifferent sources.

The question, How should humans revisetheir beliefs? has been studied by philosophersand mathematicians, and the question, Howdo humans form opinions and revise them?has been investigated by psychologists. Earlyresearch that compared the two questionsconcluded that Bayes' theorem was a usefulstarting point for the description of humaninference but that humans are "conservative,"or revise their probability judgments in amanner less extreme than implied by Bayes'theorem (Edwards, 1968; Peterson & Beach,1967; Slovic & Lichtenstein, 1971).

Edwards (1968) discussed three interpre-tations of conservatism: misperception, mis-aggregation, and response bias. Misperceptionincludes the possibility that objective proba-bilities are transformed to subjective proba-bilities by a psychophysical function. Misag-gregation refers to use of a non-Bayesian ruleto combine evidence. Response bias refers tononlinearity in the judgment function relatingjudged probabilities to subjective likelihoods.Early experimental work attempted to separate

We thank Kevin tanning, James Shanteau, and AmosTversky for comments on an earlier draft.

Requests for reprints should be sent to Michael H. Birn-baum, Department of Psychology, University of Illinois,603 East Daniel Street, Champaign, Illinois 61820.

these interpretations, but the experimentscould not yield definitive conclusions (Ed-wards, 1968).

Wallsten (1972) proposed a subjective ver-sion of Bayes' theorem to account for con-servatism. In his formulation, subjective prob-abilities replaced objective probabilities, theresponse scale was assumed to be only ordinal(allowing for nonlinear response bias), but theaggregation rule was assumed to be Bayesian.However, Shanteau (1975) found that judgesin the bookbags and poker chips paradigmrevise their opinions even when they are givenevidence that is nondiagnostic. Shanteau con-cluded that his judges were averaging priorprobability with a subjective probability rep-resenting the implication of each new sampleof evidence (see also Troutman & Shanteau,1977).

Kahneman and Tversky (1973) argued thatjudges neglect diagnostic information such asbase-rate information and the validity ofsources of information. Recently there hasbeen a debate over the claim of a "base-ratefallacy," the contention that subjects neglectbase-rate information crucial to a Bayesiananalysis (Ajzen, 1977; Bar-Hillel, 1980; Birn-baum, 1983; Carroll &Siegler, 1977;Fischhoff,Slovic, & Lichtenstein, 1979; Lyon & Slovic,1976; Manis, Dovalina, Avis, & Cardoze, 1980;

792

BAYESIAN INFERENCE AND BASE RATES 793

Tversky&Kahneman, 1980,1982). This workled many to the conclusion that human in-ference is best described in terms of a list ofbiases and heuristics, regarded as illusions ordiscrepancies between what people should doand what they actually do.

Unfortunately, much of the recent researchhas used the objective form of Bayes' theoremas the null hypothesis, without allowing formisperception or response bias. In fact, eventhe objective form of Bayes' theorem has notbeen provided in its full complexity (Pitz,1975). As Schum (1981) has shown, the algebracan accommodate many subtle patterns ofconditioning of evidence. Birnbaum (1983)noted that research on the "base-rate fallacy"used an incomplete Bayesian analysis. Birn-baum showed that behavior described as "ne-glect of base rate" may be consistent with ra-tional Bayesian utilization of the base rate.Thus, it is not at all clear that Bayes' theoremdeserves the bad press it has received in recentyears as a framework for the study of humaninference.

The purpose of this article is to test twomodels of how judges use base-rate infor-mation. A subjective version of Bayes' theoremand scale-adjustment averaging model (Birn-baum & Stegner, 1979) are examined as de-scriptions of probability judgments.

,, Inference Task

Let P(L) be the probability of the event Lthat a used car will remain in working con-dition, or last, for 3 years. The base rate, orproportion of cars that last, is assumed to varyfor different types of cars. To denote varyingbase rates, r, we write Pr(L)~as the prior prob-ability that the car will last, given base rate r.

A source examines the car and reportswhether the car seems to be in "good" shape("G") or "bad" shape ("B"), mutually exclu-sive and exhaustive categories. The followingcharacteristics of the source are known:

• P("G"|L) = probability that the sourcesays the car is in good condition, given thatthe car lasts (L). In signal-detection theory,this would be termed the hit rate, /'(HIT).

• />("G"|£) = probability that the sourcedescribes the ear as good, given it actually failsto last (L). P("G"\L) is termed that false-alarmrate, P(FA).

Suppose the source's hit and false-alarmrates are independent of the base rate. Thatis, suppose Pr("G"\L) = /T'G"|L) for all r,similarly, Pr("G"\L) = P("G"\L), for all r (seeFootnote 1).

The judges in this experiment were askedto infer the probability that the car will last,given the base rate and the source's report.Their judgments were compared with Bayes'theorem and with a subjective version of Bayes'theorem that has the same algebraic structurebut allows for conservatism due to misper-ception and response bias.

Bayesian Model

Because the source's hit and false-alarmrates are assumed to be independent of baserate, Bayes' theorem can be written:

P,(L\"G") =

Pr(D-P("G"\L)P,(L)-P("G"\L) 5*7v 0)

where P,(L\"G") is the probability that thecar lasts given the source says the car is ingood shape, and the base rate for this type ofcar is r. The prior odds that the car lasts givenonly the base-rate information follows:

nr = - Pr(L)(2)

The posterior odds, Q = Pr(L\"G")/[l -Pr(L|"G")], given the source's opinion can beexpressed as follows:

(3)P(FA)

If the source had said the car was bad ("B"),the hit/false-alarm ratio would be replaced bythe miss/correct-rejection ratio [P("B"\L)/

Equation 3 implies that the effects of baserate should multiply the source information.Taking logarithms of both sides yields

logQ =

log (Or) + log [/'(HIT)] - log [/'(FA)]. (4)

Equation 4 shows that if the Bayesian modelis descriptive of human inference, then itshould be possible to monotonically transformthe subjects' probability judgments to an ad-

794 MICHAEL H. BIRNBAUM AND BARBARA A. MELLERS

ditive decomposition of base rate, source's hitrate, and false-alarm rate for each level of thesource's opinion. This implication follows be-cause log odds (log Q) is a monotonic functionof posterior probability [P,(L\"G")].

If the Bayesian model fails due to misper-ception (i.e., subjects combine informationconsistent with Bayes' theorem but are "con-servative" in their probability estimates), onecould replace the objective probabilities inEquations 1 and 2 with subjective scale values.According to this hypothesis, the model stillmight provide an accurate representation ofthe process by which the subject combines oraggregates the information. Similarly, if sub-jects differ from Bayes' theorem due to a re-sponse nonlinearity (e.g., a reluctance to useextreme values on the response scale), then itshould still be possible to find a monotonicfunction of the judge's responses, such thatthe additive alegbraic structure is satisfied.However, if the data cannot be transformedto fit the subjective Bayesian model, then onecan reject the Bayesian models as theories ofinformation aggregation.

Scale-Adjustment Averaging Model

Birnbaum and Stegner (1979) developed amodel for the process by which judges combineinformation from sources of varying expertiseand bias. With the assumption that expertiseand bias depend on hit rate and false-alarmrate, the scale-adjustment averaging model canbe adapted to the present task, as shown inthe following equation:

p =•* rxbows w*sr wxbsbo

W + W* + Wxt,(5)

where Prx\,0 is the judged probability the carlasts, given that a source of expertise (x) andbias (b) gave an opinion (o), and the base ratewas r. The weights w, w*, and wxb are theweights of the initial impression, the base-rateinformation, and the source's opinion givenby a source with expertise x and bias b, re-spectively. The scale value of the source's re-port, sbo, is assumed to depend on the opinion("G" or "B") and the source's bias. The otherscale values, s and sr, are for the initial impres-sion and the base rate, respectively.

When a piece of information (either baserate or source's opinion) is not presented, its

weight is set to zero. Thus, the scale-adjust-ment averaging model implies that the effectof a piece of information (e.g., base rate orsource's report) is directly proportional to itsweight and inversely related to the numberand weight of other pieces of information.Further, the model implies that the effect ofa source's bias is amplified by the source'sexpertise. On the other hand, Bayesian modelsimply that the effect of the base rate (Equation4) should be independent of the characteristicsof the source, P(HIT) and P(FA). Similarly,the effect of a source's opinion should be thesame regardless of the value (or even absence)of base-rate information.

Method

To assure that the used car problem is comparable tothose used in previous research on the base-rate fallacy,a preliminary study used the same procedure as in thosestudies; that is, each subject received only one problem.The main portion of the experiment used a within-subjectdesign, in which each subject received many problems.

Single-Judgment Task

Sixty-five undergraduates received a single problemsimilar to that used in previous research on the base-ratefallacy. Instructions were modified from the cab problem(see Tversky & Kahneman, 1980) and read as follows:

In this experiment, you will be asked to judge theprobability that a used car will remain in working con-dition for 3 years. For cars of the same year, make,model, and mileage as the car you will be judging, 30%are expected to last for 3 years.

A source, who examined the car, gave his opinion.He had previously been tested in his ability to distinguishbetween cars that actually lasted for 3 years and thosethat failed. Of 100 cars that actually lasted, he correctlyidentified 80 as being in "good shape." Of 100 cars thatfailed to last, he correctly judged 80 as being in "badshape." The source stated that the car is in "good shape."What is the probability expressed as a percentage thatthe car will remain in working condition for 3 years?

Multiple-Judgment Task

Subjects were asked to make many judgments of theprobability that cars would last given base-rate informationand/or the reports of sources who examined the cars. Trialsvaried in base rate, sources' opinion, and source charac-teristics (hit and false-alarm rates).

Instructions. Base rates were represented by blue bookvalues. Judges were told to assume that the probability aused car would last was directly proportional to its bluebook value. They were told that of 100 cars with bluebook values of $300, 30 are expected to last 3 years andthe remaining 70 are expected to fail; with blue bookvalues of $400, 40 of 100 are expected to last, and so on.

BAYESIAN INFERENCE AND BASE RATES 795

A. LOW EXPERTISEFRIEND OF

SELLER

SOURCE SAID-.

OF 100 CARSTHAT FAILED

OF 100 CARSTHAT LASTED

OF ALL 200CARS (TOTAL)

'0 .2 .4 .6 .8 1.0FALSE ALARM RATE P("G"|U

Figure 1. Source characteristics. (Panel A shows the format used to present conditional probabilities inwithin-subjects design. Panel B shows hit rate and false-alarm rates of the eight sources used. Expertise ishit rate minus false-alarm rate. Bias is hit rate plus false-alarm rate. VL = very low; L = low; H = high;and VH = very high.)

They were told that the blue book value represented aclass of cars rather than any individual car. Judges weretold to assume that for the entire population of cars theywould be judging, there was an even chance that a carwould last 3 years.

Specific information for each car was represented bythe opinion of a source of known expertise and bias whoexamined the car. Instructions stated that each source wasgiven a test of ability to distinguish between cars thatactually lasted or failed. The test results were said to beindependent of base rate (blue book value); that is, theconditional probabilities that the source reports the car isin good shape, given it lasted or failed, are the same forall levels of blue book value.1

Each source was described in terms of hit rate, P("G"\L),and false-alarm rate, P("G"|£), and also described in termsof expertise and bias. There were four levels of expertise(very low, low, high, very high) and three levels of bias(friend of the buyer, independent, or friend of the seller).Instructions similar to those of Birnbaum and Stegner(1979) discussed expertise and bias.

For example, Figure 1 (panel A) shows how the testresults for a friend of the seller with low expertisewere presented. Note that of 100 cars that lasted, thissource correctly identified 80 cars as being in good shape;that is, P("G"|L) = .8. Of 100 cars that failed, thesource wrongly identified 40 to be in good shape; that is,P("G"|£) = .4.

The source's bias is reflected in the column totals (hitrate plus false-alarm rate). In the example, the friend ofthe seller said 80 cars were in bad shape and 120 carswere in good shape. The source's expertise is reflected inthe column differences (hit rate minus false-alarm rate).When the low-expertise source said "good shape," therewere 40 more correct than incorrect judgments (80-40).

Each point in Figure 1 (panel B) represents one of theeight sources in terms of false-alarm and hit rates. Forexample, the low-expertise friend of the seller is illustratedin panel B as a point with coordinates (.4, .8) on the false-

alarm and hit rate axes, respectively. The solid diagonallines in Figure 1 show that each source can be representedin terms of bias (hit + false alarm) and expertise (hit -false alarm). The low-expertise friend of the seller is atthe intersection of "sell" on the bias curve and "L" onthe expertise curve. The two squares of dashed lines showthat the eight sources form two 2 X 2 factorial designs ofHit Rate X False-Alarm Rate. The solid lines ,show thatsix of the sources form a 2 X 3 (Expertise X Bias) factorialdesign, in which expertise is either high or low and biasis either seller's friend, independent, or buyer's friend. Theother two sources are independent, providing four levelsof expertise for unbiased sources.2 Judges were instructed

1 Birnbaum (1983) noted that previous demonstrationsof the base-rate fallacy did not give information crucialto a proper Bayesian analysis. Unless the ratio of the source'shit rate to false-alarm rate is known to be independent ofthe base rate, one cannot simply multiply prior odds bya fixed likelihood ratio. Given reasonable assumptionsconcerning the signal-detection'behavior of the source, itcan be shown that Bayes' theorem implies a solution thathas previously been described as "neglect of base rate."

2 The concepts of expertise and bias, as defined by Birn-baum and Stegner (1979) are analogous to the conceptsof discriminability and bias used in signal-detection theory.When it assumed that the stimuli produce normal distri-butions of sensation with equal variance, the index ofdiscriminability, d', is defined as z[-P(HIT)] - z[P(FA)],where P(HIT) and P(FA) are the hit rate and false-alarmrates, respectively, and z is the inverse standard normaldistribution function. The labels in Figure 1 (panel B)treat expertise as hit rate minus false-alarm rate, whichwould be d' if the distributions were rectangular with equalvariance. For the purposes of fitting the model, a separateweight was estimated for each source.

796 MICHAEL H. BIRNBAUM AND BARBARA A. MELLERS

Base Rate = .3and

Sourcesays

"GoodShape

His accuracy:

«/> 30

c 25&£ 20

». l5

« I0

f 5•3

Between - SubjectDesign

A •

Within-SubjectDesign

n = 65

J

= 65

i I I i i IT i i i i i i i i IT i r t

.1 .2 .3 .4 .5 6] .7 .8 .9 .1 .2 .3 .4 .5 .61.7 .8 .9NormativeResponse

NormativeResponse

Probability Used Car Will Last 3 Years

Figure 2. Comparison of between- and within-subjects response distributions for one problem. ("B" = badshape; "G" = good shape; £ = car fails to last; L = car lasts.)

to use a page with matrices as in Figure 1 (panel A) foreach of the eight sources as an aid in making their judg-ments.

Design and procedure. Eighty test trials were constructedfrom an 8 X 2 X 5 (Source X Source's Opinion X BaseRate) factorial design, in which the eight sources of Figure1 (panel B) gave opinions of either good shape or badshape, and the five levels of base rate were .1, .3, .5, .7,or ,9. In addition, the opinions of sources were also pre-sented alone (without explicit base-rate information) toproduce 16 trials—from an 8 X 2 (Source X Source'sOpinion) factorial design. On five additional trials thesubjects made their judgments given base rate alone.

Each test booklet contained 11 pages of instructions,12 representative warm-up trials, and 101 randomly or-dered test trials. Page orders were shuffled to provide dif-ferent trial orders for different subjects.

Subjects. The judges were 65 undergraduates at theUniversity of Illinois, who received extra credit in intro-ductory psychology and were tested in 2-hour sessions.

Results

Single-Judgment TaskFigure 2 compares frequency histograms of

responses to a single problem presented alone(on the left) and to the same problem embed-ded among many other problems in the within-subject design (on the right). The base rate inthis case was .3, the source's characteristicsare given by P("G"|L) = .8 and P("G"|L) =.2; the source said "good shape." Bayes' theo-rem (Equation 1) applied to the objectiveprobabilities implies P(L\"G") = .63.

The modal response for subjects who judgeda single problem is .80, which is the hit rate

of the source. The second most frequent re-sponse is .30, the base rate, and the third mostfrequent response is .24, the product of .30and .80. Fewer than 10% of the subjects re-sponded between .4 and .6. The results aresimilar to those of Kahneman and Tversky(1973), Bar-Hillel (1980), and others. There-fore, this inference problem appears similarto those that led to the notion of a base-ratefallacy.

In sharp contrast, subjects who received thesame problem embedded among many othersgive a very different distribution of responses.Most of the responses fall between .4 and .6,and only one person responded .8. The modeis .4, which is actually closer to the base ratethan to either the normative response or thesource's hit rate. Thus, in the one-judgmenttask, subjects appear to respond with one ofthe values given, whereas when given manyproblems, they appear to integrate the infor-mation. Fischhoff et al. (1979) also concludedthat the response distribution differs for be-tween- and within-subject designs.3

3 There are many possible interpretations of the differentresults for the within- versus between-subjects experimentsin Figure 2. Although the same subject population andgeneral procedures were used, the amount of practice, thevariation of variables, and the instructions were differentfor the two conditions (see also Fischhoff et al., 1979).

BAYESIAN INFERENCE AND BASE RATES 797

i.o

.9

.1.8

.2.7

.8o>.2 .4

1.3

I,*. 1

0

WJ^iT

B"'

.3 .5 .7 .9

Base Rate

o .9

. .«in8, .5

.§.4

.3 .5 .7 .9

Base Rate.3 .5 .7 .9Base Rate

Figure 3. Predictions of three theories as a function of base rate. (Separate curves represent predictions forvery high [VH] or very low [VL] expertise independents who gave reports of "good" ["G"] or "bad" ["B"],as in Figure 1, Dashed curves in all panels are for base-rate information alone. Panel A shows predictionsof Bayes' theorem applied to the numerical values. Panel B shows predictions of best-fit subjective Bayesianmodel. Curves in Panel C show predictions of scale-adjustment averaging model. Open squares show predictionwhen VH expertise independent said either "good" [top square] or "bad" and no base-rate informationwas given. Solid squares, solid circles, and open circles in Panel C show mean judgments.)

Within-Subject Results

Figure 3 shows predictions of three modelsand data (panel C) for a portion of the within-subject design (independent sources of veryhigh and very low expertise). Predictions ofBayes' theorem with objective probabilities areshown in panel A; predictions of a least squaresfit of a subjective Bayesian model (with sub-jective scale values instead of objective prob-abilities) are shown in panel B. The lines inpanel C show best-fit predictions of the scale-adjustment averaging model (Equation 5). Themodel fitting (predictions in panels B and C)is discussed in later sections.4

In each panel of Figure 3, predictions areplotted against the base rate with a separatecurve for independent sources with very lowor very high expertise who gave opinions ofgood or bad shape. Solid lines show the pre-dicted effects when both the base rate and thesource's report were provided. The dashed lineshows the predicted effect of base rate alone.The open squares show the predictions for avery high expertise independent source,P("G"\L) = .9, P("G"\L) = . 1, who said either"good" or "bad," in the absence of specificinformation concerning the base rate (althoughthe overall base rate was specified as .5).

The solid circles in panel C of Figure 3show mean judgments based on source's reportand the base rate. The open circles show mean

judgments given only base rate; the solidsquares show means for two judgments basedonly on the source's opinion (the very highexpertise source said either "good" or "bad").

The slope of the solid points in Figure 3(panel C) shows the effect of the base rate. Ifthere were no effect of base rate, the curveswould be horizontal. The vertical spread be-tween the curves shows the effect of thesource's report. Note that the vertical spreadbetween the two curves labeled VH (very highexpertise source) is much greater than the ver-tical spread between the two curves labeledVL (very low expertise source). The data in

4 A computer program was written to estimate weightsand scale values, to calculate predictions, and to computefit. The sum of squared deviations between predicted andobtained judgments can be expressed as a function of theparameter estimates as follows:

F(w, w*, wxb, s, s,, Sbo) = 2 (Pi ~ O!)1,

where F is a function of the estimated parameters, PI isthe predicted mean response, O, is the observed meanjudgment, and the summation is over all 101 cells in theexperimental design. The best-fit parameters are those thatminimize F. The value of F at the minimum is an indexof the badness of fit of the model. Predictions for eachobserved datum can be calculated by substituting best-fitparameter estimates into the model (e.g., Equation S). Thecomputer program reads in the O,, defines Pt, defines F,and initializes the parameter estimates; this program drivesthe subroutine STEPIT (Chandler, 1969), which varies theparameters in order to minimize F.

798 MICHAEL H. BIRNBAUM AND BARBARA A. MELLERS

Figure 3 (panel C) show that judges are sen-sitive to the base rate, the source's opinion,and the expertise of the source. All of the maineffects and interactions in panel C are statis-tically significant by analysis of variance. Ofthe 65 subjects, 64 gave higher probability in-ferences for larger base rates. Averaged oversources and opinions, the effect of base rateis also large and statistically significant, F(4,256) = 625.28. (All main effects and inter-actions in Figures 4 and 5 are statistically sig-nificant, p<.Q5. Individual data were ex-amined, and it was found that the means inFigures 3, 4, and 5 are representative of in-dividual data.)

Fit of Subjective Bayesian Model

The subjective Bayesian model (Equation1) was fit to the 101 mean judgments by meansof a specially written computer program thatutilized Chandler's (1969) STEPIT subroutineto minimize the sum of squared data-modeldiscrepancies (see Footnote 4). The programestimated subjective probabilities for the fivelevels of base rate. For each of the eight sources,subjective hit rates and false-alarm rates werealso estimated. It was assumed that probabil-ities of complementary events sum to one. Anoverall prior was also;estimated, yielding 22parameters to be fit to 101 judgments.

The subjective Bayesian model (panel B ofFigure 3) gives a better approximation to thedata than the Bayesian model with objectiveprobabilities (panel A). Although it does a rea-sonably good job of describing the effect ofbase rate and source's opinions (compare panelB of Figure 3 with points in panel C), it makessome systematic errors.

Both versions of the Bayesian model implythat it should be possible to rescale monoton-ically the solid and dashed lines in panels Aand B (Figure 3) to parallelism (Equation 4).However, the judgments in panel C show thatthe effect of base-rate information is smallerwhen the source's report is provided. Theslopes of the curves connecting the solid pointsare less steep than the slope of dashed lineconnecting the open points. This crossover in-teraction is an ordinal violation of the Bayesianmodels and implies that deviations from theBayesian models cannot be explained by mis-perception or response bias.

In addition, notice that if the base rate is.9 (without a source's opinion), the meanprobability judgment is .87. However, whenthe very low expertise source gives a "good"report, the judged probability of the car lastingdrops to .74. According to the Bayesian model,inclusion of that source's report should raisethe probability judgment.

The Bayesian model also implies that theeffect of the source's report alone should bethe same as the effect of the source informationwhen accompanied by base-rate information.However, the effect of source information issmaller when presented with base rate thanwithout. To illustrate, note that if a very highexpertise source says the car is in good shape,the mean judgment is about .9 (upper solidsquare in Figure 3, panel C). However, whenthe judges are also told that the base rate forthis type of car is .5, the mean judgment isreduced to .7. According to the Bayesianmodel, these two probability judgments shouldbe identical. (Note that open squares fall di-rectly on the solid lines in panels A and B.)These aspects of the data can be explained bythe scale-adjustment averaging model (opensquares in panel C).

Fit of the Scale-Adjustment Averaging Model

The scale-adjustment averaging model(Equation 5) was also fit to the data by meansof a program that iteratively minimized thesum of squared data-model discrepancies (seeFootnote 4). This model also requires esti-mation of 22 parameters from the data. Thesum of squared discrepancies was less thanhalf that for the Bayesian model. The root-mean squared deviation was .022; the largestabsolute discrepancy was .06.

The 22 parameters of Equation 5 are asfollows: five values of s, (base-rate scale values),eight values of wxb (weights of the eightsources), six values of Sb0 (scale values of thetwo opinions by three levels of bias), two valuesof w* (weight of the base rate alone or withsource's report), and the prior scale value, 5.The weight of the initial impression (prior)was set to 1.00 with no loss of generality.Weight of information when it is not presentedwas set to 0.

The estimated weights and scale values forthe scale-adjustment averaging model are given

BAYESIAN INFERENCE AND BASE RATES 799

Table 1Estimated Parameters for Scale-AdjustmentAveraging Model

Source Seller

Source's bias

Independent Buyer

Weights of sources

Expertise „Very lowLowHighVery high

.111

.361

.082

.153

.582

.935

.129,408

Scale values of opinions

OpinionGood shapeBad shape

1.309-.166

1.359-.003

1.577.235

Note. Weight of initial impression (w) = 1.00; scale valueof initial impression (s) = .368; weight of base-rate in-formation (H>*) = 1.195; weight of base-rate informationalone (w* [alone]) = 2.903; scale values of base-rates (sr)are -.049, .218, .504, .755, and 1.013, for objective baserates of .1, .3, .5, .7, and .9, respectively (see Footnotes 4and 5).

in Table 1. Note that estimated weights ofsources depend mostly on the source's exper-tise, but biased sources have slightly less weightthan unbiased sources with the same value ofhit rate minus false-alarm rate. The fit of themodel was improved by allowing the weightof the base rate to increase (from 1.19 to 2.90)when only base-rate information was pre-sented.

The scale values for the sources' opinionsshow higher values for good reports than badreports, and scale values are higher for thebuyer (who is presumably biased to say carsare in bad shape) than for the seller (who hasthe opposite bias). Likewise, the scale value ofbad shape for the seller (who tends to say carsare in good shape), is lower than the scalevalues for the other two sources. The patternsof weights and scale values are similar to thoseof Birnbaum and Stegner (1979).5

Figures 4 and 5 show six two-way interac-tions among the four variables with data andpredictions of the scale-adjustment averagingmodel in the same panels to allow examinationof the model's success at describing other as-pects of the data, Figure 4 shows three effectsof source's expertise. The points representmean judgments (averaged over other factors),

and the dashed lines are the predictions of thescale-adjustment averaging model. In panel A,judgments based on source's opinion alone areshown with open points. Solid points are av-eraged over trials in which both source's opin-ion and base rate were given. The effect ofsource's opinion is predicted to be greater whenopinion is alone than with base rate, becauseit is then proportional to wxb/(w + wxt,), whichis greater than wxb/(w + wxb + w*).

In panel B of Figure 4, the divergent Ex-pertise X Bias interaction is predicted by thescale-adjustment averaging model (see Birn-baum & Stegner, 1979), because expertisemultiplies the effect of bias, which is incor-porated as an adjustment in the scale value.The greater the expertise of a 'source, thegreater the effect of that source's bias.

In panel C of Figure 4, judgments are plottedagainst expertise of independent sources, witha separate curve for each level of base rate.The vertical spread between the curves is pre-dicted to diminish from left to right becauseit is proportional to w*/(w + w* + wxb), whichdecreases as the expertise of the source, wxb,increases. This finding is consistent with pre-vious results (Birnbaum, 1976; Birnbaum &Stegner, 1979; Birnbaum, Wong, & Wong,1976).

The scale-adjustment averaging model(Equation 5) implies no interaction betweenbase rate and source's opinion. However, thereis a slight divergent interaction in panel A ofFigure 5, F(4, 256) = 11.16. The same inter-action is more pronounced for the very highexpertise source, shown in Figure 3 (panel C).This finding is analogous to previous resultsand may be accounted for by the configural-weight, scale-adjustment averaging model ofBirnbaum and Stegner (1979), in which thelower valued source of information receivesgreater weight. It seems likely that the formof this interaction can be affected by changing

5 In an averaging model the scale values often have greaterrange than the response scale. The most extreme responseis assumed to be a weighted average of the initial impres-sion, which is usually near the center of the scale; and themost extreme scale value, which must often be beyondthe end point of the response scale. Scale values shouldnot be compared with the response scale, as a judgmentfunction intervenes (Mellers & Birnbaum, 1982).

800 MICHAEL H. BIRNBAUM AND BARBARA A. MELLERS

Mea

n P

roba

bilit

y Ju

dgm

ent

ro

bi

^

i*

o>

*-•!

a>

<o . A. Expertise x Opinion

^"Good-?' Shape"

/ — **"*

o o "~^««-_•^^ o ••"Bad •

**--. Shape"o

1 1 1 1

1 1

. B. Expertise x Bias .

_^ Buyer

. l*Sss:==="* Seller.

i i

. C. Expertise

i i

x Base Rate .

„.» 1 .5 -

1 1

VL VH VL VH

Expertise (hit rate -false alarm rate)

Figure 4. Effects of source's expertise. (Points ire mean judgments; curves are predictions of scale-adjustmentaveraging model. Panel A shows probability judgments as a function of the expertise of independent sourceswith a separate curve for each level of opinion; Open circles show judgments based on source only; innerdashed curves are predictions averaged over levels of base rate. Panel B shows judgments as a function ,ofsource's expertise with a separate curve for levels of source's bias. Panel C shows judgments as a functionof expertise of independent sources with a separate curve for each level of base rate. VL = very low; L =low; H = high; VH = very high.)

the judge's point of view (Birnbaum & Stegner,1979).

Panels B and C of Figure 5 show the in-teraction between base rate and bias and be-tween bias and opinion. In both cases, themodel does a fairly good job reproducing thedata. Because independents have higher weightthan biased sources of the same expertise (Ta-ble 1), the effect of an independent's opinion(vertical separation between curves in Figure5, panel C) is greater than that of either biasedsource. Furthermore, the weights explain whythe effect of base rate (slope in Figure 5, panelB) is less when the source is independent thanit is when sources are biased. In summary,despite some systematic deviations (panel A),the scale-adjustment averaging model generallygives a good fit to major features of the data.

Discussion

Base-Rate Fallacy

The term base-rate fallacy was coined todescribe the supposed tendency to neglectbase-rate information in favor of individuatinginformation. Many of the studies that claimedto find evidence of a base-rate fallacy presentedonly one problem to a given subject. Our con-dition with one judgment yields similar datato those of Tversky and Kahneman (1980,

1982), Bar-Hillel (1980), and others. However,the present results show that when judges arepresented with many problems, they utilizethe base rate and are sensitive to expertise andbias of the source. Fischhoffet al. (1979) alsofound an effect of base-rate information inwithin-subjects designs.

It is difficult to interpret the effect or lackof effect of variables that have been manip-ulated between subjects. The problem is thatwhen different subjects experience differentstimulus contexts, responses cannot be com-pared without taking the different contexts intoaccount. For example, by comparing judg-ments between subjects, it has been found thatthe number 450 can be judged greater thanthe number 550; however, one need not con-clude that 450 actually seems greater than 550,because when judgments are compared withinsubjects, one finds that 550 is indeed judgedgreater than 450 (Birnbaum, 1974b, Figure2). For a more complete discussion of between-and within-subjects designs, see Birnbaum(1982) and Mellers and Birnbaum (1982).

The case for a base-rate fallacy is also weak-ened by a theoretical analysis of the problemsused in previous research. Birnbaum (1983)noted that several of the problems (cab prob-lem, lawyer vs. engineer, light bulb) have acomplex Bayesian solution in which behaviorthat has been described as "neglect" of base

BAYESIAN INFERENCE AND BASE RATES 801

I 9

I '.7™O a

,D

•JT -5

•8 -4o •£5 .3

£ .2

§ •'r

- A. Base Rate-

_ ^

X- /• ^x

s S- s s- /_ X

i 1 1

». Opinion."Good

XShape*X

X M j

xx Shope"x^

i i

- B, Base Rate n. Bias

.Buyer"jilndep-

^f Sellervf

XJW

^1 1 1 1 1

• C. Bias x Opinion•

_. — •*• '"Good• r Shape"-i- _, -«"Bad-

« — *—~~ Shape"-

.

.

1 1 1

.1 .3 .5 .7 .9 .1 .3 .5 .7

Base Rate

Seller Indep Buyer

Bias (hit + false alarm)Figure 5. Effects of base rate, source's opinion, and source's bias. (Curves are predictions of scale-adjustmentaveraging model; points are data. Panel A shows mean judgments as a function of base rate with a separatecurve for each level of source's opinion, averaged over the eight sources. Panel B shows mean judgments asa function of base rate with a separate curve for each level of bias of the sources, averaged over source'sopinion and two levels of expertise. Panel C shows mean judgments as a function of source's bias, with aseparate curve for each level of source's opinion, averaged over base rate and source's expertise.)

rate may actually be consistent with a rationalBayesian utilization of base rate.

In sum, despite the fact that the result ofKahneman and Tversky, Bar-Hillel, and otherscan be replicated in the single-judgment task,there are three reasons for a skeptic to doubtthe claim that humans neglect base rate. Thethree reasons can be termed empirical, meth-odological, and theoretical. First, the empiricalevidence concerning the base-rate fallacy isinconsistent. The major result does not appearin within-subjects designs or with certain ver-sions of instructions (see Bar-Hillel, 1980;Birnbaum, 1983; Tversky & Kahneman,1982). Second, the methodological problem isthat comparison of numerical judgments be-tween groups of subjects assumes that themapping from subjective values to overt re-sponses is the same for all subjects, an as-sumption that has been challenged by researchon contextual effects (Birnbaum, 1974b; Mell-ers & Birnbaum, 1982). Third, there is a theo-retical problem with much of the previous re-search. The problems posed to the subject havebeen vague enough to allow several rationalsolutions, depending on how the subject in-terprets the problem. Birnbaum (1983) hasshown that the Bayesian solution to the cabproblem used in previous studies requires theassumption that the hit to false-alarm ratiowill be independent of base rate. However, re-search in signal detection shows that this ratio

can increase as base rate decreases; therefore,Bayes' theorem does not necessarily imply aneffect of base rate for problems such as thecab problem.

Bayes' Theorem and Conservatism

Bayes' theorem gives a better first approx-imation to the data than would the theory thatsubjects disregard base rate and source char-acteristics. For example, if subjects ignoredbase rate, the data curves in Figures 3 (panelC) and 5 (panels A and B) would be horizontal.If subjects ignored source characteristics, thecurves marked VH and VL would coincide inFigure 3 (panel C), curves in Figure 4 wouldbe horizontal, and the curves in Figures 4(panel B) and 5 (panel B) would coincide.Bayes' theorem correctly predicts the directionof the major effects of base rate and sourcecharacteristics.

The objective form of Bayes' theorem (Fig-ure 3, panel A) predicts judgments that aretoo extreme. The present data show a patternof deviations between objective Bayesian pre-dictions and judgments that might be de-scribed at first glance as "conservatism," as inEdwards (1968). For example, when the baserate is .9 and the very high expertise sourcesays "good shape," the Bayesian prediction is.99, whereas the mean judgment is only .92(Figure 3, panel C). The subjective form of

802 MICHAEL H. BIRNBAUM AND BARBARA A. MELLERS

Bayes' theorem (Figure 3, panel B) gives abetter second approximation to the data. Byallowing the subjective impact of evidence tobe less than the objective probabilities dictate,the subjective form of Bayes' theorem providesa better fit to the judgments.

However, even the subjective form of Bayes'theorem does not provide a satisfactory ac-count of all of the data. The misperceptionand response bias interpretations of conser-vatism will not suffice to explain how increas-ing the evidence can decrease the response,and vice versa, (as in Figure 3, panel C). Forexample, when the very high expertise inde-pendent source says the car is in good shape,the Bayesian prediction is .9 and the meanresponse is .87. When subjects are also in-formed that the base rate is .7, Bayes' theoremimplies an increase in the probability the carwill last (to .96). However, the mean responsewas only .81, a decrease in probability. Thisexample and others indicate that subjects arenot simply being conservative: They can ac-tually do the opposite of that implied by eitherform of Bayes' theorem.

The scale-adjustment averaging model canexplain the above results as well as the overallconservatism. The model implies that addi-tional faint praise can actually hurt the overallevaluation if the original evidence was alreadystrong. Thus, humans appear to aggregate ev-idence by judging the average strength of ev-idence rather than by using Bayesian algebrain which a lot of weak evidence can imply astrong conclusion. The results in Figure 3(panel C) are directly analogous to findings inintuitive regression (Birnbaum, 1976).

Conceptually, the scale adjustment aver-aging model can be interpreted as an attemptby the subject to strike a balance among thevalues of the evidence (Birnbaum & Stegner,1979). The value of the evidence depends onthe source's report and is adjusted accordingto the source's bias. This value is then mul-tiplied by the source's expertise (or discrim-inability). These premises explain how exper-tise amplifies bias in Figure 4 (panel B). Byinterpreting the response as a balance betweenthe source's report and the base rate, the modelexplains how weak but favorable evidence candecrease the overall response (Figure 3, panelC), and it explains how increasing source's

expertise simultaneously amplifies the effectof source's opinion (Figure 4, panel A) anddecreases the effect of base rate (Figure 4,panel C).

Research in Source CredibilityThe present results fit in well with related

research on how judges combine informationfrom sources that vary in credibility (see Table2). Birnbaum and Stegner (1979) found thatthe scale-adjustment averaging model gave agood description of estimates of the value ofused cars based on estimates provided bysources. It is interesting that the same modeldescribes both estimates of value and proba-bility inferences. This finding appears consis-tent with Shanteau (1970) who concluded thatjudges use the same model for probability es-timates and inferences based on samples drawnfrom an urn.

The scale-adjustment averaging model canbe viewed as an extention of earlier averagingmodels of information integration (N. An-derson, 1971; Rosenbaum & Levin, 1968,1969; Shanteau, 1975). It is important to notethat there are a variety of different averagingmodels and there has been considerable theo-retical discussion of how to represent devia-tions from early forms of the averaging model(T. Anderson & Birnbaum, 1976; Birnbaum,1974a; Birnbaum & Stegner, 1979; Riskey &Birnbaum, 1974). At present, a configural-weighted, scale-adjustment averaging modelappears to provide a consistent account of awide variety of data (Birnbaum, 1982).

There now exists an array of consistent re-sults linked by analogy, as in Table 2. Thesame model has given a good account of datain the following tasks: intuitive numerical pre-dictions, based on independent numerical cuesof known correlation with the criterion (Birn-baum, 1976); likableness of hypothetical per-sons described by adjectives provided by ac-quaintances of varied length of acquaintancewith the target (Birnbaum et al., 1976); judg-ments of value of used cars based on bluebook value and the opinions of sources whoexamined the cars (Birnbaum et al., 1976;Birnbaum & Stegner, 1979); predictions of IQsof adopted children based on IQs of biologicaland adoptive parents and environmental so-cioeconomic status (Birnbaum & Stegner,

BAYESIAN INFERENCE AND BASE RATES 803

Table 2Analogies Among Variables in Source Credibility Research

Judgment Source expertise (w)

Numerical predictions'

Likableness of hypothetical peopleb

Value of used cars'"

Predicted IQd

Predicted exam performance'

Probability inferences: P(L\"G")f

Numerical cues

Adjectives

Source's estimates; Blue book value

IQs of biological and adoptiveparents; SES of environment

IQ test; Study time

Sources report, "G"; Base rates,P(L)

Cue-criterion correlations

Length of acquaintance

Mechanical knowledge

Individual differences in importanceof heredity vs. environment

Reliability of IQ and study timemeasures

Hit and false-alarm rates of sources;P("G"\L) and P("G"|£)

Note. Each row represents a different experimental situation.' Birnbaum (1976). b Birnbaum, Wong, & Wong (1976).c Birnbaum & Stegner (1979). " Birnbaum & Stegner (1981).'Surber (1981). 'Present experiment: For inferences, P(L) = probability the car will last; "G" is the report of thesource ("good shape").

1981); and predictions of performance basedon unreliable or reliable measures of IQ andstudy time (Surber, 1981).

In these studies, weight is used to representcue-criterion correlation, source's length ofacquaintance with the target person, source'smechanical expertise, perceived importance ofheredity or environmental information, andreliability of information (length of test) of IQtest or study time sample, respectively. In thepresent case of probability inferences, weightis mostly a function of source's hit rate minusfalse-alarm rate. Thus, weight is based on asource's perceived ability to discriminate thetrue states of nature.

In this set of analogies, the scale value ofinformation depends on the source's message:the value of the cue, the likableness of theadjective, the source's estimate, the value ofparents' IQs, and the value of IQ test or studytime, respectively. In the present study, scalevalue depends on the source's opinion (goodor bad shape).

In the scale-adjustment averaging model, thesource's bias also affects the scale value. Inboth the present study and that of Birnbaumand Stegner (1979), it was found that the effectof bias on the judgments was greater for sourcesof higher expertise. Thus, when a source isbelieved to be biased, judges appear to adjustthe value of the source's message before com-bining it with other information.

Conclusions

In sum, subjects utilize base-rate infor-mation in the within-subject design but givedata comparable to previous findings whenthey are given only a single problem. Althoughsubjects utilize the base rate, neither the.ob-jective nor subjective versions of Bayes' theo-rem gives a. satisfactory account of the judg-ments. Instead, a scale-adjustment averagingmodel gives a reasonably accurate account ofthe data. In this model, base-rate informationis averaged with information from othersources. The scale value of the source's opiniondepends on the report and the bias of thesource, whereas the weight of the source de-pends primarily on the source's expertise.

ReferencesAnderson, N. H. Integration theory and attitude change.

Psychological Review, 1971, 78, 171-206.Anderson, T., & Birnbaum, M. H. Test of an additive

model of social inference. Journal of Personality andSocial Psychology. 1976, 33, 655-662.

Ajzen, I. Intuitive theories of events and the effects of baserate information on prediction. Journal of Personalityand Social Psychology, 1977, 35, 303-314.

Bar-Hillel, M. The base-rate fallacy in probability judg-ments. Acta Psychologica, 1980, 44, 211-233.

Birnbaum, M. H. The nonadditivity of personalityimpressions. Journal of Experimental Psychology, 1974,702, 543-561. (Monograph) (a)

Birnbaum, M. H. Using contextual effects to derive psy-chophysical scales. Perception & Psychophysics, 1974,15, 89-96. (b)

804 MICHAEL H. BIRNBAUM AND BARBARA A. MELLERS

Birnbaum, M. H. Intuitive numerical prediction. AmericanJournal of Psychology, 1976, 89, 417-429.

Birnbaum, M. H. Controversies in psychological mea-surement. In B. Wegener (Ed.), Social attitudes andpsychophysical judgment. Hillsdale, N.J.: Erlbaum,1982.

Birnbaum, M. H. Base rates in Bayesian inference: Signaldetection analysis of the cab problem. American Journalof Psychology, 1983, 96, 85-94.

Birnbaum, M. H., & Stegner, S. E. Source credibility insocial judgment: Bias, expertise and the judge's pointof view. Journal of Personality and Social Psychology,1979, 37, 48-74.

Birnbaum, M. H., & Stegner, S. E. Measuring the im-portance of cues in judgment for individuals: Subjectivetheories of IQ as a function of heredity and environment.Journal of Experimental Social Psychology, 1981, 17,159-182.

Birnbaum, M. H., Wong, R., & Wong, L. Combining in-formation from sources who vary in credibility. Memory& Cognition, 1976, 4. 330-336.

Carroll, J. S., & Siegler, R. S. Strategies for the use of baserate information. Organizational Behavior and HumanPerformance, 1977, 19, 392-402.

Chandler, J. P. STEPIT: Finds local minima of a smoothfunction of several parameters. Behavioral Science, 1969,14, 81-82.

Edwards, W. Conservatism in human information pro-cessing. In B. Kleinmuntz (Ed.), Formal representationof human judgment. New York: Wiley, 1968.

Fischhoff, B., Slovic, P., & Lichtenstein, S. Subjective sen-sitivity analysis. Organizational Behavior and HumanPerformance, 1979, 23, 339-359.

Kahneman, D., & Tversky, A. On the psychology of pre-diction. Psychological Review, 1973, 80, 237-251.

Lyon, D., & Slovic, P. Dominance of accuracy informationand neglect of base rates in probability estimation. AdaPsychologica, 1976, 40, 287-298.

Manis, M., Dovalina, I., Avis, N., & Cardoze, S. Base ratescan affect individual predictions. Journal of Personalityand Social Psychology, 1980, 38, 231-248.

Mellers, B. A., & Birnbaum, M. H. Loci of contextualeffects in judgment. Journal of Experimental Psychology:Human Perception and Performance, 1982,5, 582-601.

Peterson, C. R., & Beach, L. R. Man as an intuitive statis-tician. Psychological Bulletin, 1967, 68, 29-46.

Pitz, G. F. Bayes' theorem: Can a theory of judgment andinference do without it? In F. Restle, R. M. Shiflrin,

N. J. Castellan, H. R. Lindman, & D. B. Pisoni (Eds.),Cognitive theory (Vol. 1). Hillsdale, N.J.: Erlbaum, 1975.

Riskey, D. R., & Birnbaum, M. H. Compensatory effectsin moral judgment: Two rights don't make up for awrong. Journal of Experimental Psychology, 1974,103,171-173.

Rosenbaum, M. E., & Levin, I. P. Impression formationas a function of source credibility and order of presen-tation of contradictory information. Journal of Person-ality and Social Psychology, 1968, 10, 167-174.

Rosenbaum, M. E., & Levin, I. P. Impression formationas a function of source credibility and the polarity ofinformation. Journal of Personality and Social Psy-chology, 1969, 12, 34-37.

Schum, D. A. Sorting out the effects of witness sensitivityand response-criterion placement upon the inferentialvalue of testimonial evidence. Organizational Behaviorand Human Performance, 1981,27, 153-196.

Shanteau, J. An additive model for sequential decisionmaking. Journal of Experimental Psychology, 1970,85,181-191.

Shanteau, J. Averaging versus multiplying combinationrules of inference judgment. Acta Psychologica, 1975,39, 83-89.

Slovic, P., & Lichtenstein, S. Comparison of Bayesian andregression approaches to the study of information pro-cessing in judgment. Organizational Behavior and Hu-man Performance, 1971, 6, 649-744.

Surber, C. F. Effects of information reliability in predictingtask performance using ability and effort. Journal ofPersonality and Social Psychology, 1981, 40, 977-989.

Troutman, C. M., & Shanteau, J. Inferences based onnondiagnostic information. Organizational Behaviorand Human Performance, 1977, 19, 43-55.

Tversky, A., & Kahneman, D. Causal schemas in judgmentsunder uncertainty. In M. Fishbein (Ed.), Progress insocial psychology (Vol. 1). Hillsdale, N.J.: Erlbaum, 1980.

Tversky, A., & Kahneman, D. Evidential impact of baserates. In D. Kahneman, P. Slovic, & A. Tversky (Eds.),Judgment under uncertainty: Heuristics and biases.Cambridge University Press, 1982.

Wallsten, T. Conjoint-measurement framework for thestudy of probabilistic information processing. Psycho-logical Review, 1972, 79, 245-260.

Received February 10, 1982Revision received February 24, 1983 •