Psychological conceptions of randomness

18
Journal of Behavioral Decision Making, Vol. 2, 221-238 (1989) Psychological Conceptions of Randomness PETER AYTON City of London Polytechnic, U. K. ANNE J. HUNT Tulane University, New Orleans, U.S.A. and GEORGE WRIGHT Bristol Business School, U. K. ABSTRACT This article presents a critique of the concept of randomness as it occurs in the psychological literature. The first section of our article outlines the significance of a concept of randomness to the process of induction; we need to distinguish random and non-random events in order to perceive lawful regularities and formulate theories concerning events in the world. Next we evaluate the psychological research that has suggested that human concepts of randomness are not normative. We argue that, because the tasks set to experimental subjects are logically problematic, observed biases may be an artifact of the experimental situation and that even if such biases do generalise they may not have pejorative implications for induction in the real world. Thirdly we investigate the statistical methodology utilised in tests for randomness and find it riddled with paradox. In a fourth section we find various branches of scientific endeavour that are stymied by the problems posed by randomness. Finally we briefly mention the social significance of randomness and conclude by arguing that such a fundamental concept merits and requires more serious considerations. KEY WORDS Randomness Subjective probability Induction Rationality Biases INTRODUCTON Uncertainty is a ubiquitious aspect of life and we all have to cope with it somehow. This possibly explains why investigations into human conceptions of chance are often credited with greater importance than can be attributed to their purely intrinsic worth; presumably such conceptions are fundamental to some crucial cognitive faculties - particularly that of reasoning under conditions of uncertainty. So, when the claim is made (and it frequently is) that people demonstrably suffer from all sorts of misconceptions regarding the nature of chance we should pay close attention. A plausible and commonly drawn inference is not merely that people are poor at tasks contrived by psychologists but, if the research is at all relevant to the world outside the laboratory, that human thought and behaviour are sub-optimal or irrational when compared with what they ought, logically, to be. 0894-3257/89/040221-18$09.00 8 1989 by John Wiley & Sons, Ltd. Received 27 July 1987 Revised 16 January 1989

Transcript of Psychological conceptions of randomness

Journal of Behavioral Decision Making, Vol. 2, 221-238 (1989)

Psychological Conceptions of Randomness

PETER AYTON City of London Polytechnic, U. K.

ANNE J. HUNT Tulane University, New Orleans, U.S.A.

and

GEORGE WRIGHT Bristol Business School, U. K.

ABSTRACT

This article presents a critique of the concept of randomness as it occurs in the psychological literature. The first section of our article outlines the significance of a concept of randomness t o the process of induction; we need to distinguish random and non-random events in order to perceive lawful regularities and formulate theories concerning events in the world. Next we evaluate the psychological research that has suggested that human concepts of randomness are not normative. We argue that, because the tasks set to experimental subjects are logically problematic, observed biases may be an artifact of the experimental situation and that even if such biases d o generalise they may not have pejorative implications for induction in the real world. Thirdly we investigate the statistical methodology utilised in tests for randomness and find it riddled with paradox. In a fourth section we find various branches of scientific endeavour that are stymied by the problems posed by randomness. Finally we briefly mention the social significance of randomness and conclude by arguing that such a fundamental concept merits and requires more serious considerations.

KEY WORDS Randomness Subjective probability Induction Rationality Biases

INTRODUCTON

Uncertainty is a ubiquitious aspect of life and we all have to cope with it somehow. This possibly explains why investigations into human conceptions of chance are often credited with greater importance than can be attributed to their purely intrinsic worth; presumably such conceptions are fundamental to some crucial cognitive faculties - particularly that of reasoning under conditions of uncertainty. So, when the claim is made (and it frequently is) that people demonstrably suffer from all sorts of misconceptions regarding the nature of chance we should pay close attention. A plausible and commonly drawn inference is not merely that people are poor at tasks contrived by psychologists but, if the research is at all relevant to the world outside the laboratory, that human thought and behaviour are sub-optimal or irrational when compared with what they ought, logically, to be.

0894-3257/89/040221-18$09.00 8 1989 by John Wiley & Sons, Ltd.

Received 27 July 1987 Revised 16 January 1989

222 Journal of Behavioral Decision Making Vol. 2, Iss. No. 4

Research exploring human conceptions of randomness can be construed as following such a pattern. Psychological experiments (reviewed by Tune, 1964a; 1964b and Wagenaar, 1972) suggest that, in general, people are not capable of generating responses that pass tests of randomness. This has been attributed, in part, to a faulty concept of randomness because of evidence suggesting that subjects cannot correctly distinguish between random and non-random patterns of data (Falk, 1981; Green, 1982; Tiegen, 1984; Wagenaar, 1970). Nonetheless some researchers have explored the scope for explaining some inadequacies of human random response generation as a function of limitations in performance rather than competence (e.g. Baddeley, 1966; Wiegersma, 1982). A recent paper by Lopes (1982) has pointed to the potential significance of the role of a concept of randomness in inductive reasoning. The ability to notice contingencies in our environment depends on our ability to detect non-randomness or pattern against a background of randomness or noise. It follows, then, that a flawed concept of randomness will lead to less than perfect performance in inductive reasoning. However, Lopes suggests a number of reasons why we should doubt that the story, even thus far, is so simple.

One reason derives from a consideration of the mysterious and somewhat paradoxical nature of the process of induction. Induction is often discussed as if it comprised two stages. Firstly, theories are created and secondly, via observation of data, these theories are tested and thereby justified or rejected. More than two hundred years ago though, Hume (1739/1969) pointed out that to generalise from particular observations to general laws cannot be logically justified. He showed that there can be no valid arguments that permit us to establish ‘that those instances, of which we have had no experience, resemble those, of which we have had experience’. So, no matter how strongly the evidence supports our current beliefs, the possibility always exists that they are incorrect. The choice of theory and, more fundamentally, the framework for categorising events - which is a vital prerequisite for noticing anything but chaos in the first place - are always arbitrary. Consequently the possibility of error is inherent in the process; it is always possible that some new evidence will prove any existing theory wrong. In fact, according to Popper, this is a necessary defining characteristic of a scientific theory - that there are conceivable circumstances which would show the theory to be false. Lopes argues that this property of induction makes it difficult to evaluate whether the process is being performed rationally.

It would seem quite significant therefore that although there is a quite substantial body of psycho- logical research suggesting that performance on the second of these two stages is suboptimal or irrational (e.g. Wason’s (1966) reasoning experiments with the four card problem) there is relatively little to read about the standard of performance on the first of these two stages. For if induction (or this part of it) cannot be justified logically, then what criteria can be used to assess whether it is being performed rationally? In this context we will, later in this article, suggest that a major problem with the research purporting to evaluate performance on the second, hypothesis testing, stage of the inductive process is that it assumes that it is valid to treat this stage as an isolatable independent process. One possible explanation for the apparently poor performance observed with such tasks is that it is normally a product of interaction with the operation of the first stage - which is often not fully considered or controlled in experimental investigations.

Grounds for doubting that the competence of inductive reasoning is adversely influenced by inappropriate notions of randomness stem from a consideration of the process in action. One quite consistently reported finding from the psychological research is that subjects asked to produce random sequences do not generate enough long runs (i.e. less than a statistically representative number) and, similarly, when asked to select random sequences avoid those with long runs. This bias against repetition has been termed negative recency. In a simulation Lopes showed that such a decrement need not lead to a severely impaired performance as measured by hit rate on a task involving discrimination of randomly and non-randomly generated event sequences. Further, anyone operating with such a bias would actually do better in detecting types of non-randomness that are biased towards repetition than someone whose suspicions of the presence of a non-random process were aroused more symmetrically;

P. Ayton et al. Psychological Conceptions of Randomness 223

that is if the non-representativeness of runs per se, irrespective of whether there were too few or too many, was a cue. As she comments: ‘. . . in order to evaluate fairly whether people’s false expectations about alternation are helpful or harmful over a lifetime’s opportunities for induction, one would have to know whether in the world non-random events are more often biased towards alternation or towards repetition - which raises tantalising, but probably unanswerable, questions concerning the natural ecology of non-randomness’ (pp. 633-634).

Thus, it would seem possible that people’s apparently biased concepts may perhaps be more fairly described as being ‘tuned’ to capitalise on properties of our environment. So, from an ecological viewpoint, perhaps repetition of outcomes is actually correctly considered to be more likely than alternation in non-random sequences. Or, it may be that the utilities associated with non-random events are structured so that it is more cost-effective to notice those non-random processes biased towards repetition at the expense of missing some of those non-random processes that are biased towards alternation. Certainly it makes sense to expect that the information processing characteristics of a fairly successful adapted species should reflect something of the nature of the environment that the species is adapted to. The alternative view is that the conceptual biases affecting information processing are not reflective of the environment and thus instrumentally degrade performance. Nevertheless some sort of conceptual orientation is necessary before induction can proceed which can never be logically justified. Labelling this as a bias without reference to the actual, rather than the potential, consequences of adopting such an orientation can be viewed as shortsighted.

It is apparent though that, although distinguishing pattern from randomness is essential for induction, there are, a priori, no particular rational criteria by which this act may be achieved. Therefore any conception of randomness, or of pattern (i.e. nonrandomness), that can be inferred from any individual’s attempts at induction is no more or no less rational than any other possible conception.

To appreciate this somewhat paradoxical state of affairs it may help to consider Popper’s (1972) description of the problem. In considering how knowledge of regularities in the world is obtained, Popper argues that we cannot start with pure observation: ‘Clearly the instruction, ‘Observe!’ is absurd . . . Observation is always selective. It needs a chosen object, a definite task, an interest, a point of view, a problem. And its description presupposes a descriptive language, with property words; it presupposes similarity and classification, which in its turn presupposes interests, points of view, and problems’. (ibid p. 46).

Psychologists are of course quite familiar with the notion that perception of the world requires perceptual organisation, sometimes referred to as top down or concept driven processes. For Popper such an assumptive system constitutes a theoretical position with regard to the data. He also notes that the ability to learn from experience requires the recognition of a set of circumstances as being similar in some relevant respect to some previously encountered situation. Such recognition also implies the existence of a theoretical framework. To illustrate his argument Popper imagines building an induction machine. Anticipating a strategy now commonly used in artificial intelligence research, he suggests that it would be quite possible to build such a machine to operate in a simplified ‘world*. Yet such a machine could not function without some inbuilt assumptions to guide it:

‘In constructing an induction machine we, the architects of the machine, must decide a priori what constitutes its ‘world’; what things are to be taken as a similar or equal; and what kind of ‘laws’ we wish the machine to be able to ‘discover’in its ‘world’. In other words we must build into the machine a framework determining what is relevant or interesting in its world: the machine will have its ‘inborn’ selection principles. The problems of similarity will have been solved for it by its makers who thus have interpreted the ‘world’ for the machine.’ (ibid p. 48)

Before any observation can occur then a particular theoretical orientation must be adopted. As of necessity no data can be observed before any theory is offered then no theory can a priori be judged more appropriate than any other.

224 Journal of Behavioral Decision Making Vol. 2, Iss. No. 4

We have taken some trouble to spell out this argument because, we believe it provides a valuable perspective with which to view the psychological research on subjective randomness. In much of the work in this area psychologists have acted as if valid objective criteria are available with which to judge the adequacy of subjective conceptions of randomness and nonrandomness. The preceding discussion indicates that these criteria are, strictly speaking, no more justifiable than any which the subject may use.

The idea that the criteria used for assessing the rationality of human reasoning are of questionable validity has recently been the subject of some controversy (e.g. Beach, Christensen-Szalanski and Barnes, 1987; Cohen 1981, 1983). In the case of pure induction, where a naive individual has no idea what to be on the look out for or theorise about, the only recourse is some arbitrary scheme; otherwise the task cannot be attempted. The aims of induction are, in a sense, quite opposite to tasks which require an individual to judge the randomness of some sequence or configuration. In the latter case the idea, presumably, is that the sequence or configuration should have no features or characteristics. In both cases however the same constraint applies: one cannot determine whether a sequence or configuration contains or does not contain patterns until the definition of a pattern has been set. Yet, in many published investigations, psychologists have set their subjects the task of generating or recognising random sequences without explicitly defining what sort of sequence would count as patterned, and therefore nonrandom, and then do not demur from passing judgement on the adequacy of the performance of the task.

As we shall see though there is some evidence that some kind of definition of pattern is often implicitly offered to subjects in these experiments and that in any case implicit definitions of pattern and randomness are inherent in the methods used to evaluate their responses.

THE PSYCHOLOGICAL RESEARCH

Experimental investigations of subjective randomness have attended to a number of different aspects and have utilised a range of tasks to examine a variety of hypotheses. However, it is nonetheless possible to identify some common assumptions and conclusions in this work. The most general reported finding is that human subjects are not good at either generating or recognising random patterns or sequences. Specifically, as we have already mentioned, subjects do not consider sequences with a representative distribution of runs as being random; they prefer to nominate those with more alternations (fewer long runs) than would be expected by chance.

Our criticisms of this research derive from a consideration of randomness as a mathematically, as well as a psychologically, problematic concept.

Firstly, we note that some of the observed bias may be attributable to an instructional bias delivered by the experimenter which in turn seems to stem from misconceptions held by the experimenter. In many of the reports where the instructions are recorded we can identify logically redundant phrases which would seem, potentially, to bias subjects in their production or recognition strategies. For example, Baddeley (1966) after specifying an adequate instructional set '. . . imagine they were drawing letters from a hat one at a time, calling them out, and replacing them.. .'goes on to state 'it was further pointed out that such a sequence would be completely jumbled and would not therefore be likely to comprise English words or alphabetic sequences such as ABC or XYZ' (p. 119).

The second quote would appear to act as a signal to the subject, warning that they should not produce sequences which had any identifiable patterns. However, it is not the case that random generators do not produce identifiable patterns; occasionally, in fact inevitably, truly random generators would produce output which resembles that of a systematic, i.e. non-random, device. By telling their subjects that they should attempt to produce sequences which appear to be jumbled or orderless the experimenters are in fact dictating explicitly that certain outputs are not acceptable and thereby the subjects are being required to simulate a particular type of non-random process.

P. Ayton et al. Psychological Conceptions of Randomness 225

This kind of instructional bias can be seen in various reports. For example Wagenaar (1971) records: ‘it was checked carefully whether they understood that any systematic trend would make the sequence too predictable’.

Cook (1967) explicitly asks subjects to decide which strings are more patterned and treats the responses as if they are judgements of non-randomness.

Teigen (1984) asking subjects to place imaginary stones in a random pattern asked his subjects to imagine that they were attempting to ‘spoil the tracks’ that might be indicated by a ‘man-made’ pattern.

It hardly seems fair to the subjects to evaluate their efforts according to one definition when they are being asked, albeit implicitly, to perform according to another. Experimenters may conclude that the subjects are operating inadequately within a given situation but it seems possible that the subjects may not be actually operating under the same set of assumptions that the experimenter believes he has defined for the subjects. This potential instructional bias seems to us symptomatic of a deeper tacit uneasiness among investigators of the lay concept of randomness. Running somewhat hauntingly through the published reports on the psychology of randomness is the disconcerting notion that it may not be reasonable to judge the competence of experimental subjects when, strictly speaking, the task they are set requires them to do what cannot, logically, be justified. Any doubts concerning the purely logical status of the concept, and consequently the validity of the task demands experimenters place on their experimental subjects are, it would seem, well founded. In introducing a section in her article which describes how philosophers and mathematicians have attempted to define randomness Lopes comments:

‘To conclude . . . that naive people’s conceptions of randomness are poor in general implies that randomness is clearly defined and well understood by those who are not naive. Nothing could be further from the truth’. (ibid p. 628).

More recent research has acknowledged that empirical investigation of subjective concepts of randomness is inescapably problematic. Diener and Thomson’s (1985) task required subjects to determine whether sequences of Hs and Ts presented on a video monitor were produced by ‘tossing a fair coin’. They comment that:

‘Although it might be argued that these instructions placed the subjects in a logically untenable position, none of the subjects seemed to regard the task as unreasonable’. (ibid p. 446)

The results they obtained from this experiment are interesting but, nonetheless, one may question whether their findings would generalise to all instructional sets or inductive frameworks adopted by subjects.

Further argument in this vein is given by Macdonald (1986) who has provided a plausible account of some of the ways in which subjects may perceive the meaning of natural language requests by invoking assumptions about the motivation for the request. He argues that experimental subjects will presume that there is some underlying reason for the request that this may guide the way they conceptualise it and respond to it. Macdonald shows how tasks involving reasoning with likelihoods are potentially open to critical reconstrual; specifically the sample space may be undefined or ambiguous. We have argued here that tasks which require subjects to make judgements of randomness, such as Diener and Thomson’s, risk obtaining results which merely reflect the implicit assumptions adopted by the subject in order to perform at all. So in this case subjects may make crucial assumptions about the proportion of sequences which come from ‘a fair coin’ and the nature of the alternative method(s) (if any) that might have been used to generate the sequences. The fact that each of the presented sequences was twenty events long may also induce certain conceptions about the meaning of the task.

Recently Lopes and Oden (1987) have found that variations in the instructional set given to subjects can produce variation in performance of a task which required subjects to decide whether strings of noughts and ones were produced by a random machine or a non random machine. Their

226 Journal of Behavioral Decision Making Vol. 2, Iss. No. 4

results emphasised the significance of possessing information about the nature of the alternative hypothesis when making judgments of randomness. Subjects not told anything about the characteristics of the non random process were worse at the task than informed subjects. For eight (though not all) of their subjects the mere mention that the non random machine tended to alternate characters too often was sufficient to undo their strong preconceptions concerning repetition and non randomness and to produce performance near theoretically optimal levels.

There is some empirical evidence which suggests that subjective concepts of chance and randomness are particularly sensitive to subtle contextual cues and assumptions that may be conveyed by the experimental situation. An experimental demonstration of the influence of subjects’ tacit assumptions about the task was provided by Winefield (1966), who examined the importance of convincing subjects required to predict the output of a random mechanism that they really were operating with a randomly generated situation. He showed that, in a card guessing task, the usual negative recency disappeared when subjects could see that the sampled card was being replaced in the pack which was then shuffled - so ensuring the subject’s belief in the independence of each draw from the pack.

When attempting to anticipate a purportedly random system any suspicion that it is really systematic might well be very influential in affecting the subject’s guessing strategy as he or she will have nothing to lose by experimenting. For example, a person erroneously suspecting that a coin is biased would have the same expected hit rate (50%) if they called all heads, all tails, or a jumbled (or patterned) mixture of the two. On the other hand, if the coin really were biased in some way, failure to notice would probably be characterised by the experimenter as suboptimal and, particularly in a competitive environment, prove maladaptive.

Research reported by Van den Brink and Van Heerden (1980) has demonstrated that subjects can be influenced to change their conception of randomness by subtle instructional variables. Showing subjects different illustrative graphical representations of the random ‘walk” produced different estimates of the likely outcome of a series of coin tossing. (Jakes and Hemsley (1986) have reported data which suggests that personality traits may influence perceptions of patterns in ‘unpatterned’visual displays). So it would appear that, for a given individual, randomness may not be a fixed concept; what counts as represen- tative of a random process may be altered. Plainly, the assumptive framework adopted by experimental subjects crucially influences their judgements and is susceptible to rather subtle influences.

Even experimental subjects who have received no explicitly redundant and potentially bias inducing instructions might reasonably be forgiven for assuming that what is required by the experimenter is a sequence that appears random according to some social convention. Such an understanding might forbid the production of sequences which by virtue of their obvious patterns could have been produced by some simple algorithm. Indeed this is one conception of randomness favoured by some philosophers (cf. Chaitin 1975; Martin-Lof 1966). For example, a subject instructed to generate a random sequence of one hundred 0 s and Is might fully appreciate that one hundred 0 s is just as likely as any other sequence but suspect they would be considered unco-operative (cf. Grice, 1975) or even pathological if they produced any sequence containing very long runs. Such a policy would result in negative recency.

Implications for human rationality in the real world Of course, negative recency in guessing is a quite rational strategy for some circumstances. For example, Winefield’s (1966) study referred to above also experimentally demonstrated its appropriate manifest- ation by human subjects guessing outcomes in a situation where sampling from a set of playing cards gradually exhausted the set (sampling without replacement). In this case sampling provides information about the contents of the set but also changes the contents; there is now one fewer of the type just drawn

I In this case the walk was represented by a graph where, on the y-axis, each head scored + I and each tail - 1 . The x-axis was the cumulative number of tosses, I , 2, 3, . . , n. The subject’s task is to estimate, for a given number of tosses, how many times the plotted line is most likely to cross y=O.

P. Ayton et al. Psychological Conceptions of Randomness 227

and thus a weaker chance of drawing that type next time. Moreover when subjects are not told of the relative proportions of the types one might even expect a degree of positive recency. Drawing many reds in a row might increase an estimate of the proportion of reds in the pack and thus increase the tendency to predict red in future. However, if subjects know, or suspect, that they are sampling from a finite population, then this tendency should, logically, diminish.

Perhaps the behaviour of the systems producing variable output in the real world that individuals commonly encounter and attempt to anticipate resembles, more often than not, sampling without replacement. Such systems might plausibily have in common certain characteristics such as the tendency not to show a statistically representative number of repetitions (particularly after long runs) because of the ubiquitous nature of what we might generally refer to as fatigue effects. After all the second law of thermodynamics, that entropy increases, is consistent with that view. Thus systematic behaviour in the long run is unlikely because all systems tend to break down. So, whatever apparently principled outcome is exhibited becomes less probable for the future. This line of speculation suggests that negative recency may be a manifestation of successful adaptation and therefore good news about human performance rather than bad. Of course in the real world ‘given’ information analogous to the number of cards in the deck, the nature and number of possible types, the relative proportions of types etc may not be available or may itself be of an uncertain nature. As a consequence, a large number of experiments purportedly showing that humans are illogical or poor intuitive statisticians may be examining performance on inappropriate tasks. For example, in the realm of deductive reasoning, Cohen (1981) has suggested that the poor performance on Wason’s (1966) selection task be considered as an example of a ‘cognitive illusion’. The existence of visual illusions is not usually construed as implying a generally poor level of visual competence. Analogously, Cohen argues, suboptimal performance on a reasoning task - particularly when it is known, as in this case, that performance is much better on a similar task which is logically isomorphic - might be similarly interpreted. It remains possible that, within the usual naturally occurring framework for human induction, performance is highly successful.

TESTS FOR RANDOMNESS?

The statistical evaluation of the randomness of the output generated by experimental subjects can be shown to depend on problematic procedures. The underlying assumption behind the various methods employed is that the sequences produced should be representative of the output of a random device in the long run. However, as Horwich (1982) has pointed out, anything less than an infinite series may legitimately not be representative of the long term output of a stochastic process. Plainly human beings, with their limited life spans, never encounter infinite series and so it is not possible to verify beyond all doubt that any given real sequence is, or is not, random. For example a string of one hundred 0 s would fail the tests of randomness applied by experimental psychologists. However, it is quite possible, and indeed in the long run inevitable, that a random device would generate such a sequence.

Related arguments in this respect can be perceived in the work of Kahneman and Tversky (1972) and their discovery and discussion of the representativeness heuristic. Kahneman and Tversky noted that when their subjects were asked to identify chance outcomes they presumed that ‘orderless’ sequences were more likely than apparently systematic ones because the former were more representative of chance outcomes; systematic patterns being more representative of systematic processes. Tversky and Kahneman (1971) coined the term ‘the law of small numbers’ to christen the misconception that even small samples of output should reflect the properties of the parent distribution - for example the disorder expected from a random process. A point that should be emphasized however is that, unless once can justify an a priori conception to define disorder, the representativeness heuristic is not, in

220 Journal of Behavioral Decision Making Vol. 2, Iss. No. 4

principle, any more valid for analysing large samples for randomness. This is because, for stochastic processes where the elementary events are equally likely (eg. coin tossing), an apparently systematic sequence involving many thousands of events is, of course, no less likely than any other sequence of equal length. Although one expects larger samples to be more accurate reflections of the underlying process, it is still an error to assume that any sample of output, however large, that does not exhibit ‘representativeness’ of some selected criteria cannot be the produce of a random generator. It might be claimed that, by using the binomial theorem, one could calculate the conditional probability of occurrence of some configuration within a sequence given that the process that generated it was random

. - and then use this probability as a measure of confidence in the hypothesis that the sequence was generated by a random process. Unfortunately though, without a favouring of one, or at least a limited subset of the alternative hypotheses this probability is equivocal with respect to the issue. For, what is the conditional probability of the sequence being generated given the process that generated it was not random? If we have no reason to doubt that any of the infinite set of possible non-random processes might have been responsible for the sequence then the two conditional probabilities will be equal. In other words one is not automatically entitled to assume that the probability of a given sequence being generated randomly is equivalent to the probability that a random process, rather than any other, generated the sequence. It is nor justifiable to use the former probability as a measure of confidence in the random hypothesis [p (observed sequence/ random hypothesis) f p (random hypothesis/ observed sequence)]. Following Popper’s view of induction, we shall contend that the drawing of apparently purely statistical inferences of this type is actually entirely dependent upon an implicit pre-judgement of the issue on non-statistical and logically arbitrary grounds.

The underlying logic commonly utilised by researchers attempting to derive tests of randomness is that the ‘randomness’ of some given sequence is in some way indexed by the probability of that sequence occurring given that it was generated randomly. [p (random process/observed sequence) p (observed sequence/ random process)]. However, the probability of any sequence being generated randomly is never as small as nought and is always equal to the probability of any other sequence being generated randomly.

(1) =

(2) =

(3) =

(4) =

dsequence X_/random process)

p(all other sequences/random process)

p(sequence X/nonrandom process)

p(all other sequences/nonrandom process)

Exhibit 1. The probability of a sequence being random

P. Ayton et al. Psychological Conceptions of Randomness 229

The point is demonstrated by the probability tree shown in Exhibit 1. Imagine being confronted with a sequence called X which may or may not be a randomly produced sequence. We may, e.g. from binomial theorem, work out (1) and then by subtraction work out (2) ((1) + (2) must sum to unity). However, without additional and specific assumptions about the type of nonrandom process that may have produced X we can only assume that (3) = (I) and therefore (2) = (4). If we can make no apriori assumptions about the putative nonrandom process we have no means of revising the probabilities of sequences derived from the random process assumption. This results in equal probabilities that X is random or non-random. Thus inspection of the sequence does not alter the logical probability of it being random. This can be illustrated with Bayes’ theorem.

From Bayes’ theorem we know that:

P(HI/D) P W I ) P(D/H)I

p(Hz/ D) p(Hz) p(D/ Hz) = - x

If H I is the hypothesis that the sequence Xis generated by a random process and Hz is the hypothesis

Substituting the terms from Exhibit I we get: that the sequence X is generated by a non random process. And D (the observed data) is sequence X.

But if p(sequence X / Hrandorn is equal to

= p (sequence X / Hnonrandorn process) then the left hand side of the equation

p( Hrandorn process)

p( Hnonrandorn process)

Therefore, in this case, the posterior probability of the two hypotheses is equal to their prior probability. The data do not change our opinion.

The psychological research, as we have noted, suggests that human subjective concepts of randomness suffer from negative recency which implies that people would prefer some sequences with negative recency to ones with no recency at all (as conventionally measured) as representative of a random process. This is often taken to indicate that unaided lay intuitions of randomness are biased. However this conclusion implies that application of an unbiased representativeness heuristic would result in valid judgements of the randomness of any sequence under scrutiny. But actually, as we shall see, any heuristic utilising representativeness would fail to produce or recognise sequences which could truthfully be described as random.

There are a number of formal procedures, available in the literature, which purport to test for the randomness of sequences, (e.g. Good, 1953; Good and Cover, 1967; Knuth 1969; Green, 1982 (but see Ayton and Wright (1987a, for a discussion); Strube, 1983), or which generate sequences which meet sufficient criteria to be defined as random or, perhaps more modestly, ‘pseudorandom’ (Fragaszy and Fragaszy, 1978).

Yet actually these amount to tests of the representativeness of the sequences. For example in testing the randomness of a simple binary sequence (e.g. successive coin tosses) one might check that all possible triples (HHH, HHT, HTH, . . . etc) occur equally often in the sample - or at least do not occur in frequencies so different as to render a x 2 test significant. A crucial aspect of such tests is that they require local representativeness within the sample. So if one is testing a series of one thousand coin tosses for the relative frequency of triples we will reject as non-random a sequence of 1,OOO heads.

230 Journal of Behavioral Decision Making Vol. 2, Iss. No. 4

Howtver, when testing a much larger sample for the relative frequency of all the possible 1,OOO long sequences we will require there to be a certain (representative) number of sequences of 1 ,OOO heads. The size of the sample under scrutiny limits the order of the test that can be performed and this in turn specifies the ‘localness’ of the representative criteria. A consequence of all this is that if we took many millions of samples of 1,OOO coin tosses that all passed the statistical tests of randomness applicable to them, and then joined them together, the resulting sequence would necessarilyfail tests of randomness that tested the representativeness of larger subsequences that would now be statistically testable; for example there would not by any runs of length 1,OOO or 999, or 998 etc. Clearly for the tests of representativeness to be considered valid indicators of randomnessperse they should be at least consistent.

Essentially the same paradox manifests itself in any attempt to produce an objective measure of randomness. For example, Strube (1983) describes a number of tests for randomness including a correlation test for serial dependencies. In this test each number in the sequence under test is correlated with its adjacent number, with the number two positions away and so on. In this way a whole set of correlations can be computed by successively lagging the series. Following the argument given, we would expect a random series to exhibit no significant correlations, and indeed, for twenty correlations in a computer generated sequence, none were significant, indicating that the computer supplied numbers that were ‘adequately random for practical purposes’. However, the repetition of statistical tests for significance is problematic because the probability of making a type I error increases; the probability of not obtaining a significant statistic when the null hypothesis is true decreases the more tests are conducted. So, paradoxically, if we treat a null statistic as evidence for the null hypothesis, as the evidence accretes the hypothesis becomes increasingly implausible. Consequently with a large batch of independent statistics as performed by Strube we will begin to expect to observe statistically significant correlations in our sample even if there was no serial dependency in the process that generated it. In fact, with as few as fourteen tests, it is more probable than not that at least one of the tests will achieve statistical significance at the 5% level. With larger numbers of tests this expectation approaches a critical point such that, with the conventional 5% significance level, the discovery of 59 non-significant correlations is itself evidence for a significant pattern in the data. As .9559 <.05, when the null hypothesis is true the probability of obtaining 59 correlations (or of course any significance test at the 5% level) all of which are non-significant, is less than the critical 5% level of probability. So, in other words, we should, in a sample of random numbers, expect to see evidence for some apparently systematic dependancies just by chance. But how are we to interpret any we do see? Are they the chance dependancies inevitable in a genuinely random output? Or are they reflections of some systematic bias in the underlying process? There seems to be no way of knowing.

The attempt to establish evidence for the randomness of a process by statistical analysis of a sample of its output is somewhat analagous to the more familiar problem (to scientists) of attempting to prove the null hypothesis. When following the conventions of scientific method one normally attempts to reject the null hypothesis (that there are no ‘patterns’) in favour of some alternative mutually exclusive hypotheses that implies quite specific patterns in the data under consideration. Attempting to work the other way around is problematic for various reasons. (For example failure to observe any systematic patterns in data may be attributed to the weakness of the statistical test.) One lesson that can be drawn from this analogy should serve to caution those who seek to conduct tests for randomness. Endeavouring to prove that the configuration of some data is random is logically equivalent to trying to prove that all non-null theories about how the data were generated are not true. Such an inference is beyond the scope of any analysis of any finite sample of data. The larger the finite sample under consideration the more hypotheses will be susceptible to statistical examination, but, because infinite samples cannot be analysed, all the possible non-random features cannot be tested for.

P. Ayton et al. Psychological Conceptions of Randomness 23 1

RANDOMNESS AND INDUCTION

The tangible consequences of these theoretical difficulties may not always be apparent because the ostensible applications of randomness are, in psychology at any rate, usually for such relatively humdrum uses as the generation of subjectively unpredictable event sequences in experimental research.

However, actually induction is a vital part of science as well as everyday learning and reasoning and it would appear that discrimination between random and patterned event relationships is sometimes a major problem for observationally based scientific methods in particular. For instance Silvertown (1985) has described the struggles that ecologists face when attempting to determine what patterns (or structure) exist in ecological communities and the role of competition and other types of interaction between species in producing those patterns. Although the detection of pattern has hitherto seemed straightforward ecologists are beginning to appreciate that, as human perception and intuition are so predisposed to construe patterns, they should attempt a formal evaluation of the probability that any observed pattern is the product of chance. For example, if we found two species of birds distributed throughout an archipelago such that they never coincided on any one island, then we ought to compute the probability that this outcome would occur if the null hypothesis, that there are no ecological interactions, were true; this probability would then guide our inferences about the existence of patterns.

Use of null models to generate random reallocations of observed communities has apparently, in general, led ecologists to conclude that the presence of structure may have been overstated (e.g. May, 1984). But, of course, without an a priori alternative model the probability of an observed community being generated by a null model is not informative. This is because we need to know the relative likelihoods of the observed outcome being generated by chance and by some specified interaction before we can decide which model provides the most plausible account. As we have already seen, without specifying an alternative model the a priori probability of any outcome if all of the infinite number of possible non-null models may be true is the same as under the null model. In practise one might attempt to find the best alternative to the null hypothesisposr hoc, by seeing which model had the greatest probability of generating the observed pattern and then testing this model on a new set of observations. However, ecologists are faced with the problem that they only have one world; they are thus obliged to attempt to determine the nonrandomness of patterns which have no replicates. So, even if it were possible to identify another archipelago, entirely uninfluenced by the first, it would not have an identical geography, meteorology or the same selection of species.

The issue of whether an observed pattern is authentic - i.e. symptomatic of a particular underlying process-occurs in other demains of science as well. For example Arp and Bahcall(l973) have debated whether there are, or are not, significant spatial associations between astronomical objects with large but different red shifts. And, in psychology, arguments about the evidence for a particular configuration within data representing behaviour are not unknown. We are aware of one dispute in which a key issue was how best to identify the random element of behaviour in any particular research design; that is, the variance in behaviour not predictable from measurements of the controlled variables in an experiment (Ayton and Wright, 1985, 1986; Furnham and Jaspars, 1985). In the context of the person situation debate in personality and social psychology the extent to which behaviour can be predicted from experimental variables has been controversial and analysis of variance has often been construed as a method for induction with the residual or ‘error’term representing randomness and the main effects and interactions representing predictable patterns. Though as one might expect, such a strategy is fraught with difficulties (Golding, 1975).

Another, perhaps rather ironic, example of the difficulties inherent in segregating pattern from its complement can be seen in parapsychology. Randomness is an essential concept in most experiments

232 Vol. 2, Iss. No. 4

designed to test for the psi phenomenon - that is the putative ability to extra-sensorily perceive as manifested, for example, by a greater than chance hit rate on a card guessing task. It is usually accepted that as the cards are subject to ‘randomisation’ (e.g. by shuffling) the resulting sequence is unpredict- able. Alcock (1981, chapter 7) has provided an interesting review of some of the sometimes extremely subtle ways in which artifacts can invade this type of design to generate spuriously high hit rates. More recently Tart and Dronek ( I 982) and Vassy (1984) have found themselves confronted head-on by one of the central and arguably unsolvable problems in this field of study namely, how can one guarantee that the sequence of targets is such that a subject cannot increase their hit rate by using (perhaps unconsciously) some inference strategy? What those authors propose is a computer-based ‘probabalistic predictor program’ that uncovers certain possible types of regularity in sequences of random numbers. However, as we have already seen, although one might devise tests for certain types of regularity it is impossible to devise a test which guarantees the complete absence of every possible types of pattern, and thereby pure randomness, in any finite sample of data. ESP research attempts to eliminate the possibi- lities for induction, in order to attribute hit rates to the psi phenomenon, by using sequences of events that are random by virtue of the fact that they are inprinciple impossible to predict. But, sadly, such an endeavour is logically doomed to failure. It could be argued that one might attempt to construct sequences of stimuli that are psychologically unpredictable by virtue of the non-existence of those types of pattern that are salient to subjects in experiments. But, especially as chance and putative psi hits are indistinguishable (see Marks, 1986), how one could ever prove that hit rates above chance were due to psi and not some inference strategy - even an unconscious one - is beyond our understanding.

Such conundrums recall the distinction between process and product discussed by Lopes. As she pointed out, we usually assume that it makes sense to talk of randomness as a property of processes not of products. However we cannot directly observe the randomness of a process; we have to test samples of output from the process and attempt to make inferences. In other words we are obliged to consider the randomness of products if we wish to judge the randomness of processes.

In order to determine whether a sequence is plausibly nonrandom one necessarily requires an apriori rationale for a theory that justifies the selection of generic types of outcome or patterns because they are specifically predicted by the theory. It is necessary for the rationale to be available on apriori grounds (i.e. independent of any observation of the data) in order to avoid the circularity inherent in the process of seeking statistical support for the notion that there are patterns in the data of the sort one has already ‘noticed’ by causal observation. Otherwise the possibility of Type I error is increased and hence the diagnostic validity of the statistical inferences is weakened.

Martin (1984) has wryly described, in the context of an example drawn from medical research, the perils of conducting statistical analyses of data without specific a priori hypotheses:

‘If you wish to study, for instance, the relation between backache and changes on the lumbar spine x-ray, you examine a large but unspecified number of features (say forty) of the x-ray and relate them to the presence or absence of pain. Statistical tests are applied, using a probability level of one in twenty, and you find that there is more back pain with square osteophytes than with round ones . . . The main message of your paper then becomes that square osteophytes cause back pain. You ignore the second statistically significant result among the forty or so tested, that there is a relation between the amount of bowel gas on the film and back pain, because you are not a gastroenterologist. This method has been described as ‘casting the net widely’ a technique known for centuries to improve the chances of catching a fish. Moreover, submitting a larger number of factors to statistical examination not only improves your chances of a positive result but also enhances your reputation for diligence.’ (p. 1457).

The most impressive experiments in all fields of science are those which demonstrate surprising or ‘risky’ theoretical predictions to hold true. Popper has argued that the ‘riskiness’ of theoretical

Journal of Behavioral Decision Making

P. Ayton et al. Psychological Conceptions of Randomness 233

predictions is a vital characteristic of successful science. The riskiness of a theoretical prediction is indexed by the number of identifiable, possible, mutually exclusive outcomes there are for an experi- ment; a good experiment provides enormous potential for falsifying the theoretical prediction, thereby on a priori grounds the theoretical prediction looks a risky bet. The problem with testing data for randomness or nonrandomness as if these were general properties is that in this sense there is no risk entailed in the inductive method. It is possible, of course, to pick some categories of outcome and test to see if they have observed frequencies significantly different from those expected under the null hypothesis. However here one runs into problems, the main one being that the selection of categories of outcome is entirely arbitrary.

Our point may be illustrated by considering an imaginary situation where one is interested in determining whether some supposedly random device, say a die, is really random. Now the simplest way that dice may be biased (stemming from our knowledge of the physics of dice) is for the weight to be unevenly distributed such that one, or perhaps more than one, outcome has a greater likelihood of occurrence, at the expense of one, or perhaps more than one, of the alternatives. So we could test for a biased die by rolling it a number of times and seeing if the expected frequencies of the individual outcomes were significantly different from the observed frequencies. But, there are ever more subtle ways of biasing even a simple device like this so it would pass this test and yet still not be random: imagine a mass inside the die that moved around in such a way as to increase the likelihood of a repeat of the previous outcome. Providing it did not entirely determine the outcome (i.e. there was a finite probability of it coming to any other outcome) then our die would pass a test based on the expected frequencies of individual outcomes but presumably fail a test based on more complex patterns. It is not difficult to imagine ever more subtle forms of bias that would pass whatever form of test was derived to analyse it. (Imagine a tiny computer inside programmed to selectively bias the outcome depending on the chance occurrence of one specific sequence of a thousand previous outcomes.)

Now although in this example it may appear rather perverse to continue to test for increasingly subtle forms of bias, note that the subtlety is defined by reference to knowledge of, or assumptions about, the ways in which dice might plausibly be biased. But when confronted with the outputs from completely unknown mechanism (radio waves from deep space, data on subatomic particles, human attempts to generate random sequences, etc.) how could we rationally justify the selection of certain patterns rather than others for inclusion in our tests for randomness? It would seem that, without resorting to logically arbitrary, intuitively generated and often practically untestable assumptions about the precise nature of the non-randomness in any particular instance, it is an impossible task.

The same paradox can be seen in the one-sample runs test. Siege1 (1956, p. 52) explains the function of the test as a means of determining whether a sample sequence is random. The rationale is based on the order of individual events; the argument being that if too few or too many runs occur then the sequence cannot be random. However, again note that the choice of pattern (runs) is arbitrary. It is quite possible for a sequence to pass the runs test and fail a test based on some other pattern. And again note that, until we specify a non-random model, the probability distribution for the various numbers of runs for all the possible non-random processes is identical to the null (or random) distribution. If we know a die is biased, but do not know how, we cannot revise the probabilities we would derive for a fair die. Similarly within a framework where we have no reason to suspect the presence of one possible non-random process any more than any other, the runs test provides no pertinent information on the issue.

Why we notice some patterns (in this case runs) and not others in attempting this impossible task is unclear. Lopes (1982) has suggested that it may be related to the cost-benefit functions occurring in the natural world. This, as she indicates, is almost certainly impossible to verify but it follows that such pre-tuning to certain patterns may only be valuable within a relatively parochial setting. It certainly do not imply that in some other part of the universe there could not be creatures who do not raise an

234 Journal of Behavioral Decision Making voi. 2, rss. NO. 4

eyebrow at, or even notice,sequences containing very few or very many runs. Perhaps the constraints governing their existence require them to be sensitive to ‘patterns’ of events that we would ordinarily be unable to distinguish from randomness.

CONCLUDING REMARKS

Research investigating subjective concepts of randomness continues to be published. When such studies investigate specific hypotheses about psychological (mis)conceptions that are derived on a priori grounds and do not merely explore experimental demand characteristics we see no reason to doubt the value of the approach. However, as we have seen, very often no sufficient rationale is developed for hypotheses which creates problems for their testing in experimental tasks and major problems for the conventional statistical treatments employed in these experimental studies.

There is evidence that, under the right conditions human subjects are subtle and skillful intuitive statisticians (e.g. Nisbett, Krantz, Jepson and Kunda, 1983) and so it is important to determine the reasons for characteristic deviations from what is considered normative behaviour. We have argued that the criteria for rational behaviour may not be those which experimenters have used to gauge their subject’s behaviour. If, as we have argued, a more ecological and pragmatic perspective is adopted then it may be seen that laboratory studies of human statistical inference are rather narrow and that apparently suboptimal strategies may have hidden benefits when applied in the real world (cf. Winkler and Murphy, 1973; Navon, 1978).

The significance of subjective judgements of randomness has not often received much consideration by those reporting empirical investigations. However some of the psychological implications have been discussed by Neuringer (1986) who trained subjects to produce sequences of 1s and 2s that would meet the requirements for randomness specified by a number of statistical tests similar to those we discussed earlier in this article. He found that subjects could learn to emit responses that met these criteria and speculates that people may have an endogenous source of variability which with practise and feed back may have the usually apparent biases, presumably acquired through interaction with their environment, extinguished from it. Neuringer acknowledges the impossibility of proving that a set of responses are random but suggests that such an internal source of variability could aid the learning of new operant responses, problem solving and creativity. Moreover if the source of such variability were internal then this implies that instances of behaviour determined by this means would be in principle unpredictable by an external observer ‘independent of the observer’s knowledge of the subject’s conditioning history and current environmental influences’ (ibid, p. 74, italics in original).

The ability to behave randomly or at least unpredictably to an observer may have significant adaptive consequences. For example Bovet (1983) has emphasised the importance of a random element in his efforts to model the optimal pattern of movement in an animal foraging for food. He suggests that random behaviours may be a very efficient means of ensuring important biological functions such as diffusion, dispersal, exhaustive exploration and so on. Random movement might also help to maintain unpredictable locations for defence against predators - or attacks upon prey.

A finding with some bearing on the adaptive significance of randomness come from Gilovich, Vallone and Tversky (1985). They discovered a quite prevalent belief amongst basketball players and fans alike in ‘the hot hand’ and ‘streak shooting’ - i.e. runs of success in scoring attempts such that a player’s attempts at scoring with a shot are more likely to succeed following a hit than following a miss. However analysis of the shooting records of all the home games in one season of one team showed no evidence of any dependency between the outcomes of successive shots. This is an interesting misconception because it seems likely that, on the basis of it, a defence will allocate more resources to marking (denying possession and shooting opportunities to) a player with the hot hand even though he has no greater chance of scoring than usual. So, an attack could exploit this misconception by giving

P. Ayton et al. Psychological Conceptions of Randomness 235

scoring opportunities to other players without hot hands who, by virtue of being less marked, should be easier to give possession to. However, as the attacking players apparently share the same misconception, it seems unlikely that the opportunity is exploited.

We find this case study illuminating because it shows how it might be maladaptive to believe there are patterns in data when there are not - in contrast to the more obvious problems created by failing to notice patterns in data. One case where it might be adaptive to believe that there are patterns when there are not is given in Aubert’s (1959/ 1980) discussion of the social function of chance and random phenomena in different cultures. He describes the practice of the North American Naskapi Indians who heat the shoulder blade of the Caribou so that it cracks. It is then fitted to a wooden handle and, while held in specified ways, the cracks of the bone are read so as to give directions for the hunt. Although, apparently, the Naskapi are quite unaware of the randomness resulting from their decisions and believe they are seeking and getting guidance from the supernatural (i.e. the decision is ‘systematic?, such a strategy would make it difficult to predict their behaviour which might well contribute to the success of the hunt.

In more ‘sophisticated’ societies randomness also has its uses in government and the judiciary. For instance, in the UK, a private member’s bill may be introduced to parliament by members of parliament who are chosen by lottery and juries are composed by random selection of those on the electoral register. There is currently some debate as to whether random breath testing of motorists would be justified as a deterrent to driving under the influence of alcohol. Exemption from military duty on the basis of lot drawing is a well known practise in several countries. Aubert suggests that such procedures may be justified in instances where outcomes should be ensured to be unrelated to criteria deemed to be irrelevant to the decision or, if it is unclear which criteria should be applied. In the face of great uncertainty the best bet may be the randomly distributed response.

Bork (1967/ 1980) has also discussed randomness from a cultural perspective. He argues that our contemporary culture differs from that of previous centuries in that randomness and chance are now much more of a manifest preoccupation in music, literature and art. He suggests that this is a consequence of developments in science, in particular the utilisation of randomness in evolutionary theory and statistical thermodynamics.

Explanations of phenomena, including those developed by sophisticated scientists, are themselves psychological phenomena. We have noticed that randomness or apparent randomness is becoming a topic of increasing interest to those attempting to develop explanations for certain types of complex phenomena in the physical sciences (cf. Ford, 1983; Crutchfield, Farmer, Packard and Shaw, 1986), and even in pure mathematics (Chaitin, 1988). The study of chaotic systems has led to some interesting discoveries which have provoked a reappraisal of the relationship between determinism and randomness. In general, forecasting the behaviour of some system presupposes a deterministic world - one in which the past determines the future. We usually assume that uncertainty only arises because of lack of knowledge of the underlying causality or lack of information about the values of relevant parameters(Ayton and Wright, 1987b). However, when observationsaremadeofarealsystemit isimpos- sible to specify the current state of the system exactly; there is always a limit to observational accuracy.

A number of mathematical models have now been developed to describe situations whereby discrepancies in the current state - too small to measure initially - can influence the behaviour of the system in such a huge way that any predictive power is rapidly lost; there is no possible discernable causal connection between past and future. Such circumstances have been found in phenomena as disparate as the beating of chicken-heart cells, oscillating concentration levels in stirred chemical reactions and in computer models of phenomena such.as epidemics, stellar oscillations and the electrical activity of a nerve cell (Crutchfield et a1 1986; Gleick, 1987). Investigators of such chaos have shown that unpredictably complicated behaviour can emerge from very simple systems as a consequence of simple nonlinear interaction of just a few elements.

236 Journal of Behavioral Decision Making Vol. 2, Iss. No. 4

Given Heisenberg’s uncertainty principle, that there are finite limits t o the accuracy with which fundamental measurement can be made, it may be time to reconsider the role of randomness in explanatory effort. Perhaps instead of construing randomness as the uninteresting epistemological ‘left overs’ from deterministic endeavours it should have a more prominent position in accounts of the nature of things. How such issues will develop in the future is, appropriately enough, somewhat uncertain. Nonetheless we are of the persuasion that a more rigorous examination of psychological concepts of randomness is fundamental to the development of theroretical accounts of a wide range of psychological competence.

ACKNOWLEDGEMENTS

Preparation of this paper was funded by an Economic and Social Research Council Project Grant: “Judging the Likelihood of Future Events” (Grant No. C00232037) awarded to the third author.

We would like to thank Nigel Harvey, A. R. Jonckheere and Alastair McClelland for helpful discussion and critical comments on earlier drafts of this article.

REFERENCES

Alcock, J. E. (I98 1). Parapsychology: Science or Magic? A Psychological Perspective, Oxford: Pergamon Press. Arp, H. and Bahall, J. N. (1973). The redshift controversy. New York: Addison Wesley. Aubert, V. (1980). ‘Chance in Social Affairs,’ in Dowie, J. and Lefrere, P. (Eds) Risk and Chance. Milton Keynes:

Ayton, P. and Wright, G. (1985). ‘The evidence for interactionism in psychology: a reply to Furnham and Jaspars.’

Ayton, P. and Wright, G. (1986). ‘Persons, situations, interactions and error: consistency, variability and

Ayton, P. and Wright, G. (1987a) ‘Tests for randomness?’ Teaching Mathematics and its Applications, 6 , 83-87. Ayton, P. and Wright, G. (1987b). ‘Assessing and Improving Judgemental Probability Forecasts.’ Omega. The

International Journal of Management Science, 15, 191 - 196. Baddeley, A. D. (1966). ‘The capacity for generating information by randomization.’ Quarterly Journal of

Experimental Psychology, 18, 119-129. Beach, L. R., Christensen-Szalanski, J. and Barnes, V. (1987). ‘Assessing human judgement: Has it been done, can

it be done, should it be done?’ in: Wright, G. and Ayton, P. (eds) Judgemental Forecasring, Chichester: Wiley. Bork, A. (1967). ‘Randomness and the twentieth century.’ Antioch Review, 27,40-61. Reprinted in: Dowie, J. and

Lefrere, P. (1980) (Eds) Risk and Chance. Milton Keynes: The Open University Press. Bovet, P. (1983). ‘Optimal randomness in foraging movement: A central place model,’ in Cosnard, M., Demongeot,

J. and LeBreton, A. (Eds), Rhythms in Biology and Other Fields of Application: Deterministic and Stochastic Approaches. (Proceedings of the Journkes de la Sociktk Mathkmatique de France, held at Luminy, France. 1981), Berlin: Springer-Verlag.

The Open University Press.

Personality and Individual Differences, 6 , 509-5 12.

confusion.’ Personality and Individual Differences, 7, 233-235.

Chaitin, G. J. (1975). ‘Randomness and mathematical proof.’ Scientific American, 232,47-52. Chaitin, G . J. (1 988). ‘Randomness in Arithmetic.’ Scientific American, 259, 52-57. Cohen, L. J. (198 1). ‘Can human irrationality be experimentally demonstrated?’ Behavioral and Brain Sciences, 4,

Cohen, L. J. (1981). ‘Are there any a priori constraints on the study of rationality?’ Behavioral and Brain Sciences,

Cohen, L. J. (1983). ‘The controversy about irrationality.’ Behavioral and Brain Sciences, 6, 510-5 17. Cook, A. (1967). ‘Recognition of bias in strings of binary digits.’ Perceptualand Motor Skills, 24, 1003-1006. Crutchfield, J. P., Farmer, J. D., Packard, N. H. and Shaw, R. S . (1986). ‘Chaos.’ Scientific American, 255, (6),

Diener, D. and Thompson, W. P. (1985). ‘Recognizing randomness.’ American Journal of Psychology, 98,

Falk, R. (1981). ‘The perception of randomness.’ Paper presented at the Fifth Conference of the International

3 17-33 10.

4,359-367.

3849.

433-447.

Group for the Psychology of Mathematical Education, Grenoble, France.

P. Ayton et al. Psychological Conceptions of Randomness 237

Ford, J. (1983). ‘How random is a coin toss?’ Physics Today, April, 4047. Furnham, A. and Jaspars, J . (1985).‘More confusion.’ Personality and Individual Differences, 6 , 513-5 14. Fragaszy, R. L. and Fragaszy, D. M. (1978). ‘A program to generate Gellermann (pseudorandom) series of binary

states.’ Behaviour Research Methods and Instrumentation, 10, 83-88. Gilovich, T., Vallone, R., and Tversky, A. (1985). ‘The hot hand in basketball: On the misperception of random

sequences.’ Cognitive Psychology, 17,295-3 14. Gleick, J. (1987). Chaos. New York: Viking Penguin. Golding, S. (1975). ‘Flies in the ointment: methodological problems in the analysis of the percentages of variance

Good, I. H. (1953). ‘The serial test for sampling numbers and other tests for randomness.’ Proceedings of the

Good, I. J. and Gover, T. N. (1967). ‘The generalized serial test and the binary expansion of d 2 . * Journal of The

Green, D. R. (1982). ‘Testing randomness.’ Teaching Mathematics and its Applications, 1,95-100. Grice, H. P. (1975). ‘Logic and conversation,’ in P. Cole and J. L. Morgan (eds) Syntax and Semantics, Vol. 3:

Horwich, P. ( 1982). Probability and Evidence. Cambridge University Press. Hume, D. (1739). A Treatise of Human Nature. London: John Noon. Reprinted Harmondsworth: Penguin, 1969. Jakes, S. and Hemsley, D. R. (1986). ‘Individual differences in reaction to brief exposure to unpatterned visual

Kahneman, D. and Tversky, A. (1972). ‘Subjective probability: a judgement of representativeness.’ Cognitive

Knuth, D. E. (1969). m e art of computerprogramming. Reading, MA: Addison-Wesley. Lopes, L. L. (1982). ‘Doing the impossible: A note on induction and the experience of randomness.’ Journal of

Lopes, L. L. and Oden, G. C. (1987). ‘Distinguishing between random and non random events.’ Journal of

Macdonald, R. R. (1986). ‘Credible conceptions and implausible probabilities.’ British Journal of Mathematical

Marks, D. F. (1986). ‘Investigating the paranormal.’ Nature, 320, 119-124. Martin, G. (1984). ‘Munchausen’s statistical grid, which makes all trials significant.’ The Lancet, December 22/29,

Martin-Lof, P. (1966). ‘The definition of random sequences.’ Information and Control, 9,602-619. May, R. M. (1984). ‘An overview: Real and apparent patterns in community structure,’ in Strong, D. R.,

Simberloff, D., Abele, L. G. and Thistle, A. B. (Eds) Ecological Communities: Conceptual Issues and the Evidence. Princeton: Princeton University Press.

Navon, D. (1978). ‘The importance of being conservative: Some reflections on human Bayesian behaviour.’ British Journal of Mathematical and Statistical Psychology, 31,3348.

Neuringer, A. (1986). ‘Can people behave “randomly”?: The role of feedback.’ Journal of Experimental Psycho- logy: General, 115,62-75.

Nisbett, R. E., Krantz, D. H., Jepson, C. and Kunda, Z. (1983). ‘The use of statistical heuristics in everyday inductive reasoning.’ Psychological Review, 90, 339-363.

Oakes, M. (1986). ‘Statistical Inference:’ A Commentary for the Social and Behavioural Sciences. Chichester: Wiley.

Popper, K. R. (1972). Conjectures and Refutations. The growth of scientific knowledge. (Fourth edition) London: Routledge and Kegan Paul.

Siegel, S. (1956). Non-parametric Statistics for the Behavioural Sciences. New York: McGraw-Hill. Silvertown, J. (1985). ‘Random patterns.’ (Review of Ecological communities: Conceptual issues and the evidence.

Edited by D. R. Strong, Jr., Daniel Simberloff, L. G. Abele and A. B. Thistle. (1984). Princeton University Press). The Times Higher Education Supplement, 12 July, 24.

Strube, M. J. (1983). ‘Tests of randomness for pseudorandom number generators.’ Behaviour Research Methods and Instrumentation, 15,536-537.

Tart, C. T. and Dronek, E. (1982). ‘Mathematical inference strategies versus psi: initial explorations with the probabilistic predictor program.’ European Journal of Parapsychology, 4,325-355.

Tiegen, K. H. (1984). ‘Studies in subjective probability V: Chance vs. structure in visual patterns.’ Scandinavian Journal of Psychology, 25, 3 15-323.

Tune, G. S. (1964a). ‘A brief survey of variables that influence random generation.’ Perceptual and Motor Skills,

due to persons and situations.’ Psychological Bulletin, 82, 278-288.

Cambridge Philosophical Society, 49,276-284.

Royal Statistical Society (Series A), 130, 102-107.

Speech Acts, New York: Seminar Press.

stimulation.’ Personality and Individual Differences, 7 , I21 -1 23.

Psychology, 3,430454.

Experimental Psychology: Learning, Memory and Cognition, 8,626-636.

Experimental Psychology: Learning Memory and Cognition. 13,392400.

and Statistical Psychology, 39, 15-27.

1457.

18, 705-710.

238

Tune, G. S. (1964b). ‘Response preferences: A review of some relevant literature.’ Psychological Bulletin, 61,

Tversky, A. and Kahneman, D. (1971). ‘Belief in the law of small numbers.’ Psychological Eulletin, 2, 105-1 10. Van den Brink, W. P. and Van Heerden, J. (1980). ‘Estimation of chance fluctuation under the influence of a

Vassy, Z. (1984). ‘Improvement of the ‘probabalistic predictor program’ of Tart and Dronek for testing random

Wagenaar, W. A. (1970). ‘Appreciation of conditional probabilities in binary sequences.’ Acra Psychologica, 34,

Wagenaar, W. A. (1971). ‘Serial non-randomness as a function of duration and monotony of a randomisation task.’

Wagenaar, W. A. (1972). ‘Generation of random sequences by human subjects: A critical survey of literature.’

Wason, P. C. (1966). ‘Reasoning,’ in B. Foss (Ed.), New horizons inpsychology. Middlesex, England: Penguin. Wiegersma, S. (1982). ‘Sequential response bias in randomized response sequences: A computer simulation.’ Acta

Journal of Behavioral Decision Making Vol. 2, Iss. No. 4

286-302.

representational figure.’ Journal of Mathematical Psychology, 21,83-87.

target generators.’ European Journal of Parapsychology, 5,203-21 8.

348-356.

Acta Psychologica, 35, 78-87.

Psychological Bulletin, 11, 65-72.

Psychologica, 52, 249-256. Winefield, A. H. (1966). ‘Negative recency and event-dependence.’ Quarterly Journal of Experimental Psychology,

18.47-54. Winkler, R. L. and Murphy, A. H. (1973). ‘Experiments in the laboratory and the real world.’ Organizarional

Behaviour and Human Performance, 10,252-270.

Authors ’ biographies: Peter Ayton is a senior lecturer at the City of London Polytechnic where his responsibilities include the teaching of Cognitive Psychology and contributing to a multidisciplinary postgraduate course in Decision Making. His current research interests include metaphorical comprehension and judgment under uncertainty.

Anne J. Hunt holds degrees in Mathematics and Philosophy from Tulane University where she is currently reading for her doctorate in Philosophy. Her interests are in mathematical logic.

George Wright is reader in Business at Bristol Business School. His interests are in the human aspects of decision making and forecasting.

Authors ’ addresses: Peter Ayton, Decision Analysis Group, Department of Psychology, City of London Polytechnic, Old Castle Street,

London E l 7NT.

Anne J. Hunt, Department of Philosophy, Tulane University, New Orleans, LA 701 18, USA.

George Wright, Bristol Business School, Coldharbour Lane, Frenchay, Bristol BS16 IQY.